Merge tag 'drm-next-2019-03-06' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie: "This is the main drm pull request for the 5.1 merge window. The big changes I'd highlight are: - nouveau has HMM support now, there is finally an in-tree user so we can quieten down the rip it out people. - i915 now enables fastboot by default on Skylake+ - Displayport Multistream support has been refactored and should hopefully be more reliable. Core: - header cleanups aiming towards removing drmP.h - dma-buf fence seqnos to 64-bits - common helper for DP mst hotplug for radeon,i915,amdgpu + new refcounting scheme - MST i2c improvements - drm_syncobj_cb removal - ARM FB compression fourcc - P010 + P016 fourcc - allwinner tiled format modifier - i2c over aux I2C_M_STOP support - DRM_AUTH handling fixes TTM: - ref/unref renaming New driver: - ARM komeda display driver scheduler: - refactor mirror list handling - rework hw fence processing - 0 run queue entity fix bridge: - TI DS90C185 LVDS bridge - thc631lvdm83d bridge improvements - cadence + allwinner DSI ported to generic phy panels: - Sitronix ST7701 panel - Kingdisplay KD097D04 - LeMaker BL035-RGB-002 - PDA 91-00156-A0 - Innolux EE101IA-01D i915: - Enable fastboot by default on SKL+/VLV/CHV - Export RPCS configuration for ICL media driver - Coffelake PCI ID - CNL clocks setup fixes - ACPI/PMIC support for MIPI/DSI - Per-engine WA init for all engines - Shrinker locking fixes - Kerneldoc updates - Lots of ring improvements and reset fixes - Coffeelake GVT Support - VFIO GVT EDID Region support - runtime PM wakeref tracking - ILK->IVB primary plane enable delays - userptr mutex locking fixes - DSI fixes - LVDS/TV cleanups - HW readout fixes - LUT robustness fixes - ICL display and watermark fixes - gem mmap race fix amdgpu: - add scheduled dependencies interface - DCC on scanout surfaces - vega10/20 BACO support - Multiple IH rings on soc15 - XGMI locking fixes - DC i2c/aux cleanups - runtime SMU debug interface - Kexec improvmeents - SR-IOV fixes - DC freesync + ABM fixes - GDS fixes - GPUVM fixes - vega20 PCIE DPM switching fixes - Context priority handling fixes radeon: - fix missing break in evergreen parser nouveau: - SVM support via HMM msm: - QCOM Compressed modifier support exynos: - s5pv210 rotator support imx: - zpos property support - pending update fixes v3d: - cache flush improvments vc4: - reflection support - HDMI overscan support tegra: - CEC refactoring - HDMI audio fixes - Tegra186 prep work - SOR crossbar device tree fixes sun4i: - implicit fencing support - YUV and scalar support improvements - A23 support - tiling fixes atmel-hlcdc: - clipping and rotation property fixes qxl: - BO and PRIME improvements - generic fbdev emulation dw-hdmi: - HDMI 2.0 2160p - YUV420 ouput rockchip: - implicit fencing support - reflection proerties virtio-gpu: - use generic fbdev emulation tilcdc: - cpufreq vs crtc init fix rcar-du: - R8A774C0 support - D3/E3 RGB output routing fixes and DPAD0 support - RA87744 LVDS support bochs: - atomic and generic fbdev emulation - ID mismatch error on bochs load meson: - remove firmware fbs" * tag 'drm-next-2019-03-06' of git://anongit.freedesktop.org/drm/drm: (1130 commits) drm/amd/display: Use vrr friendly pageflip throttling in DC. drm/imx: only send commit done event when all state has been applied drm/imx: allow building under COMPILE_TEST drm/imx: imx-tve: depend on COMMON_CLK drm/imx: ipuv3-plane: add zpos property drm/imx: ipuv3-plane: add function to query atomic update status gpu: ipu-v3: prg: add function to get channel configure status gpu: ipu-v3: pre: add double buffer status readback drm/amdgpu: Bump amdgpu version for context priority override. drm/amdgpu/powerplay: fix typo in BACO header guards drm/amdgpu/powerplay: fix return codes in BACO code drm/amdgpu: add missing license on baco files drm/bochs: Fix the ID mismatch error drm/nouveau/dmem: use dma addresses during migration copies drm/nouveau/dmem: use physical vram addresses during migration copies drm/nouveau/dmem: extend copy function to allow direct use of physical addresses drm/nouveau/svm: new ioctl to migrate process memory to GPU memory drm/nouveau/dmem: device memory helpers for SVM drm/nouveau/svm: initial support for shared virtual memory drm/nouveau: prepare for enabling svm with existing userspace interfaces ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2019-03-08 08:23:15 -0800
committer: Linus Torvalds <torvalds@linux-foundation.org> 2019-03-08 08:23:15 -0800
commit: 851ca779d110f694b5d078bc4af06d3ad37169e8 (patch)
tree: 3d03de09e44ef02a6f73924f32fa21646347e64e
parent: b5dd0c658c31b469ccff1b637e5124851e7a4a1c (diff)
parent: 4b057e73f28f1df13b77b77a52094238ffdf8abd (diff)
download: linux-851ca779d110f694b5d078bc4af06d3ad37169e8.tar.gz
1141 files changed, 39562 insertions, 37911 deletions
diff --git a/Documentation/devicetree/bindings/display/arm,komeda.txt b/Documentation/devicetree/bindings/display/arm,komeda.txt
new file mode 100644
index 000000000000..02b226532ebd
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/arm,komeda.txt
@@ -0,0 +1,73 @@
+Device Tree bindings for Arm Komeda display driver
+
+Required properties:
+- compatible: Should be "arm,mali-d71"
+- reg: Physical base address and length of the registers in the system
+- interrupts: the interrupt line number of the device in the system
+- clocks: A list of phandle + clock-specifier pairs, one for each entry
+    in 'clock-names'
+- clock-names: A list of clock names. It should contain:
+      - "mclk": for the main processor clock
+      - "pclk": for the APB interface clock
+- #address-cells: Must be 1
+- #size-cells: Must be 0
+
+Required properties for sub-node: pipeline@nq
+Each device contains one or two pipeline sub-nodes (at least one), each
+pipeline node should provide properties:
+- reg: Zero-indexed identifier for the pipeline
+- clocks: A list of phandle + clock-specifier pairs, one for each entry
+    in 'clock-names'
+- clock-names: should contain:
+      - "pxclk": pixel clock
+      - "aclk": AXI interface clock
+
+- port: each pipeline connect to an encoder input port. The connection is
+    modeled using the OF graph bindings specified in
+    Documentation/devicetree/bindings/graph.txt
+
+Optional properties:
+  - memory-region: phandle to a node describing memory (see
+    Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt)
+    to be used for the framebuffer; if not present, the framebuffer may
+    be located anywhere in memory.
+
+Example:
+/ {
+	...
+
+	dp0: display@c00000 {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		compatible = "arm,mali-d71";
+		reg = <0xc00000 0x20000>;
+		interrupts = <0 168 4>;
+		clocks = <&dpu_mclk>, <&dpu_aclk>;
+		clock-names = "mclk", "pclk";
+
+		dp0_pipe0: pipeline@0 {
+			clocks = <&fpgaosc2>, <&dpu_aclk>;
+			clock-names = "pxclk", "aclk";
+			reg = <0>;
+
+			port {
+				dp0_pipe0_out: endpoint {
+					remote-endpoint = <&db_dvi0_in>;
+				};
+			};
+		};
+
+		dp0_pipe1: pipeline@1 {
+			clocks = <&fpgaosc2>, <&dpu_aclk>;
+			clock-names = "pxclk", "aclk";
+			reg = <1>;
+
+			port {
+				dp0_pipe1_out: endpoint {
+					remote-endpoint = <&db_dvi1_in>;
+				};
+			};
+		};
+	};
+	...
+};
diff --git a/Documentation/devicetree/bindings/display/bridge/lvds-transmitter.txt b/Documentation/devicetree/bindings/display/bridge/lvds-transmitter.txt
index 50220190c203..60091db5dfa5 100644
--- a/Documentation/devicetree/bindings/display/bridge/lvds-transmitter.txt
+++ b/Documentation/devicetree/bindings/display/bridge/lvds-transmitter.txt
@@ -22,13 +22,11 @@ among others.
 
 Required properties:
 
-- compatible: Must be one or more of the following
-  - "ti,ds90c185" for the TI DS90C185 FPD-Link Serializer
-  - "lvds-encoder" for a generic LVDS encoder device
+- compatible: Must be "lvds-encoder"
 
-  When compatible with the generic version, nodes must list the
-  device-specific version corresponding to the device first
-  followed by the generic version.
+  Any encoder compatible with this generic binding, but with additional
+  properties not listed here, must list a device specific compatible first
+  followed by this generic compatible.
 
 Required nodes:
 
@@ -44,8 +42,6 @@ Example
 
 lvds-encoder {
 	compatible = "lvds-encoder";
-	#address-cells = <1>;
-	#size-cells = <0>;
 
 	ports {
 		#address-cells = <1>;
diff --git a/Documentation/devicetree/bindings/display/bridge/renesas,lvds.txt b/Documentation/devicetree/bindings/display/bridge/renesas,lvds.txt
index ba5469dd09f3..900a884ad9f5 100644
--- a/Documentation/devicetree/bindings/display/bridge/renesas,lvds.txt
+++ b/Documentation/devicetree/bindings/display/bridge/renesas,lvds.txt
@@ -8,6 +8,8 @@ Required properties:
 
 - compatible : Shall contain one of
   - "renesas,r8a7743-lvds" for R8A7743 (RZ/G1M) compatible LVDS encoders
+  - "renesas,r8a7744-lvds" for R8A7744 (RZ/G1N) compatible LVDS encoders
+  - "renesas,r8a774c0-lvds" for R8A774C0 (RZ/G2E) compatible LVDS encoders
   - "renesas,r8a7790-lvds" for R8A7790 (R-Car H2) compatible LVDS encoders
   - "renesas,r8a7791-lvds" for R8A7791 (R-Car M2-W) compatible LVDS encoders
   - "renesas,r8a7793-lvds" for R8A7793 (R-Car M2-N) compatible LVDS encoders
@@ -25,7 +27,7 @@ Required properties:
 - clock-names: Name of the clocks. This property is model-dependent.
   - The functional clock, which mandatory for all models, shall be listed
     first, and shall be named "fck".
-  - On R8A77990 and R8A77995, the LVDS encoder can use the EXTAL or
+  - On R8A77990, R8A77995 and R8A774C0, the LVDS encoder can use the EXTAL or
     DU_DOTCLKINx clocks. Those clocks are optional. When supplied they must be
     named "extal" and "dclkin.x" respectively, with "x" being the DU_DOTCLKIN
     numerical index.
diff --git a/Documentation/devicetree/bindings/display/bridge/thine,thc63lvdm83d.txt b/Documentation/devicetree/bindings/display/bridge/thine,thc63lvdm83d.txt
index 527e236e9a2a..fee3c88e1a17 100644
--- a/Documentation/devicetree/bindings/display/bridge/thine,thc63lvdm83d.txt
+++ b/Documentation/devicetree/bindings/display/bridge/thine,thc63lvdm83d.txt
@@ -10,7 +10,7 @@ Required properties:
 
 Optional properties:
 
-- pwdn-gpios: Power down control GPIO
+- powerdown-gpios: Power down control GPIO (the /PWDN pin, active low).
 
 Required nodes:
 
diff --git a/Documentation/devicetree/bindings/display/bridge/ti,ds90c185.txt b/Documentation/devicetree/bindings/display/bridge/ti,ds90c185.txt
new file mode 100644
index 000000000000..e575f996959a
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/bridge/ti,ds90c185.txt
@@ -0,0 +1,55 @@
+Texas Instruments FPD-Link (LVDS) Serializer
+--------------------------------------------
+
+The DS90C185 and DS90C187 are low-power serializers for portable
+battery-powered applications that reduces the size of the RGB
+interface between the host GPU and the display.
+
+Required properties:
+
+- compatible: Should be
+  "ti,ds90c185", "lvds-encoder"  for the TI DS90C185 FPD-Link Serializer
+  "ti,ds90c187", "lvds-encoder"  for the TI DS90C187 FPD-Link Serializer
+
+Optional properties:
+
+- powerdown-gpios: Power down control GPIO (the PDB pin, active-low)
+
+Required nodes:
+
+The devices have two video ports. Their connections are modeled using the OF
+graph bindings specified in Documentation/devicetree/bindings/graph.txt.
+
+- Video port 0 for parallel input
+- Video port 1 for LVDS output
+
+
+Example
+-------
+
+lvds-encoder {
+	compatible = "ti,ds90c185", "lvds-encoder";
+
+	powerdown-gpios = <&gpio 17 GPIO_ACTIVE_LOW>;
+
+	ports {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		port@0 {
+			reg = <0>;
+
+			lvds_enc_in: endpoint {
+				remote-endpoint = <&lcdc_out_rgb>;
+			};
+		};
+
+		port@1 {
+			reg = <1>;
+
+			lvds_enc_out: endpoint {
+				remote-endpoint = <&lvds_panel_in>;
+			};
+		};
+	};
+};
diff --git a/Documentation/devicetree/bindings/display/msm/gmu.txt b/Documentation/devicetree/bindings/display/msm/gmu.txt
new file mode 100644
index 000000000000..3439b38e60f2
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/msm/gmu.txt
@@ -0,0 +1,59 @@
+Qualcomm adreno/snapdragon GMU (Graphics management unit)
+
+The GMU is a programmable power controller for the GPU. the CPU controls the
+GMU which in turn handles power controls for the GPU.
+
+Required properties:
+- compatible: "qcom,adreno-gmu-XYZ.W", "qcom,adreno-gmu"
+    for example: "qcom,adreno-gmu-630.2", "qcom,adreno-gmu"
+  Note that you need to list the less specific "qcom,adreno-gmu"
+  for generic matches and the more specific identifier to identify
+  the specific device.
+- reg: Physical base address and length of the GMU registers.
+- reg-names: Matching names for the register regions
+  * "gmu"
+  * "gmu_pdc"
+  * "gmu_pdc_seg"
+- interrupts: The interrupt signals from the GMU.
+- interrupt-names: Matching names for the interrupts
+  * "hfi"
+  * "gmu"
+- clocks: phandles to the device clocks
+- clock-names: Matching names for the clocks
+   * "gmu"
+   * "cxo"
+   * "axi"
+   * "mnoc"
+- power-domains: should be <&clock_gpucc GPU_CX_GDSC>
+- iommus: phandle to the adreno iommu
+- operating-points-v2: phandle to the OPP operating points
+
+Example:
+
+/ {
+	...
+
+	gmu: gmu@506a000 {
+		compatible="qcom,adreno-gmu-630.2", "qcom,adreno-gmu";
+
+		reg = <0x506a000 0x30000>,
+			<0xb280000 0x10000>,
+			<0xb480000 0x10000>;
+		reg-names = "gmu", "gmu_pdc", "gmu_pdc_seq";
+
+		interrupts = <GIC_SPI 304 IRQ_TYPE_LEVEL_HIGH>,
+		     <GIC_SPI 305 IRQ_TYPE_LEVEL_HIGH>;
+		interrupt-names = "hfi", "gmu";
+
+		clocks = <&gpucc GPU_CC_CX_GMU_CLK>,
+			<&gpucc GPU_CC_CXO_CLK>,
+			<&gcc GCC_DDRSS_GPU_AXI_CLK>,
+			<&gcc GCC_GPU_MEMNOC_GFX_CLK>;
+		clock-names = "gmu", "cxo", "axi", "memnoc";
+
+		power-domains = <&gpucc GPU_CX_GDSC>;
+		iommus = <&adreno_smmu 5>;
+
+		operating-points-v2 = <&gmu_opp_table>;
+	};
+};
diff --git a/Documentation/devicetree/bindings/display/msm/gpu.txt b/Documentation/devicetree/bindings/display/msm/gpu.txt
index f8759145ce1a..aad1aef682f7 100644
--- a/Documentation/devicetree/bindings/display/msm/gpu.txt
+++ b/Documentation/devicetree/bindings/display/msm/gpu.txt
@@ -10,14 +10,23 @@ Required properties:
   If "amd,imageon" is used, there should be no top level msm device.
 - reg: Physical base address and length of the controller's registers.
 - interrupts: The interrupt signal from the gpu.
-- clocks: device clocks
+- clocks: device clocks (if applicable)
   See ../clocks/clock-bindings.txt for details.
-- clock-names: the following clocks are required:
+- clock-names: the following clocks are required by a3xx, a4xx and a5xx
+  cores:
   * "core"
   * "iface"
   * "mem_iface"
+  For GMU attached devices the GPU clocks are not used and are not required. The
+  following devices should not list clocks:
+   - qcom,adreno-630.2
+- iommus: optional phandle to an adreno iommu instance
+- operating-points-v2: optional phandle to the OPP operating points
+- qcom,gmu: For GMU attached devices a phandle to the GMU device that will
+  control the power for the GPU. Applicable targets:
+    - qcom,adreno-630.2
 
-Example:
+Example 3xx/4xx/a5xx:
 
 / {
 	...
@@ -37,3 +46,30 @@ Example:
 		    <&mmcc MMSS_IMEM_AHB_CLK>;
 	};
 };
+
+Example a6xx (with GMU):
+
+/ {
+	...
+
+	gpu@5000000 {
+		compatible = "qcom,adreno-630.2", "qcom,adreno";
+		#stream-id-cells = <16>;
+
+		reg = <0x5000000 0x40000>, <0x509e000 0x10>;
+		reg-names = "kgsl_3d0_reg_memory", "cx_mem";
+
+		/*
+		 * Look ma, no clocks! The GPU clocks and power are
+		 * controlled entirely by the GMU
+		 */
+
+		interrupts = <GIC_SPI 300 IRQ_TYPE_LEVEL_HIGH>;
+
+		iommus = <&adreno_smmu 0>;
+
+		operating-points-v2 = <&gpu_opp_table>;
+
+		qcom,gmu = <&gmu>;
+	};
+};
diff --git a/Documentation/devicetree/bindings/display/panel/auo,g101evn010 b/Documentation/devicetree/bindings/display/panel/auo,g101evn010.txt
index bc6a0c858e23..bc6a0c858e23 100644
--- a/Documentation/devicetree/bindings/display/panel/auo,g101evn010
+++ b/Documentation/devicetree/bindings/display/panel/auo,g101evn010.txt
diff --git a/Documentation/devicetree/bindings/display/panel/innolux,ee101ia-01d.txt b/Documentation/devicetree/bindings/display/panel/innolux,ee101ia-01d.txt
new file mode 100644
index 000000000000..e5ca4ccd55ed
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/panel/innolux,ee101ia-01d.txt
@@ -0,0 +1,7 @@
+Innolux Corporation 10.1" EE101IA-01D WXGA (1280x800) LVDS panel
+
+Required properties:
+- compatible: should be "innolux,ee101ia-01d"
+
+This binding is compatible with the lvds-panel binding, which is specified
+in panel-lvds.txt in this directory.
diff --git a/Documentation/devicetree/bindings/display/panel/lemaker,bl035-rgb-002.txt b/Documentation/devicetree/bindings/display/panel/lemaker,bl035-rgb-002.txt
new file mode 100644
index 000000000000..74ee7ea6b493
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/panel/lemaker,bl035-rgb-002.txt
@@ -0,0 +1,12 @@
+LeMaker BL035-RGB-002 3.5" QVGA TFT LCD panel
+
+Required properties:
+- compatible: should be "lemaker,bl035-rgb-002"
+- power-supply: as specified in the base binding
+
+Optional properties:
+- backlight: as specified in the base binding
+- enable-gpios: as specified in the base binding
+
+This binding is compatible with the simple-panel binding, which is specified
+in simple-panel.txt in this directory.
diff --git a/Documentation/devicetree/bindings/display/panel/pda,91-00156-a0.txt b/Documentation/devicetree/bindings/display/panel/pda,91-00156-a0.txt
new file mode 100644
index 000000000000..1639fb17a9f0
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/panel/pda,91-00156-a0.txt
@@ -0,0 +1,14 @@
+PDA 91-00156-A0 5.0" WVGA TFT LCD panel
+
+Required properties:
+- compatible: should be "pda,91-00156-a0"
+- power-supply: this panel requires a single power supply. A phandle to a
+regulator needs to be specified here. Compatible with panel-common binding which
+is specified in the panel-common.txt in this directory.
+- backlight: this panel's backlight is controlled by an external backlight
+controller. A phandle to this controller needs to be specified here.
+Compatible with panel-common binding which is specified in the panel-common.txt
+in this directory.
+
+This binding is compatible with the simple-panel binding, which is specified
+in simple-panel.txt in this directory.
diff --git a/Documentation/devicetree/bindings/display/panel/sitronix,st7701.txt b/Documentation/devicetree/bindings/display/panel/sitronix,st7701.txt
new file mode 100644
index 000000000000..ccd17597f1f6
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/panel/sitronix,st7701.txt
@@ -0,0 +1,30 @@
+Sitronix ST7701 based LCD panels
+
+ST7701 designed for small and medium sizes of TFT LCD display, is
+capable of supporting up to 480RGBX864 in resolution. It provides
+several system interfaces like MIPI/RGB/SPI.
+
+Techstar TS8550B is 480x854, 2-lane MIPI DSI LCD panel which has
+inbuilt ST7701 chip.
+
+Required properties:
+- compatible: must be "sitronix,st7701" and one of
+  * "techstar,ts8550b"
+- reset-gpios: a GPIO phandle for the reset pin
+
+Required properties for techstar,ts8550b:
+- reg: DSI virtual channel used by that screen
+- VCC-supply: analog regulator for MIPI circuit
+- IOVCC-supply: I/O system regulator
+
+Optional properties:
+- backlight: phandle for the backlight control.
+
+panel@0 {
+	compatible = "techstar,ts8550b", "sitronix,st7701";
+	reg = <0>;
+	VCC-supply = <&reg_dldo2>;
+	IOVCC-supply = <&reg_dldo2>;
+	reset-gpios = <&pio 3 24 GPIO_ACTIVE_HIGH>; /* LCD-RST: PD24 */
+	backlight = <&backlight>;
+};
diff --git a/Documentation/devicetree/bindings/display/renesas,du.txt b/Documentation/devicetree/bindings/display/renesas,du.txt
index 3c855d9f2719..aedb22b4d161 100644
--- a/Documentation/devicetree/bindings/display/renesas,du.txt
+++ b/Documentation/devicetree/bindings/display/renesas,du.txt
@@ -7,6 +7,7 @@ Required Properties:
     - "renesas,du-r8a7744" for R8A7744 (RZ/G1N) compatible DU
     - "renesas,du-r8a7745" for R8A7745 (RZ/G1E) compatible DU
     - "renesas,du-r8a77470" for R8A77470 (RZ/G1C) compatible DU
+    - "renesas,du-r8a774c0" for R8A774C0 (RZ/G2E) compatible DU
     - "renesas,du-r8a7779" for R8A7779 (R-Car H1) compatible DU
     - "renesas,du-r8a7790" for R8A7790 (R-Car H2) compatible DU
     - "renesas,du-r8a7791" for R8A7791 (R-Car M2-W) compatible DU
@@ -57,6 +58,7 @@ corresponding to each DU output.
  R8A7744 (RZ/G1N)       DPAD 0         LVDS 0         -              -
  R8A7745 (RZ/G1E)       DPAD 0         DPAD 1         -              -
  R8A77470 (RZ/G1C)      DPAD 0         DPAD 1         LVDS 0         -
+ R8A774C0 (RZ/G2E)      DPAD 0         LVDS 0         LVDS 1         -
  R8A7779 (R-Car H1)     DPAD 0         DPAD 1         -              -
  R8A7790 (R-Car H2)     DPAD 0         LVDS 0         LVDS 1         -
  R8A7791 (R-Car M2-W)   DPAD 0         LVDS 0         -              -
diff --git a/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt b/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt
index b79e5769f0ae..4f58c5a2d195 100644
--- a/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt
+++ b/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt
@@ -10,6 +10,7 @@ Required properties:
 		"rockchip,rk3126-vop";
 		"rockchip,px30-vop-lit";
 		"rockchip,px30-vop-big";
+		"rockchip,rk3066-vop";
 		"rockchip,rk3188-vop";
 		"rockchip,rk3288-vop";
 		"rockchip,rk3368-vop";
diff --git a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
index f426bdb42f18..31ab72cba3d4 100644
--- a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
+++ b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
@@ -156,6 +156,7 @@ Required properties:
    * allwinner,sun6i-a31-tcon
    * allwinner,sun6i-a31s-tcon
    * allwinner,sun7i-a20-tcon
+   * allwinner,sun8i-a23-tcon
    * allwinner,sun8i-a33-tcon
    * allwinner,sun8i-a83t-tcon-lcd
    * allwinner,sun8i-a83t-tcon-tv
@@ -276,6 +277,7 @@ Required properties:
   - compatible: value must be one of:
     * allwinner,sun6i-a31-drc
     * allwinner,sun6i-a31s-drc
+    * allwinner,sun8i-a23-drc
     * allwinner,sun8i-a33-drc
     * allwinner,sun9i-a80-drc
   - reg: base address and size of the memory-mapped region.
@@ -303,6 +305,7 @@ Required properties:
     * allwinner,sun5i-a13-display-backend
     * allwinner,sun6i-a31-display-backend
     * allwinner,sun7i-a20-display-backend
+    * allwinner,sun8i-a23-display-backend
     * allwinner,sun8i-a33-display-backend
     * allwinner,sun9i-a80-display-backend
   - reg: base address and size of the memory-mapped region.
@@ -360,6 +363,7 @@ Required properties:
     * allwinner,sun5i-a13-display-frontend
     * allwinner,sun6i-a31-display-frontend
     * allwinner,sun7i-a20-display-frontend
+    * allwinner,sun8i-a23-display-frontend
     * allwinner,sun8i-a33-display-frontend
     * allwinner,sun9i-a80-display-frontend
   - reg: base address and size of the memory-mapped region.
@@ -419,6 +423,7 @@ Required properties:
     * allwinner,sun6i-a31-display-engine
     * allwinner,sun6i-a31s-display-engine
     * allwinner,sun7i-a20-display-engine
+    * allwinner,sun8i-a23-display-engine
     * allwinner,sun8i-a33-display-engine
     * allwinner,sun8i-a83t-display-engine
     * allwinner,sun8i-h3-display-engine
diff --git a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt
index 593be44a53c9..9999255ac5b6 100644
--- a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt
+++ b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt
@@ -238,6 +238,9 @@ of the following host1x client modules:
   - nvidia,hpd-gpio: specifies a GPIO used for hotplug detection
   - nvidia,edid: supplies a binary EDID blob
   - nvidia,panel: phandle of a display panel
+  - nvidia,xbar-cfg: 5 cells containing the crossbar configuration. Each lane
+    of the SOR, identified by the cell's index, is mapped via the crossbar to
+    the pad specified by the cell's value.
 
   Optional properties when driving an eDP output:
   - nvidia,dpaux: phandle to a DispayPort AUX interface
diff --git a/Documentation/devicetree/bindings/gpu/samsung-rotator.txt b/Documentation/devicetree/bindings/gpu/samsung-rotator.txt
index 82cd1ed0be93..3aca2578da0b 100644
--- a/Documentation/devicetree/bindings/gpu/samsung-rotator.txt
+++ b/Documentation/devicetree/bindings/gpu/samsung-rotator.txt
@@ -2,9 +2,10 @@
 
 Required properties:
   - compatible : value should be one of the following:
-	(a) "samsung,exynos4210-rotator" for Rotator IP in Exynos4210
-	(b) "samsung,exynos4212-rotator" for Rotator IP in Exynos4212/4412
-	(c) "samsung,exynos5250-rotator" for Rotator IP in Exynos5250
+	* "samsung,s5pv210-rotator" for Rotator IP in S5PV210
+	* "samsung,exynos4210-rotator" for Rotator IP in Exynos4210
+	* "samsung,exynos4212-rotator" for Rotator IP in Exynos4212/4412
+	* "samsung,exynos5250-rotator" for Rotator IP in Exynos5250
 
   - reg : Physical base address of the IP registers and length of memory
 	  mapped region.
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index e604e7f73629..98f83edbfc95 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -216,6 +216,7 @@ laird	Laird PLC
 lantiq	Lantiq Semiconductor
 lattice	Lattice Semiconductor
 lego	LEGO Systems A/S
+lemaker	Shenzhen LeMaker Technology Co., Ltd.
 lenovo	Lenovo Group Ltd.
 lg	LG Corporation
 libretech	Shenzhen Libre Technology Co., Ltd
@@ -303,6 +304,7 @@ ovti	OmniVision Technologies
 oxsemi	Oxford Semiconductor, Ltd.
 panasonic	Panasonic Corporation
 parade	Parade Technologies Inc.
+pda	Precision Design Associates, Inc.
 pericom	Pericom Technology Inc.
 pervasive	Pervasive Displays, Inc.
 phicomm PHICOMM Co., Ltd.
diff --git a/Documentation/gpu/afbc.rst b/Documentation/gpu/afbc.rst
new file mode 100644
index 000000000000..4d38dc49d105
--- /dev/null
+++ b/Documentation/gpu/afbc.rst
@@ -0,0 +1,235 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+===================================
+ Arm Framebuffer Compression (AFBC)
+===================================
+
+AFBC is a proprietary lossless image compression protocol and format.
+It provides fine-grained random access and minimizes the amount of
+data transferred between IP blocks.
+
+AFBC can be enabled on drivers which support it via use of the AFBC
+format modifiers defined in drm_fourcc.h. See DRM_FORMAT_MOD_ARM_AFBC(*).
+
+All users of the AFBC modifiers must follow the usage guidelines laid
+out in this document, to ensure compatibility across different AFBC
+producers and consumers.
+
+Components and Ordering
+=======================
+
+AFBC streams can contain several components - where a component
+corresponds to a color channel (i.e. R, G, B, X, A, Y, Cb, Cr).
+The assignment of input/output color channels must be consistent
+between the encoder and the decoder for correct operation, otherwise
+the consumer will interpret the decoded data incorrectly.
+
+Furthermore, when the lossless colorspace transform is used
+(AFBC_FORMAT_MOD_YTR, which should be enabled for RGB buffers for
+maximum compression efficiency), the component order must be:
+
+ * Component 0: R
+ * Component 1: G
+ * Component 2: B
+
+The component ordering is communicated via the fourcc code in the
+fourcc:modifier pair. In general, component '0' is considered to
+reside in the least-significant bits of the corresponding linear
+format. For example, COMP(bits):
+
+ * DRM_FORMAT_ABGR8888
+
+   * Component 0: R(8)
+   * Component 1: G(8)
+   * Component 2: B(8)
+   * Component 3: A(8)
+
+ * DRM_FORMAT_BGR888
+
+   * Component 0: R(8)
+   * Component 1: G(8)
+   * Component 2: B(8)
+
+ * DRM_FORMAT_YUYV
+
+   * Component 0: Y(8)
+   * Component 1: Cb(8, 2x1 subsampled)
+   * Component 2: Cr(8, 2x1 subsampled)
+
+In AFBC, 'X' components are not treated any differently from any other
+component. Therefore, an AFBC buffer with fourcc DRM_FORMAT_XBGR8888
+encodes with 4 components, like so:
+
+ * DRM_FORMAT_XBGR8888
+
+   * Component 0: R(8)
+   * Component 1: G(8)
+   * Component 2: B(8)
+   * Component 3: X(8)
+
+Please note, however, that the inclusion of a "wasted" 'X' channel is
+bad for compression efficiency, and so it's recommended to avoid
+formats containing 'X' bits. If a fourth component is
+required/expected by the encoder/decoder, then it is recommended to
+instead use an equivalent format with alpha, setting all alpha bits to
+'1'. If there is no requirement for a fourth component, then a format
+which doesn't include alpha can be used, e.g. DRM_FORMAT_BGR888.
+
+Number of Planes
+================
+
+Formats which are typically multi-planar in linear layouts (e.g. YUV
+420), can be encoded into one, or multiple, AFBC planes. As with
+component order, the encoder and decoder must agree about the number
+of planes in order to correctly decode the buffer. The fourcc code is
+used to determine the number of encoded planes in an AFBC buffer,
+matching the number of planes for the linear (unmodified) format.
+Within each plane, the component ordering also follows the fourcc
+code:
+
+For example:
+
+ * DRM_FORMAT_YUYV: nplanes = 1
+
+   * Plane 0:
+
+     * Component 0: Y(8)
+     * Component 1: Cb(8, 2x1 subsampled)
+     * Component 2: Cr(8, 2x1 subsampled)
+
+ * DRM_FORMAT_NV12: nplanes = 2
+
+   * Plane 0:
+
+     * Component 0: Y(8)
+
+   * Plane 1:
+
+     * Component 0: Cb(8, 2x1 subsampled)
+     * Component 1: Cr(8, 2x1 subsampled)
+
+Cross-device interoperability
+=============================
+
+For maximum compatibility across devices, the table below defines
+canonical formats for use between AFBC-enabled devices. Formats which
+are listed here must be used exactly as specified when using the AFBC
+modifiers. Formats which are not listed should be avoided.
+
+.. flat-table:: AFBC formats
+
+   * - Fourcc code
+     - Description
+     - Planes/Components
+
+   * - DRM_FORMAT_ABGR2101010
+     - 10-bit per component RGB, with 2-bit alpha
+     - Plane 0: 4 components
+              * Component 0: R(10)
+              * Component 1: G(10)
+              * Component 2: B(10)
+              * Component 3: A(2)
+
+   * - DRM_FORMAT_ABGR8888
+     - 8-bit per component RGB, with 8-bit alpha
+     - Plane 0: 4 components
+              * Component 0: R(8)
+              * Component 1: G(8)
+              * Component 2: B(8)
+              * Component 3: A(8)
+
+   * - DRM_FORMAT_BGR888
+     - 8-bit per component RGB
+     - Plane 0: 3 components
+              * Component 0: R(8)
+              * Component 1: G(8)
+              * Component 2: B(8)
+
+   * - DRM_FORMAT_BGR565
+     - 5/6-bit per component RGB
+     - Plane 0: 3 components
+              * Component 0: R(5)
+              * Component 1: G(6)
+              * Component 2: B(5)
+
+   * - DRM_FORMAT_ABGR1555
+     - 5-bit per component RGB, with 1-bit alpha
+     - Plane 0: 4 components
+              * Component 0: R(5)
+              * Component 1: G(5)
+              * Component 2: B(5)
+              * Component 3: A(1)
+
+   * - DRM_FORMAT_VUY888
+     - 8-bit per component YCbCr 444, single plane
+     - Plane 0: 3 components
+              * Component 0: Y(8)
+              * Component 1: Cb(8)
+              * Component 2: Cr(8)
+
+   * - DRM_FORMAT_VUY101010
+     - 10-bit per component YCbCr 444, single plane
+     - Plane 0: 3 components
+              * Component 0: Y(10)
+              * Component 1: Cb(10)
+              * Component 2: Cr(10)
+
+   * - DRM_FORMAT_YUYV
+     - 8-bit per component YCbCr 422, single plane
+     - Plane 0: 3 components
+              * Component 0: Y(8)
+              * Component 1: Cb(8, 2x1 subsampled)
+              * Component 2: Cr(8, 2x1 subsampled)
+
+   * - DRM_FORMAT_NV16
+     - 8-bit per component YCbCr 422, two plane
+     - Plane 0: 1 component
+              * Component 0: Y(8)
+       Plane 1: 2 components
+              * Component 0: Cb(8, 2x1 subsampled)
+              * Component 1: Cr(8, 2x1 subsampled)
+
+   * - DRM_FORMAT_Y210
+     - 10-bit per component YCbCr 422, single plane
+     - Plane 0: 3 components
+              * Component 0: Y(10)
+              * Component 1: Cb(10, 2x1 subsampled)
+              * Component 2: Cr(10, 2x1 subsampled)
+
+   * - DRM_FORMAT_P210
+     - 10-bit per component YCbCr 422, two plane
+     - Plane 0: 1 component
+              * Component 0: Y(10)
+       Plane 1: 2 components
+              * Component 0: Cb(10, 2x1 subsampled)
+              * Component 1: Cr(10, 2x1 subsampled)
+
+   * - DRM_FORMAT_YUV420_8BIT
+     - 8-bit per component YCbCr 420, single plane
+     - Plane 0: 3 components
+              * Component 0: Y(8)
+              * Component 1: Cb(8, 2x2 subsampled)
+              * Component 2: Cr(8, 2x2 subsampled)
+
+   * - DRM_FORMAT_YUV420_10BIT
+     - 10-bit per component YCbCr 420, single plane
+     - Plane 0: 3 components
+              * Component 0: Y(10)
+              * Component 1: Cb(10, 2x2 subsampled)
+              * Component 2: Cr(10, 2x2 subsampled)
+
+   * - DRM_FORMAT_NV12
+     - 8-bit per component YCbCr 420, two plane
+     - Plane 0: 1 component
+              * Component 0: Y(8)
+       Plane 1: 2 components
+              * Component 0: Cb(8, 2x2 subsampled)
+              * Component 1: Cr(8, 2x2 subsampled)
+
+   * - DRM_FORMAT_P010
+     - 10-bit per component YCbCr 420, two plane
+     - Plane 0: 1 component
+              * Component 0: Y(10)
+       Plane 1: 2 components
+              * Component 0: Cb(10, 2x2 subsampled)
+              * Component 1: Cr(10, 2x2 subsampled)
diff --git a/Documentation/gpu/dp-mst/topology-figure-1.dot b/Documentation/gpu/dp-mst/topology-figure-1.dot
new file mode 100644
index 000000000000..157e17c7e0b0
--- /dev/null
+++ b/Documentation/gpu/dp-mst/topology-figure-1.dot
@@ -0,0 +1,52 @@
+digraph T {
+    /* Make sure our payloads are always drawn below the driver node */
+    subgraph cluster_driver {
+        fillcolor = grey;
+        style = filled;
+        driver -> {payload1, payload2} [dir=none];
+    }
+
+    /* Driver malloc references */
+    edge [style=dashed];
+    driver -> port1;
+    driver -> port2;
+    driver -> port3:e;
+    driver -> port4;
+
+    payload1:s -> port1:e;
+    payload2:s -> port3:e;
+    edge [style=""];
+
+    subgraph cluster_topology {
+        label="Topology Manager";
+        labelloc=bottom;
+
+        /* Topology references */
+        mstb1 -> {port1, port2};
+        port1 -> mstb2;
+        port2 -> mstb3 -> {port3, port4};
+        port3 -> mstb4;
+
+        /* Malloc references */
+        edge [style=dashed;dir=back];
+        mstb1 -> {port1, port2};
+        port1 -> mstb2;
+        port2 -> mstb3 -> {port3, port4};
+        port3 -> mstb4;
+    }
+
+    driver [label="DRM driver";style=filled;shape=box;fillcolor=lightblue];
+
+    payload1 [label="Payload #1";style=filled;shape=box;fillcolor=lightblue];
+    payload2 [label="Payload #2";style=filled;shape=box;fillcolor=lightblue];
+
+    mstb1 [label="MSTB #1";style=filled;fillcolor=palegreen;shape=oval];
+    mstb2 [label="MSTB #2";style=filled;fillcolor=palegreen;shape=oval];
+    mstb3 [label="MSTB #3";style=filled;fillcolor=palegreen;shape=oval];
+    mstb4 [label="MSTB #4";style=filled;fillcolor=palegreen;shape=oval];
+
+    port1 [label="Port #1";shape=oval];
+    port2 [label="Port #2";shape=oval];
+    port3 [label="Port #3";shape=oval];
+    port4 [label="Port #4";shape=oval];
+}
diff --git a/Documentation/gpu/dp-mst/topology-figure-2.dot b/Documentation/gpu/dp-mst/topology-figure-2.dot
new file mode 100644
index 000000000000..4243dd1737cb
--- /dev/null
+++ b/Documentation/gpu/dp-mst/topology-figure-2.dot
@@ -0,0 +1,56 @@
+digraph T {
+    /* Make sure our payloads are always drawn below the driver node */
+    subgraph cluster_driver {
+        fillcolor = grey;
+        style = filled;
+        driver -> {payload1, payload2} [dir=none];
+    }
+
+    /* Driver malloc references */
+    edge [style=dashed];
+    driver -> port1;
+    driver -> port2;
+    driver -> port3:e;
+    driver -> port4 [color=red];
+
+    payload1:s -> port1:e;
+    payload2:s -> port3:e;
+    edge [style=""];
+
+    subgraph cluster_topology {
+        label="Topology Manager";
+        labelloc=bottom;
+
+        /* Topology references */
+        mstb1 -> {port1, port2};
+        port1 -> mstb2;
+        edge [color=red];
+        port2 -> mstb3 -> {port3, port4};
+        port3 -> mstb4;
+        edge [color=""];
+
+        /* Malloc references */
+        edge [style=dashed;dir=back];
+        mstb1 -> {port1, port2};
+        port1 -> mstb2;
+        port2 -> mstb3 -> port3;
+        edge [color=red];
+        mstb3 -> port4;
+        port3 -> mstb4;
+    }
+
+    mstb1 [label="MSTB #1";style=filled;fillcolor=palegreen];
+    mstb2 [label="MSTB #2";style=filled;fillcolor=palegreen];
+    mstb3 [label="MSTB #3";style=filled;fillcolor=palegreen];
+    mstb4 [label="MSTB #4";style=filled;fillcolor=grey];
+
+    port1 [label="Port #1"];
+    port2 [label="Port #2"];
+    port3 [label="Port #3"];
+    port4 [label="Port #4";style=filled;fillcolor=grey];
+
+    driver [label="DRM driver";style=filled;shape=box;fillcolor=lightblue];
+
+    payload1 [label="Payload #1";style=filled;shape=box;fillcolor=lightblue];
+    payload2 [label="Payload #2";style=filled;shape=box;fillcolor=lightblue];
+}
diff --git a/Documentation/gpu/dp-mst/topology-figure-3.dot b/Documentation/gpu/dp-mst/topology-figure-3.dot
new file mode 100644
index 000000000000..6cd78d06778b
--- /dev/null
+++ b/Documentation/gpu/dp-mst/topology-figure-3.dot
@@ -0,0 +1,59 @@
+digraph T {
+    /* Make sure our payloads are always drawn below the driver node */
+    subgraph cluster_driver {
+        fillcolor = grey;
+        style = filled;
+        edge [dir=none];
+        driver -> payload1;
+        driver -> payload2 [penwidth=3];
+        edge [dir=""];
+    }
+
+    /* Driver malloc references */
+    edge [style=dashed];
+    driver -> port1;
+    driver -> port2;
+    driver -> port3:e;
+    driver -> port4 [color=grey];
+    payload1:s -> port1:e;
+    payload2:s -> port3:e [penwidth=3];
+    edge [style=""];
+
+    subgraph cluster_topology {
+        label="Topology Manager";
+        labelloc=bottom;
+
+        /* Topology references */
+        mstb1 -> {port1, port2};
+        port1 -> mstb2;
+        edge [color=grey];
+        port2 -> mstb3 -> {port3, port4};
+        port3 -> mstb4;
+        edge [color=""];
+
+        /* Malloc references */
+        edge [style=dashed;dir=back];
+        mstb1 -> {port1, port2};
+        port1 -> mstb2;
+        port2 -> mstb3 [penwidth=3];
+        mstb3 -> port3 [penwidth=3];
+        edge [color=grey];
+        mstb3 -> port4;
+        port3 -> mstb4;
+    }
+
+    mstb1 [label="MSTB #1";style=filled;fillcolor=palegreen];
+    mstb2 [label="MSTB #2";style=filled;fillcolor=palegreen];
+    mstb3 [label="MSTB #3";style=filled;fillcolor=palegreen;penwidth=3];
+    mstb4 [label="MSTB #4";style=filled;fillcolor=grey];
+
+    port1 [label="Port #1"];
+    port2 [label="Port #2";penwidth=5];
+    port3 [label="Port #3";penwidth=3];
+    port4 [label="Port #4";style=filled;fillcolor=grey];
+
+    driver [label="DRM driver";style=filled;shape=box;fillcolor=lightblue];
+
+    payload1 [label="Payload #1";style=filled;shape=box;fillcolor=lightblue];
+    payload2 [label="Payload #2";style=filled;shape=box;fillcolor=lightblue;penwidth=3];
+}
diff --git a/Documentation/gpu/drivers.rst b/Documentation/gpu/drivers.rst
index 7c1672118a73..044a7025477c 100644
--- a/Documentation/gpu/drivers.rst
+++ b/Documentation/gpu/drivers.rst
@@ -17,6 +17,8 @@ GPU Driver Documentation
    vkms
    bridge/dw-hdmi
    xen-front
+   afbc
+   komeda-kms
 
 .. only::  subproject and html
 
diff --git a/Documentation/gpu/drm-internals.rst b/Documentation/gpu/drm-internals.rst
index 5ee9674fb9e9..3ae23a5454ac 100644
--- a/Documentation/gpu/drm-internals.rst
+++ b/Documentation/gpu/drm-internals.rst
@@ -39,68 +39,6 @@ sections.
 Driver Information
 ------------------
 
-Driver Features
-~~~~~~~~~~~~~~~
-
-Drivers inform the DRM core about their requirements and supported
-features by setting appropriate flags in the driver_features field.
-Since those flags influence the DRM core behaviour since registration
-time, most of them must be set to registering the :c:type:`struct
-drm_driver <drm_driver>` instance.
-
-u32 driver_features;
-
-DRIVER_USE_AGP
-    Driver uses AGP interface, the DRM core will manage AGP resources.
-
-DRIVER_LEGACY
-    Denote a legacy driver using shadow attach. Don't use.
-
-DRIVER_KMS_LEGACY_CONTEXT
-    Used only by nouveau for backwards compatibility with existing userspace.
-    Don't use.
-
-DRIVER_PCI_DMA
-    Driver is capable of PCI DMA, mapping of PCI DMA buffers to
-    userspace will be enabled. Deprecated.
-
-DRIVER_SG
-    Driver can perform scatter/gather DMA, allocation and mapping of
-    scatter/gather buffers will be enabled. Deprecated.
-
-DRIVER_HAVE_DMA
-    Driver supports DMA, the userspace DMA API will be supported.
-    Deprecated.
-
-DRIVER_HAVE_IRQ; DRIVER_IRQ_SHARED
-    DRIVER_HAVE_IRQ indicates whether the driver has an IRQ handler
-    managed by the DRM Core. The core will support simple IRQ handler
-    installation when the flag is set. The installation process is
-    described in ?.
-
-    DRIVER_IRQ_SHARED indicates whether the device & handler support
-    shared IRQs (note that this is required of PCI drivers).
-
-DRIVER_GEM
-    Driver use the GEM memory manager.
-
-DRIVER_MODESET
-    Driver supports mode setting interfaces (KMS).
-
-DRIVER_PRIME
-    Driver implements DRM PRIME buffer sharing.
-
-DRIVER_RENDER
-    Driver supports dedicated render nodes.
-
-DRIVER_ATOMIC
-    Driver supports atomic properties. In this case the driver must
-    implement appropriate obj->atomic_get_property() vfuncs for any
-    modeset objects with driver specific properties.
-
-DRIVER_SYNCOBJ
-    Driver support drm sync objects.
-
 Major, Minor and Patchlevel
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -143,6 +81,9 @@ Device Instance and Driver Handling
 .. kernel-doc:: drivers/gpu/drm/drm_drv.c
    :doc: driver instance overview
 
+.. kernel-doc:: include/drm/drm_device.h
+   :internal:
+
 .. kernel-doc:: include/drm/drm_drv.h
    :internal:
 
@@ -230,6 +171,15 @@ Printer
 .. kernel-doc:: drivers/gpu/drm/drm_print.c
    :export:
 
+Utilities
+---------
+
+.. kernel-doc:: include/drm/drm_util.h
+   :doc: drm utils
+
+.. kernel-doc:: include/drm/drm_util.h
+   :internal:
+
 
 Legacy Support Code
 ===================
diff --git a/Documentation/gpu/drm-kms-helpers.rst b/Documentation/gpu/drm-kms-helpers.rst
index b422eb8edf16..17ca7f8bf3d3 100644
--- a/Documentation/gpu/drm-kms-helpers.rst
+++ b/Documentation/gpu/drm-kms-helpers.rst
@@ -116,8 +116,6 @@ Framebuffer CMA Helper Functions Reference
 .. kernel-doc:: drivers/gpu/drm/drm_fb_cma_helper.c
    :export:
 
-.. _drm_bridges:
-
 Framebuffer GEM Helper Reference
 ================================
 
@@ -127,6 +125,8 @@ Framebuffer GEM Helper Reference
 .. kernel-doc:: drivers/gpu/drm/drm_gem_framebuffer_helper.c
    :export:
 
+.. _drm_bridges:
+
 Bridges
 =======
 
@@ -208,18 +208,40 @@ Display Port Dual Mode Adaptor Helper Functions Reference
 .. kernel-doc:: drivers/gpu/drm/drm_dp_dual_mode_helper.c
    :export:
 
-Display Port MST Helper Functions Reference
-===========================================
+Display Port MST Helpers
+========================
+
+Overview
+--------
 
 .. kernel-doc:: drivers/gpu/drm/drm_dp_mst_topology.c
    :doc: dp mst helper
 
+.. kernel-doc:: drivers/gpu/drm/drm_dp_mst_topology.c
+   :doc: Branch device and port refcounting
+
+Functions Reference
+-------------------
+
 .. kernel-doc:: include/drm/drm_dp_mst_helper.h
    :internal:
 
 .. kernel-doc:: drivers/gpu/drm/drm_dp_mst_topology.c
    :export:
 
+Topology Lifetime Internals
+---------------------------
+
+These functions aren't exported to drivers, but are documented here to help make
+the MST topology helpers easier to understand
+
+.. kernel-doc:: drivers/gpu/drm/drm_dp_mst_topology.c
+   :functions: drm_dp_mst_topology_try_get_mstb drm_dp_mst_topology_get_mstb
+               drm_dp_mst_topology_put_mstb
+               drm_dp_mst_topology_try_get_port drm_dp_mst_topology_get_port
+               drm_dp_mst_topology_put_port
+               drm_dp_mst_get_mstb_malloc drm_dp_mst_put_mstb_malloc
+
 MIPI DSI Helper Functions Reference
 ===================================
 
@@ -274,18 +296,6 @@ SCDC Helper Functions Reference
 .. kernel-doc:: drivers/gpu/drm/drm_scdc_helper.c
    :export:
 
-Rectangle Utilities Reference
-=============================
-
-.. kernel-doc:: include/drm/drm_rect.h
-   :doc: rect utils
-
-.. kernel-doc:: include/drm/drm_rect.h
-   :internal:
-
-.. kernel-doc:: drivers/gpu/drm/drm_rect.c
-   :export:
-
 HDMI Infoframes Helper Reference
 ================================
 
@@ -300,6 +310,18 @@ libraries and hence is also included here.
 .. kernel-doc:: drivers/video/hdmi.c
    :export:
 
+Rectangle Utilities Reference
+=============================
+
+.. kernel-doc:: include/drm/drm_rect.h
+   :doc: rect utils
+
+.. kernel-doc:: include/drm/drm_rect.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_rect.c
+   :export:
+
 Flip-work Helper Reference
 ==========================
 
diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
index 75c882e09fee..23a3c986ef6d 100644
--- a/Documentation/gpu/drm-kms.rst
+++ b/Documentation/gpu/drm-kms.rst
@@ -410,102 +410,6 @@ Encoder Functions Reference
 .. kernel-doc:: drivers/gpu/drm/drm_encoder.c
    :export:
 
-KMS Initialization and Cleanup
-==============================
-
-A KMS device is abstracted and exposed as a set of planes, CRTCs,
-encoders and connectors. KMS drivers must thus create and initialize all
-those objects at load time after initializing mode setting.
-
-CRTCs (:c:type:`struct drm_crtc <drm_crtc>`)
---------------------------------------------
-
-A CRTC is an abstraction representing a part of the chip that contains a
-pointer to a scanout buffer. Therefore, the number of CRTCs available
-determines how many independent scanout buffers can be active at any
-given time. The CRTC structure contains several fields to support this:
-a pointer to some video memory (abstracted as a frame buffer object), a
-display mode, and an (x, y) offset into the video memory to support
-panning or configurations where one piece of video memory spans multiple
-CRTCs.
-
-CRTC Initialization
-~~~~~~~~~~~~~~~~~~~
-
-A KMS device must create and register at least one struct
-:c:type:`struct drm_crtc <drm_crtc>` instance. The instance is
-allocated and zeroed by the driver, possibly as part of a larger
-structure, and registered with a call to :c:func:`drm_crtc_init()`
-with a pointer to CRTC functions.
-
-
-Cleanup
--------
-
-The DRM core manages its objects' lifetime. When an object is not needed
-anymore the core calls its destroy function, which must clean up and
-free every resource allocated for the object. Every
-:c:func:`drm_\*_init()` call must be matched with a corresponding
-:c:func:`drm_\*_cleanup()` call to cleanup CRTCs
-(:c:func:`drm_crtc_cleanup()`), planes
-(:c:func:`drm_plane_cleanup()`), encoders
-(:c:func:`drm_encoder_cleanup()`) and connectors
-(:c:func:`drm_connector_cleanup()`). Furthermore, connectors that
-have been added to sysfs must be removed by a call to
-:c:func:`drm_connector_unregister()` before calling
-:c:func:`drm_connector_cleanup()`.
-
-Connectors state change detection must be cleanup up with a call to
-:c:func:`drm_kms_helper_poll_fini()`.
-
-Output discovery and initialization example
--------------------------------------------
-
-.. code-block:: c
-
-    void intel_crt_init(struct drm_device *dev)
-    {
-        struct drm_connector *connector;
-        struct intel_output *intel_output;
-
-        intel_output = kzalloc(sizeof(struct intel_output), GFP_KERNEL);
-        if (!intel_output)
-            return;
-
-        connector = &intel_output->base;
-        drm_connector_init(dev, &intel_output->base,
-                   &intel_crt_connector_funcs, DRM_MODE_CONNECTOR_VGA);
-
-        drm_encoder_init(dev, &intel_output->enc, &intel_crt_enc_funcs,
-                 DRM_MODE_ENCODER_DAC);
-
-        drm_connector_attach_encoder(&intel_output->base,
-                          &intel_output->enc);
-
-        /* Set up the DDC bus. */
-        intel_output->ddc_bus = intel_i2c_create(dev, GPIOA, "CRTDDC_A");
-        if (!intel_output->ddc_bus) {
-            dev_printk(KERN_ERR, &dev->pdev->dev, "DDC bus registration "
-                   "failed.\n");
-            return;
-        }
-
-        intel_output->type = INTEL_OUTPUT_ANALOG;
-        connector->interlace_allowed = 0;
-        connector->doublescan_allowed = 0;
-
-        drm_encoder_helper_add(&intel_output->enc, &intel_crt_helper_funcs);
-        drm_connector_helper_add(connector, &intel_crt_connector_helper_funcs);
-
-        drm_connector_register(connector);
-    }
-
-In the example above (taken from the i915 driver), a CRTC, connector and
-encoder combination is created. A device-specific i2c bus is also
-created for fetching EDID data and performing monitor detection. Once
-the process is complete, the new connector is registered with sysfs to
-make its properties available to applications.
-
 KMS Locking
 ===========
 
diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index a752aa561ea4..c9fd23efd957 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -238,6 +238,14 @@ DRM specific patterns. Note that ENOTTY has the slightly unintuitive meaning of
 Testing and validation
 ======================
 
+Testing Requirements for userspace API
+--------------------------------------
+
+New cross-driver userspace interface extensions, like new IOCTL, new KMS
+properties, new files in sysfs or anything else that constitutes an API change
+should have driver-agnostic testcases in IGT for that feature, if such a test
+can be reasonably made using IGT for the target hardware.
+
 Validating changes with IGT
 ---------------------------
 
diff --git a/Documentation/gpu/komeda-kms.rst b/Documentation/gpu/komeda-kms.rst
new file mode 100644
index 000000000000..b08da1cffecc
--- /dev/null
+++ b/Documentation/gpu/komeda-kms.rst
@@ -0,0 +1,488 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+ drm/komeda Arm display driver
+==============================
+
+The drm/komeda driver supports the Arm display processor D71 and later products,
+this document gives a brief overview of driver design: how it works and why
+design it like that.
+
+Overview of D71 like display IPs
+================================
+
+From D71, Arm display IP begins to adopt a flexible and modularized
+architecture. A display pipeline is made up of multiple individual and
+functional pipeline stages called components, and every component has some
+specific capabilities that can give the flowed pipeline pixel data a
+particular processing.
+
+Typical D71 components:
+
+Layer
+-----
+Layer is the first pipeline stage, which prepares the pixel data for the next
+stage. It fetches the pixel from memory, decodes it if it's AFBC, rotates the
+source image, unpacks or converts YUV pixels to the device internal RGB pixels,
+then adjusts the color_space of pixels if needed.
+
+Scaler
+------
+As its name suggests, scaler takes responsibility for scaling, and D71 also
+supports image enhancements by scaler.
+The usage of scaler is very flexible and can be connected to layer output
+for layer scaling, or connected to compositor and scale the whole display
+frame and then feed the output data into wb_layer which will then write it
+into memory.
+
+Compositor (compiz)
+-------------------
+Compositor blends multiple layers or pixel data flows into one single display
+frame. its output frame can be fed into post image processor for showing it on
+the monitor or fed into wb_layer and written to memory at the same time.
+user can also insert a scaler between compositor and wb_layer to down scale
+the display frame first and and then write to memory.
+
+Writeback Layer (wb_layer)
+--------------------------
+Writeback layer does the opposite things of Layer, which connects to compiz
+and writes the composition result to memory.
+
+Post image processor (improc)
+-----------------------------
+Post image processor adjusts frame data like gamma and color space to fit the
+requirements of the monitor.
+
+Timing controller (timing_ctrlr)
+--------------------------------
+Final stage of display pipeline, Timing controller is not for the pixel
+handling, but only for controlling the display timing.
+
+Merger
+------
+D71 scaler mostly only has the half horizontal input/output capabilities
+compared with Layer, like if Layer supports 4K input size, the scaler only can
+support 2K input/output in the same time. To achieve the ful frame scaling, D71
+introduces Layer Split, which splits the whole image to two half parts and feeds
+them to two Layers A and B, and does the scaling independently. After scaling
+the result need to be fed to merger to merge two part images together, and then
+output merged result to compiz.
+
+Splitter
+--------
+Similar to Layer Split, but Splitter is used for writeback, which splits the
+compiz result to two parts and then feed them to two scalers.
+
+Possible D71 Pipeline usage
+===========================
+
+Benefitting from the modularized architecture, D71 pipelines can be easily
+adjusted to fit different usages. And D71 has two pipelines, which support two
+types of working mode:
+
+-   Dual display mode
+    Two pipelines work independently and separately to drive two display outputs.
+
+-   Single display mode
+    Two pipelines work together to drive only one display output.
+
+    On this mode, pipeline_B doesn't work indenpendently, but outputs its
+    composition result into pipeline_A, and its pixel timing also derived from
+    pipeline_A.timing_ctrlr. The pipeline_B works just like a "slave" of
+    pipeline_A(master)
+
+Single pipeline data flow
+-------------------------
+
+.. kernel-render:: DOT
+   :alt: Single pipeline digraph
+   :caption: Single pipeline data flow
+
+   digraph single_ppl {
+      rankdir=LR;
+
+      subgraph {
+         "Memory";
+         "Monitor";
+      }
+
+      subgraph cluster_pipeline {
+          style=dashed
+          node [shape=box]
+          {
+              node [bgcolor=grey style=dashed]
+              "Scaler-0";
+              "Scaler-1";
+              "Scaler-0/1"
+          }
+
+         node [bgcolor=grey style=filled]
+         "Layer-0" -> "Scaler-0"
+         "Layer-1" -> "Scaler-0"
+         "Layer-2" -> "Scaler-1"
+         "Layer-3" -> "Scaler-1"
+
+         "Layer-0" -> "Compiz"
+         "Layer-1" -> "Compiz"
+         "Layer-2" -> "Compiz"
+         "Layer-3" -> "Compiz"
+         "Scaler-0" -> "Compiz"
+         "Scaler-1" -> "Compiz"
+
+         "Compiz" -> "Scaler-0/1" -> "Wb_layer"
+         "Compiz" -> "Improc" -> "Timing Controller"
+      }
+
+      "Wb_layer" -> "Memory"
+      "Timing Controller" -> "Monitor"
+   }
+
+Dual pipeline with Slave enabled
+--------------------------------
+
+.. kernel-render:: DOT
+   :alt: Slave pipeline digraph
+   :caption: Slave pipeline enabled data flow
+
+   digraph slave_ppl {
+      rankdir=LR;
+
+      subgraph {
+         "Memory";
+         "Monitor";
+      }
+      node [shape=box]
+      subgraph cluster_pipeline_slave {
+          style=dashed
+          label="Slave Pipeline_B"
+          node [shape=box]
+          {
+              node [bgcolor=grey style=dashed]
+              "Slave.Scaler-0";
+              "Slave.Scaler-1";
+          }
+
+         node [bgcolor=grey style=filled]
+         "Slave.Layer-0" -> "Slave.Scaler-0"
+         "Slave.Layer-1" -> "Slave.Scaler-0"
+         "Slave.Layer-2" -> "Slave.Scaler-1"
+         "Slave.Layer-3" -> "Slave.Scaler-1"
+
+         "Slave.Layer-0" -> "Slave.Compiz"
+         "Slave.Layer-1" -> "Slave.Compiz"
+         "Slave.Layer-2" -> "Slave.Compiz"
+         "Slave.Layer-3" -> "Slave.Compiz"
+         "Slave.Scaler-0" -> "Slave.Compiz"
+         "Slave.Scaler-1" -> "Slave.Compiz"
+      }
+
+      subgraph cluster_pipeline_master {
+          style=dashed
+          label="Master Pipeline_A"
+          node [shape=box]
+          {
+              node [bgcolor=grey style=dashed]
+              "Scaler-0";
+              "Scaler-1";
+              "Scaler-0/1"
+          }
+
+         node [bgcolor=grey style=filled]
+         "Layer-0" -> "Scaler-0"
+         "Layer-1" -> "Scaler-0"
+         "Layer-2" -> "Scaler-1"
+         "Layer-3" -> "Scaler-1"
+
+         "Slave.Compiz" -> "Compiz"
+         "Layer-0" -> "Compiz"
+         "Layer-1" -> "Compiz"
+         "Layer-2" -> "Compiz"
+         "Layer-3" -> "Compiz"
+         "Scaler-0" -> "Compiz"
+         "Scaler-1" -> "Compiz"
+
+         "Compiz" -> "Scaler-0/1" -> "Wb_layer"
+         "Compiz" -> "Improc" -> "Timing Controller"
+      }
+
+      "Wb_layer" -> "Memory"
+      "Timing Controller" -> "Monitor"
+   }
+
+Sub-pipelines for input and output
+----------------------------------
+
+A complete display pipeline can be easily divided into three sub-pipelines
+according to the in/out usage.
+
+Layer(input) pipeline
+~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-render:: DOT
+   :alt: Layer data digraph
+   :caption: Layer (input) data flow
+
+   digraph layer_data_flow {
+      rankdir=LR;
+      node [shape=box]
+
+      {
+         node [bgcolor=grey style=dashed]
+           "Scaler-n";
+      }
+
+      "Layer-n" -> "Scaler-n" -> "Compiz"
+   }
+
+.. kernel-render:: DOT
+   :alt: Layer Split digraph
+   :caption: Layer Split pipeline
+
+   digraph layer_data_flow {
+      rankdir=LR;
+      node [shape=box]
+
+      "Layer-0/1" -> "Scaler-0" -> "Merger"
+      "Layer-2/3" -> "Scaler-1" -> "Merger"
+      "Merger" -> "Compiz"
+   }
+
+Writeback(output) pipeline
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. kernel-render:: DOT
+   :alt: writeback digraph
+   :caption: Writeback(output) data flow
+
+   digraph writeback_data_flow {
+      rankdir=LR;
+      node [shape=box]
+
+      {
+         node [bgcolor=grey style=dashed]
+           "Scaler-n";
+      }
+
+      "Compiz" -> "Scaler-n" -> "Wb_layer"
+   }
+
+.. kernel-render:: DOT
+   :alt: split writeback digraph
+   :caption: Writeback(output) Split data flow
+
+   digraph writeback_data_flow {
+      rankdir=LR;
+      node [shape=box]
+
+      "Compiz" -> "Splitter"
+      "Splitter" -> "Scaler-0" -> "Merger"
+      "Splitter" -> "Scaler-1" -> "Merger"
+      "Merger" -> "Wb_layer"
+   }
+
+Display output pipeline
+~~~~~~~~~~~~~~~~~~~~~~~
+.. kernel-render:: DOT
+   :alt: display digraph
+   :caption: display output data flow
+
+   digraph single_ppl {
+      rankdir=LR;
+      node [shape=box]
+
+      "Compiz" -> "Improc" -> "Timing Controller"
+   }
+
+In the following section we'll see these three sub-pipelines will be handled
+by KMS-plane/wb_conn/crtc respectively.
+
+Komeda Resource abstraction
+===========================
+
+struct komeda_pipeline/component
+--------------------------------
+
+To fully utilize and easily access/configure the HW, the driver side also uses
+a similar architecture: Pipeline/Component to describe the HW features and
+capabilities, and a specific component includes two parts:
+
+-  Data flow controlling.
+-  Specific component capabilities and features.
+
+So the driver defines a common header struct komeda_component to describe the
+data flow control and all specific components are a subclass of this base
+structure.
+
+.. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h
+   :internal:
+
+Resource discovery and initialization
+=====================================
+
+Pipeline and component are used to describe how to handle the pixel data. We
+still need a @struct komeda_dev to describe the whole view of the device, and
+the control-abilites of device.
+
+We have &komeda_dev, &komeda_pipeline, &komeda_component. Now fill devices with
+pipelines. Since komeda is not for D71 only but also intended for later products,
+of course we’d better share as much as possible between different products. To
+achieve this, split the komeda device into two layers: CORE and CHIP.
+
+-   CORE: for common features and capabilities handling.
+-   CHIP: for register programing and HW specific feature (limitation) handling.
+
+CORE can access CHIP by three chip function structures:
+
+-   struct komeda_dev_funcs
+-   struct komeda_pipeline_funcs
+-   struct komeda_component_funcs
+
+.. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_dev.h
+   :internal:
+
+Format handling
+===============
+
+.. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_format_caps.h
+   :internal:
+.. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_framebuffer.h
+   :internal:
+
+Attach komeda_dev to DRM-KMS
+============================
+
+Komeda abstracts resources by pipeline/component, but DRM-KMS uses
+crtc/plane/connector. One KMS-obj cannot represent only one single component,
+since the requirements of a single KMS object cannot simply be achieved by a
+single component, usually that needs multiple components to fit the requirement.
+Like set mode, gamma, ctm for KMS all target on CRTC-obj, but komeda needs
+compiz, improc and timing_ctrlr to work together to fit these requirements.
+And a KMS-Plane may require multiple komeda resources: layer/scaler/compiz.
+
+So, one KMS-Obj represents a sub-pipeline of komeda resources.
+
+-   Plane: `Layer(input) pipeline`_
+-   Wb_connector: `Writeback(output) pipeline`_
+-   Crtc: `Display output pipeline`_
+
+So, for komeda, we treat KMS crtc/plane/connector as users of pipeline and
+component, and at any one time a pipeline/component only can be used by one
+user. And pipeline/component will be treated as private object of DRM-KMS; the
+state will be managed by drm_atomic_state as well.
+
+How to map plane to Layer(input) pipeline
+-----------------------------------------
+
+Komeda has multiple Layer input pipelines, see:
+-   `Single pipeline data flow`_
+-   `Dual pipeline with Slave enabled`_
+
+The easiest way is binding a plane to a fixed Layer pipeline, but consider the
+komeda capabilities:
+
+-   Layer Split, See `Layer(input) pipeline`_
+
+    Layer_Split is quite complicated feature, which splits a big image into two
+    parts and handles it by two layers and two scalers individually. But it
+    imports an edge problem or effect in the middle of the image after the split.
+    To avoid such a problem, it needs a complicated Split calculation and some
+    special configurations to the layer and scaler. We'd better hide such HW
+    related complexity to user mode.
+
+-   Slave pipeline, See `Dual pipeline with Slave enabled`_
+
+    Since the compiz component doesn't output alpha value, the slave pipeline
+    only can be used for bottom layers composition. The komeda driver wants to
+    hide this limitation to the user. The way to do this is to pick a suitable
+    Layer according to plane_state->zpos.
+
+So for komeda, the KMS-plane doesn't represent a fixed komeda layer pipeline,
+but multiple Layers with same capabilities. Komeda will select one or more
+Layers to fit the requirement of one KMS-plane.
+
+Make component/pipeline to be drm_private_obj
+---------------------------------------------
+
+Add :c:type:`drm_private_obj` to :c:type:`komeda_component`, :c:type:`komeda_pipeline`
+
+.. code-block:: c
+
+    struct komeda_component {
+        struct drm_private_obj obj;
+        ...
+    }
+
+    struct komeda_pipeline {
+        struct drm_private_obj obj;
+        ...
+    }
+
+Tracking component_state/pipeline_state by drm_atomic_state
+-----------------------------------------------------------
+
+Add :c:type:`drm_private_state` and user to :c:type:`komeda_component_state`,
+:c:type:`komeda_pipeline_state`
+
+.. code-block:: c
+
+    struct komeda_component_state {
+        struct drm_private_state obj;
+        void *binding_user;
+        ...
+    }
+
+    struct komeda_pipeline_state {
+        struct drm_private_state obj;
+        struct drm_crtc *crtc;
+        ...
+    }
+
+komeda component validation
+---------------------------
+
+Komeda has multiple types of components, but the process of validation are
+similar, usually including the following steps:
+
+.. code-block:: c
+
+    int komeda_xxxx_validate(struct komeda_component_xxx xxx_comp,
+                struct komeda_component_output *input_dflow,
+                struct drm_plane/crtc/connector *user,
+                struct drm_plane/crtc/connector_state, *user_state)
+    {
+         setup 1: check if component is needed, like the scaler is optional depending
+                  on the user_state; if unneeded, just return, and the caller will
+                  put the data flow into next stage.
+         Setup 2: check user_state with component features and capabilities to see
+                  if requirements can be met; if not, return fail.
+         Setup 3: get component_state from drm_atomic_state, and try set to set
+                  user to component; fail if component has been assigned to another
+                  user already.
+         Setup 3: configure the component_state, like set its input component,
+                  convert user_state to component specific state.
+         Setup 4: adjust the input_dflow and prepare it for the next stage.
+    }
+
+komeda_kms Abstraction
+----------------------
+
+.. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_kms.h
+   :internal:
+
+komde_kms Functions
+-------------------
+.. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_crtc.c
+   :internal:
+.. kernel-doc:: drivers/gpu/drm/arm/display/komeda/komeda_plane.c
+   :internal:
+
+Build komeda to be a Linux module driver
+========================================
+
+Now we have two level devices:
+
+-   komeda_dev: describes the real display hardware.
+-   komeda_kms_dev: attachs or connects komeda_dev to DRM-KMS.
+
+All komeda operations are supplied or operated by komeda_dev or komeda_kms_dev,
+the module driver is only a simple wrapper to pass the Linux command
+(probe/remove/pm) into komeda_dev or komeda_kms_dev.
diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 14191b64446d..159a4aba49e6 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -82,30 +82,6 @@ events for atomic commits correctly. But fixing these bugs is good anyway.
 
 Contact: Daniel Vetter, respective driver maintainers
 
-Better manual-upload support for atomic
----------------------------------------
-
-This would be especially useful for tinydrm:
-
-- Add a struct drm_rect dirty_clip to drm_crtc_state. When duplicating the
-  crtc state, clear that to the max values, x/y = 0 and w/h = MAX_INT, in
-  __drm_atomic_helper_crtc_duplicate_state().
-
-- Move tinydrm_merge_clips into drm_framebuffer.c, dropping the tinydrm\_
-  prefix ofc and using drm_fb\_. drm_framebuffer.c makes sense since this
-  is a function useful to implement the fb->dirty function.
-
-- Create a new drm_fb_dirty function which does essentially what e.g.
-  mipi_dbi_fb_dirty does. You can use e.g. drm_atomic_helper_update_plane as the
-  template. But instead of doing a simple full-screen plane update, this new
-  helper also sets crtc_state->dirty_clip to the right coordinates. And of
-  course it needs to check whether the fb is actually active (and maybe where),
-  so there's some book-keeping involved. There's also some good fun involved in
-  scaling things appropriately. For that case we might simply give up and
-  declare the entire area covered by the plane as dirty.
-
-Contact: Noralf Trønnes, Daniel Vetter
-
 Fallout from atomic KMS
 -----------------------
 
@@ -209,6 +185,36 @@ Would be great to refactor this all into a set of small common helpers.
 
 Contact: Daniel Vetter
 
+Generic fbdev defio support
+---------------------------
+
+The defio support code in the fbdev core has some very specific requirements,
+which means drivers need to have a special framebuffer for fbdev. Which prevents
+us from using the generic fbdev emulation code everywhere. The main issue is
+that it uses some fields in struct page itself, which breaks shmem gem objects
+(and other things).
+
+Possible solution would be to write our own defio mmap code in the drm fbdev
+emulation. It would need to fully wrap the existing mmap ops, forwarding
+everything after it has done the write-protect/mkwrite trickery:
+
+- In the drm_fbdev_fb_mmap helper, if we need defio, change the
+  default page prots to write-protected with something like this::
+
+      vma->vm_page_prot = pgprot_wrprotect(vma->vm_page_prot);
+
+- Set the mkwrite and fsync callbacks with similar implementions to the core
+  fbdev defio stuff. These should all work on plain ptes, they don't actually
+  require a struct page.  uff. These should all work on plain ptes, they don't
+  actually require a struct page.
+
+- Track the dirty pages in a separate structure (bitfield with one bit per page
+  should work) to avoid clobbering struct page.
+
+Might be good to also have some igt testcases for this.
+
+Contact: Daniel Vetter, Noralf Tronnes
+
 Put a reservation_object into drm_gem_object
 --------------------------------------------
 
@@ -256,6 +262,44 @@ As a reference, take a look at the conversions already completed in drm core.
 
 Contact: Sean Paul, respective driver maintainers
 
+Rename CMA helpers to DMA helpers
+---------------------------------
+
+CMA (standing for contiguous memory allocator) is really a bit an accident of
+what these were used for first, a much better name would be DMA helpers. In the
+text these should even be called coherent DMA memory helpers (so maybe CDM, but
+no one knows what that means) since underneath they just use dma_alloc_coherent.
+
+Contact: Laurent Pinchart, Daniel Vetter
+
+Convert direct mode.vrefresh accesses to use drm_mode_vrefresh()
+----------------------------------------------------------------
+
+drm_display_mode.vrefresh isn't guaranteed to be populated. As such, using it
+is risky and has been known to cause div-by-zero bugs. Fortunately, drm core
+has helper which will use mode.vrefresh if it's !0 and will calculate it from
+the timings when it's 0.
+
+Use simple search/replace, or (more fun) cocci to replace instances of direct
+vrefresh access with a call to the helper. Check out
+https://lists.freedesktop.org/archives/dri-devel/2019-January/205186.html for
+inspiration.
+
+Once all instances of vrefresh have been converted, remove vrefresh from
+drm_display_mode to avoid future use.
+
+Contact: Sean Paul
+
+Remove drm_display_mode.hsync
+-----------------------------
+
+We have drm_mode_hsync() to calculate this from hsync_start/end, since drivers
+shouldn't/don't use this, remove this member to avoid any temptations to use it
+in the future. If there is any debug code using drm_display_mode.hsync, convert
+it to use drm_mode_hsync() instead.
+
+Contact: Sean Paul
+
 Core refactorings
 =================
 
@@ -354,13 +398,6 @@ KMS cleanups
 
 Some of these date from the very introduction of KMS in 2008 ...
 
-- drm_mode_config.crtc_idr is misnamed, since it contains all KMS object. Should
-  be renamed to drm_mode_config.object_idr.
-
-- drm_display_mode doesn't need to be derived from drm_mode_object. That's
-  leftovers from older (never merged into upstream) KMS designs where modes
-  where set using their ID, including support to add/remove modes.
-
 - Make ->funcs and ->helper_private vtables optional. There's a bunch of empty
   function tables in drivers, but before we can remove them we need to make sure
   that all the users in helpers and drivers do correctly check for a NULL
@@ -432,21 +469,10 @@ those drivers as simple as possible, so lots of room for refactoring:
   one of the ideas for having a shared dsi/dbi helper, abstracting away the
   transport details more.
 
-- tinydrm_gem_cma_prime_import_sg_table should probably go into the cma
-  helpers, as a _vmapped variant (since not every driver needs the vmap).
-  And tinydrm_gem_cma_free_object could the be merged into
-  drm_gem_cma_free_object().
-
-- tinydrm_fb_create we could move into drm_simple_pipe, only need to add
-  the fb_create hook to drm_simple_pipe_funcs, which would again simplify a
-  bunch of things (since it gives you a one-stop vfunc for simple drivers).
-
 - Quick aside: The unregister devm stuff is kinda getting the lifetimes of
   a drm_device wrong. Doesn't matter, since everyone else gets it wrong
   too :-)
 
-- also rework the drm_framebuffer_funcs->dirty hook wire-up, see above.
-
 Contact: Noralf Trønnes, Daniel Vetter
 
 AMD DC Display Driver
diff --git a/Documentation/gpu/vkms.rst b/Documentation/gpu/vkms.rst
index 7dfc349a4508..61586fc861bb 100644
--- a/Documentation/gpu/vkms.rst
+++ b/Documentation/gpu/vkms.rst
@@ -23,17 +23,6 @@ CRC API Improvements
 - Add igt test to check extreme alpha values i.e. fully opaque and fully
   transparent (intermediate values are affected by hw-specific rounding modes).
 
-Vblank issues
--------------
-
-Some IGT test cases are failing. Need to analyze why and fix the issues:
-
-- plain-flip-fb-recreate
-- plain-flip-ts-check
-- flip-vs-blocking-wf-vblank
-- plain-flip-fb-recreate-interruptible
-- flip-vs-wf_vblank-interruptible
-
 Runtime Configuration
 ---------------------
 
diff --git a/MAINTAINERS b/MAINTAINERS
index fce33cc179b0..51bbae5ef2ba 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1149,13 +1149,26 @@ S:	Supported
 F:	drivers/gpu/drm/arm/hdlcd_*
 F:	Documentation/devicetree/bindings/display/arm,hdlcd.txt
 
+ARM KOMEDA DRM-KMS DRIVER
+M:	James (Qian) Wang <james.qian.wang@arm.com>
+M:	Liviu Dudau <liviu.dudau@arm.com>
+L:	Mali DP Maintainers <malidp@foss.arm.com>
+S:	Supported
+T:	git git://linux-arm.org/linux-ld.git for-upstream/mali-dp
+F:	drivers/gpu/drm/arm/display/include/
+F:	drivers/gpu/drm/arm/display/komeda/
+F:	Documentation/devicetree/bindings/display/arm/arm,komeda.txt
+F:	Documentation/gpu/komeda-kms.rst
+
 ARM MALI-DP DRM DRIVER
 M:	Liviu Dudau <liviu.dudau@arm.com>
 M:	Brian Starkey <brian.starkey@arm.com>
-M:	Mali DP Maintainers <malidp@foss.arm.com>
+L:	Mali DP Maintainers <malidp@foss.arm.com>
 S:	Supported
+T:	git git://linux-arm.org/linux-ld.git for-upstream/mali-dp
 F:	drivers/gpu/drm/arm/
 F:	Documentation/devicetree/bindings/display/arm,malidp.txt
+F:	Documentation/gpu/afbc.rst
 
 ARM MFM AND FLOPPY DRIVERS
 M:	Ian Molton <spyro@f2s.com>
@@ -4900,10 +4913,11 @@ F:	Documentation/devicetree/bindings/display/multi-inno,mi0283qt.txt
 
 DRM DRIVER FOR MSM ADRENO GPU
 M:	Rob Clark <robdclark@gmail.com>
+M:	Sean Paul <sean@poorly.run>
 L:	linux-arm-msm@vger.kernel.org
 L:	dri-devel@lists.freedesktop.org
 L:	freedreno@lists.freedesktop.org
-T:	git git://people.freedesktop.org/~robclark/linux
+T:	git https://gitlab.freedesktop.org/drm/msm.git
 S:	Maintained
 F:	drivers/gpu/drm/msm/
 F:	include/uapi/drm/msm_drm.h
@@ -4943,6 +4957,7 @@ DRM DRIVER FOR QXL VIRTUAL GPU
 M:	Dave Airlie <airlied@redhat.com>
 M:	Gerd Hoffmann <kraxel@redhat.com>
 L:	virtualization@lists.linux-foundation.org
+L:	spice-devel@lists.freedesktop.org
 T:	git git://anongit.freedesktop.org/drm/drm-misc
 S:	Maintained
 F:	drivers/gpu/drm/qxl/
@@ -4963,6 +4978,12 @@ S:	Orphan / Obsolete
 F:	drivers/gpu/drm/sis/
 F:	include/uapi/drm/sis_drm.h
 
+DRM DRIVER FOR SITRONIX ST7701 PANELS
+M:	Jagan Teki <jagan@amarulasolutions.com>
+S:	Maintained
+F:	drivers/gpu/drm/panel/panel-sitronix-st7701.c
+F:	Documentation/devicetree/bindings/display/panel/sitronix,st7701.txt
+
 DRM DRIVER FOR SITRONIX ST7586 PANELS
 M:	David Lechner <david@lechnology.com>
 S:	Maintained
@@ -4979,6 +5000,13 @@ DRM DRIVER FOR TDFX VIDEO CARDS
 S:	Orphan / Obsolete
 F:	drivers/gpu/drm/tdfx/
 
+DRM DRIVER FOR TPO TPG110 PANELS
+M:	Linus Walleij <linus.walleij@linaro.org>
+T:	git git://anongit.freedesktop.org/drm/drm-misc
+S:	Maintained
+F:	drivers/gpu/drm/panel/panel-tpo-tpg110.c
+F:	Documentation/devicetree/bindings/display/panel/tpo,tpg110.txt
+
 DRM DRIVER FOR USB DISPLAYLINK VIDEO ADAPTERS
 M:	Dave Airlie <airlied@redhat.com>
 R:	Sean Paul <sean@poorly.run>
@@ -4987,6 +5015,16 @@ S:	Odd Fixes
 F:	drivers/gpu/drm/udl/
 T:	git git://anongit.freedesktop.org/drm/drm-misc
 
+DRM DRIVER FOR VIRTUAL KERNEL MODESETTING (VKMS)
+M:	Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
+R:	Haneen Mohammed <hamohammed.sa@gmail.com>
+R:	Daniel Vetter <daniel@ffwll.ch>
+T:	git git://anongit.freedesktop.org/drm/drm-misc
+S:	Maintained
+L:	dri-devel@lists.freedesktop.org
+F:	drivers/gpu/drm/vkms/
+F:	Documentation/gpu/vkms.rst
+
 DRM DRIVER FOR VMWARE VIRTUAL GPU
 M:	"VMware Graphics" <linux-graphics-maintainer@vmware.com>
 M:	Thomas Hellstrom <thellstrom@vmware.com>
@@ -5056,7 +5094,6 @@ F:	Documentation/devicetree/bindings/display/atmel/
 T:	git git://anongit.freedesktop.org/drm/drm-misc
 
 DRM DRIVERS FOR BRIDGE CHIPS
-M:	Archit Taneja <architt@codeaurora.org>
 M:	Andrzej Hajda <a.hajda@samsung.com>
 R:	Laurent Pinchart <Laurent.pinchart@ideasonboard.com>
 S:	Maintained
diff --git a/drivers/acpi/pmic/intel_pmic.c b/drivers/acpi/pmic/intel_pmic.c
index ca18e0d23df9..c14cfaea92e2 100644
--- a/drivers/acpi/pmic/intel_pmic.c
+++ b/drivers/acpi/pmic/intel_pmic.c
@@ -15,6 +15,7 @@
 
 #include <linux/export.h>
 #include <linux/acpi.h>
+#include <linux/mfd/intel_soc_pmic.h>
 #include <linux/regmap.h>
 #include <acpi/acpi_lpat.h>
 #include "intel_pmic.h"
@@ -36,6 +37,8 @@ struct intel_pmic_opregion {
 	struct intel_pmic_regs_handler_ctx ctx;
 };
 
+static struct intel_pmic_opregion *intel_pmic_opregion;
+
 static int pmic_get_reg_bit(int address, struct pmic_table *table,
 			    int count, int *reg, int *bit)
 {
@@ -304,6 +307,7 @@ int intel_pmic_install_opregion_handler(struct device *dev, acpi_handle handle,
 	}
 
 	opregion->data = d;
+	intel_pmic_opregion = opregion;
 	return 0;
 
 out_remove_thermal_handler:
@@ -319,3 +323,60 @@ out_error:
 	return ret;
 }
 EXPORT_SYMBOL_GPL(intel_pmic_install_opregion_handler);
+
+/**
+ * intel_soc_pmic_exec_mipi_pmic_seq_element - Execute PMIC MIPI sequence
+ * @i2c_address:  I2C client address for the PMIC
+ * @reg_address:  PMIC register address
+ * @value:        New value for the register bits to change
+ * @mask:         Mask indicating which register bits to change
+ *
+ * DSI LCD panels describe an initialization sequence in the i915 VBT (Video
+ * BIOS Tables) using so called MIPI sequences. One possible element in these
+ * sequences is a PMIC specific element of 15 bytes.
+ *
+ * This function executes these PMIC specific elements sending the embedded
+ * commands to the PMIC.
+ *
+ * Return 0 on success, < 0 on failure.
+ */
+int intel_soc_pmic_exec_mipi_pmic_seq_element(u16 i2c_address, u32 reg_address,
+					      u32 value, u32 mask)
+{
+	struct intel_pmic_opregion_data *d;
+	int ret;
+
+	if (!intel_pmic_opregion) {
+		pr_warn("%s: No PMIC registered\n", __func__);
+		return -ENXIO;
+	}
+
+	d = intel_pmic_opregion->data;
+
+	mutex_lock(&intel_pmic_opregion->lock);
+
+	if (d->exec_mipi_pmic_seq_element) {
+		ret = d->exec_mipi_pmic_seq_element(intel_pmic_opregion->regmap,
+						    i2c_address, reg_address,
+						    value, mask);
+	} else if (d->pmic_i2c_address) {
+		if (i2c_address == d->pmic_i2c_address) {
+			ret = regmap_update_bits(intel_pmic_opregion->regmap,
+						 reg_address, mask, value);
+		} else {
+			pr_err("%s: Unexpected i2c-addr: 0x%02x (reg-addr 0x%x value 0x%x mask 0x%x)\n",
+			       __func__, i2c_address, reg_address, value, mask);
+			ret = -ENXIO;
+		}
+	} else {
+		pr_warn("%s: Not implemented\n", __func__);
+		pr_warn("%s: i2c-addr: 0x%x reg-addr 0x%x value 0x%x mask 0x%x\n",
+			__func__, i2c_address, reg_address, value, mask);
+		ret = -EOPNOTSUPP;
+	}
+
+	mutex_unlock(&intel_pmic_opregion->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(intel_soc_pmic_exec_mipi_pmic_seq_element);
diff --git a/drivers/acpi/pmic/intel_pmic.h b/drivers/acpi/pmic/intel_pmic.h
index 095afc96952e..89379476a1df 100644
--- a/drivers/acpi/pmic/intel_pmic.h
+++ b/drivers/acpi/pmic/intel_pmic.h
@@ -15,10 +15,14 @@ struct intel_pmic_opregion_data {
 	int (*update_aux)(struct regmap *r, int reg, int raw_temp);
 	int (*get_policy)(struct regmap *r, int reg, int bit, u64 *value);
 	int (*update_policy)(struct regmap *r, int reg, int bit, int enable);
+	int (*exec_mipi_pmic_seq_element)(struct regmap *r, u16 i2c_address,
+					  u32 reg_address, u32 value, u32 mask);
 	struct pmic_table *power_table;
 	int power_table_count;
 	struct pmic_table *thermal_table;
 	int thermal_table_count;
+	/* For generic exec_mipi_pmic_seq_element handling */
+	int pmic_i2c_address;
 };
 
 int intel_pmic_install_opregion_handler(struct device *dev, acpi_handle handle, struct regmap *regmap, struct intel_pmic_opregion_data *d);
diff --git a/drivers/acpi/pmic/intel_pmic_chtwc.c b/drivers/acpi/pmic/intel_pmic_chtwc.c
index 078b0448f30a..7ffd5624b8e1 100644
--- a/drivers/acpi/pmic/intel_pmic_chtwc.c
+++ b/drivers/acpi/pmic/intel_pmic_chtwc.c
@@ -231,6 +231,24 @@ static int intel_cht_wc_pmic_update_power(struct regmap *regmap, int reg,
 	return regmap_update_bits(regmap, reg, bitmask, on ? 1 : 0);
 }
 
+static int intel_cht_wc_exec_mipi_pmic_seq_element(struct regmap *regmap,
+						   u16 i2c_client_address,
+						   u32 reg_address,
+						   u32 value, u32 mask)
+{
+	u32 address;
+
+	if (i2c_client_address > 0xff || reg_address > 0xff) {
+		pr_warn("%s warning addresses too big client 0x%x reg 0x%x\n",
+			__func__, i2c_client_address, reg_address);
+		return -ERANGE;
+	}
+
+	address = (i2c_client_address << 8) | reg_address;
+
+	return regmap_update_bits(regmap, address, mask, value);
+}
+
 /*
  * The thermal table and ops are empty, we do not support the Thermal opregion
  * (DPTF) due to lacking documentation.
@@ -238,6 +256,7 @@ static int intel_cht_wc_pmic_update_power(struct regmap *regmap, int reg,
 static struct intel_pmic_opregion_data intel_cht_wc_pmic_opregion_data = {
 	.get_power		= intel_cht_wc_pmic_get_power,
 	.update_power		= intel_cht_wc_pmic_update_power,
+	.exec_mipi_pmic_seq_element = intel_cht_wc_exec_mipi_pmic_seq_element,
 	.power_table		= power_table,
 	.power_table_count	= ARRAY_SIZE(power_table),
 };
diff --git a/drivers/acpi/pmic/intel_pmic_xpower.c b/drivers/acpi/pmic/intel_pmic_xpower.c
index e7c0006e6602..a091d5a8392c 100644
--- a/drivers/acpi/pmic/intel_pmic_xpower.c
+++ b/drivers/acpi/pmic/intel_pmic_xpower.c
@@ -265,6 +265,7 @@ static struct intel_pmic_opregion_data intel_xpower_pmic_opregion_data = {
 	.power_table_count = ARRAY_SIZE(power_table),
 	.thermal_table = thermal_table,
 	.thermal_table_count = ARRAY_SIZE(thermal_table),
+	.pmic_i2c_address = 0x34,
 };
 
 static acpi_status intel_xpower_pmic_gpio_handler(u32 function,
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 02f7f9a89979..7c858020d14b 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -1093,17 +1093,7 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
 	return 0;
 }
 
-static int dma_buf_debug_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, dma_buf_debug_show, NULL);
-}
-
-static const struct file_operations dma_buf_debug_fops = {
-	.open           = dma_buf_debug_open,
-	.read           = seq_read,
-	.llseek         = seq_lseek,
-	.release        = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(dma_buf_debug);
 
 static struct dentry *dma_buf_debugfs_dir;
 
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 136ec04d683f..3aa8733f832a 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -649,7 +649,7 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
  */
 void
 dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
-	       spinlock_t *lock, u64 context, unsigned seqno)
+	       spinlock_t *lock, u64 context, u64 seqno)
 {
 	BUG_ON(!lock);
 	BUG_ON(!ops || !ops->get_driver_name || !ops->get_timeline_name);
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 53c1d6d36a64..32dcf7b4c935 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -172,7 +172,7 @@ static bool timeline_fence_enable_signaling(struct dma_fence *fence)
 static void timeline_fence_value_str(struct dma_fence *fence,
 				    char *str, int size)
 {
-	snprintf(str, size, "%d", fence->seqno);
+	snprintf(str, size, "%lld", fence->seqno);
 }
 
 static void timeline_fence_timeline_value_str(struct dma_fence *fence,
diff --git a/drivers/dma-buf/sync_debug.c b/drivers/dma-buf/sync_debug.c
index c4c8ecb24aa9..c0abf37df88b 100644
--- a/drivers/dma-buf/sync_debug.c
+++ b/drivers/dma-buf/sync_debug.c
@@ -147,7 +147,7 @@ static void sync_print_sync_file(struct seq_file *s,
 	}
 }
 
-static int sync_debugfs_show(struct seq_file *s, void *unused)
+static int sync_info_debugfs_show(struct seq_file *s, void *unused)
 {
 	struct list_head *pos;
 
@@ -178,17 +178,7 @@ static int sync_debugfs_show(struct seq_file *s, void *unused)
 	return 0;
 }
 
-static int sync_info_debugfs_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, sync_debugfs_show, inode->i_private);
-}
-
-static const struct file_operations sync_info_debugfs_fops = {
-	.open           = sync_info_debugfs_open,
-	.read           = seq_read,
-	.llseek         = seq_lseek,
-	.release        = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(sync_info_debugfs);
 
 static __init int sync_debugfs_init(void)
 {
@@ -218,7 +208,7 @@ void sync_dump(void)
 	};
 	int i;
 
-	sync_debugfs_show(&s, NULL);
+	sync_info_debugfs_show(&s, NULL);
 
 	for (i = 0; i < s.count; i += DUMP_CHUNK) {
 		if ((s.count - i) > DUMP_CHUNK) {
diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
index 35dd06479867..4f6305ca52c8 100644
--- a/drivers/dma-buf/sync_file.c
+++ b/drivers/dma-buf/sync_file.c
@@ -144,7 +144,7 @@ char *sync_file_get_name(struct sync_file *sync_file, char *buf, int len)
 	} else {
 		struct dma_fence *fence = sync_file->fence;
 
-		snprintf(buf, len, "%s-%s%llu-%d",
+		snprintf(buf, len, "%s-%s%llu-%lld",
 			 fence->ops->get_driver_name(fence),
 			 fence->ops->get_timeline_name(fence),
 			 fence->context,
@@ -258,7 +258,7 @@ static struct sync_file *sync_file_merge(const char *name, struct sync_file *a,
 
 			i_b++;
 		} else {
-			if (pt_a->seqno - pt_b->seqno <= INT_MAX)
+			if (__dma_fence_is_later(pt_a->seqno, pt_b->seqno))
 				add_fence(fences, &i, pt_a);
 			else
 				add_fence(fences, &i, pt_b);
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 4385f00e1d05..bd943a71756c 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -170,10 +170,6 @@ config DRM_KMS_CMA_HELPER
 	bool
 	depends on DRM
 	select DRM_GEM_CMA_HELPER
-	select DRM_KMS_FB_HELPER
-	select FB_SYS_FILLRECT
-	select FB_SYS_COPYAREA
-	select FB_SYS_IMAGEBLIT
 	help
 	  Choose this if you need the KMS CMA helper functions
 
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index ce8d1d384319..1ac55c65eac0 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -51,7 +51,7 @@ obj-$(CONFIG_DRM_DEBUG_SELFTEST) += selftests/
 obj-$(CONFIG_DRM)	+= drm.o
 obj-$(CONFIG_DRM_MIPI_DSI) += drm_mipi_dsi.o
 obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o
-obj-$(CONFIG_DRM_ARM)	+= arm/
+obj-y			+= arm/
 obj-$(CONFIG_DRM_TTM)	+= ttm/
 obj-$(CONFIG_DRM_SCHED)	+= scheduler/
 obj-$(CONFIG_DRM_TDFX)	+= tdfx/
@@ -81,7 +81,7 @@ obj-$(CONFIG_DRM_UDL) += udl/
 obj-$(CONFIG_DRM_AST) += ast/
 obj-$(CONFIG_DRM_ARMADA) += armada/
 obj-$(CONFIG_DRM_ATMEL_HLCDC)	+= atmel-hlcdc/
-obj-$(CONFIG_DRM_RCAR_DU) += rcar-du/
+obj-y			+= rcar-du/
 obj-$(CONFIG_DRM_SHMOBILE) +=shmobile/
 obj-y			+= omapdrm/
 obj-$(CONFIG_DRM_SUN4I) += sun4i/
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
index f76bcb9c45e4..466da5954a68 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -57,7 +57,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
 
 # add asic specific block
 amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
-	ci_smc.o ci_dpm.o dce_v8_0.o gfx_v7_0.o cik_sdma.o uvd_v4_2.o vce_v2_0.o
+	dce_v8_0.o gfx_v7_0.o cik_sdma.o uvd_v4_2.o vce_v2_0.o
 
 amdgpu-$(CONFIG_DRM_AMDGPU_SI)+= si.o gmc_v6_0.o gfx_v6_0.o si_ih.o si_dma.o dce_v6_0.o si_dpm.o si_smc.o
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index bcef6ea4bcf9..8d0d7f3dd5fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -411,6 +411,8 @@ struct amdgpu_fpriv {
 	struct amdgpu_ctx_mgr	ctx_mgr;
 };
 
+int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv);
+
 int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		  unsigned size, struct amdgpu_ib *ib);
 void amdgpu_ib_free(struct amdgpu_device *adev, struct amdgpu_ib *ib,
@@ -542,6 +544,11 @@ struct amdgpu_asic_funcs {
 	bool (*need_full_reset)(struct amdgpu_device *adev);
 	/* initialize doorbell layout for specific asic*/
 	void (*init_doorbell_index)(struct amdgpu_device *adev);
+	/* PCIe bandwidth usage */
+	void (*get_pcie_usage)(struct amdgpu_device *adev, uint64_t *count0,
+			       uint64_t *count1);
+	/* do we need to reset the asic at init time (e.g., kexec) */
+	bool (*need_reset_on_init)(struct amdgpu_device *adev);
 };
 
 /*
@@ -634,7 +641,7 @@ struct amdgpu_nbio_funcs {
 	void (*hdp_flush)(struct amdgpu_device *adev, struct amdgpu_ring *ring);
 	u32 (*get_memsize)(struct amdgpu_device *adev);
 	void (*sdma_doorbell_range)(struct amdgpu_device *adev, int instance,
-				    bool use_doorbell, int doorbell_index);
+			bool use_doorbell, int doorbell_index, int doorbell_size);
 	void (*enable_doorbell_aperture)(struct amdgpu_device *adev,
 					 bool enable);
 	void (*enable_doorbell_selfring_aperture)(struct amdgpu_device *adev,
@@ -1042,6 +1049,8 @@ int emu_soc_asic_init(struct amdgpu_device *adev);
 #define amdgpu_asic_invalidate_hdp(adev, r) (adev)->asic_funcs->invalidate_hdp((adev), (r))
 #define amdgpu_asic_need_full_reset(adev) (adev)->asic_funcs->need_full_reset((adev))
 #define amdgpu_asic_init_doorbell_index(adev) (adev)->asic_funcs->init_doorbell_index((adev))
+#define amdgpu_asic_get_pcie_usage(adev, cnt0, cnt1) ((adev)->asic_funcs->get_pcie_usage((adev), (cnt0), (cnt1)))
+#define amdgpu_asic_need_reset_on_init(adev) (adev)->asic_funcs->need_reset_on_init((adev))
 
 /* Common functions */
 bool amdgpu_device_should_recover_gpu(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 2dfaf158ef07..fe1d7368c1e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -28,8 +28,6 @@
 #include <linux/module.h>
 #include <linux/dma-buf.h>
 
-const struct kgd2kfd_calls *kgd2kfd;
-
 static const unsigned int compute_vmid_bitmap = 0xFF00;
 
 /* Total memory size in system memory and all GPU VRAM. Used to
@@ -47,12 +45,9 @@ int amdgpu_amdkfd_init(void)
 	amdgpu_amdkfd_total_mem_size *= si.mem_unit;
 
 #ifdef CONFIG_HSA_AMD
-	ret = kgd2kfd_init(KFD_INTERFACE_VERSION, &kgd2kfd);
-	if (ret)
-		kgd2kfd = NULL;
+	ret = kgd2kfd_init();
 	amdgpu_amdkfd_gpuvm_init_mem_limits();
 #else
-	kgd2kfd = NULL;
 	ret = -ENOENT;
 #endif
 
@@ -61,17 +56,13 @@ int amdgpu_amdkfd_init(void)
 
 void amdgpu_amdkfd_fini(void)
 {
-	if (kgd2kfd)
-		kgd2kfd->exit();
+	kgd2kfd_exit();
 }
 
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
 {
 	const struct kfd2kgd_calls *kfd2kgd;
 
-	if (!kgd2kfd)
-		return;
-
 	switch (adev->asic_type) {
 #ifdef CONFIG_DRM_AMDGPU_CIK
 	case CHIP_KAVERI:
@@ -98,8 +89,8 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
 		return;
 	}
 
-	adev->kfd.dev = kgd2kfd->probe((struct kgd_dev *)adev,
-				       adev->pdev, kfd2kgd);
+	adev->kfd.dev = kgd2kfd_probe((struct kgd_dev *)adev,
+				      adev->pdev, kfd2kgd);
 
 	if (adev->kfd.dev)
 		amdgpu_amdkfd_total_mem_size += adev->gmc.real_vram_size;
@@ -140,7 +131,7 @@ static void amdgpu_doorbell_get_kfd_info(struct amdgpu_device *adev,
 
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 {
-	int i, n;
+	int i;
 	int last_valid_bit;
 
 	if (adev->kfd.dev) {
@@ -151,7 +142,9 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 			.gpuvm_size = min(adev->vm_manager.max_pfn
 					  << AMDGPU_GPU_PAGE_SHIFT,
 					  AMDGPU_GMC_HOLE_START),
-			.drm_render_minor = adev->ddev->render->index
+			.drm_render_minor = adev->ddev->render->index,
+			.sdma_doorbell_idx = adev->doorbell_index.sdma_engine,
+
 		};
 
 		/* this is going to have a few of the MSBs set that we need to
@@ -181,44 +174,29 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
 				&gpu_resources.doorbell_aperture_size,
 				&gpu_resources.doorbell_start_offset);
 
-		if (adev->asic_type < CHIP_VEGA10) {
-			kgd2kfd->device_init(adev->kfd.dev, &gpu_resources);
-			return;
-		}
-
-		n = (adev->asic_type < CHIP_VEGA20) ? 2 : 8;
-
-		for (i = 0; i < n; i += 2) {
-			/* On SOC15 the BIF is involved in routing
-			 * doorbells using the low 12 bits of the
-			 * address. Communicate the assignments to
-			 * KFD. KFD uses two doorbell pages per
-			 * process in case of 64-bit doorbells so we
-			 * can use each doorbell assignment twice.
-			 */
-			gpu_resources.sdma_doorbell[0][i] =
-				adev->doorbell_index.sdma_engine0 + (i >> 1);
-			gpu_resources.sdma_doorbell[0][i+1] =
-				adev->doorbell_index.sdma_engine0 + 0x200 + (i >> 1);
-			gpu_resources.sdma_doorbell[1][i] =
-				adev->doorbell_index.sdma_engine1 + (i >> 1);
-			gpu_resources.sdma_doorbell[1][i+1] =
-				adev->doorbell_index.sdma_engine1 + 0x200 + (i >> 1);
-		}
-		/* Doorbells 0x0e0-0ff and 0x2e0-2ff are reserved for
-		 * SDMA, IH and VCN. So don't use them for the CP.
+		/* Since SOC15, BIF starts to statically use the
+		 * lower 12 bits of doorbell addresses for routing
+		 * based on settings in registers like
+		 * SDMA0_DOORBELL_RANGE etc..
+		 * In order to route a doorbell to CP engine, the lower
+		 * 12 bits of its address has to be outside the range
+		 * set for SDMA, VCN, and IH blocks.
 		 */
-		gpu_resources.reserved_doorbell_mask = 0x1e0;
-		gpu_resources.reserved_doorbell_val  = 0x0e0;
+		if (adev->asic_type >= CHIP_VEGA10) {
+			gpu_resources.non_cp_doorbells_start =
+					adev->doorbell_index.first_non_cp;
+			gpu_resources.non_cp_doorbells_end =
+					adev->doorbell_index.last_non_cp;
+		}
 
-		kgd2kfd->device_init(adev->kfd.dev, &gpu_resources);
+		kgd2kfd_device_init(adev->kfd.dev, &gpu_resources);
 	}
 }
 
 void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev)
 {
 	if (adev->kfd.dev) {
-		kgd2kfd->device_exit(adev->kfd.dev);
+		kgd2kfd_device_exit(adev->kfd.dev);
 		adev->kfd.dev = NULL;
 	}
 }
@@ -227,13 +205,13 @@ void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
 		const void *ih_ring_entry)
 {
 	if (adev->kfd.dev)
-		kgd2kfd->interrupt(adev->kfd.dev, ih_ring_entry);
+		kgd2kfd_interrupt(adev->kfd.dev, ih_ring_entry);
 }
 
 void amdgpu_amdkfd_suspend(struct amdgpu_device *adev)
 {
 	if (adev->kfd.dev)
-		kgd2kfd->suspend(adev->kfd.dev);
+		kgd2kfd_suspend(adev->kfd.dev);
 }
 
 int amdgpu_amdkfd_resume(struct amdgpu_device *adev)
@@ -241,7 +219,7 @@ int amdgpu_amdkfd_resume(struct amdgpu_device *adev)
 	int r = 0;
 
 	if (adev->kfd.dev)
-		r = kgd2kfd->resume(adev->kfd.dev);
+		r = kgd2kfd_resume(adev->kfd.dev);
 
 	return r;
 }
@@ -251,7 +229,7 @@ int amdgpu_amdkfd_pre_reset(struct amdgpu_device *adev)
 	int r = 0;
 
 	if (adev->kfd.dev)
-		r = kgd2kfd->pre_reset(adev->kfd.dev);
+		r = kgd2kfd_pre_reset(adev->kfd.dev);
 
 	return r;
 }
@@ -261,7 +239,7 @@ int amdgpu_amdkfd_post_reset(struct amdgpu_device *adev)
 	int r = 0;
 
 	if (adev->kfd.dev)
-		r = kgd2kfd->post_reset(adev->kfd.dev);
+		r = kgd2kfd_post_reset(adev->kfd.dev);
 
 	return r;
 }
@@ -619,4 +597,47 @@ struct kfd2kgd_calls *amdgpu_amdkfd_gfx_9_0_get_functions(void)
 {
 	return NULL;
 }
+
+struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct pci_dev *pdev,
+			      const struct kfd2kgd_calls *f2g)
+{
+	return NULL;
+}
+
+bool kgd2kfd_device_init(struct kfd_dev *kfd,
+			 const struct kgd2kfd_shared_resources *gpu_resources)
+{
+	return false;
+}
+
+void kgd2kfd_device_exit(struct kfd_dev *kfd)
+{
+}
+
+void kgd2kfd_exit(void)
+{
+}
+
+void kgd2kfd_suspend(struct kfd_dev *kfd)
+{
+}
+
+int kgd2kfd_resume(struct kfd_dev *kfd)
+{
+	return 0;
+}
+
+int kgd2kfd_pre_reset(struct kfd_dev *kfd)
+{
+	return 0;
+}
+
+int kgd2kfd_post_reset(struct kfd_dev *kfd)
+{
+	return 0;
+}
+
+void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry)
+{
+}
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 70429f7aa9a8..0b31a1859023 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -33,7 +33,6 @@
 #include "amdgpu_sync.h"
 #include "amdgpu_vm.h"
 
-extern const struct kgd2kfd_calls *kgd2kfd;
 extern uint64_t amdgpu_amdkfd_total_mem_size;
 
 struct amdgpu_device;
@@ -214,4 +213,22 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct kgd_dev *kgd,
 void amdgpu_amdkfd_gpuvm_init_mem_limits(void);
 void amdgpu_amdkfd_unreserve_memory_limit(struct amdgpu_bo *bo);
 
+/* KGD2KFD callbacks */
+int kgd2kfd_init(void);
+void kgd2kfd_exit(void);
+struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct pci_dev *pdev,
+			      const struct kfd2kgd_calls *f2g);
+bool kgd2kfd_device_init(struct kfd_dev *kfd,
+			 const struct kgd2kfd_shared_resources *gpu_resources);
+void kgd2kfd_device_exit(struct kfd_dev *kfd);
+void kgd2kfd_suspend(struct kfd_dev *kfd);
+int kgd2kfd_resume(struct kfd_dev *kfd);
+int kgd2kfd_pre_reset(struct kfd_dev *kfd);
+int kgd2kfd_post_reset(struct kfd_dev *kfd);
+void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry);
+int kgd2kfd_quiesce_mm(struct mm_struct *mm);
+int kgd2kfd_resume_mm(struct mm_struct *mm);
+int kgd2kfd_schedule_evict_and_restore_process(struct mm_struct *mm,
+					       struct dma_fence *fence);
+
 #endif /* AMDGPU_AMDKFD_H_INCLUDED */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
index 574c1181ae9a..3107b9575929 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
@@ -122,7 +122,7 @@ static bool amdkfd_fence_enable_signaling(struct dma_fence *f)
 	if (dma_fence_is_signaled(f))
 		return true;
 
-	if (!kgd2kfd->schedule_evict_and_restore_process(fence->mm, f))
+	if (!kgd2kfd_schedule_evict_and_restore_process(fence->mm, f))
 		return true;
 
 	return false;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index be1ab43473c6..1921dec3df7a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -204,38 +204,25 @@ void amdgpu_amdkfd_unreserve_memory_limit(struct amdgpu_bo *bo)
 }
 
 
-/* amdgpu_amdkfd_remove_eviction_fence - Removes eviction fence(s) from BO's
+/* amdgpu_amdkfd_remove_eviction_fence - Removes eviction fence from BO's
  *  reservation object.
  *
  * @bo: [IN] Remove eviction fence(s) from this BO
- * @ef: [IN] If ef is specified, then this eviction fence is removed if it
+ * @ef: [IN] This eviction fence is removed if it
  *  is present in the shared list.
- * @ef_list: [OUT] Returns list of eviction fences. These fences are removed
- *  from BO's reservation object shared list.
- * @ef_count: [OUT] Number of fences in ef_list.
  *
- * NOTE: If called with ef_list, then amdgpu_amdkfd_add_eviction_fence must be
- *  called to restore the eviction fences and to avoid memory leak. This is
- *  useful for shared BOs.
  * NOTE: Must be called with BO reserved i.e. bo->tbo.resv->lock held.
  */
 static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
-					struct amdgpu_amdkfd_fence *ef,
-					struct amdgpu_amdkfd_fence ***ef_list,
-					unsigned int *ef_count)
+					struct amdgpu_amdkfd_fence *ef)
 {
 	struct reservation_object *resv = bo->tbo.resv;
 	struct reservation_object_list *old, *new;
 	unsigned int i, j, k;
 
-	if (!ef && !ef_list)
+	if (!ef)
 		return -EINVAL;
 
-	if (ef_list) {
-		*ef_list = NULL;
-		*ef_count = 0;
-	}
-
 	old = reservation_object_get_list(resv);
 	if (!old)
 		return 0;
@@ -254,8 +241,7 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
 		f = rcu_dereference_protected(old->shared[i],
 					      reservation_object_held(resv));
 
-		if ((ef && f->context == ef->base.context) ||
-		    (!ef && to_amdgpu_amdkfd_fence(f)))
+		if (f->context == ef->base.context)
 			RCU_INIT_POINTER(new->shared[--j], f);
 		else
 			RCU_INIT_POINTER(new->shared[k++], f);
@@ -263,21 +249,6 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
 	new->shared_max = old->shared_max;
 	new->shared_count = k;
 
-	if (!ef) {
-		unsigned int count = old->shared_count - j;
-
-		/* Alloc memory for count number of eviction fence pointers.
-		 * Fill the ef_list array and ef_count
-		 */
-		*ef_list = kcalloc(count, sizeof(**ef_list), GFP_KERNEL);
-		*ef_count = count;
-
-		if (!*ef_list) {
-			kfree(new);
-			return -ENOMEM;
-		}
-	}
-
 	/* Install the new fence list, seqcount provides the barriers */
 	preempt_disable();
 	write_seqcount_begin(&resv->seq);
@@ -291,46 +262,13 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
 
 		f = rcu_dereference_protected(new->shared[i],
 					      reservation_object_held(resv));
-		if (!ef)
-			(*ef_list)[k++] = to_amdgpu_amdkfd_fence(f);
-		else
-			dma_fence_put(f);
+		dma_fence_put(f);
 	}
 	kfree_rcu(old, rcu);
 
 	return 0;
 }
 
-/* amdgpu_amdkfd_add_eviction_fence - Adds eviction fence(s) back into BO's
- *  reservation object.
- *
- * @bo: [IN] Add eviction fences to this BO
- * @ef_list: [IN] List of eviction fences to be added
- * @ef_count: [IN] Number of fences in ef_list.
- *
- * NOTE: Must call amdgpu_amdkfd_remove_eviction_fence before calling this
- *  function.
- */
-static void amdgpu_amdkfd_add_eviction_fence(struct amdgpu_bo *bo,
-				struct amdgpu_amdkfd_fence **ef_list,
-				unsigned int ef_count)
-{
-	int i;
-
-	if (!ef_list || !ef_count)
-		return;
-
-	for (i = 0; i < ef_count; i++) {
-		amdgpu_bo_fence(bo, &ef_list[i]->base, true);
-		/* Re-adding the fence takes an additional reference. Drop that
-		 * reference.
-		 */
-		dma_fence_put(&ef_list[i]->base);
-	}
-
-	kfree(ef_list);
-}
-
 static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, uint32_t domain,
 				     bool wait)
 {
@@ -346,18 +284,8 @@ static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, uint32_t domain,
 	ret = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
 	if (ret)
 		goto validate_fail;
-	if (wait) {
-		struct amdgpu_amdkfd_fence **ef_list;
-		unsigned int ef_count;
-
-		ret = amdgpu_amdkfd_remove_eviction_fence(bo, NULL, &ef_list,
-							  &ef_count);
-		if (ret)
-			goto validate_fail;
-
-		ttm_bo_wait(&bo->tbo, false, false);
-		amdgpu_amdkfd_add_eviction_fence(bo, ef_list, ef_count);
-	}
+	if (wait)
+		amdgpu_bo_sync_wait(bo, AMDGPU_FENCE_OWNER_KFD, false);
 
 validate_fail:
 	return ret;
@@ -444,7 +372,6 @@ static int add_bo_to_vm(struct amdgpu_device *adev, struct kgd_mem *mem,
 {
 	int ret;
 	struct kfd_bo_va_list *bo_va_entry;
-	struct amdgpu_bo *pd = vm->root.base.bo;
 	struct amdgpu_bo *bo = mem->bo;
 	uint64_t va = mem->va;
 	struct list_head *list_bo_va = &mem->bo_va_list;
@@ -484,14 +411,8 @@ static int add_bo_to_vm(struct amdgpu_device *adev, struct kgd_mem *mem,
 		*p_bo_va_entry = bo_va_entry;
 
 	/* Allocate new page tables if needed and validate
-	 * them. Clearing of new page tables and validate need to wait
-	 * on move fences. We don't want that to trigger the eviction
-	 * fence, so remove it temporarily.
+	 * them.
 	 */
-	amdgpu_amdkfd_remove_eviction_fence(pd,
-					vm->process_info->eviction_fence,
-					NULL, NULL);
-
 	ret = amdgpu_vm_alloc_pts(adev, vm, va, amdgpu_bo_size(bo));
 	if (ret) {
 		pr_err("Failed to allocate pts, err=%d\n", ret);
@@ -504,13 +425,9 @@ static int add_bo_to_vm(struct amdgpu_device *adev, struct kgd_mem *mem,
 		goto err_alloc_pts;
 	}
 
-	/* Add the eviction fence back */
-	amdgpu_bo_fence(pd, &vm->process_info->eviction_fence->base, true);
-
 	return 0;
 
 err_alloc_pts:
-	amdgpu_bo_fence(pd, &vm->process_info->eviction_fence->base, true);
 	amdgpu_vm_bo_rmv(adev, bo_va_entry->bo_va);
 	list_del(&bo_va_entry->bo_list);
 err_vmadd:
@@ -809,24 +726,11 @@ static int unmap_bo_from_gpuvm(struct amdgpu_device *adev,
 {
 	struct amdgpu_bo_va *bo_va = entry->bo_va;
 	struct amdgpu_vm *vm = bo_va->base.vm;
-	struct amdgpu_bo *pd = vm->root.base.bo;
 
-	/* Remove eviction fence from PD (and thereby from PTs too as
-	 * they share the resv. object). Otherwise during PT update
-	 * job (see amdgpu_vm_bo_update_mapping), eviction fence would
-	 * get added to job->sync object and job execution would
-	 * trigger the eviction fence.
-	 */
-	amdgpu_amdkfd_remove_eviction_fence(pd,
-					    vm->process_info->eviction_fence,
-					    NULL, NULL);
 	amdgpu_vm_bo_unmap(adev, bo_va, entry->va);
 
 	amdgpu_vm_clear_freed(adev, vm, &bo_va->last_pt_update);
 
-	/* Add the eviction fence back */
-	amdgpu_bo_fence(pd, &vm->process_info->eviction_fence->base, true);
-
 	amdgpu_sync_fence(NULL, sync, bo_va->last_pt_update, false);
 
 	return 0;
@@ -1002,7 +906,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void **process_info,
 		pr_err("validate_pt_pd_bos() failed\n");
 		goto validate_pd_fail;
 	}
-	ret = ttm_bo_wait(&vm->root.base.bo->tbo, false, false);
+	amdgpu_bo_sync_wait(vm->root.base.bo, AMDGPU_FENCE_OWNER_KFD, false);
 	if (ret)
 		goto wait_pd_fail;
 	amdgpu_bo_fence(vm->root.base.bo,
@@ -1389,8 +1293,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
 	 * attached
 	 */
 	amdgpu_amdkfd_remove_eviction_fence(mem->bo,
-					process_info->eviction_fence,
-					NULL, NULL);
+					process_info->eviction_fence);
 	pr_debug("Release VA 0x%llx - 0x%llx\n", mem->va,
 		mem->va + bo_size * (1 + mem->aql_queue));
 
@@ -1617,8 +1520,7 @@ int amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu(
 	if (mem->mapped_to_gpu_memory == 0 &&
 	    !amdgpu_ttm_tt_get_usermm(mem->bo->tbo.ttm) && !mem->bo->pin_count)
 		amdgpu_amdkfd_remove_eviction_fence(mem->bo,
-						process_info->eviction_fence,
-						    NULL, NULL);
+						process_info->eviction_fence);
 
 unreserve_out:
 	unreserve_bo_and_vms(&ctx, false, false);
@@ -1679,7 +1581,7 @@ int amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(struct kgd_dev *kgd,
 	}
 
 	amdgpu_amdkfd_remove_eviction_fence(
-		bo, mem->process_info->eviction_fence, NULL, NULL);
+		bo, mem->process_info->eviction_fence);
 	list_del_init(&mem->validate_list.head);
 
 	if (size)
@@ -1790,7 +1692,7 @@ int amdgpu_amdkfd_evict_userptr(struct kgd_mem *mem,
 	evicted_bos = atomic_inc_return(&process_info->evicted_bos);
 	if (evicted_bos == 1) {
 		/* First eviction, stop the queues */
-		r = kgd2kfd->quiesce_mm(mm);
+		r = kgd2kfd_quiesce_mm(mm);
 		if (r)
 			pr_err("Failed to quiesce KFD\n");
 		schedule_delayed_work(&process_info->restore_userptr_work,
@@ -1945,16 +1847,6 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
 
 	amdgpu_sync_create(&sync);
 
-	/* Avoid triggering eviction fences when unmapping invalid
-	 * userptr BOs (waits for all fences, doesn't use
-	 * FENCE_OWNER_VM)
-	 */
-	list_for_each_entry(peer_vm, &process_info->vm_list_head,
-			    vm_list_node)
-		amdgpu_amdkfd_remove_eviction_fence(peer_vm->root.base.bo,
-						process_info->eviction_fence,
-						NULL, NULL);
-
 	ret = process_validate_vms(process_info);
 	if (ret)
 		goto unreserve_out;
@@ -2015,10 +1907,6 @@ static int validate_invalid_user_pages(struct amdkfd_process_info *process_info)
 	ret = process_update_pds(process_info, &sync);
 
 unreserve_out:
-	list_for_each_entry(peer_vm, &process_info->vm_list_head,
-			    vm_list_node)
-		amdgpu_bo_fence(peer_vm->root.base.bo,
-				&process_info->eviction_fence->base, true);
 	ttm_eu_backoff_reservation(&ticket, &resv_list);
 	amdgpu_sync_wait(&sync, false);
 	amdgpu_sync_free(&sync);
@@ -2082,7 +1970,7 @@ static void amdgpu_amdkfd_restore_userptr_worker(struct work_struct *work)
 	    evicted_bos)
 		goto unlock_out;
 	evicted_bos = 0;
-	if (kgd2kfd->resume_mm(mm)) {
+	if (kgd2kfd_resume_mm(mm)) {
 		pr_err("%s: Failed to resume KFD\n", __func__);
 		/* No recovery from this failure. Probably the CP is
 		 * hanging. No point trying again.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
index 69ad6ec0a4f3..bf04c12bd324 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
@@ -25,8 +25,8 @@
  */
 #include <drm/drmP.h>
 #include <drm/drm_edid.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/amdgpu_drm.h>
 #include "amdgpu.h"
 #include "atom.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 1c49b8266d69..52a5e4fdc95b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -214,6 +214,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs
 		case AMDGPU_CHUNK_ID_DEPENDENCIES:
 		case AMDGPU_CHUNK_ID_SYNCOBJ_IN:
 		case AMDGPU_CHUNK_ID_SYNCOBJ_OUT:
+		case AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES:
 			break;
 
 		default:
@@ -1090,6 +1091,15 @@ static int amdgpu_cs_process_fence_dep(struct amdgpu_cs_parser *p,
 
 		fence = amdgpu_ctx_get_fence(ctx, entity,
 					     deps[i].handle);
+
+		if (chunk->chunk_id == AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES) {
+			struct drm_sched_fence *s_fence = to_drm_sched_fence(fence);
+			struct dma_fence *old = fence;
+
+			fence = dma_fence_get(&s_fence->scheduled);
+			dma_fence_put(old);
+		}
+
 		if (IS_ERR(fence)) {
 			r = PTR_ERR(fence);
 			amdgpu_ctx_put(ctx);
@@ -1177,7 +1187,8 @@ static int amdgpu_cs_dependencies(struct amdgpu_device *adev,
 
 		chunk = &p->chunks[i];
 
-		if (chunk->chunk_id == AMDGPU_CHUNK_ID_DEPENDENCIES) {
+		if (chunk->chunk_id == AMDGPU_CHUNK_ID_DEPENDENCIES ||
+		    chunk->chunk_id == AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES) {
 			r = amdgpu_cs_process_fence_dep(p, chunk);
 			if (r)
 				return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index d85184b5b35c..7b526593eb77 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -124,6 +124,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
 		struct amdgpu_ring *rings[AMDGPU_MAX_RINGS];
 		struct drm_sched_rq *rqs[AMDGPU_MAX_RINGS];
 		unsigned num_rings;
+		unsigned num_rqs = 0;
 
 		switch (i) {
 		case AMDGPU_HW_IP_GFX:
@@ -166,12 +167,16 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
 			break;
 		}
 
-		for (j = 0; j < num_rings; ++j)
-			rqs[j] = &rings[j]->sched.sched_rq[priority];
+		for (j = 0; j < num_rings; ++j) {
+			if (!rings[j]->adev)
+				continue;
+
+			rqs[num_rqs++] = &rings[j]->sched.sched_rq[priority];
+		}
 
 		for (j = 0; j < amdgpu_ctx_num_entities[i]; ++j)
 			r = drm_sched_entity_init(&ctx->entities[i][j].entity,
-						  rqs, num_rings, &ctx->guilty);
+						  rqs, num_rqs, &ctx->guilty);
 		if (r)
 			goto error_cleanup_entities;
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index dd9a4fb9ce39..4ae3ff9a1d4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -158,9 +158,6 @@ static int  amdgpu_debugfs_process_reg_op(bool read, struct file *f,
 	while (size) {
 		uint32_t value;
 
-		if (*pos > adev->rmmio_size)
-			goto end;
-
 		if (read) {
 			value = RREG32(*pos >> 2);
 			r = put_user(value, (uint32_t *)buf);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7ff3a28fc903..4f8fb4ecde34 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -30,8 +30,8 @@
 #include <linux/console.h>
 #include <linux/slab.h>
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/amdgpu_drm.h>
 #include <linux/vgaarb.h>
 #include <linux/vga_switcheroo.h>
@@ -1645,7 +1645,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 		if (r) {
 			DRM_ERROR("sw_init of IP block <%s> failed %d\n",
 				  adev->ip_blocks[i].version->funcs->name, r);
-			return r;
+			goto init_failed;
 		}
 		adev->ip_blocks[i].status.sw = true;
 
@@ -1654,17 +1654,17 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 			r = amdgpu_device_vram_scratch_init(adev);
 			if (r) {
 				DRM_ERROR("amdgpu_vram_scratch_init failed %d\n", r);
-				return r;
+				goto init_failed;
 			}
 			r = adev->ip_blocks[i].version->funcs->hw_init((void *)adev);
 			if (r) {
 				DRM_ERROR("hw_init %d failed %d\n", i, r);
-				return r;
+				goto init_failed;
 			}
 			r = amdgpu_device_wb_init(adev);
 			if (r) {
 				DRM_ERROR("amdgpu_device_wb_init failed %d\n", r);
-				return r;
+				goto init_failed;
 			}
 			adev->ip_blocks[i].status.hw = true;
 
@@ -1675,7 +1675,7 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 								AMDGPU_CSA_SIZE);
 				if (r) {
 					DRM_ERROR("allocate CSA failed %d\n", r);
-					return r;
+					goto init_failed;
 				}
 			}
 		}
@@ -1683,30 +1683,32 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 
 	r = amdgpu_ucode_create_bo(adev); /* create ucode bo when sw_init complete*/
 	if (r)
-		return r;
+		goto init_failed;
 
 	r = amdgpu_device_ip_hw_init_phase1(adev);
 	if (r)
-		return r;
+		goto init_failed;
 
 	r = amdgpu_device_fw_loading(adev);
 	if (r)
-		return r;
+		goto init_failed;
 
 	r = amdgpu_device_ip_hw_init_phase2(adev);
 	if (r)
-		return r;
+		goto init_failed;
 
 	if (adev->gmc.xgmi.num_physical_nodes > 1)
 		amdgpu_xgmi_add_device(adev);
 	amdgpu_amdkfd_device_init(adev);
 
+init_failed:
 	if (amdgpu_sriov_vf(adev)) {
-		amdgpu_virt_init_data_exchange(adev);
+		if (!r)
+			amdgpu_virt_init_data_exchange(adev);
 		amdgpu_virt_release_full_gpu(adev, true);
 	}
 
-	return 0;
+	return r;
 }
 
 /**
@@ -2133,7 +2135,7 @@ static int amdgpu_device_ip_reinit_early_sriov(struct amdgpu_device *adev)
 				continue;
 
 			r = block->version->funcs->hw_init(adev);
-			DRM_INFO("RE-INIT: %s %s\n", block->version->funcs->name, r?"failed":"succeeded");
+			DRM_INFO("RE-INIT-early: %s %s\n", block->version->funcs->name, r?"failed":"succeeded");
 			if (r)
 				return r;
 		}
@@ -2167,7 +2169,7 @@ static int amdgpu_device_ip_reinit_late_sriov(struct amdgpu_device *adev)
 				continue;
 
 			r = block->version->funcs->hw_init(adev);
-			DRM_INFO("RE-INIT: %s %s\n", block->version->funcs->name, r?"failed":"succeeded");
+			DRM_INFO("RE-INIT-late: %s %s\n", block->version->funcs->name, r?"failed":"succeeded");
 			if (r)
 				return r;
 		}
@@ -2548,6 +2550,17 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	/* detect if we are with an SRIOV vbios */
 	amdgpu_device_detect_sriov_bios(adev);
 
+	/* check if we need to reset the asic
+	 *  E.g., driver was not cleanly unloaded previously, etc.
+	 */
+	if (!amdgpu_sriov_vf(adev) && amdgpu_asic_need_reset_on_init(adev)) {
+		r = amdgpu_asic_reset(adev);
+		if (r) {
+			dev_err(adev->dev, "asic reset on init failed\n");
+			goto failed;
+		}
+	}
+
 	/* Post card if necessary */
 	if (amdgpu_device_need_post(adev)) {
 		if (!adev->bios) {
@@ -2612,6 +2625,8 @@ fence_driver_init:
 		}
 		dev_err(adev->dev, "amdgpu_device_ip_init failed\n");
 		amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_AMDGPU_INIT_FAIL, 0, 0);
+		if (amdgpu_virt_request_full_gpu(adev, false))
+			amdgpu_virt_release_full_gpu(adev, false);
 		goto failed;
 	}
 
@@ -2707,7 +2722,7 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 	amdgpu_irq_disable_all(adev);
 	if (adev->mode_info.mode_config_initialized){
 		if (!amdgpu_device_has_dc_support(adev))
-			drm_crtc_force_disable_all(adev->ddev);
+			drm_helper_force_disable_all(adev->ddev);
 		else
 			drm_atomic_helper_shutdown(adev->ddev);
 	}
@@ -3298,17 +3313,15 @@ static int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
 		if (!ring || !ring->sched.thread)
 			continue;
 
-		kthread_park(ring->sched.thread);
-
-		if (job && job->base.sched != &ring->sched)
-			continue;
-
-		drm_sched_hw_job_reset(&ring->sched, job ? &job->base : NULL);
+		drm_sched_stop(&ring->sched);
 
 		/* after all hw jobs are reset, hw fence is meaningless, so force_completion */
 		amdgpu_fence_driver_force_completion(ring);
 	}
 
+	if(job)
+		drm_sched_increase_karma(&job->base);
+
 
 
 	if (!amdgpu_sriov_vf(adev)) {
@@ -3454,14 +3467,10 @@ static void amdgpu_device_post_asic_reset(struct amdgpu_device *adev,
 		if (!ring || !ring->sched.thread)
 			continue;
 
-		/* only need recovery sched of the given job's ring
-		 * or all rings (in the case @job is NULL)
-		 * after above amdgpu_reset accomplished
-		 */
-		if ((!job || job->base.sched == &ring->sched) && !adev->asic_reset_res)
-			drm_sched_job_recovery(&ring->sched);
+		if (!adev->asic_reset_res)
+			drm_sched_resubmit_jobs(&ring->sched);
 
-		kthread_unpark(ring->sched.thread);
+		drm_sched_start(&ring->sched, !adev->asic_reset_res);
 	}
 
 	if (!amdgpu_device_has_dc_support(adev)) {
@@ -3521,9 +3530,9 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 	 * by different nodes. No point also since the one node already executing
 	 * reset will also reset all the other nodes in the hive.
 	 */
-	hive = amdgpu_get_xgmi_hive(adev);
+	hive = amdgpu_get_xgmi_hive(adev, 0);
 	if (hive && adev->gmc.xgmi.num_physical_nodes > 1 &&
-	    !mutex_trylock(&hive->hive_lock))
+	    !mutex_trylock(&hive->reset_lock))
 		return 0;
 
 	/* Start with adev pre asic reset first for soft reset check.*/
@@ -3602,13 +3611,45 @@ retry:	/* Rest of adevs pre asic reset from XGMI hive. */
 	}
 
 	if (hive && adev->gmc.xgmi.num_physical_nodes > 1)
-		mutex_unlock(&hive->hive_lock);
+		mutex_unlock(&hive->reset_lock);
 
 	if (r)
 		dev_info(adev->dev, "GPU reset end with ret = %d\n", r);
 	return r;
 }
 
+static void amdgpu_device_get_min_pci_speed_width(struct amdgpu_device *adev,
+						  enum pci_bus_speed *speed,
+						  enum pcie_link_width *width)
+{
+	struct pci_dev *pdev = adev->pdev;
+	enum pci_bus_speed cur_speed;
+	enum pcie_link_width cur_width;
+
+	*speed = PCI_SPEED_UNKNOWN;
+	*width = PCIE_LNK_WIDTH_UNKNOWN;
+
+	while (pdev) {
+		cur_speed = pcie_get_speed_cap(pdev);
+		cur_width = pcie_get_width_cap(pdev);
+
+		if (cur_speed != PCI_SPEED_UNKNOWN) {
+			if (*speed == PCI_SPEED_UNKNOWN)
+				*speed = cur_speed;
+			else if (cur_speed < *speed)
+				*speed = cur_speed;
+		}
+
+		if (cur_width != PCIE_LNK_WIDTH_UNKNOWN) {
+			if (*width == PCIE_LNK_WIDTH_UNKNOWN)
+				*width = cur_width;
+			else if (cur_width < *width)
+				*width = cur_width;
+		}
+		pdev = pci_upstream_bridge(pdev);
+	}
+}
+
 /**
  * amdgpu_device_get_pcie_info - fence pcie info about the PCIE slot
  *
@@ -3621,8 +3662,8 @@ retry:	/* Rest of adevs pre asic reset from XGMI hive. */
 static void amdgpu_device_get_pcie_info(struct amdgpu_device *adev)
 {
 	struct pci_dev *pdev;
-	enum pci_bus_speed speed_cap;
-	enum pcie_link_width link_width;
+	enum pci_bus_speed speed_cap, platform_speed_cap;
+	enum pcie_link_width platform_link_width;
 
 	if (amdgpu_pcie_gen_cap)
 		adev->pm.pcie_gen_mask = amdgpu_pcie_gen_cap;
@@ -3639,6 +3680,12 @@ static void amdgpu_device_get_pcie_info(struct amdgpu_device *adev)
 		return;
 	}
 
+	if (adev->pm.pcie_gen_mask && adev->pm.pcie_mlw_mask)
+		return;
+
+	amdgpu_device_get_min_pci_speed_width(adev, &platform_speed_cap,
+					      &platform_link_width);
+
 	if (adev->pm.pcie_gen_mask == 0) {
 		/* asic caps */
 		pdev = adev->pdev;
@@ -3664,22 +3711,20 @@ static void amdgpu_device_get_pcie_info(struct amdgpu_device *adev)
 				adev->pm.pcie_gen_mask |= CAIL_ASIC_PCIE_LINK_SPEED_SUPPORT_GEN1;
 		}
 		/* platform caps */
-		pdev = adev->ddev->pdev->bus->self;
-		speed_cap = pcie_get_speed_cap(pdev);
-		if (speed_cap == PCI_SPEED_UNKNOWN) {
+		if (platform_speed_cap == PCI_SPEED_UNKNOWN) {
 			adev->pm.pcie_gen_mask |= (CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1 |
 						   CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2);
 		} else {
-			if (speed_cap == PCIE_SPEED_16_0GT)
+			if (platform_speed_cap == PCIE_SPEED_16_0GT)
 				adev->pm.pcie_gen_mask |= (CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1 |
 							   CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2 |
 							   CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3 |
 							   CAIL_PCIE_LINK_SPEED_SUPPORT_GEN4);
-			else if (speed_cap == PCIE_SPEED_8_0GT)
+			else if (platform_speed_cap == PCIE_SPEED_8_0GT)
 				adev->pm.pcie_gen_mask |= (CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1 |
 							   CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2 |
 							   CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3);
-			else if (speed_cap == PCIE_SPEED_5_0GT)
+			else if (platform_speed_cap == PCIE_SPEED_5_0GT)
 				adev->pm.pcie_gen_mask |= (CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1 |
 							   CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2);
 			else
@@ -3688,12 +3733,10 @@ static void amdgpu_device_get_pcie_info(struct amdgpu_device *adev)
 		}
 	}
 	if (adev->pm.pcie_mlw_mask == 0) {
-		pdev = adev->ddev->pdev->bus->self;
-		link_width = pcie_get_width_cap(pdev);
-		if (link_width == PCIE_LNK_WIDTH_UNKNOWN) {
+		if (platform_link_width == PCIE_LNK_WIDTH_UNKNOWN) {
 			adev->pm.pcie_mlw_mask |= AMDGPU_DEFAULT_PCIE_MLW_MASK;
 		} else {
-			switch (link_width) {
+			switch (platform_link_width) {
 			case PCIE_LNK_X32:
 				adev->pm.pcie_mlw_mask = (CAIL_PCIE_LINK_WIDTH_SUPPORT_X32 |
 							  CAIL_PCIE_LINK_WIDTH_SUPPORT_X16 |
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
index be620b29f4aa..68959b923f89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell.h
@@ -51,14 +51,7 @@ struct amdgpu_doorbell_index {
 	uint32_t userqueue_start;
 	uint32_t userqueue_end;
 	uint32_t gfx_ring0;
-	uint32_t sdma_engine0;
-	uint32_t sdma_engine1;
-	uint32_t sdma_engine2;
-	uint32_t sdma_engine3;
-	uint32_t sdma_engine4;
-	uint32_t sdma_engine5;
-	uint32_t sdma_engine6;
-	uint32_t sdma_engine7;
+	uint32_t sdma_engine[8];
 	uint32_t ih;
 	union {
 		struct {
@@ -78,7 +71,11 @@ struct amdgpu_doorbell_index {
 			uint32_t vce_ring6_7;
 		} uvd_vce;
 	};
+	uint32_t first_non_cp;
+	uint32_t last_non_cp;
 	uint32_t max_assignment;
+	/* Per engine SDMA doorbell size in dword */
+	uint32_t sdma_doorbell_range;
 };
 
 typedef enum _AMDGPU_DOORBELL_ASSIGNMENT
@@ -148,6 +145,10 @@ typedef enum _AMDGPU_VEGA20_DOORBELL_ASSIGNMENT
 	AMDGPU_VEGA20_DOORBELL64_VCE_RING2_3             = 0x18D,
 	AMDGPU_VEGA20_DOORBELL64_VCE_RING4_5             = 0x18E,
 	AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7             = 0x18F,
+
+	AMDGPU_VEGA20_DOORBELL64_FIRST_NON_CP            = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE0,
+	AMDGPU_VEGA20_DOORBELL64_LAST_NON_CP             = AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7,
+
 	AMDGPU_VEGA20_DOORBELL_MAX_ASSIGNMENT            = 0x18F,
 	AMDGPU_VEGA20_DOORBELL_INVALID                   = 0xFFFF
 } AMDGPU_VEGA20_DOORBELL_ASSIGNMENT;
@@ -227,6 +228,9 @@ typedef enum _AMDGPU_DOORBELL64_ASSIGNMENT
 	AMDGPU_DOORBELL64_VCE_RING4_5             = 0xFE,
 	AMDGPU_DOORBELL64_VCE_RING6_7             = 0xFF,
 
+	AMDGPU_DOORBELL64_FIRST_NON_CP            = AMDGPU_DOORBELL64_sDMA_ENGINE0,
+	AMDGPU_DOORBELL64_LAST_NON_CP             = AMDGPU_DOORBELL64_VCE_RING6_7,
+
 	AMDGPU_DOORBELL64_MAX_ASSIGNMENT          = 0xFF,
 	AMDGPU_DOORBELL64_INVALID                 = 0xFFFF
 } AMDGPU_DOORBELL64_ASSIGNMENT;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
index 1c4595562f8f..344967df3137 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c
@@ -184,61 +184,6 @@ u32 amdgpu_dpm_get_vrefresh(struct amdgpu_device *adev)
 	return vrefresh;
 }
 
-void amdgpu_calculate_u_and_p(u32 i, u32 r_c, u32 p_b,
-			      u32 *p, u32 *u)
-{
-	u32 b_c = 0;
-	u32 i_c;
-	u32 tmp;
-
-	i_c = (i * r_c) / 100;
-	tmp = i_c >> p_b;
-
-	while (tmp) {
-		b_c++;
-		tmp >>= 1;
-	}
-
-	*u = (b_c + 1) / 2;
-	*p = i_c / (1 << (2 * (*u)));
-}
-
-int amdgpu_calculate_at(u32 t, u32 h, u32 fh, u32 fl, u32 *tl, u32 *th)
-{
-	u32 k, a, ah, al;
-	u32 t1;
-
-	if ((fl == 0) || (fh == 0) || (fl > fh))
-		return -EINVAL;
-
-	k = (100 * fh) / fl;
-	t1 = (t * (k - 100));
-	a = (1000 * (100 * h + t1)) / (10000 + (t1 / 100));
-	a = (a + 5) / 10;
-	ah = ((a * t) + 5000) / 10000;
-	al = a - ah;
-
-	*th = t - ah;
-	*tl = t + al;
-
-	return 0;
-}
-
-bool amdgpu_is_uvd_state(u32 class, u32 class2)
-{
-	if (class & ATOM_PPLIB_CLASSIFICATION_UVDSTATE)
-		return true;
-	if (class & ATOM_PPLIB_CLASSIFICATION_HD2STATE)
-		return true;
-	if (class & ATOM_PPLIB_CLASSIFICATION_HDSTATE)
-		return true;
-	if (class & ATOM_PPLIB_CLASSIFICATION_SDSTATE)
-		return true;
-	if (class2 & ATOM_PPLIB_CLASSIFICATION2_MVC)
-		return true;
-	return false;
-}
-
 bool amdgpu_is_internal_thermal_sensor(enum amdgpu_int_thermal_type sensor)
 {
 	switch (sensor) {
@@ -949,39 +894,6 @@ enum amdgpu_pcie_gen amdgpu_get_pcie_gen_support(struct amdgpu_device *adev,
 	return AMDGPU_PCIE_GEN1;
 }
 
-u16 amdgpu_get_pcie_lane_support(struct amdgpu_device *adev,
-				 u16 asic_lanes,
-				 u16 default_lanes)
-{
-	switch (asic_lanes) {
-	case 0:
-	default:
-		return default_lanes;
-	case 1:
-		return 1;
-	case 2:
-		return 2;
-	case 4:
-		return 4;
-	case 8:
-		return 8;
-	case 12:
-		return 12;
-	case 16:
-		return 16;
-	}
-}
-
-u8 amdgpu_encode_pci_lane_width(u32 lanes)
-{
-	u8 encoded_lanes[] = { 0, 1, 2, 0, 3, 0, 0, 0, 4, 0, 0, 0, 5, 0, 0, 0, 6 };
-
-	if (lanes > 16)
-		return 0;
-
-	return encoded_lanes[lanes];
-}
-
 struct amd_vce_state*
 amdgpu_get_vce_clock_state(void *handle, u32 idx)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
index f972cd156795..e871e022c129 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h
@@ -364,6 +364,14 @@ enum amdgpu_pcie_gen {
 		((adev)->powerplay.pp_funcs->enable_mgpu_fan_boost(\
 			(adev)->powerplay.pp_handle))
 
+#define amdgpu_dpm_get_ppfeature_status(adev, buf) \
+		((adev)->powerplay.pp_funcs->get_ppfeature_status(\
+			(adev)->powerplay.pp_handle, (buf)))
+
+#define amdgpu_dpm_set_ppfeature_status(adev, ppfeatures) \
+		((adev)->powerplay.pp_funcs->set_ppfeature_status(\
+			(adev)->powerplay.pp_handle, (ppfeatures)))
+
 struct amdgpu_dpm {
 	struct amdgpu_ps        *ps;
 	/* number of valid power states */
@@ -478,10 +486,6 @@ void amdgpu_dpm_print_ps_status(struct amdgpu_device *adev,
 u32 amdgpu_dpm_get_vblank_time(struct amdgpu_device *adev);
 u32 amdgpu_dpm_get_vrefresh(struct amdgpu_device *adev);
 void amdgpu_dpm_get_active_displays(struct amdgpu_device *adev);
-bool amdgpu_is_uvd_state(u32 class, u32 class2);
-void amdgpu_calculate_u_and_p(u32 i, u32 r_c, u32 p_b,
-			      u32 *p, u32 *u);
-int amdgpu_calculate_at(u32 t, u32 h, u32 fh, u32 fl, u32 *tl, u32 *th);
 
 bool amdgpu_is_internal_thermal_sensor(enum amdgpu_int_thermal_type sensor);
 
@@ -497,11 +501,6 @@ enum amdgpu_pcie_gen amdgpu_get_pcie_gen_support(struct amdgpu_device *adev,
 						 enum amdgpu_pcie_gen asic_gen,
 						 enum amdgpu_pcie_gen default_gen);
 
-u16 amdgpu_get_pcie_lane_support(struct amdgpu_device *adev,
-				 u16 asic_lanes,
-				 u16 default_lanes);
-u8 amdgpu_encode_pci_lane_width(u32 lanes);
-
 struct amd_vce_state*
 amdgpu_get_vce_clock_state(void *handle, u32 idx);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index c806f984bcc5..7419ea8a388b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -32,7 +32,7 @@
 #include <linux/module.h>
 #include <linux/pm_runtime.h>
 #include <linux/vga_switcheroo.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "amdgpu.h"
 #include "amdgpu_irq.h"
@@ -71,9 +71,12 @@
  * - 3.25.0 - Add support for sensor query info (stable pstate sclk/mclk).
  * - 3.26.0 - GFX9: Process AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE.
  * - 3.27.0 - Add new chunk to to AMDGPU_CS to enable BO_LIST creation.
+ * - 3.28.0 - Add AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES
+ * - 3.29.0 - Add AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID
+ * - 3.30.0 - Add AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE.
  */
 #define KMS_DRIVER_MAJOR	3
-#define KMS_DRIVER_MINOR	27
+#define KMS_DRIVER_MINOR	30
 #define KMS_DRIVER_PATCHLEVEL	0
 
 int amdgpu_vram_limit = 0;
@@ -1176,6 +1179,22 @@ static const struct file_operations amdgpu_driver_kms_fops = {
 #endif
 };
 
+int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
+{
+        struct drm_file *file;
+
+	if (!filp)
+		return -EINVAL;
+
+	if (filp->f_op != &amdgpu_driver_kms_fops) {
+		return -EINVAL;
+	}
+
+	file = filp->private_data;
+	*fpriv = file->driver_priv;
+	return 0;
+}
+
 static bool
 amdgpu_get_crtc_scanout_position(struct drm_device *dev, unsigned int pipe,
 				 bool in_vblank_irq, int *vpos, int *hpos,
@@ -1189,7 +1208,7 @@ amdgpu_get_crtc_scanout_position(struct drm_device *dev, unsigned int pipe,
 static struct drm_driver kms_driver = {
 	.driver_features =
 	    DRIVER_USE_AGP | DRIVER_ATOMIC |
-	    DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM |
+	    DRIVER_GEM |
 	    DRIVER_PRIME | DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ,
 	.load = amdgpu_driver_load_kms,
 	.open = amdgpu_driver_open_kms,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
index ecbcefe49a98..f89f5734d985 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gds.h
@@ -37,6 +37,8 @@ struct amdgpu_gds {
 	struct amdgpu_gds_asic_info	mem;
 	struct amdgpu_gds_asic_info	gws;
 	struct amdgpu_gds_asic_info	oa;
+	uint32_t			gds_compute_max_wave_id;
+
 	/* At present, GDS, GWS and OA resources for gfx (graphics)
 	 * is always pre-allocated and available for graphics operation.
 	 * Such resource is shared between all gfx clients.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index f4f00217546e..d21dd2f369da 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -54,10 +54,6 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
 
 	memset(&bp, 0, sizeof(bp));
 	*obj = NULL;
-	/* At least align on page size */
-	if (alignment < PAGE_SIZE) {
-		alignment = PAGE_SIZE;
-	}
 
 	bp.size = size;
 	bp.byte_align = alignment;
@@ -244,9 +240,6 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
 			return -EINVAL;
 		}
 		flags |= AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
-		/* GDS allocations must be DW aligned */
-		if (args->in.domains & AMDGPU_GEM_DOMAIN_GDS)
-			size = ALIGN(size, 4);
 	}
 
 	if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index c48207b377bc..0b8ef2d27d6b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -202,12 +202,12 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 			amdgpu_asic_flush_hdp(adev, ring);
 	}
 
+	if (need_ctx_switch)
+		status |= AMDGPU_HAVE_CTX_SWITCH;
+
 	skip_preamble = ring->current_ctx == fence_ctx;
 	if (job && ring->funcs->emit_cntxcntl) {
-		if (need_ctx_switch)
-			status |= AMDGPU_HAVE_CTX_SWITCH;
 		status |= job->preamble_status;
-
 		amdgpu_ring_emit_cntxcntl(ring, status);
 	}
 
@@ -221,8 +221,8 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 			!amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */
 			continue;
 
-		amdgpu_ring_emit_ib(ring, job, ib, need_ctx_switch);
-		need_ctx_switch = false;
+		amdgpu_ring_emit_ib(ring, job, ib, status);
+		status &= ~AMDGPU_HAVE_CTX_SWITCH;
 	}
 
 	if (ring->funcs->emit_tmz)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index 8af67f649660..1c50be3ab8a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -52,6 +52,8 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih,
 	ih->use_bus_addr = use_bus_addr;
 
 	if (use_bus_addr) {
+		dma_addr_t dma_addr;
+
 		if (ih->ring)
 			return 0;
 
@@ -59,21 +61,26 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih,
 		 * add them to the end of the ring allocation.
 		 */
 		ih->ring = dma_alloc_coherent(adev->dev, ih->ring_size + 8,
-					      &ih->rb_dma_addr, GFP_KERNEL);
+					      &dma_addr, GFP_KERNEL);
 		if (ih->ring == NULL)
 			return -ENOMEM;
 
 		memset((void *)ih->ring, 0, ih->ring_size + 8);
-		ih->wptr_offs = (ih->ring_size / 4) + 0;
-		ih->rptr_offs = (ih->ring_size / 4) + 1;
+		ih->gpu_addr = dma_addr;
+		ih->wptr_addr = dma_addr + ih->ring_size;
+		ih->wptr_cpu = &ih->ring[ih->ring_size / 4];
+		ih->rptr_addr = dma_addr + ih->ring_size + 4;
+		ih->rptr_cpu = &ih->ring[(ih->ring_size / 4) + 1];
 	} else {
-		r = amdgpu_device_wb_get(adev, &ih->wptr_offs);
+		unsigned wptr_offs, rptr_offs;
+
+		r = amdgpu_device_wb_get(adev, &wptr_offs);
 		if (r)
 			return r;
 
-		r = amdgpu_device_wb_get(adev, &ih->rptr_offs);
+		r = amdgpu_device_wb_get(adev, &rptr_offs);
 		if (r) {
-			amdgpu_device_wb_free(adev, ih->wptr_offs);
+			amdgpu_device_wb_free(adev, wptr_offs);
 			return r;
 		}
 
@@ -82,10 +89,15 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih,
 					    &ih->ring_obj, &ih->gpu_addr,
 					    (void **)&ih->ring);
 		if (r) {
-			amdgpu_device_wb_free(adev, ih->rptr_offs);
-			amdgpu_device_wb_free(adev, ih->wptr_offs);
+			amdgpu_device_wb_free(adev, rptr_offs);
+			amdgpu_device_wb_free(adev, wptr_offs);
 			return r;
 		}
+
+		ih->wptr_addr = adev->wb.gpu_addr + wptr_offs * 4;
+		ih->wptr_cpu = &adev->wb.wb[wptr_offs];
+		ih->rptr_addr = adev->wb.gpu_addr + rptr_offs * 4;
+		ih->rptr_cpu = &adev->wb.wb[rptr_offs];
 	}
 	return 0;
 }
@@ -109,13 +121,13 @@ void amdgpu_ih_ring_fini(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih)
 		 * add them to the end of the ring allocation.
 		 */
 		dma_free_coherent(adev->dev, ih->ring_size + 8,
-				  (void *)ih->ring, ih->rb_dma_addr);
+				  (void *)ih->ring, ih->gpu_addr);
 		ih->ring = NULL;
 	} else {
 		amdgpu_bo_free_kernel(&ih->ring_obj, &ih->gpu_addr,
 				      (void **)&ih->ring);
-		amdgpu_device_wb_free(adev, ih->wptr_offs);
-		amdgpu_device_wb_free(adev, ih->rptr_offs);
+		amdgpu_device_wb_free(adev, (ih->wptr_addr - ih->gpu_addr) / 4);
+		amdgpu_device_wb_free(adev, (ih->rptr_addr - ih->gpu_addr) / 4);
 	}
 }
 
@@ -128,16 +140,14 @@ void amdgpu_ih_ring_fini(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih)
  * Interrupt hander (VI), walk the IH ring.
  * Returns irq process return code.
  */
-int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih,
-		      void (*callback)(struct amdgpu_device *adev,
-				       struct amdgpu_ih_ring *ih))
+int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih)
 {
 	u32 wptr;
 
 	if (!ih->enabled || adev->shutdown)
 		return IRQ_NONE;
 
-	wptr = amdgpu_ih_get_wptr(adev);
+	wptr = amdgpu_ih_get_wptr(adev, ih);
 
 restart_ih:
 	/* is somebody else already processing irqs? */
@@ -150,15 +160,15 @@ restart_ih:
 	rmb();
 
 	while (ih->rptr != wptr) {
-		callback(adev, ih);
+		amdgpu_irq_dispatch(adev, ih);
 		ih->rptr &= ih->ptr_mask;
 	}
 
-	amdgpu_ih_set_rptr(adev);
+	amdgpu_ih_set_rptr(adev, ih);
 	atomic_set(&ih->lock, 0);
 
 	/* make sure wptr hasn't changed while processing */
-	wptr = amdgpu_ih_get_wptr(adev);
+	wptr = amdgpu_ih_get_wptr(adev, ih);
 	if (wptr != ih->rptr)
 		goto restart_ih;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
index f877bb78d10a..113a1ba13d4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
@@ -31,40 +31,44 @@ struct amdgpu_iv_entry;
  * R6xx+ IH ring
  */
 struct amdgpu_ih_ring {
-	struct amdgpu_bo	*ring_obj;
-	volatile uint32_t	*ring;
-	unsigned		rptr;
 	unsigned		ring_size;
-	uint64_t		gpu_addr;
 	uint32_t		ptr_mask;
-	atomic_t		lock;
-	bool                    enabled;
-	unsigned		wptr_offs;
-	unsigned		rptr_offs;
 	u32			doorbell_index;
 	bool			use_doorbell;
 	bool			use_bus_addr;
-	dma_addr_t		rb_dma_addr; /* only used when use_bus_addr = true */
+
+	struct amdgpu_bo	*ring_obj;
+	volatile uint32_t	*ring;
+	uint64_t		gpu_addr;
+
+	uint64_t		wptr_addr;
+	volatile uint32_t	*wptr_cpu;
+
+	uint64_t		rptr_addr;
+	volatile uint32_t	*rptr_cpu;
+
+	bool                    enabled;
+	unsigned		rptr;
+	atomic_t		lock;
 };
 
 /* provided by the ih block */
 struct amdgpu_ih_funcs {
 	/* ring read/write ptr handling, called from interrupt context */
-	u32 (*get_wptr)(struct amdgpu_device *adev);
-	void (*decode_iv)(struct amdgpu_device *adev,
+	u32 (*get_wptr)(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih);
+	void (*decode_iv)(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih,
 			  struct amdgpu_iv_entry *entry);
-	void (*set_rptr)(struct amdgpu_device *adev);
+	void (*set_rptr)(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih);
 };
 
-#define amdgpu_ih_get_wptr(adev) (adev)->irq.ih_funcs->get_wptr((adev))
-#define amdgpu_ih_decode_iv(adev, iv) (adev)->irq.ih_funcs->decode_iv((adev), (iv))
-#define amdgpu_ih_set_rptr(adev) (adev)->irq.ih_funcs->set_rptr((adev))
+#define amdgpu_ih_get_wptr(adev, ih) (adev)->irq.ih_funcs->get_wptr((adev), (ih))
+#define amdgpu_ih_decode_iv(adev, iv) \
+	(adev)->irq.ih_funcs->decode_iv((adev), (ih), (iv))
+#define amdgpu_ih_set_rptr(adev, ih) (adev)->irq.ih_funcs->set_rptr((adev), (ih))
 
 int amdgpu_ih_ring_init(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih,
 			unsigned ring_size, bool use_bus_addr);
 void amdgpu_ih_ring_fini(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih);
-int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih,
-		      void (*callback)(struct amdgpu_device *adev,
-				       struct amdgpu_ih_ring *ih));
+int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index b7968f426862..af4c3b1af322 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -131,27 +131,6 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev)
 }
 
 /**
- * amdgpu_irq_callback - callback from the IH ring
- *
- * @adev: amdgpu device pointer
- * @ih: amdgpu ih ring
- *
- * Callback from IH ring processing to handle the entry at the current position
- * and advance the read pointer.
- */
-static void amdgpu_irq_callback(struct amdgpu_device *adev,
-				struct amdgpu_ih_ring *ih)
-{
-	u32 ring_index = ih->rptr >> 2;
-	struct amdgpu_iv_entry entry;
-
-	entry.iv_entry = (const uint32_t *)&ih->ring[ring_index];
-	amdgpu_ih_decode_iv(adev, &entry);
-
-	amdgpu_irq_dispatch(adev, &entry);
-}
-
-/**
  * amdgpu_irq_handler - IRQ handler
  *
  * @irq: IRQ number (unused)
@@ -168,13 +147,43 @@ irqreturn_t amdgpu_irq_handler(int irq, void *arg)
 	struct amdgpu_device *adev = dev->dev_private;
 	irqreturn_t ret;
 
-	ret = amdgpu_ih_process(adev, &adev->irq.ih, amdgpu_irq_callback);
+	ret = amdgpu_ih_process(adev, &adev->irq.ih);
 	if (ret == IRQ_HANDLED)
 		pm_runtime_mark_last_busy(dev->dev);
 	return ret;
 }
 
 /**
+ * amdgpu_irq_handle_ih1 - kick of processing for IH1
+ *
+ * @work: work structure in struct amdgpu_irq
+ *
+ * Kick of processing IH ring 1.
+ */
+static void amdgpu_irq_handle_ih1(struct work_struct *work)
+{
+	struct amdgpu_device *adev = container_of(work, struct amdgpu_device,
+						  irq.ih1_work);
+
+	amdgpu_ih_process(adev, &adev->irq.ih1);
+}
+
+/**
+ * amdgpu_irq_handle_ih2 - kick of processing for IH2
+ *
+ * @work: work structure in struct amdgpu_irq
+ *
+ * Kick of processing IH ring 2.
+ */
+static void amdgpu_irq_handle_ih2(struct work_struct *work)
+{
+	struct amdgpu_device *adev = container_of(work, struct amdgpu_device,
+						  irq.ih2_work);
+
+	amdgpu_ih_process(adev, &adev->irq.ih2);
+}
+
+/**
  * amdgpu_msi_ok - check whether MSI functionality is enabled
  *
  * @adev: amdgpu device pointer (unused)
@@ -238,6 +247,9 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
 				amdgpu_hotplug_work_func);
 	}
 
+	INIT_WORK(&adev->irq.ih1_work, amdgpu_irq_handle_ih1);
+	INIT_WORK(&adev->irq.ih2_work, amdgpu_irq_handle_ih2);
+
 	adev->irq.installed = true;
 	r = drm_irq_install(adev->ddev, adev->ddev->pdev->irq);
 	if (r) {
@@ -359,15 +371,22 @@ int amdgpu_irq_add_id(struct amdgpu_device *adev,
  * Dispatches IRQ to IP blocks.
  */
 void amdgpu_irq_dispatch(struct amdgpu_device *adev,
-			 struct amdgpu_iv_entry *entry)
+			 struct amdgpu_ih_ring *ih)
 {
-	unsigned client_id = entry->client_id;
-	unsigned src_id = entry->src_id;
+	u32 ring_index = ih->rptr >> 2;
+	struct amdgpu_iv_entry entry;
+	unsigned client_id, src_id;
 	struct amdgpu_irq_src *src;
 	bool handled = false;
 	int r;
 
-	trace_amdgpu_iv(entry);
+	entry.iv_entry = (const uint32_t *)&ih->ring[ring_index];
+	amdgpu_ih_decode_iv(adev, &entry);
+
+	trace_amdgpu_iv(ih - &adev->irq.ih, &entry);
+
+	client_id = entry.client_id;
+	src_id = entry.src_id;
 
 	if (client_id >= AMDGPU_IRQ_CLIENTID_MAX) {
 		DRM_DEBUG("Invalid client_id in IV: %d\n", client_id);
@@ -383,7 +402,7 @@ void amdgpu_irq_dispatch(struct amdgpu_device *adev,
 			  client_id, src_id);
 
 	} else if ((src = adev->irq.client[client_id].sources[src_id])) {
-		r = src->funcs->process(adev, src, entry);
+		r = src->funcs->process(adev, src, &entry);
 		if (r < 0)
 			DRM_ERROR("error processing interrupt (%d)\n", r);
 		else if (r)
@@ -395,7 +414,7 @@ void amdgpu_irq_dispatch(struct amdgpu_device *adev,
 
 	/* Send it to amdkfd as well if it isn't already handled */
 	if (!handled)
-		amdgpu_amdkfd_interrupt(adev, entry->iv_entry);
+		amdgpu_amdkfd_interrupt(adev, entry.iv_entry);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
index f6ce171cb8aa..c718e94a55c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.h
@@ -87,9 +87,11 @@ struct amdgpu_irq {
 	/* status, etc. */
 	bool				msi_enabled; /* msi enabled */
 
-	/* interrupt ring */
-	struct amdgpu_ih_ring		ih;
-	const struct amdgpu_ih_funcs	*ih_funcs;
+	/* interrupt rings */
+	struct amdgpu_ih_ring		ih, ih1, ih2;
+	const struct amdgpu_ih_funcs    *ih_funcs;
+	struct work_struct		ih1_work, ih2_work;
+	struct amdgpu_irq_src		self_irq;
 
 	/* gen irq stuff */
 	struct irq_domain		*domain; /* GPU irq controller domain */
@@ -106,7 +108,7 @@ int amdgpu_irq_add_id(struct amdgpu_device *adev,
 		      unsigned client_id, unsigned src_id,
 		      struct amdgpu_irq_src *source);
 void amdgpu_irq_dispatch(struct amdgpu_device *adev,
-			 struct amdgpu_iv_entry *entry);
+			 struct amdgpu_ih_ring *ih);
 int amdgpu_irq_update(struct amdgpu_device *adev, struct amdgpu_irq_src *src,
 		      unsigned type);
 int amdgpu_irq_get(struct amdgpu_device *adev, struct amdgpu_irq_src *src,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 5dc349173e4f..e860412043bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -207,7 +207,7 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags)
 	if (!r) {
 		acpi_status = amdgpu_acpi_init(adev);
 		if (acpi_status)
-		dev_dbg(&dev->pdev->dev,
+			dev_dbg(&dev->pdev->dev,
 				"Error during ACPI methods call\n");
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 3aa42c64484a..889e443eeee7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -38,6 +38,7 @@
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/i2c.h>
 #include <linux/i2c-algo-bit.h>
 #include <linux/hrtimer.h>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 728e15e5d68a..ec9e45004bff 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -426,12 +426,20 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
 	size_t acc_size;
 	int r;
 
-	page_align = roundup(bp->byte_align, PAGE_SIZE) >> PAGE_SHIFT;
-	if (bp->domain & (AMDGPU_GEM_DOMAIN_GDS | AMDGPU_GEM_DOMAIN_GWS |
-			  AMDGPU_GEM_DOMAIN_OA))
+	/* Note that GDS/GWS/OA allocates 1 page per byte/resource. */
+	if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) {
+		/* GWS and OA don't need any alignment. */
+		page_align = bp->byte_align;
 		size <<= PAGE_SHIFT;
-	else
+	} else if (bp->domain & AMDGPU_GEM_DOMAIN_GDS) {
+		/* Both size and alignment must be a multiple of 4. */
+		page_align = ALIGN(bp->byte_align, 4);
+		size = ALIGN(size, 4) << PAGE_SHIFT;
+	} else {
+		/* Memory should be aligned at least to a page size. */
+		page_align = ALIGN(bp->byte_align, PAGE_SIZE) >> PAGE_SHIFT;
 		size = ALIGN(size, PAGE_SIZE);
+	}
 
 	if (!amdgpu_bo_validate_size(adev, size, bp->domain))
 		return -ENOMEM;
@@ -1277,6 +1285,30 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
 }
 
 /**
+ * amdgpu_sync_wait_resv - Wait for BO reservation fences
+ *
+ * @bo: buffer object
+ * @owner: fence owner
+ * @intr: Whether the wait is interruptible
+ *
+ * Returns:
+ * 0 on success, errno otherwise.
+ */
+int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr)
+{
+	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+	struct amdgpu_sync sync;
+	int r;
+
+	amdgpu_sync_create(&sync);
+	amdgpu_sync_resv(adev, &sync, bo->tbo.resv, owner, false);
+	r = amdgpu_sync_wait(&sync, intr);
+	amdgpu_sync_free(&sync);
+
+	return r;
+}
+
+/**
  * amdgpu_bo_gpu_offset - return GPU offset of bo
  * @bo:	amdgpu object for which we query the offset
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 9291c2f837e9..220a6a7b1bc1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -266,6 +266,7 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
 int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
 void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
 		     bool shared);
+int amdgpu_bo_sync_wait(struct amdgpu_bo *bo, void *owner, bool intr);
 u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
 int amdgpu_bo_validate(struct amdgpu_bo *bo);
 int amdgpu_bo_restore_shadow(struct amdgpu_bo *shadow,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index 0ed41a9d2d77..a7adb7b6bd98 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -626,11 +626,71 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
 }
 
 /**
- * DOC: pp_dpm_sclk pp_dpm_mclk pp_dpm_pcie
+ * DOC: ppfeatures
+ *
+ * The amdgpu driver provides a sysfs API for adjusting what powerplay
+ * features to be enabled. The file ppfeatures is used for this. And
+ * this is only available for Vega10 and later dGPUs.
+ *
+ * Reading back the file will show you the followings:
+ * - Current ppfeature masks
+ * - List of the all supported powerplay features with their naming,
+ *   bitmasks and enablement status('Y'/'N' means "enabled"/"disabled").
+ *
+ * To manually enable or disable a specific feature, just set or clear
+ * the corresponding bit from original ppfeature masks and input the
+ * new ppfeature masks.
+ */
+static ssize_t amdgpu_set_ppfeature_status(struct device *dev,
+		struct device_attribute *attr,
+		const char *buf,
+		size_t count)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = ddev->dev_private;
+	uint64_t featuremask;
+	int ret;
+
+	ret = kstrtou64(buf, 0, &featuremask);
+	if (ret)
+		return -EINVAL;
+
+	pr_debug("featuremask = 0x%llx\n", featuremask);
+
+	if (adev->powerplay.pp_funcs->set_ppfeature_status) {
+		ret = amdgpu_dpm_set_ppfeature_status(adev, featuremask);
+		if (ret)
+			return -EINVAL;
+	}
+
+	return count;
+}
+
+static ssize_t amdgpu_get_ppfeature_status(struct device *dev,
+		struct device_attribute *attr,
+		char *buf)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = ddev->dev_private;
+
+	if (adev->powerplay.pp_funcs->get_ppfeature_status)
+		return amdgpu_dpm_get_ppfeature_status(adev, buf);
+
+	return snprintf(buf, PAGE_SIZE, "\n");
+}
+
+/**
+ * DOC: pp_dpm_sclk pp_dpm_mclk pp_dpm_socclk pp_dpm_fclk pp_dpm_dcefclk
+ * pp_dpm_pcie
  *
  * The amdgpu driver provides a sysfs API for adjusting what power levels
  * are enabled for a given power state.  The files pp_dpm_sclk, pp_dpm_mclk,
- * and pp_dpm_pcie are used for this.
+ * pp_dpm_socclk, pp_dpm_fclk, pp_dpm_dcefclk and pp_dpm_pcie are used for
+ * this.
+ *
+ * pp_dpm_socclk and pp_dpm_dcefclk interfaces are only available for
+ * Vega10 and later ASICs.
+ * pp_dpm_fclk interface is only available for Vega20 and later ASICs.
  *
  * Reading back the files will show you the available power levels within
  * the power state and the clock information for those levels.
@@ -640,6 +700,8 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
  * Secondly,Enter a new value for each level by inputing a string that
  * contains " echo xx xx xx > pp_dpm_sclk/mclk/pcie"
  * E.g., echo 4 5 6 to > pp_dpm_sclk will enable sclk levels 4, 5, and 6.
+ *
+ * NOTE: change to the dcefclk max dpm level is not supported now
  */
 
 static ssize_t amdgpu_get_pp_dpm_sclk(struct device *dev,
@@ -750,6 +812,114 @@ static ssize_t amdgpu_set_pp_dpm_mclk(struct device *dev,
 	return count;
 }
 
+static ssize_t amdgpu_get_pp_dpm_socclk(struct device *dev,
+		struct device_attribute *attr,
+		char *buf)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = ddev->dev_private;
+
+	if (adev->powerplay.pp_funcs->print_clock_levels)
+		return amdgpu_dpm_print_clock_levels(adev, PP_SOCCLK, buf);
+	else
+		return snprintf(buf, PAGE_SIZE, "\n");
+}
+
+static ssize_t amdgpu_set_pp_dpm_socclk(struct device *dev,
+		struct device_attribute *attr,
+		const char *buf,
+		size_t count)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = ddev->dev_private;
+	int ret;
+	uint32_t mask = 0;
+
+	ret = amdgpu_read_mask(buf, count, &mask);
+	if (ret)
+		return ret;
+
+	if (adev->powerplay.pp_funcs->force_clock_level)
+		ret = amdgpu_dpm_force_clock_level(adev, PP_SOCCLK, mask);
+
+	if (ret)
+		return -EINVAL;
+
+	return count;
+}
+
+static ssize_t amdgpu_get_pp_dpm_fclk(struct device *dev,
+		struct device_attribute *attr,
+		char *buf)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = ddev->dev_private;
+
+	if (adev->powerplay.pp_funcs->print_clock_levels)
+		return amdgpu_dpm_print_clock_levels(adev, PP_FCLK, buf);
+	else
+		return snprintf(buf, PAGE_SIZE, "\n");
+}
+
+static ssize_t amdgpu_set_pp_dpm_fclk(struct device *dev,
+		struct device_attribute *attr,
+		const char *buf,
+		size_t count)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = ddev->dev_private;
+	int ret;
+	uint32_t mask = 0;
+
+	ret = amdgpu_read_mask(buf, count, &mask);
+	if (ret)
+		return ret;
+
+	if (adev->powerplay.pp_funcs->force_clock_level)
+		ret = amdgpu_dpm_force_clock_level(adev, PP_FCLK, mask);
+
+	if (ret)
+		return -EINVAL;
+
+	return count;
+}
+
+static ssize_t amdgpu_get_pp_dpm_dcefclk(struct device *dev,
+		struct device_attribute *attr,
+		char *buf)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = ddev->dev_private;
+
+	if (adev->powerplay.pp_funcs->print_clock_levels)
+		return amdgpu_dpm_print_clock_levels(adev, PP_DCEFCLK, buf);
+	else
+		return snprintf(buf, PAGE_SIZE, "\n");
+}
+
+static ssize_t amdgpu_set_pp_dpm_dcefclk(struct device *dev,
+		struct device_attribute *attr,
+		const char *buf,
+		size_t count)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = ddev->dev_private;
+	int ret;
+	uint32_t mask = 0;
+
+	ret = amdgpu_read_mask(buf, count, &mask);
+	if (ret)
+		return ret;
+
+	if (adev->powerplay.pp_funcs->force_clock_level)
+		ret = amdgpu_dpm_force_clock_level(adev, PP_DCEFCLK, mask);
+
+	if (ret)
+		return -EINVAL;
+
+	return count;
+}
+
 static ssize_t amdgpu_get_pp_dpm_pcie(struct device *dev,
 		struct device_attribute *attr,
 		char *buf)
@@ -990,6 +1160,31 @@ static ssize_t amdgpu_get_busy_percent(struct device *dev,
 	return snprintf(buf, PAGE_SIZE, "%d\n", value);
 }
 
+/**
+ * DOC: pcie_bw
+ *
+ * The amdgpu driver provides a sysfs API for estimating how much data
+ * has been received and sent by the GPU in the last second through PCIe.
+ * The file pcie_bw is used for this.
+ * The Perf counters count the number of received and sent messages and return
+ * those values, as well as the maximum payload size of a PCIe packet (mps).
+ * Note that it is not possible to easily and quickly obtain the size of each
+ * packet transmitted, so we output the max payload size (mps) to allow for
+ * quick estimation of the PCIe bandwidth usage
+ */
+static ssize_t amdgpu_get_pcie_bw(struct device *dev,
+		struct device_attribute *attr,
+		char *buf)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct amdgpu_device *adev = ddev->dev_private;
+	uint64_t count0, count1;
+
+	amdgpu_asic_get_pcie_usage(adev, &count0, &count1);
+	return snprintf(buf, PAGE_SIZE,	"%llu %llu %i\n",
+			count0, count1, pcie_get_mps(adev->pdev));
+}
+
 static DEVICE_ATTR(power_dpm_state, S_IRUGO | S_IWUSR, amdgpu_get_dpm_state, amdgpu_set_dpm_state);
 static DEVICE_ATTR(power_dpm_force_performance_level, S_IRUGO | S_IWUSR,
 		   amdgpu_get_dpm_forced_performance_level,
@@ -1008,6 +1203,15 @@ static DEVICE_ATTR(pp_dpm_sclk, S_IRUGO | S_IWUSR,
 static DEVICE_ATTR(pp_dpm_mclk, S_IRUGO | S_IWUSR,
 		amdgpu_get_pp_dpm_mclk,
 		amdgpu_set_pp_dpm_mclk);
+static DEVICE_ATTR(pp_dpm_socclk, S_IRUGO | S_IWUSR,
+		amdgpu_get_pp_dpm_socclk,
+		amdgpu_set_pp_dpm_socclk);
+static DEVICE_ATTR(pp_dpm_fclk, S_IRUGO | S_IWUSR,
+		amdgpu_get_pp_dpm_fclk,
+		amdgpu_set_pp_dpm_fclk);
+static DEVICE_ATTR(pp_dpm_dcefclk, S_IRUGO | S_IWUSR,
+		amdgpu_get_pp_dpm_dcefclk,
+		amdgpu_set_pp_dpm_dcefclk);
 static DEVICE_ATTR(pp_dpm_pcie, S_IRUGO | S_IWUSR,
 		amdgpu_get_pp_dpm_pcie,
 		amdgpu_set_pp_dpm_pcie);
@@ -1025,6 +1229,10 @@ static DEVICE_ATTR(pp_od_clk_voltage, S_IRUGO | S_IWUSR,
 		amdgpu_set_pp_od_clk_voltage);
 static DEVICE_ATTR(gpu_busy_percent, S_IRUGO,
 		amdgpu_get_busy_percent, NULL);
+static DEVICE_ATTR(pcie_bw, S_IRUGO, amdgpu_get_pcie_bw, NULL);
+static DEVICE_ATTR(ppfeatures, S_IRUGO | S_IWUSR,
+		amdgpu_get_ppfeature_status,
+		amdgpu_set_ppfeature_status);
 
 static ssize_t amdgpu_hwmon_show_temp(struct device *dev,
 				      struct device_attribute *attr,
@@ -1516,6 +1724,75 @@ static ssize_t amdgpu_hwmon_set_power_cap(struct device *dev,
 	return count;
 }
 
+static ssize_t amdgpu_hwmon_show_sclk(struct device *dev,
+				      struct device_attribute *attr,
+				      char *buf)
+{
+	struct amdgpu_device *adev = dev_get_drvdata(dev);
+	struct drm_device *ddev = adev->ddev;
+	uint32_t sclk;
+	int r, size = sizeof(sclk);
+
+	/* Can't get voltage when the card is off */
+	if  ((adev->flags & AMD_IS_PX) &&
+	     (ddev->switch_power_state != DRM_SWITCH_POWER_ON))
+		return -EINVAL;
+
+	/* sanity check PP is enabled */
+	if (!(adev->powerplay.pp_funcs &&
+	      adev->powerplay.pp_funcs->read_sensor))
+	      return -EINVAL;
+
+	/* get the sclk */
+	r = amdgpu_dpm_read_sensor(adev, AMDGPU_PP_SENSOR_GFX_SCLK,
+				   (void *)&sclk, &size);
+	if (r)
+		return r;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", sclk * 10 * 1000);
+}
+
+static ssize_t amdgpu_hwmon_show_sclk_label(struct device *dev,
+					    struct device_attribute *attr,
+					    char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "sclk\n");
+}
+
+static ssize_t amdgpu_hwmon_show_mclk(struct device *dev,
+				      struct device_attribute *attr,
+				      char *buf)
+{
+	struct amdgpu_device *adev = dev_get_drvdata(dev);
+	struct drm_device *ddev = adev->ddev;
+	uint32_t mclk;
+	int r, size = sizeof(mclk);
+
+	/* Can't get voltage when the card is off */
+	if  ((adev->flags & AMD_IS_PX) &&
+	     (ddev->switch_power_state != DRM_SWITCH_POWER_ON))
+		return -EINVAL;
+
+	/* sanity check PP is enabled */
+	if (!(adev->powerplay.pp_funcs &&
+	      adev->powerplay.pp_funcs->read_sensor))
+	      return -EINVAL;
+
+	/* get the sclk */
+	r = amdgpu_dpm_read_sensor(adev, AMDGPU_PP_SENSOR_GFX_MCLK,
+				   (void *)&mclk, &size);
+	if (r)
+		return r;
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", mclk * 10 * 1000);
+}
+
+static ssize_t amdgpu_hwmon_show_mclk_label(struct device *dev,
+					    struct device_attribute *attr,
+					    char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "mclk\n");
+}
 
 /**
  * DOC: hwmon
@@ -1532,6 +1809,10 @@ static ssize_t amdgpu_hwmon_set_power_cap(struct device *dev,
  *
  * - GPU fan
  *
+ * - GPU gfx/compute engine clock
+ *
+ * - GPU memory clock (dGPU only)
+ *
  * hwmon interfaces for GPU temperature:
  *
  * - temp1_input: the on die GPU temperature in millidegrees Celsius
@@ -1576,6 +1857,12 @@ static ssize_t amdgpu_hwmon_set_power_cap(struct device *dev,
  *
  * - fan[1-*]_enable: Enable or disable the sensors.1: Enable 0: Disable
  *
+ * hwmon interfaces for GPU clocks:
+ *
+ * - freq1_input: the gfx/compute clock in hertz
+ *
+ * - freq2_input: the memory clock in hertz
+ *
  * You can use hwmon tools like sensors to view this information on your system.
  *
  */
@@ -1600,6 +1887,10 @@ static SENSOR_DEVICE_ATTR(power1_average, S_IRUGO, amdgpu_hwmon_show_power_avg,
 static SENSOR_DEVICE_ATTR(power1_cap_max, S_IRUGO, amdgpu_hwmon_show_power_cap_max, NULL, 0);
 static SENSOR_DEVICE_ATTR(power1_cap_min, S_IRUGO, amdgpu_hwmon_show_power_cap_min, NULL, 0);
 static SENSOR_DEVICE_ATTR(power1_cap, S_IRUGO | S_IWUSR, amdgpu_hwmon_show_power_cap, amdgpu_hwmon_set_power_cap, 0);
+static SENSOR_DEVICE_ATTR(freq1_input, S_IRUGO, amdgpu_hwmon_show_sclk, NULL, 0);
+static SENSOR_DEVICE_ATTR(freq1_label, S_IRUGO, amdgpu_hwmon_show_sclk_label, NULL, 0);
+static SENSOR_DEVICE_ATTR(freq2_input, S_IRUGO, amdgpu_hwmon_show_mclk, NULL, 0);
+static SENSOR_DEVICE_ATTR(freq2_label, S_IRUGO, amdgpu_hwmon_show_mclk_label, NULL, 0);
 
 static struct attribute *hwmon_attributes[] = {
 	&sensor_dev_attr_temp1_input.dev_attr.attr,
@@ -1622,6 +1913,10 @@ static struct attribute *hwmon_attributes[] = {
 	&sensor_dev_attr_power1_cap_max.dev_attr.attr,
 	&sensor_dev_attr_power1_cap_min.dev_attr.attr,
 	&sensor_dev_attr_power1_cap.dev_attr.attr,
+	&sensor_dev_attr_freq1_input.dev_attr.attr,
+	&sensor_dev_attr_freq1_label.dev_attr.attr,
+	&sensor_dev_attr_freq2_input.dev_attr.attr,
+	&sensor_dev_attr_freq2_label.dev_attr.attr,
 	NULL
 };
 
@@ -1713,6 +2008,12 @@ static umode_t hwmon_attributes_visible(struct kobject *kobj,
 	     attr == &sensor_dev_attr_in1_label.dev_attr.attr))
 		return 0;
 
+	/* no mclk on APUs */
+	if ((adev->flags & AMD_IS_APU) &&
+	    (attr == &sensor_dev_attr_freq2_input.dev_attr.attr ||
+	     attr == &sensor_dev_attr_freq2_label.dev_attr.attr))
+		return 0;
+
 	return effective_mode;
 }
 
@@ -2071,6 +2372,25 @@ int amdgpu_pm_sysfs_init(struct amdgpu_device *adev)
 		DRM_ERROR("failed to create device file pp_dpm_mclk\n");
 		return ret;
 	}
+	if (adev->asic_type >= CHIP_VEGA10) {
+		ret = device_create_file(adev->dev, &dev_attr_pp_dpm_socclk);
+		if (ret) {
+			DRM_ERROR("failed to create device file pp_dpm_socclk\n");
+			return ret;
+		}
+		ret = device_create_file(adev->dev, &dev_attr_pp_dpm_dcefclk);
+		if (ret) {
+			DRM_ERROR("failed to create device file pp_dpm_dcefclk\n");
+			return ret;
+		}
+	}
+	if (adev->asic_type >= CHIP_VEGA20) {
+		ret = device_create_file(adev->dev, &dev_attr_pp_dpm_fclk);
+		if (ret) {
+			DRM_ERROR("failed to create device file pp_dpm_fclk\n");
+			return ret;
+		}
+	}
 	ret = device_create_file(adev->dev, &dev_attr_pp_dpm_pcie);
 	if (ret) {
 		DRM_ERROR("failed to create device file pp_dpm_pcie\n");
@@ -2109,12 +2429,31 @@ int amdgpu_pm_sysfs_init(struct amdgpu_device *adev)
 				"gpu_busy_level\n");
 		return ret;
 	}
+	/* PCIe Perf counters won't work on APU nodes */
+	if (!(adev->flags & AMD_IS_APU)) {
+		ret = device_create_file(adev->dev, &dev_attr_pcie_bw);
+		if (ret) {
+			DRM_ERROR("failed to create device file pcie_bw\n");
+			return ret;
+		}
+	}
 	ret = amdgpu_debugfs_pm_init(adev);
 	if (ret) {
 		DRM_ERROR("Failed to register debugfs file for dpm!\n");
 		return ret;
 	}
 
+	if ((adev->asic_type >= CHIP_VEGA10) &&
+	    !(adev->flags & AMD_IS_APU)) {
+		ret = device_create_file(adev->dev,
+				&dev_attr_ppfeatures);
+		if (ret) {
+			DRM_ERROR("failed to create device file	"
+					"ppfeatures\n");
+			return ret;
+		}
+	}
+
 	adev->pm.sysfs_initialized = true;
 
 	return 0;
@@ -2139,7 +2478,13 @@ void amdgpu_pm_sysfs_fini(struct amdgpu_device *adev)
 
 	device_remove_file(adev->dev, &dev_attr_pp_dpm_sclk);
 	device_remove_file(adev->dev, &dev_attr_pp_dpm_mclk);
+	if (adev->asic_type >= CHIP_VEGA10) {
+		device_remove_file(adev->dev, &dev_attr_pp_dpm_socclk);
+		device_remove_file(adev->dev, &dev_attr_pp_dpm_dcefclk);
+	}
 	device_remove_file(adev->dev, &dev_attr_pp_dpm_pcie);
+	if (adev->asic_type >= CHIP_VEGA20)
+		device_remove_file(adev->dev, &dev_attr_pp_dpm_fclk);
 	device_remove_file(adev->dev, &dev_attr_pp_sclk_od);
 	device_remove_file(adev->dev, &dev_attr_pp_mclk_od);
 	device_remove_file(adev->dev,
@@ -2148,6 +2493,11 @@ void amdgpu_pm_sysfs_fini(struct amdgpu_device *adev)
 		device_remove_file(adev->dev,
 				&dev_attr_pp_od_clk_voltage);
 	device_remove_file(adev->dev, &dev_attr_gpu_busy_percent);
+	if (!(adev->flags & AMD_IS_APU))
+		device_remove_file(adev->dev, &dev_attr_pcie_bw);
+	if ((adev->asic_type >= CHIP_VEGA10) &&
+	    !(adev->flags & AMD_IS_APU))
+		device_remove_file(adev->dev, &dev_attr_ppfeatures);
 }
 
 void amdgpu_pm_compute_clocks(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 3a9b48b227ac..3091488cd8cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -67,9 +67,6 @@ static int psp_sw_init(void *handle)
 
 	psp->adev = adev;
 
-	if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)
-		return 0;
-
 	ret = psp_init_microcode(psp);
 	if (ret) {
 		DRM_ERROR("Failed to load psp firmware!\n");
@@ -83,9 +80,6 @@ static int psp_sw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)
-		return 0;
-
 	release_firmware(adev->psp.sos_fw);
 	adev->psp.sos_fw = NULL;
 	release_firmware(adev->psp.asd_fw);
@@ -142,13 +136,24 @@ psp_cmd_submit_buf(struct psp_context *psp,
 	while (*((unsigned int *)psp->fence_buf) != index)
 		msleep(1);
 
-	/* the status field must be 0 after FW is loaded */
-	if (ucode && psp->cmd_buf_mem->resp.status) {
-		DRM_ERROR("failed loading with status (%d) and ucode id (%d)\n",
-			  psp->cmd_buf_mem->resp.status, ucode->ucode_id);
-		return -EINVAL;
+	/* In some cases, psp response status is not 0 even there is no
+	 * problem while the command is submitted. Some version of PSP FW
+	 * doesn't write 0 to that field.
+	 * So here we would like to only print a warning instead of an error
+	 * during psp initialization to avoid breaking hw_init and it doesn't
+	 * return -EINVAL.
+	 */
+	if (psp->cmd_buf_mem->resp.status) {
+		if (ucode)
+			DRM_WARN("failed to load ucode id (%d) ",
+				  ucode->ucode_id);
+		DRM_WARN("psp command failed and response status is (%d)\n",
+			  psp->cmd_buf_mem->resp.status);
 	}
 
+	/* get xGMI session id from response buffer */
+	cmd->resp.session_id = psp->cmd_buf_mem->resp.session_id;
+
 	if (ucode) {
 		ucode->tmr_mc_addr_lo = psp->cmd_buf_mem->resp.fw_addr_lo;
 		ucode->tmr_mc_addr_hi = psp->cmd_buf_mem->resp.fw_addr_hi;
@@ -500,6 +505,98 @@ static int psp_hw_start(struct psp_context *psp)
 	return 0;
 }
 
+static int psp_get_fw_type(struct amdgpu_firmware_info *ucode,
+			   enum psp_gfx_fw_type *type)
+{
+	switch (ucode->ucode_id) {
+	case AMDGPU_UCODE_ID_SDMA0:
+		*type = GFX_FW_TYPE_SDMA0;
+		break;
+	case AMDGPU_UCODE_ID_SDMA1:
+		*type = GFX_FW_TYPE_SDMA1;
+		break;
+	case AMDGPU_UCODE_ID_CP_CE:
+		*type = GFX_FW_TYPE_CP_CE;
+		break;
+	case AMDGPU_UCODE_ID_CP_PFP:
+		*type = GFX_FW_TYPE_CP_PFP;
+		break;
+	case AMDGPU_UCODE_ID_CP_ME:
+		*type = GFX_FW_TYPE_CP_ME;
+		break;
+	case AMDGPU_UCODE_ID_CP_MEC1:
+		*type = GFX_FW_TYPE_CP_MEC;
+		break;
+	case AMDGPU_UCODE_ID_CP_MEC1_JT:
+		*type = GFX_FW_TYPE_CP_MEC_ME1;
+		break;
+	case AMDGPU_UCODE_ID_CP_MEC2:
+		*type = GFX_FW_TYPE_CP_MEC;
+		break;
+	case AMDGPU_UCODE_ID_CP_MEC2_JT:
+		*type = GFX_FW_TYPE_CP_MEC_ME2;
+		break;
+	case AMDGPU_UCODE_ID_RLC_G:
+		*type = GFX_FW_TYPE_RLC_G;
+		break;
+	case AMDGPU_UCODE_ID_RLC_RESTORE_LIST_CNTL:
+		*type = GFX_FW_TYPE_RLC_RESTORE_LIST_SRM_CNTL;
+		break;
+	case AMDGPU_UCODE_ID_RLC_RESTORE_LIST_GPM_MEM:
+		*type = GFX_FW_TYPE_RLC_RESTORE_LIST_GPM_MEM;
+		break;
+	case AMDGPU_UCODE_ID_RLC_RESTORE_LIST_SRM_MEM:
+		*type = GFX_FW_TYPE_RLC_RESTORE_LIST_SRM_MEM;
+		break;
+	case AMDGPU_UCODE_ID_SMC:
+		*type = GFX_FW_TYPE_SMU;
+		break;
+	case AMDGPU_UCODE_ID_UVD:
+		*type = GFX_FW_TYPE_UVD;
+		break;
+	case AMDGPU_UCODE_ID_UVD1:
+		*type = GFX_FW_TYPE_UVD1;
+		break;
+	case AMDGPU_UCODE_ID_VCE:
+		*type = GFX_FW_TYPE_VCE;
+		break;
+	case AMDGPU_UCODE_ID_VCN:
+		*type = GFX_FW_TYPE_VCN;
+		break;
+	case AMDGPU_UCODE_ID_DMCU_ERAM:
+		*type = GFX_FW_TYPE_DMCU_ERAM;
+		break;
+	case AMDGPU_UCODE_ID_DMCU_INTV:
+		*type = GFX_FW_TYPE_DMCU_ISR;
+		break;
+	case AMDGPU_UCODE_ID_MAXIMUM:
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int psp_prep_load_ip_fw_cmd_buf(struct amdgpu_firmware_info *ucode,
+				       struct psp_gfx_cmd_resp *cmd)
+{
+	int ret;
+	uint64_t fw_mem_mc_addr = ucode->mc_addr;
+
+	memset(cmd, 0, sizeof(struct psp_gfx_cmd_resp));
+
+	cmd->cmd_id = GFX_CMD_ID_LOAD_IP_FW;
+	cmd->cmd.cmd_load_ip_fw.fw_phy_addr_lo = lower_32_bits(fw_mem_mc_addr);
+	cmd->cmd.cmd_load_ip_fw.fw_phy_addr_hi = upper_32_bits(fw_mem_mc_addr);
+	cmd->cmd.cmd_load_ip_fw.fw_size = ucode->ucode_size;
+
+	ret = psp_get_fw_type(ucode, &cmd->cmd.cmd_load_ip_fw.fw_type);
+	if (ret)
+		DRM_ERROR("Unknown firmware type\n");
+
+	return ret;
+}
+
 static int psp_np_fw_load(struct psp_context *psp)
 {
 	int i, ret;
@@ -521,7 +618,7 @@ static int psp_np_fw_load(struct psp_context *psp)
 			/*skip ucode loading in SRIOV VF */
 			continue;
 
-		ret = psp_prep_cmd_buf(ucode, psp->cmd);
+		ret = psp_prep_load_ip_fw_cmd_buf(ucode, psp->cmd);
 		if (ret)
 			return ret;
 
@@ -546,7 +643,7 @@ static int psp_load_fw(struct amdgpu_device *adev)
 	struct psp_context *psp = &adev->psp;
 
 	if (amdgpu_sriov_vf(adev) && adev->in_gpu_reset) {
-		psp_ring_destroy(psp, PSP_RING_TYPE__KM);
+		psp_ring_stop(psp, PSP_RING_TYPE__KM); /* should not destroy ring, only stop */
 		goto skip_memalloc;
 	}
 
@@ -623,10 +720,6 @@ static int psp_hw_init(void *handle)
 	int ret;
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-
-	if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)
-		return 0;
-
 	mutex_lock(&adev->firmware.mutex);
 	/*
 	 * This sequence is just used on hw_init only once, no need on
@@ -656,9 +749,6 @@ static int psp_hw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 	struct psp_context *psp = &adev->psp;
 
-	if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)
-		return 0;
-
 	if (adev->gmc.xgmi.num_physical_nodes > 1 &&
 	    psp->xgmi_context.initialized == 1)
                 psp_xgmi_terminate(psp);
@@ -687,9 +777,6 @@ static int psp_suspend(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 	struct psp_context *psp = &adev->psp;
 
-	if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)
-		return 0;
-
 	if (adev->gmc.xgmi.num_physical_nodes > 1 &&
 	    psp->xgmi_context.initialized == 1) {
 		ret = psp_xgmi_terminate(psp);
@@ -714,9 +801,6 @@ static int psp_resume(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 	struct psp_context *psp = &adev->psp;
 
-	if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)
-		return 0;
-
 	DRM_INFO("PSP is resuming...\n");
 
 	mutex_lock(&adev->firmware.mutex);
@@ -752,11 +836,6 @@ static bool psp_check_fw_loading_status(struct amdgpu_device *adev,
 {
 	struct amdgpu_firmware_info *ucode = NULL;
 
-	if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
-		DRM_INFO("firmware is not loaded by PSP\n");
-		return true;
-	}
-
 	if (!adev->firmware.fw_size)
 		return false;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index 3ee573b4016e..2ef98cc755d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -65,8 +65,6 @@ struct psp_funcs
 	int (*init_microcode)(struct psp_context *psp);
 	int (*bootloader_load_sysdrv)(struct psp_context *psp);
 	int (*bootloader_load_sos)(struct psp_context *psp);
-	int (*prep_cmd_buf)(struct amdgpu_firmware_info *ucode,
-			    struct psp_gfx_cmd_resp *cmd);
 	int (*ring_init)(struct psp_context *psp, enum psp_ring_type ring_type);
 	int (*ring_create)(struct psp_context *psp,
 			   enum psp_ring_type ring_type);
@@ -176,7 +174,6 @@ struct psp_xgmi_topology_info {
 	struct psp_xgmi_node_info	nodes[AMDGPU_XGMI_MAX_CONNECTED_NODES];
 };
 
-#define psp_prep_cmd_buf(ucode, type) (psp)->funcs->prep_cmd_buf((ucode), (type))
 #define psp_ring_init(psp, type) (psp)->funcs->ring_init((psp), (type))
 #define psp_ring_create(psp, type) (psp)->funcs->ring_create((psp), (type))
 #define psp_ring_stop(psp, type) (psp)->funcs->ring_stop((psp), (type))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index d87e828a084b..d7fae2676269 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -131,7 +131,7 @@ struct amdgpu_ring_funcs {
 	void (*emit_ib)(struct amdgpu_ring *ring,
 			struct amdgpu_job *job,
 			struct amdgpu_ib *ib,
-			bool ctx_switch);
+			uint32_t flags);
 	void (*emit_fence)(struct amdgpu_ring *ring, uint64_t addr,
 			   uint64_t seq, unsigned flags);
 	void (*emit_pipeline_sync)(struct amdgpu_ring *ring);
@@ -229,7 +229,7 @@ struct amdgpu_ring {
 #define amdgpu_ring_get_rptr(r) (r)->funcs->get_rptr((r))
 #define amdgpu_ring_get_wptr(r) (r)->funcs->get_wptr((r))
 #define amdgpu_ring_set_wptr(r) (r)->funcs->set_wptr((r))
-#define amdgpu_ring_emit_ib(r, job, ib, c) ((r)->funcs->emit_ib((r), (job), (ib), (c)))
+#define amdgpu_ring_emit_ib(r, job, ib, flags) ((r)->funcs->emit_ib((r), (job), (ib), (flags)))
 #define amdgpu_ring_emit_pipeline_sync(r) (r)->funcs->emit_pipeline_sync((r))
 #define amdgpu_ring_emit_vm_flush(r, vmid, addr) (r)->funcs->emit_vm_flush((r), (vmid), (addr))
 #define amdgpu_ring_emit_fence(r, addr, seq, flags) (r)->funcs->emit_fence((r), (addr), (seq), (flags))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
index 12f2bf97611f..bfaf5c6323be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
@@ -388,7 +388,7 @@ void amdgpu_sa_bo_dump_debug_info(struct amdgpu_sa_manager *sa_manager,
 			   soffset, eoffset, eoffset - soffset);
 
 		if (i->fence)
-			seq_printf(m, " protected by 0x%08x on context %llu",
+			seq_printf(m, " protected by 0x%016llx on context %llu",
 				   i->fence->seqno, i->fence->context);
 
 		seq_printf(m, "\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
index 1cafe8d83a4d..0767a93e4d91 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
@@ -54,16 +54,20 @@ static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
 						  enum drm_sched_priority priority)
 {
 	struct file *filp = fget(fd);
-	struct drm_file *file;
 	struct amdgpu_fpriv *fpriv;
 	struct amdgpu_ctx *ctx;
 	uint32_t id;
+	int r;
 
 	if (!filp)
 		return -EINVAL;
 
-	file = filp->private_data;
-	fpriv = file->driver_priv;
+	r = amdgpu_file_to_fpriv(filp, &fpriv);
+	if (r) {
+		fput(filp);
+		return r;
+	}
+
 	idr_for_each_entry(&fpriv->ctx_mgr.ctx_handles, ctx, id)
 		amdgpu_ctx_priority_override(ctx, priority);
 
@@ -72,6 +76,39 @@ static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
 	return 0;
 }
 
+static int amdgpu_sched_context_priority_override(struct amdgpu_device *adev,
+						  int fd,
+						  unsigned ctx_id,
+						  enum drm_sched_priority priority)
+{
+	struct file *filp = fget(fd);
+	struct amdgpu_fpriv *fpriv;
+	struct amdgpu_ctx *ctx;
+	int r;
+
+	if (!filp)
+		return -EINVAL;
+
+	r = amdgpu_file_to_fpriv(filp, &fpriv);
+	if (r) {
+		fput(filp);
+		return r;
+	}
+
+	ctx = amdgpu_ctx_get(fpriv, ctx_id);
+
+	if (!ctx) {
+		fput(filp);
+		return -EINVAL;
+	}
+
+	amdgpu_ctx_priority_override(ctx, priority);
+	amdgpu_ctx_put(ctx);
+	fput(filp);
+
+	return 0;
+}
+
 int amdgpu_sched_ioctl(struct drm_device *dev, void *data,
 		       struct drm_file *filp)
 {
@@ -81,7 +118,7 @@ int amdgpu_sched_ioctl(struct drm_device *dev, void *data,
 	int r;
 
 	priority = amdgpu_to_sched_priority(args->in.priority);
-	if (args->in.flags || priority == DRM_SCHED_PRIORITY_INVALID)
+	if (priority == DRM_SCHED_PRIORITY_INVALID)
 		return -EINVAL;
 
 	switch (args->in.op) {
@@ -90,6 +127,12 @@ int amdgpu_sched_ioctl(struct drm_device *dev, void *data,
 							   args->in.fd,
 							   priority);
 		break;
+	case AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE:
+		r = amdgpu_sched_context_priority_override(adev,
+							   args->in.fd,
+							   args->in.ctx_id,
+							   priority);
+		break;
 	default:
 		DRM_ERROR("Invalid sched op specified: %d\n", args->in.op);
 		r = -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 626abca770a0..d3ca2424b5fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -76,9 +76,10 @@ TRACE_EVENT(amdgpu_mm_wreg,
 );
 
 TRACE_EVENT(amdgpu_iv,
-	    TP_PROTO(struct amdgpu_iv_entry *iv),
-	    TP_ARGS(iv),
+	    TP_PROTO(unsigned ih, struct amdgpu_iv_entry *iv),
+	    TP_ARGS(ih, iv),
 	    TP_STRUCT__entry(
+			     __field(unsigned, ih)
 			     __field(unsigned, client_id)
 			     __field(unsigned, src_id)
 			     __field(unsigned, ring_id)
@@ -90,6 +91,7 @@ TRACE_EVENT(amdgpu_iv,
 			     __array(unsigned, src_data, 4)
 			    ),
 	    TP_fast_assign(
+			   __entry->ih = ih;
 			   __entry->client_id = iv->client_id;
 			   __entry->src_id = iv->src_id;
 			   __entry->ring_id = iv->ring_id;
@@ -103,8 +105,9 @@ TRACE_EVENT(amdgpu_iv,
 			   __entry->src_data[2] = iv->src_data[2];
 			   __entry->src_data[3] = iv->src_data[3];
 			   ),
-	    TP_printk("client_id:%u src_id:%u ring:%u vmid:%u timestamp: %llu pasid:%u src_data: %08x %08x %08x %08x",
-		      __entry->client_id, __entry->src_id,
+	    TP_printk("ih:%u client_id:%u src_id:%u ring:%u vmid:%u "
+		      "timestamp: %llu pasid:%u src_data: %08x %08x %08x %08x",
+		      __entry->ih, __entry->client_id, __entry->src_id,
 		      __entry->ring_id, __entry->vmid,
 		      __entry->timestamp, __entry->pasid,
 		      __entry->src_data[0], __entry->src_data[1],
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c91ec3101d00..73e71e61dc99 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1546,7 +1546,8 @@ static struct ttm_bo_driver amdgpu_bo_driver = {
 	.io_mem_reserve = &amdgpu_ttm_io_mem_reserve,
 	.io_mem_free = &amdgpu_ttm_io_mem_free,
 	.io_mem_pfn = amdgpu_ttm_io_mem_pfn,
-	.access_memory = &amdgpu_ttm_access_memory
+	.access_memory = &amdgpu_ttm_access_memory,
+	.del_from_lru_notify = &amdgpu_vm_del_from_lru_notify
 };
 
 /*
@@ -1755,7 +1756,7 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 	}
 
 	r = amdgpu_bo_create_kernel(adev, adev->gds.mem.gfx_partition_size,
-				    PAGE_SIZE, AMDGPU_GEM_DOMAIN_GDS,
+				    4, AMDGPU_GEM_DOMAIN_GDS,
 				    &adev->gds.gds_gfx_bo, NULL, NULL);
 	if (r)
 		return r;
@@ -1768,7 +1769,7 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 	}
 
 	r = amdgpu_bo_create_kernel(adev, adev->gds.gws.gfx_partition_size,
-				    PAGE_SIZE, AMDGPU_GEM_DOMAIN_GWS,
+				    1, AMDGPU_GEM_DOMAIN_GWS,
 				    &adev->gds.gws_gfx_bo, NULL, NULL);
 	if (r)
 		return r;
@@ -1781,7 +1782,7 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 	}
 
 	r = amdgpu_bo_create_kernel(adev, adev->gds.oa.gfx_partition_size,
-				    PAGE_SIZE, AMDGPU_GEM_DOMAIN_OA,
+				    1, AMDGPU_GEM_DOMAIN_OA,
 				    &adev->gds.oa_gfx_bo, NULL, NULL);
 	if (r)
 		return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index 98a1b2ce2b9d..c021b114c8a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -1035,7 +1035,7 @@ out:
 void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring,
 				struct amdgpu_job *job,
 				struct amdgpu_ib *ib,
-				bool ctx_switch)
+				uint32_t flags)
 {
 	amdgpu_ring_write(ring, VCE_CMD_IB);
 	amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr));
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
index 50293652af14..30ea54dd9117 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.h
@@ -66,7 +66,7 @@ void amdgpu_vce_free_handles(struct amdgpu_device *adev, struct drm_file *filp);
 int amdgpu_vce_ring_parse_cs(struct amdgpu_cs_parser *p, uint32_t ib_idx);
 int amdgpu_vce_ring_parse_cs_vm(struct amdgpu_cs_parser *p, uint32_t ib_idx);
 void amdgpu_vce_ring_emit_ib(struct amdgpu_ring *ring, struct amdgpu_job *job,
-				struct amdgpu_ib *ib, bool ctx_switch);
+				struct amdgpu_ib *ib, uint32_t flags);
 void amdgpu_vce_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64 seq,
 				unsigned flags);
 int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 698bcb8ce61d..ead851413c0a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -107,14 +107,6 @@ struct amdgpu_pte_update_params {
 	 * DMA addresses to use for mapping, used during VM update by CPU
 	 */
 	dma_addr_t *pages_addr;
-
-	/**
-	 * @kptr:
-	 *
-	 * Kernel pointer of PD/PT BO that needs to be updated,
-	 * used during VM update by CPU
-	 */
-	void *kptr;
 };
 
 /**
@@ -623,6 +615,28 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
 	list_add(&entry->tv.head, validated);
 }
 
+void amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo)
+{
+	struct amdgpu_bo *abo;
+	struct amdgpu_vm_bo_base *bo_base;
+
+	if (!amdgpu_bo_is_amdgpu_bo(bo))
+		return;
+
+	if (bo->mem.placement & TTM_PL_FLAG_NO_EVICT)
+		return;
+
+	abo = ttm_to_amdgpu_bo(bo);
+	if (!abo->parent)
+		return;
+	for (bo_base = abo->vm_bo; bo_base; bo_base = bo_base->next) {
+		struct amdgpu_vm *vm = bo_base->vm;
+
+		if (abo->tbo.resv == vm->root.base.bo->tbo.resv)
+			vm->bulk_moveable = false;
+	}
+
+}
 /**
  * amdgpu_vm_move_to_lru_tail - move all BOs to the end of LRU
  *
@@ -686,8 +700,6 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	struct amdgpu_vm_bo_base *bo_base, *tmp;
 	int r = 0;
 
-	vm->bulk_moveable &= list_empty(&vm->evicted);
-
 	list_for_each_entry_safe(bo_base, tmp, &vm->evicted, vm_status) {
 		struct amdgpu_bo *bo = bo_base->bo;
 
@@ -801,15 +813,22 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
 		addr += ats_entries * 8;
 	}
 
-	if (entries)
+	if (entries) {
+		uint64_t value = 0;
+
+		/* Workaround for fault priority problem on GMC9 */
+		if (level == AMDGPU_VM_PTB && adev->asic_type >= CHIP_VEGA10)
+			value = AMDGPU_PTE_EXECUTABLE;
+
 		amdgpu_vm_set_pte_pde(adev, &job->ibs[0], addr, 0,
-				      entries, 0, 0);
+				      entries, 0, value);
+	}
 
 	amdgpu_ring_pad_ib(ring, &job->ibs[0]);
 
 	WARN_ON(job->ibs[0].length_dw > 64);
 	r = amdgpu_sync_resv(adev, &job->sync, bo->tbo.resv,
-			     AMDGPU_FENCE_OWNER_UNDEFINED, false);
+			     AMDGPU_FENCE_OWNER_KFD, false);
 	if (r)
 		goto error_free;
 
@@ -1313,31 +1332,6 @@ static void amdgpu_vm_cpu_set_ptes(struct amdgpu_pte_update_params *params,
 	}
 }
 
-
-/**
- * amdgpu_vm_wait_pd - Wait for PT BOs to be free.
- *
- * @adev: amdgpu_device pointer
- * @vm: related vm
- * @owner: fence owner
- *
- * Returns:
- * 0 on success, errno otherwise.
- */
-static int amdgpu_vm_wait_pd(struct amdgpu_device *adev, struct amdgpu_vm *vm,
-			     void *owner)
-{
-	struct amdgpu_sync sync;
-	int r;
-
-	amdgpu_sync_create(&sync);
-	amdgpu_sync_resv(adev, &sync, vm->root.base.bo->tbo.resv, owner, false);
-	r = amdgpu_sync_wait(&sync, true);
-	amdgpu_sync_free(&sync);
-
-	return r;
-}
-
 /**
  * amdgpu_vm_update_func - helper to call update function
  *
@@ -1432,7 +1426,8 @@ restart:
 	params.adev = adev;
 
 	if (vm->use_cpu_for_update) {
-		r = amdgpu_vm_wait_pd(adev, vm, AMDGPU_FENCE_OWNER_VM);
+		r = amdgpu_bo_sync_wait(vm->root.base.bo,
+					AMDGPU_FENCE_OWNER_VM, true);
 		if (unlikely(r))
 			return r;
 
@@ -1505,20 +1500,27 @@ error:
 }
 
 /**
- * amdgpu_vm_update_huge - figure out parameters for PTE updates
+ * amdgpu_vm_update_flags - figure out flags for PTE updates
  *
  * Make sure to set the right flags for the PTEs at the desired level.
  */
-static void amdgpu_vm_update_huge(struct amdgpu_pte_update_params *params,
-				  struct amdgpu_bo *bo, unsigned level,
-				  uint64_t pe, uint64_t addr,
-				  unsigned count, uint32_t incr,
-				  uint64_t flags)
+static void amdgpu_vm_update_flags(struct amdgpu_pte_update_params *params,
+				   struct amdgpu_bo *bo, unsigned level,
+				   uint64_t pe, uint64_t addr,
+				   unsigned count, uint32_t incr,
+				   uint64_t flags)
 
 {
 	if (level != AMDGPU_VM_PTB) {
 		flags |= AMDGPU_PDE_PTE;
 		amdgpu_gmc_get_vm_pde(params->adev, level, &addr, &flags);
+
+	} else if (params->adev->asic_type >= CHIP_VEGA10 &&
+		   !(flags & AMDGPU_PTE_VALID) &&
+		   !(flags & AMDGPU_PTE_PRT)) {
+
+		/* Workaround for fault priority problem on GMC9 */
+		flags |= AMDGPU_PTE_EXECUTABLE;
 	}
 
 	amdgpu_vm_update_func(params, bo, pe, addr, count, incr, flags);
@@ -1675,9 +1677,9 @@ static int amdgpu_vm_update_ptes(struct amdgpu_pte_update_params *params,
 			uint64_t upd_end = min(entry_end, frag_end);
 			unsigned nptes = (upd_end - frag_start) >> shift;
 
-			amdgpu_vm_update_huge(params, pt, cursor.level,
-					      pe_start, dst, nptes, incr,
-					      flags | AMDGPU_PTE_FRAG(frag));
+			amdgpu_vm_update_flags(params, pt, cursor.level,
+					       pe_start, dst, nptes, incr,
+					       flags | AMDGPU_PTE_FRAG(frag));
 
 			pe_start += nptes * 8;
 			dst += (uint64_t)nptes * AMDGPU_GPU_PAGE_SIZE << shift;
@@ -1746,22 +1748,29 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	params.adev = adev;
 	params.vm = vm;
 
-	/* sync to everything on unmapping */
+	/* sync to everything except eviction fences on unmapping */
 	if (!(flags & AMDGPU_PTE_VALID))
-		owner = AMDGPU_FENCE_OWNER_UNDEFINED;
+		owner = AMDGPU_FENCE_OWNER_KFD;
 
 	if (vm->use_cpu_for_update) {
 		/* params.src is used as flag to indicate system Memory */
 		if (pages_addr)
 			params.src = ~0;
 
-		/* Wait for PT BOs to be free. PTs share the same resv. object
+		/* Wait for PT BOs to be idle. PTs share the same resv. object
 		 * as the root PD BO
 		 */
-		r = amdgpu_vm_wait_pd(adev, vm, owner);
+		r = amdgpu_bo_sync_wait(vm->root.base.bo, owner, true);
 		if (unlikely(r))
 			return r;
 
+		/* Wait for any BO move to be completed */
+		if (exclusive) {
+			r = dma_fence_wait(exclusive, true);
+			if (unlikely(r))
+				return r;
+		}
+
 		params.func = amdgpu_vm_cpu_set_ptes;
 		params.pages_addr = pages_addr;
 		return amdgpu_vm_update_ptes(&params, start, last + 1,
@@ -1775,13 +1784,12 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	/*
 	 * reserve space for two commands every (1 << BLOCK_SIZE)
 	 *  entries or 2k dwords (whatever is smaller)
-         *
-         * The second command is for the shadow pagetables.
 	 */
+	ncmds = ((nptes >> min(adev->vm_manager.block_size, 11u)) + 1);
+
+	/* The second command is for the shadow pagetables. */
 	if (vm->root.base.bo->shadow)
-		ncmds = ((nptes >> min(adev->vm_manager.block_size, 11u)) + 1) * 2;
-	else
-		ncmds = ((nptes >> min(adev->vm_manager.block_size, 11u)) + 1);
+		ncmds *= 2;
 
 	/* padding, etc. */
 	ndw = 64;
@@ -1800,10 +1808,11 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 		ndw += ncmds * 10;
 
 		/* extra commands for begin/end fragments */
+		ncmds = 2 * adev->vm_manager.fragment_size;
 		if (vm->root.base.bo->shadow)
-		        ndw += 2 * 10 * adev->vm_manager.fragment_size * 2;
-		else
-		        ndw += 2 * 10 * adev->vm_manager.fragment_size;
+			ncmds *= 2;
+
+		ndw += 10 * ncmds;
 
 		params.func = amdgpu_vm_do_set_ptes;
 	}
@@ -3005,7 +3014,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	}
 	DRM_DEBUG_DRIVER("VM update mode is %s\n",
 			 vm->use_cpu_for_update ? "CPU" : "SDMA");
-	WARN_ONCE((vm->use_cpu_for_update & !amdgpu_gmc_vram_full_visible(&adev->gmc)),
+	WARN_ONCE((vm->use_cpu_for_update && !amdgpu_gmc_vram_full_visible(&adev->gmc)),
 		  "CPU update of VM recommended only for large BAR system\n");
 	vm->last_update = NULL;
 
@@ -3135,7 +3144,7 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm, uns
 	vm->pte_support_ats = pte_support_ats;
 	DRM_DEBUG_DRIVER("VM update mode is %s\n",
 			 vm->use_cpu_for_update ? "CPU" : "SDMA");
-	WARN_ONCE((vm->use_cpu_for_update & !amdgpu_gmc_vram_full_visible(&adev->gmc)),
+	WARN_ONCE((vm->use_cpu_for_update && !amdgpu_gmc_vram_full_visible(&adev->gmc)),
 		  "CPU update of VM recommended only for large BAR system\n");
 
 	if (vm->pasid) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index e8dcfd59fc93..81ff8177f092 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -363,4 +363,6 @@ int amdgpu_vm_add_fault(struct amdgpu_retryfault_hashtable *fault_hash, u64 key)
 
 void amdgpu_vm_clear_fault(struct amdgpu_retryfault_hashtable *fault_hash, u64 key);
 
+void amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 8a8bc60cb6b4..407dd16cc35c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -40,26 +40,40 @@ void *amdgpu_xgmi_hive_try_lock(struct amdgpu_hive_info *hive)
 	return &hive->device_list;
 }
 
-struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
+struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev, int lock)
 {
 	int i;
 	struct amdgpu_hive_info *tmp;
 
 	if (!adev->gmc.xgmi.hive_id)
 		return NULL;
+
+	mutex_lock(&xgmi_mutex);
+
 	for (i = 0 ; i < hive_count; ++i) {
 		tmp = &xgmi_hives[i];
-		if (tmp->hive_id == adev->gmc.xgmi.hive_id)
+		if (tmp->hive_id == adev->gmc.xgmi.hive_id) {
+			if (lock)
+				mutex_lock(&tmp->hive_lock);
+			mutex_unlock(&xgmi_mutex);
 			return tmp;
+		}
 	}
-	if (i >= AMDGPU_MAX_XGMI_HIVE)
+	if (i >= AMDGPU_MAX_XGMI_HIVE) {
+		mutex_unlock(&xgmi_mutex);
 		return NULL;
+	}
 
 	/* initialize new hive if not exist */
 	tmp = &xgmi_hives[hive_count++];
 	tmp->hive_id = adev->gmc.xgmi.hive_id;
 	INIT_LIST_HEAD(&tmp->device_list);
 	mutex_init(&tmp->hive_lock);
+	mutex_init(&tmp->reset_lock);
+	if (lock)
+		mutex_lock(&tmp->hive_lock);
+
+	mutex_unlock(&xgmi_mutex);
 
 	return tmp;
 }
@@ -77,10 +91,6 @@ int amdgpu_xgmi_update_topology(struct amdgpu_hive_info *hive, struct amdgpu_dev
 			"XGMI: Set topology failure on device %llx, hive %llx, ret %d",
 			adev->gmc.xgmi.node_id,
 			adev->gmc.xgmi.hive_id, ret);
-	else
-		dev_info(adev->dev, "XGMI: Set topology for node %d, hive 0x%llx.\n",
-			 adev->gmc.xgmi.physical_node_id,
-				 adev->gmc.xgmi.hive_id);
 
 	return ret;
 }
@@ -111,10 +121,14 @@ int amdgpu_xgmi_add_device(struct amdgpu_device *adev)
 		return ret;
 	}
 
-	mutex_lock(&xgmi_mutex);
-	hive = amdgpu_get_xgmi_hive(adev);
-	if (!hive)
+	hive = amdgpu_get_xgmi_hive(adev, 1);
+	if (!hive) {
+		ret = -EINVAL;
+		dev_err(adev->dev,
+			"XGMI: node 0x%llx, can not match hive 0x%llx in the hive list.\n",
+			adev->gmc.xgmi.node_id, adev->gmc.xgmi.hive_id);
 		goto exit;
+	}
 
 	hive_topology = &hive->topology_info;
 
@@ -142,8 +156,11 @@ int amdgpu_xgmi_add_device(struct amdgpu_device *adev)
 			break;
 	}
 
+	dev_info(adev->dev, "XGMI: Add node %d, hive 0x%llx.\n",
+		 adev->gmc.xgmi.physical_node_id, adev->gmc.xgmi.hive_id);
+
+	mutex_unlock(&hive->hive_lock);
 exit:
-	mutex_unlock(&xgmi_mutex);
 	return ret;
 }
 
@@ -154,15 +171,14 @@ void amdgpu_xgmi_remove_device(struct amdgpu_device *adev)
 	if (!adev->gmc.xgmi.supported)
 		return;
 
-	mutex_lock(&xgmi_mutex);
-
-	hive = amdgpu_get_xgmi_hive(adev);
+	hive = amdgpu_get_xgmi_hive(adev, 1);
 	if (!hive)
-		goto exit;
+		return;
 
-	if (!(hive->number_devices--))
+	if (!(hive->number_devices--)) {
 		mutex_destroy(&hive->hive_lock);
-
-exit:
-	mutex_unlock(&xgmi_mutex);
+		mutex_destroy(&hive->reset_lock);
+	} else {
+		mutex_unlock(&hive->hive_lock);
+	}
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
index 6151eb9c8ad3..14bc60664159 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
@@ -29,10 +29,11 @@ struct amdgpu_hive_info {
 	struct list_head	device_list;
 	struct psp_xgmi_topology_info	topology_info;
 	int number_devices;
-	struct mutex hive_lock;
+	struct mutex hive_lock,
+		     reset_lock;
 };
 
-struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev);
+struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev, int lock);
 int amdgpu_xgmi_update_topology(struct amdgpu_hive_info *hive, struct amdgpu_device *adev);
 int amdgpu_xgmi_add_device(struct amdgpu_device *adev);
 void amdgpu_xgmi_remove_device(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c b/drivers/gpu/drm/amd/amdgpu/atom.c
index e9934de1b9cf..dd30f4e61a8c 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.c
+++ b/drivers/gpu/drm/amd/amdgpu/atom.c
@@ -27,6 +27,8 @@
 #include <linux/slab.h>
 #include <asm/unaligned.h>
 
+#include <drm/drm_util.h>
+
 #define ATOM_DEBUG
 
 #include "atom.h"
diff --git a/drivers/gpu/drm/amd/amdgpu/ci_dpm.c b/drivers/gpu/drm/amd/amdgpu/ci_dpm.c
deleted file mode 100644
index 86e14c754dd4..000000000000
--- a/drivers/gpu/drm/amd/amdgpu/ci_dpm.c
+++ /dev/null
@@ -1,6844 +0,0 @@
-/*
- * Copyright 2013 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- */
-
-#include <linux/firmware.h>
-#include <drm/drmP.h>
-#include "amdgpu.h"
-#include "amdgpu_pm.h"
-#include "amdgpu_ucode.h"
-#include "cikd.h"
-#include "amdgpu_dpm.h"
-#include "ci_dpm.h"
-#include "gfx_v7_0.h"
-#include "atom.h"
-#include "amd_pcie.h"
-#include <linux/seq_file.h>
-
-#include "smu/smu_7_0_1_d.h"
-#include "smu/smu_7_0_1_sh_mask.h"
-
-#include "dce/dce_8_0_d.h"
-#include "dce/dce_8_0_sh_mask.h"
-
-#include "bif/bif_4_1_d.h"
-#include "bif/bif_4_1_sh_mask.h"
-
-#include "gca/gfx_7_2_d.h"
-#include "gca/gfx_7_2_sh_mask.h"
-
-#include "gmc/gmc_7_1_d.h"
-#include "gmc/gmc_7_1_sh_mask.h"
-
-MODULE_FIRMWARE("amdgpu/bonaire_smc.bin");
-MODULE_FIRMWARE("amdgpu/bonaire_k_smc.bin");
-MODULE_FIRMWARE("amdgpu/hawaii_smc.bin");
-MODULE_FIRMWARE("amdgpu/hawaii_k_smc.bin");
-
-#define MC_CG_ARB_FREQ_F0           0x0a
-#define MC_CG_ARB_FREQ_F1           0x0b
-#define MC_CG_ARB_FREQ_F2           0x0c
-#define MC_CG_ARB_FREQ_F3           0x0d
-
-#define SMC_RAM_END 0x40000
-
-#define VOLTAGE_SCALE               4
-#define VOLTAGE_VID_OFFSET_SCALE1    625
-#define VOLTAGE_VID_OFFSET_SCALE2    100
-
-static const struct amd_pm_funcs ci_dpm_funcs;
-
-static const struct ci_pt_defaults defaults_hawaii_xt =
-{
-	1, 0xF, 0xFD, 0x19, 5, 0x14, 0, 0xB0000,
-	{ 0x2E,  0x00,  0x00,  0x88,  0x00,  0x00,  0x72,  0x60,  0x51,  0xA7,  0x79,  0x6B,  0x90,  0xBD,  0x79  },
-	{ 0x217, 0x217, 0x217, 0x242, 0x242, 0x242, 0x269, 0x269, 0x269, 0x2A1, 0x2A1, 0x2A1, 0x2C9, 0x2C9, 0x2C9 }
-};
-
-static const struct ci_pt_defaults defaults_hawaii_pro =
-{
-	1, 0xF, 0xFD, 0x19, 5, 0x14, 0, 0x65062,
-	{ 0x2E,  0x00,  0x00,  0x88,  0x00,  0x00,  0x72,  0x60,  0x51,  0xA7,  0x79,  0x6B,  0x90,  0xBD,  0x79  },
-	{ 0x217, 0x217, 0x217, 0x242, 0x242, 0x242, 0x269, 0x269, 0x269, 0x2A1, 0x2A1, 0x2A1, 0x2C9, 0x2C9, 0x2C9 }
-};
-
-static const struct ci_pt_defaults defaults_bonaire_xt =
-{
-	1, 0xF, 0xFD, 0x19, 5, 45, 0, 0xB0000,
-	{ 0x79,  0x253, 0x25D, 0xAE,  0x72,  0x80,  0x83,  0x86,  0x6F,  0xC8,  0xC9,  0xC9,  0x2F,  0x4D,  0x61  },
-	{ 0x17C, 0x172, 0x180, 0x1BC, 0x1B3, 0x1BD, 0x206, 0x200, 0x203, 0x25D, 0x25A, 0x255, 0x2C3, 0x2C5, 0x2B4 }
-};
-
-#if 0
-static const struct ci_pt_defaults defaults_bonaire_pro =
-{
-	1, 0xF, 0xFD, 0x19, 5, 45, 0, 0x65062,
-	{ 0x8C,  0x23F, 0x244, 0xA6,  0x83,  0x85,  0x86,  0x86,  0x83,  0xDB,  0xDB,  0xDA,  0x67,  0x60,  0x5F  },
-	{ 0x187, 0x193, 0x193, 0x1C7, 0x1D1, 0x1D1, 0x210, 0x219, 0x219, 0x266, 0x26C, 0x26C, 0x2C9, 0x2CB, 0x2CB }
-};
-#endif
-
-static const struct ci_pt_defaults defaults_saturn_xt =
-{
-	1, 0xF, 0xFD, 0x19, 5, 55, 0, 0x70000,
-	{ 0x8C,  0x247, 0x249, 0xA6,  0x80,  0x81,  0x8B,  0x89,  0x86,  0xC9,  0xCA,  0xC9,  0x4D,  0x4D,  0x4D  },
-	{ 0x187, 0x187, 0x187, 0x1C7, 0x1C7, 0x1C7, 0x210, 0x210, 0x210, 0x266, 0x266, 0x266, 0x2C9, 0x2C9, 0x2C9 }
-};
-
-#if 0
-static const struct ci_pt_defaults defaults_saturn_pro =
-{
-	1, 0xF, 0xFD, 0x19, 5, 55, 0, 0x30000,
-	{ 0x96,  0x21D, 0x23B, 0xA1,  0x85,  0x87,  0x83,  0x84,  0x81,  0xE6,  0xE6,  0xE6,  0x71,  0x6A,  0x6A  },
-	{ 0x193, 0x19E, 0x19E, 0x1D2, 0x1DC, 0x1DC, 0x21A, 0x223, 0x223, 0x26E, 0x27E, 0x274, 0x2CF, 0x2D2, 0x2D2 }
-};
-#endif
-
-static const struct ci_pt_config_reg didt_config_ci[] =
-{
-	{ 0x10, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x10, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x10, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x10, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x11, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x11, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x11, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x11, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x12, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x12, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x12, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x12, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x2, 0x00003fff, 0, 0x4, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x2, 0x03ff0000, 16, 0x80, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x2, 0x78000000, 27, 0x3, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x1, 0x0000ffff, 0, 0x3FFF, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x1, 0xffff0000, 16, 0x3FFF, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x0, 0x00000001, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x30, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x30, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x30, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x30, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x31, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x31, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x31, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x31, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x32, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x32, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x32, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x32, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x22, 0x00003fff, 0, 0x4, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x22, 0x03ff0000, 16, 0x80, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x22, 0x78000000, 27, 0x3, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x21, 0x0000ffff, 0, 0x3FFF, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x21, 0xffff0000, 16, 0x3FFF, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x20, 0x00000001, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x50, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x50, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x50, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x50, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x51, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x51, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x51, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x51, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x52, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x52, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x52, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x52, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x42, 0x00003fff, 0, 0x4, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x42, 0x03ff0000, 16, 0x80, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x42, 0x78000000, 27, 0x3, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x41, 0x0000ffff, 0, 0x3FFF, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x41, 0xffff0000, 16, 0x3FFF, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x40, 0x00000001, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x70, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x70, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x70, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x70, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x71, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x71, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x71, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x71, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x72, 0x000000ff, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x72, 0x0000ff00, 8, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x72, 0x00ff0000, 16, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x72, 0xff000000, 24, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x62, 0x00003fff, 0, 0x4, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x62, 0x03ff0000, 16, 0x80, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x62, 0x78000000, 27, 0x3, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x61, 0x0000ffff, 0, 0x3FFF, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x61, 0xffff0000, 16, 0x3FFF, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0x60, 0x00000001, 0, 0x0, CISLANDS_CONFIGREG_DIDT_IND },
-	{ 0xFFFFFFFF }
-};
-
-static u8 ci_get_memory_module_index(struct amdgpu_device *adev)
-{
-	return (u8) ((RREG32(mmBIOS_SCRATCH_4) >> 16) & 0xff);
-}
-
-#define MC_CG_ARB_FREQ_F0           0x0a
-#define MC_CG_ARB_FREQ_F1           0x0b
-#define MC_CG_ARB_FREQ_F2           0x0c
-#define MC_CG_ARB_FREQ_F3           0x0d
-
-static int ci_copy_and_switch_arb_sets(struct amdgpu_device *adev,
-				       u32 arb_freq_src, u32 arb_freq_dest)
-{
-	u32 mc_arb_dram_timing;
-	u32 mc_arb_dram_timing2;
-	u32 burst_time;
-	u32 mc_cg_config;
-
-	switch (arb_freq_src) {
-	case MC_CG_ARB_FREQ_F0:
-		mc_arb_dram_timing  = RREG32(mmMC_ARB_DRAM_TIMING);
-		mc_arb_dram_timing2 = RREG32(mmMC_ARB_DRAM_TIMING2);
-		burst_time = (RREG32(mmMC_ARB_BURST_TIME) & MC_ARB_BURST_TIME__STATE0_MASK) >>
-			 MC_ARB_BURST_TIME__STATE0__SHIFT;
-		break;
-	case MC_CG_ARB_FREQ_F1:
-		mc_arb_dram_timing  = RREG32(mmMC_ARB_DRAM_TIMING_1);
-		mc_arb_dram_timing2 = RREG32(mmMC_ARB_DRAM_TIMING2_1);
-		burst_time = (RREG32(mmMC_ARB_BURST_TIME) & MC_ARB_BURST_TIME__STATE1_MASK) >>
-			 MC_ARB_BURST_TIME__STATE1__SHIFT;
-		break;
-	default:
-		return -EINVAL;
-	}
-
-	switch (arb_freq_dest) {
-	case MC_CG_ARB_FREQ_F0:
-		WREG32(mmMC_ARB_DRAM_TIMING, mc_arb_dram_timing);
-		WREG32(mmMC_ARB_DRAM_TIMING2, mc_arb_dram_timing2);
-		WREG32_P(mmMC_ARB_BURST_TIME, (burst_time << MC_ARB_BURST_TIME__STATE0__SHIFT),
-			~MC_ARB_BURST_TIME__STATE0_MASK);
-		break;
-	case MC_CG_ARB_FREQ_F1:
-		WREG32(mmMC_ARB_DRAM_TIMING_1, mc_arb_dram_timing);
-		WREG32(mmMC_ARB_DRAM_TIMING2_1, mc_arb_dram_timing2);
-		WREG32_P(mmMC_ARB_BURST_TIME, (burst_time << MC_ARB_BURST_TIME__STATE1__SHIFT),
-			~MC_ARB_BURST_TIME__STATE1_MASK);
-		break;
-	default:
-		return -EINVAL;
-	}
-
-	mc_cg_config = RREG32(mmMC_CG_CONFIG) | 0x0000000F;
-	WREG32(mmMC_CG_CONFIG, mc_cg_config);
-	WREG32_P(mmMC_ARB_CG, (arb_freq_dest) << MC_ARB_CG__CG_ARB_REQ__SHIFT,
-		~MC_ARB_CG__CG_ARB_REQ_MASK);
-
-	return 0;
-}
-
-static u8 ci_get_ddr3_mclk_frequency_ratio(u32 memory_clock)
-{
-	u8 mc_para_index;
-
-	if (memory_clock < 10000)
-		mc_para_index = 0;
-	else if (memory_clock >= 80000)
-		mc_para_index = 0x0f;
-	else
-		mc_para_index = (u8)((memory_clock - 10000) / 5000 + 1);
-	return mc_para_index;
-}
-
-static u8 ci_get_mclk_frequency_ratio(u32 memory_clock, bool strobe_mode)
-{
-	u8 mc_para_index;
-
-	if (strobe_mode) {
-		if (memory_clock < 12500)
-			mc_para_index = 0x00;
-		else if (memory_clock > 47500)
-			mc_para_index = 0x0f;
-		else
-			mc_para_index = (u8)((memory_clock - 10000) / 2500);
-	} else {
-		if (memory_clock < 65000)
-			mc_para_index = 0x00;
-		else if (memory_clock > 135000)
-			mc_para_index = 0x0f;
-		else
-			mc_para_index = (u8)((memory_clock - 60000) / 5000);
-	}
-	return mc_para_index;
-}
-
-static void ci_trim_voltage_table_to_fit_state_table(struct amdgpu_device *adev,
-						     u32 max_voltage_steps,
-						     struct atom_voltage_table *voltage_table)
-{
-	unsigned int i, diff;
-
-	if (voltage_table->count <= max_voltage_steps)
-		return;
-
-	diff = voltage_table->count - max_voltage_steps;
-
-	for (i = 0; i < max_voltage_steps; i++)
-		voltage_table->entries[i] = voltage_table->entries[i + diff];
-
-	voltage_table->count = max_voltage_steps;
-}
-
-static int ci_get_std_voltage_value_sidd(struct amdgpu_device *adev,
-					 struct atom_voltage_table_entry *voltage_table,
-					 u16 *std_voltage_hi_sidd, u16 *std_voltage_lo_sidd);
-static int ci_set_power_limit(struct amdgpu_device *adev, u32 n);
-static int ci_set_overdrive_target_tdp(struct amdgpu_device *adev,
-				       u32 target_tdp);
-static int ci_update_uvd_dpm(struct amdgpu_device *adev, bool gate);
-static void ci_dpm_set_irq_funcs(struct amdgpu_device *adev);
-
-static PPSMC_Result amdgpu_ci_send_msg_to_smc_with_parameter(struct amdgpu_device *adev,
-							     PPSMC_Msg msg, u32 parameter);
-static void ci_thermal_start_smc_fan_control(struct amdgpu_device *adev);
-static void ci_fan_ctrl_set_default_mode(struct amdgpu_device *adev);
-
-static struct ci_power_info *ci_get_pi(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = adev->pm.dpm.priv;
-
-	return pi;
-}
-
-static struct ci_ps *ci_get_ps(struct amdgpu_ps *rps)
-{
-	struct ci_ps *ps = rps->ps_priv;
-
-	return ps;
-}
-
-static void ci_initialize_powertune_defaults(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	switch (adev->pdev->device) {
-	case 0x6649:
-	case 0x6650:
-	case 0x6651:
-	case 0x6658:
-	case 0x665C:
-	case 0x665D:
-	default:
-		pi->powertune_defaults = &defaults_bonaire_xt;
-		break;
-	case 0x6640:
-	case 0x6641:
-	case 0x6646:
-	case 0x6647:
-		pi->powertune_defaults = &defaults_saturn_xt;
-		break;
-	case 0x67B8:
-	case 0x67B0:
-		pi->powertune_defaults = &defaults_hawaii_xt;
-		break;
-	case 0x67BA:
-	case 0x67B1:
-		pi->powertune_defaults = &defaults_hawaii_pro;
-		break;
-	case 0x67A0:
-	case 0x67A1:
-	case 0x67A2:
-	case 0x67A8:
-	case 0x67A9:
-	case 0x67AA:
-	case 0x67B9:
-	case 0x67BE:
-		pi->powertune_defaults = &defaults_bonaire_xt;
-		break;
-	}
-
-	pi->dte_tj_offset = 0;
-
-	pi->caps_power_containment = true;
-	pi->caps_cac = false;
-	pi->caps_sq_ramping = false;
-	pi->caps_db_ramping = false;
-	pi->caps_td_ramping = false;
-	pi->caps_tcp_ramping = false;
-
-	if (pi->caps_power_containment) {
-		pi->caps_cac = true;
-		if (adev->asic_type == CHIP_HAWAII)
-			pi->enable_bapm_feature = false;
-		else
-			pi->enable_bapm_feature = true;
-		pi->enable_tdc_limit_feature = true;
-		pi->enable_pkg_pwr_tracking_feature = true;
-	}
-}
-
-static u8 ci_convert_to_vid(u16 vddc)
-{
-	return (6200 - (vddc * VOLTAGE_SCALE)) / 25;
-}
-
-static int ci_populate_bapm_vddc_vid_sidd(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u8 *hi_vid = pi->smc_powertune_table.BapmVddCVidHiSidd;
-	u8 *lo_vid = pi->smc_powertune_table.BapmVddCVidLoSidd;
-	u8 *hi2_vid = pi->smc_powertune_table.BapmVddCVidHiSidd2;
-	u32 i;
-
-	if (adev->pm.dpm.dyn_state.cac_leakage_table.entries == NULL)
-		return -EINVAL;
-	if (adev->pm.dpm.dyn_state.cac_leakage_table.count > 8)
-		return -EINVAL;
-	if (adev->pm.dpm.dyn_state.cac_leakage_table.count !=
-	    adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.count)
-		return -EINVAL;
-
-	for (i = 0; i < adev->pm.dpm.dyn_state.cac_leakage_table.count; i++) {
-		if (adev->pm.dpm.platform_caps & ATOM_PP_PLATFORM_CAP_EVV) {
-			lo_vid[i] = ci_convert_to_vid(adev->pm.dpm.dyn_state.cac_leakage_table.entries[i].vddc1);
-			hi_vid[i] = ci_convert_to_vid(adev->pm.dpm.dyn_state.cac_leakage_table.entries[i].vddc2);
-			hi2_vid[i] = ci_convert_to_vid(adev->pm.dpm.dyn_state.cac_leakage_table.entries[i].vddc3);
-		} else {
-			lo_vid[i] = ci_convert_to_vid(adev->pm.dpm.dyn_state.cac_leakage_table.entries[i].vddc);
-			hi_vid[i] = ci_convert_to_vid((u16)adev->pm.dpm.dyn_state.cac_leakage_table.entries[i].leakage);
-		}
-	}
-	return 0;
-}
-
-static int ci_populate_vddc_vid(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u8 *vid = pi->smc_powertune_table.VddCVid;
-	u32 i;
-
-	if (pi->vddc_voltage_table.count > 8)
-		return -EINVAL;
-
-	for (i = 0; i < pi->vddc_voltage_table.count; i++)
-		vid[i] = ci_convert_to_vid(pi->vddc_voltage_table.entries[i].value);
-
-	return 0;
-}
-
-static int ci_populate_svi_load_line(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	const struct ci_pt_defaults *pt_defaults = pi->powertune_defaults;
-
-	pi->smc_powertune_table.SviLoadLineEn = pt_defaults->svi_load_line_en;
-	pi->smc_powertune_table.SviLoadLineVddC = pt_defaults->svi_load_line_vddc;
-	pi->smc_powertune_table.SviLoadLineTrimVddC = 3;
-	pi->smc_powertune_table.SviLoadLineOffsetVddC = 0;
-
-	return 0;
-}
-
-static int ci_populate_tdc_limit(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	const struct ci_pt_defaults *pt_defaults = pi->powertune_defaults;
-	u16 tdc_limit;
-
-	tdc_limit = adev->pm.dpm.dyn_state.cac_tdp_table->tdc * 256;
-	pi->smc_powertune_table.TDC_VDDC_PkgLimit = cpu_to_be16(tdc_limit);
-	pi->smc_powertune_table.TDC_VDDC_ThrottleReleaseLimitPerc =
-		pt_defaults->tdc_vddc_throttle_release_limit_perc;
-	pi->smc_powertune_table.TDC_MAWt = pt_defaults->tdc_mawt;
-
-	return 0;
-}
-
-static int ci_populate_dw8(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	const struct ci_pt_defaults *pt_defaults = pi->powertune_defaults;
-	int ret;
-
-	ret = amdgpu_ci_read_smc_sram_dword(adev,
-				     SMU7_FIRMWARE_HEADER_LOCATION +
-				     offsetof(SMU7_Firmware_Header, PmFuseTable) +
-				     offsetof(SMU7_Discrete_PmFuses, TdcWaterfallCtl),
-				     (u32 *)&pi->smc_powertune_table.TdcWaterfallCtl,
-				     pi->sram_end);
-	if (ret)
-		return -EINVAL;
-	else
-		pi->smc_powertune_table.TdcWaterfallCtl = pt_defaults->tdc_waterfall_ctl;
-
-	return 0;
-}
-
-static int ci_populate_fuzzy_fan(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if ((adev->pm.dpm.fan.fan_output_sensitivity & (1 << 15)) ||
-	    (adev->pm.dpm.fan.fan_output_sensitivity == 0))
-		adev->pm.dpm.fan.fan_output_sensitivity =
-			adev->pm.dpm.fan.default_fan_output_sensitivity;
-
-	pi->smc_powertune_table.FuzzyFan_PwmSetDelta =
-		cpu_to_be16(adev->pm.dpm.fan.fan_output_sensitivity);
-
-	return 0;
-}
-
-static int ci_min_max_v_gnbl_pm_lid_from_bapm_vddc(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u8 *hi_vid = pi->smc_powertune_table.BapmVddCVidHiSidd;
-	u8 *lo_vid = pi->smc_powertune_table.BapmVddCVidLoSidd;
-	int i, min, max;
-
-	min = max = hi_vid[0];
-	for (i = 0; i < 8; i++) {
-		if (0 != hi_vid[i]) {
-			if (min > hi_vid[i])
-				min = hi_vid[i];
-			if (max < hi_vid[i])
-				max = hi_vid[i];
-		}
-
-		if (0 != lo_vid[i]) {
-			if (min > lo_vid[i])
-				min = lo_vid[i];
-			if (max < lo_vid[i])
-				max = lo_vid[i];
-		}
-	}
-
-	if ((min == 0) || (max == 0))
-		return -EINVAL;
-	pi->smc_powertune_table.GnbLPMLMaxVid = (u8)max;
-	pi->smc_powertune_table.GnbLPMLMinVid = (u8)min;
-
-	return 0;
-}
-
-static int ci_populate_bapm_vddc_base_leakage_sidd(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u16 hi_sidd = pi->smc_powertune_table.BapmVddCBaseLeakageHiSidd;
-	u16 lo_sidd = pi->smc_powertune_table.BapmVddCBaseLeakageLoSidd;
-	struct amdgpu_cac_tdp_table *cac_tdp_table =
-		adev->pm.dpm.dyn_state.cac_tdp_table;
-
-	hi_sidd = cac_tdp_table->high_cac_leakage / 100 * 256;
-	lo_sidd = cac_tdp_table->low_cac_leakage / 100 * 256;
-
-	pi->smc_powertune_table.BapmVddCBaseLeakageHiSidd = cpu_to_be16(hi_sidd);
-	pi->smc_powertune_table.BapmVddCBaseLeakageLoSidd = cpu_to_be16(lo_sidd);
-
-	return 0;
-}
-
-static int ci_populate_bapm_parameters_in_dpm_table(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	const struct ci_pt_defaults *pt_defaults = pi->powertune_defaults;
-	SMU7_Discrete_DpmTable  *dpm_table = &pi->smc_state_table;
-	struct amdgpu_cac_tdp_table *cac_tdp_table =
-		adev->pm.dpm.dyn_state.cac_tdp_table;
-	struct amdgpu_ppm_table *ppm = adev->pm.dpm.dyn_state.ppm_table;
-	int i, j, k;
-	const u16 *def1;
-	const u16 *def2;
-
-	dpm_table->DefaultTdp = cac_tdp_table->tdp * 256;
-	dpm_table->TargetTdp = cac_tdp_table->configurable_tdp * 256;
-
-	dpm_table->DTETjOffset = (u8)pi->dte_tj_offset;
-	dpm_table->GpuTjMax =
-		(u8)(pi->thermal_temp_setting.temperature_high / 1000);
-	dpm_table->GpuTjHyst = 8;
-
-	dpm_table->DTEAmbientTempBase = pt_defaults->dte_ambient_temp_base;
-
-	if (ppm) {
-		dpm_table->PPM_PkgPwrLimit = cpu_to_be16((u16)ppm->dgpu_tdp * 256 / 1000);
-		dpm_table->PPM_TemperatureLimit = cpu_to_be16((u16)ppm->tj_max * 256);
-	} else {
-		dpm_table->PPM_PkgPwrLimit = cpu_to_be16(0);
-		dpm_table->PPM_TemperatureLimit = cpu_to_be16(0);
-	}
-
-	dpm_table->BAPM_TEMP_GRADIENT = cpu_to_be32(pt_defaults->bapm_temp_gradient);
-	def1 = pt_defaults->bapmti_r;
-	def2 = pt_defaults->bapmti_rc;
-
-	for (i = 0; i < SMU7_DTE_ITERATIONS; i++) {
-		for (j = 0; j < SMU7_DTE_SOURCES; j++) {
-			for (k = 0; k < SMU7_DTE_SINKS; k++) {
-				dpm_table->BAPMTI_R[i][j][k] = cpu_to_be16(*def1);
-				dpm_table->BAPMTI_RC[i][j][k] = cpu_to_be16(*def2);
-				def1++;
-				def2++;
-			}
-		}
-	}
-
-	return 0;
-}
-
-static int ci_populate_pm_base(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 pm_fuse_table_offset;
-	int ret;
-
-	if (pi->caps_power_containment) {
-		ret = amdgpu_ci_read_smc_sram_dword(adev,
-					     SMU7_FIRMWARE_HEADER_LOCATION +
-					     offsetof(SMU7_Firmware_Header, PmFuseTable),
-					     &pm_fuse_table_offset, pi->sram_end);
-		if (ret)
-			return ret;
-		ret = ci_populate_bapm_vddc_vid_sidd(adev);
-		if (ret)
-			return ret;
-		ret = ci_populate_vddc_vid(adev);
-		if (ret)
-			return ret;
-		ret = ci_populate_svi_load_line(adev);
-		if (ret)
-			return ret;
-		ret = ci_populate_tdc_limit(adev);
-		if (ret)
-			return ret;
-		ret = ci_populate_dw8(adev);
-		if (ret)
-			return ret;
-		ret = ci_populate_fuzzy_fan(adev);
-		if (ret)
-			return ret;
-		ret = ci_min_max_v_gnbl_pm_lid_from_bapm_vddc(adev);
-		if (ret)
-			return ret;
-		ret = ci_populate_bapm_vddc_base_leakage_sidd(adev);
-		if (ret)
-			return ret;
-		ret = amdgpu_ci_copy_bytes_to_smc(adev, pm_fuse_table_offset,
-					   (u8 *)&pi->smc_powertune_table,
-					   sizeof(SMU7_Discrete_PmFuses), pi->sram_end);
-		if (ret)
-			return ret;
-	}
-
-	return 0;
-}
-
-static void ci_do_enable_didt(struct amdgpu_device *adev, const bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 data;
-
-	if (pi->caps_sq_ramping) {
-		data = RREG32_DIDT(ixDIDT_SQ_CTRL0);
-		if (enable)
-			data |= DIDT_SQ_CTRL0__DIDT_CTRL_EN_MASK;
-		else
-			data &= ~DIDT_SQ_CTRL0__DIDT_CTRL_EN_MASK;
-		WREG32_DIDT(ixDIDT_SQ_CTRL0, data);
-	}
-
-	if (pi->caps_db_ramping) {
-		data = RREG32_DIDT(ixDIDT_DB_CTRL0);
-		if (enable)
-			data |= DIDT_DB_CTRL0__DIDT_CTRL_EN_MASK;
-		else
-			data &= ~DIDT_DB_CTRL0__DIDT_CTRL_EN_MASK;
-		WREG32_DIDT(ixDIDT_DB_CTRL0, data);
-	}
-
-	if (pi->caps_td_ramping) {
-		data = RREG32_DIDT(ixDIDT_TD_CTRL0);
-		if (enable)
-			data |= DIDT_TD_CTRL0__DIDT_CTRL_EN_MASK;
-		else
-			data &= ~DIDT_TD_CTRL0__DIDT_CTRL_EN_MASK;
-		WREG32_DIDT(ixDIDT_TD_CTRL0, data);
-	}
-
-	if (pi->caps_tcp_ramping) {
-		data = RREG32_DIDT(ixDIDT_TCP_CTRL0);
-		if (enable)
-			data |= DIDT_TCP_CTRL0__DIDT_CTRL_EN_MASK;
-		else
-			data &= ~DIDT_TCP_CTRL0__DIDT_CTRL_EN_MASK;
-		WREG32_DIDT(ixDIDT_TCP_CTRL0, data);
-	}
-}
-
-static int ci_program_pt_config_registers(struct amdgpu_device *adev,
-					  const struct ci_pt_config_reg *cac_config_regs)
-{
-	const struct ci_pt_config_reg *config_regs = cac_config_regs;
-	u32 data;
-	u32 cache = 0;
-
-	if (config_regs == NULL)
-		return -EINVAL;
-
-	while (config_regs->offset != 0xFFFFFFFF) {
-		if (config_regs->type == CISLANDS_CONFIGREG_CACHE) {
-			cache |= ((config_regs->value << config_regs->shift) & config_regs->mask);
-		} else {
-			switch (config_regs->type) {
-			case CISLANDS_CONFIGREG_SMC_IND:
-				data = RREG32_SMC(config_regs->offset);
-				break;
-			case CISLANDS_CONFIGREG_DIDT_IND:
-				data = RREG32_DIDT(config_regs->offset);
-				break;
-			default:
-				data = RREG32(config_regs->offset);
-				break;
-			}
-
-			data &= ~config_regs->mask;
-			data |= ((config_regs->value << config_regs->shift) & config_regs->mask);
-			data |= cache;
-
-			switch (config_regs->type) {
-			case CISLANDS_CONFIGREG_SMC_IND:
-				WREG32_SMC(config_regs->offset, data);
-				break;
-			case CISLANDS_CONFIGREG_DIDT_IND:
-				WREG32_DIDT(config_regs->offset, data);
-				break;
-			default:
-				WREG32(config_regs->offset, data);
-				break;
-			}
-			cache = 0;
-		}
-		config_regs++;
-	}
-	return 0;
-}
-
-static int ci_enable_didt(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	int ret;
-
-	if (pi->caps_sq_ramping || pi->caps_db_ramping ||
-	    pi->caps_td_ramping || pi->caps_tcp_ramping) {
-		amdgpu_gfx_rlc_enter_safe_mode(adev);
-
-		if (enable) {
-			ret = ci_program_pt_config_registers(adev, didt_config_ci);
-			if (ret) {
-				amdgpu_gfx_rlc_exit_safe_mode(adev);
-				return ret;
-			}
-		}
-
-		ci_do_enable_didt(adev, enable);
-
-		amdgpu_gfx_rlc_exit_safe_mode(adev);
-	}
-
-	return 0;
-}
-
-static int ci_enable_power_containment(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result smc_result;
-	int ret = 0;
-
-	if (enable) {
-		pi->power_containment_features = 0;
-		if (pi->caps_power_containment) {
-			if (pi->enable_bapm_feature) {
-				smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_EnableDTE);
-				if (smc_result != PPSMC_Result_OK)
-					ret = -EINVAL;
-				else
-					pi->power_containment_features |= POWERCONTAINMENT_FEATURE_BAPM;
-			}
-
-			if (pi->enable_tdc_limit_feature) {
-				smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_TDCLimitEnable);
-				if (smc_result != PPSMC_Result_OK)
-					ret = -EINVAL;
-				else
-					pi->power_containment_features |= POWERCONTAINMENT_FEATURE_TDCLimit;
-			}
-
-			if (pi->enable_pkg_pwr_tracking_feature) {
-				smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_PkgPwrLimitEnable);
-				if (smc_result != PPSMC_Result_OK) {
-					ret = -EINVAL;
-				} else {
-					struct amdgpu_cac_tdp_table *cac_tdp_table =
-						adev->pm.dpm.dyn_state.cac_tdp_table;
-					u32 default_pwr_limit =
-						(u32)(cac_tdp_table->maximum_power_delivery_limit * 256);
-
-					pi->power_containment_features |= POWERCONTAINMENT_FEATURE_PkgPwrLimit;
-
-					ci_set_power_limit(adev, default_pwr_limit);
-				}
-			}
-		}
-	} else {
-		if (pi->caps_power_containment && pi->power_containment_features) {
-			if (pi->power_containment_features & POWERCONTAINMENT_FEATURE_TDCLimit)
-				amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_TDCLimitDisable);
-
-			if (pi->power_containment_features & POWERCONTAINMENT_FEATURE_BAPM)
-				amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_DisableDTE);
-
-			if (pi->power_containment_features & POWERCONTAINMENT_FEATURE_PkgPwrLimit)
-				amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_PkgPwrLimitDisable);
-			pi->power_containment_features = 0;
-		}
-	}
-
-	return ret;
-}
-
-static int ci_enable_smc_cac(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result smc_result;
-	int ret = 0;
-
-	if (pi->caps_cac) {
-		if (enable) {
-			smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_EnableCac);
-			if (smc_result != PPSMC_Result_OK) {
-				ret = -EINVAL;
-				pi->cac_enabled = false;
-			} else {
-				pi->cac_enabled = true;
-			}
-		} else if (pi->cac_enabled) {
-			amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_DisableCac);
-			pi->cac_enabled = false;
-		}
-	}
-
-	return ret;
-}
-
-static int ci_enable_thermal_based_sclk_dpm(struct amdgpu_device *adev,
-					    bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result smc_result = PPSMC_Result_OK;
-
-	if (pi->thermal_sclk_dpm_enabled) {
-		if (enable)
-			smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_ENABLE_THERMAL_DPM);
-		else
-			smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_DISABLE_THERMAL_DPM);
-	}
-
-	if (smc_result == PPSMC_Result_OK)
-		return 0;
-	else
-		return -EINVAL;
-}
-
-static int ci_power_control_set_level(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_cac_tdp_table *cac_tdp_table =
-		adev->pm.dpm.dyn_state.cac_tdp_table;
-	s32 adjust_percent;
-	s32 target_tdp;
-	int ret = 0;
-	bool adjust_polarity = false; /* ??? */
-
-	if (pi->caps_power_containment) {
-		adjust_percent = adjust_polarity ?
-			adev->pm.dpm.tdp_adjustment : (-1 * adev->pm.dpm.tdp_adjustment);
-		target_tdp = ((100 + adjust_percent) *
-			      (s32)cac_tdp_table->configurable_tdp) / 100;
-
-		ret = ci_set_overdrive_target_tdp(adev, (u32)target_tdp);
-	}
-
-	return ret;
-}
-
-static void ci_dpm_powergate_uvd(void *handle, bool gate)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	pi->uvd_power_gated = gate;
-
-	if (gate) {
-		/* stop the UVD block */
-		amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
-						       AMD_PG_STATE_GATE);
-		ci_update_uvd_dpm(adev, gate);
-	} else {
-		amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_UVD,
-						       AMD_PG_STATE_UNGATE);
-		ci_update_uvd_dpm(adev, gate);
-	}
-}
-
-static bool ci_dpm_vblank_too_short(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	u32 vblank_time = amdgpu_dpm_get_vblank_time(adev);
-	u32 switch_limit = adev->gmc.vram_type == AMDGPU_VRAM_TYPE_GDDR5 ? 450 : 300;
-
-	/* disable mclk switching if the refresh is >120Hz, even if the
-	 * blanking period would allow it
-	 */
-	if (amdgpu_dpm_get_vrefresh(adev) > 120)
-		return true;
-
-	if (vblank_time < switch_limit)
-		return true;
-	else
-		return false;
-
-}
-
-static void ci_apply_state_adjust_rules(struct amdgpu_device *adev,
-					struct amdgpu_ps *rps)
-{
-	struct ci_ps *ps = ci_get_ps(rps);
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_clock_and_voltage_limits *max_limits;
-	bool disable_mclk_switching;
-	u32 sclk, mclk;
-	int i;
-
-	if (rps->vce_active) {
-		rps->evclk = adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].evclk;
-		rps->ecclk = adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].ecclk;
-	} else {
-		rps->evclk = 0;
-		rps->ecclk = 0;
-	}
-
-	if ((adev->pm.dpm.new_active_crtc_count > 1) ||
-	    ci_dpm_vblank_too_short(adev))
-		disable_mclk_switching = true;
-	else
-		disable_mclk_switching = false;
-
-	if ((rps->class & ATOM_PPLIB_CLASSIFICATION_UI_MASK) == ATOM_PPLIB_CLASSIFICATION_UI_BATTERY)
-		pi->battery_state = true;
-	else
-		pi->battery_state = false;
-
-	if (adev->pm.ac_power)
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_ac;
-	else
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_dc;
-
-	if (adev->pm.ac_power == false) {
-		for (i = 0; i < ps->performance_level_count; i++) {
-			if (ps->performance_levels[i].mclk > max_limits->mclk)
-				ps->performance_levels[i].mclk = max_limits->mclk;
-			if (ps->performance_levels[i].sclk > max_limits->sclk)
-				ps->performance_levels[i].sclk = max_limits->sclk;
-		}
-	}
-
-	/* XXX validate the min clocks required for display */
-
-	if (disable_mclk_switching) {
-		mclk  = ps->performance_levels[ps->performance_level_count - 1].mclk;
-		sclk = ps->performance_levels[0].sclk;
-	} else {
-		mclk = ps->performance_levels[0].mclk;
-		sclk = ps->performance_levels[0].sclk;
-	}
-
-	if (adev->pm.pm_display_cfg.min_core_set_clock > sclk)
-		sclk = adev->pm.pm_display_cfg.min_core_set_clock;
-
-	if (adev->pm.pm_display_cfg.min_mem_set_clock > mclk)
-		mclk = adev->pm.pm_display_cfg.min_mem_set_clock;
-
-	if (rps->vce_active) {
-		if (sclk < adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].sclk)
-			sclk = adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].sclk;
-		if (mclk < adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].mclk)
-			mclk = adev->pm.dpm.vce_states[adev->pm.dpm.vce_level].mclk;
-	}
-
-	ps->performance_levels[0].sclk = sclk;
-	ps->performance_levels[0].mclk = mclk;
-
-	if (ps->performance_levels[1].sclk < ps->performance_levels[0].sclk)
-		ps->performance_levels[1].sclk = ps->performance_levels[0].sclk;
-
-	if (disable_mclk_switching) {
-		if (ps->performance_levels[0].mclk < ps->performance_levels[1].mclk)
-			ps->performance_levels[0].mclk = ps->performance_levels[1].mclk;
-	} else {
-		if (ps->performance_levels[1].mclk < ps->performance_levels[0].mclk)
-			ps->performance_levels[1].mclk = ps->performance_levels[0].mclk;
-	}
-}
-
-static int ci_thermal_set_temperature_range(struct amdgpu_device *adev,
-					    int min_temp, int max_temp)
-{
-	int low_temp = 0 * 1000;
-	int high_temp = 255 * 1000;
-	u32 tmp;
-
-	if (low_temp < min_temp)
-		low_temp = min_temp;
-	if (high_temp > max_temp)
-		high_temp = max_temp;
-	if (high_temp < low_temp) {
-		DRM_ERROR("invalid thermal range: %d - %d\n", low_temp, high_temp);
-		return -EINVAL;
-	}
-
-	tmp = RREG32_SMC(ixCG_THERMAL_INT);
-	tmp &= ~(CG_THERMAL_INT__DIG_THERM_INTH_MASK | CG_THERMAL_INT__DIG_THERM_INTL_MASK);
-	tmp |= ((high_temp / 1000) << CG_THERMAL_INT__DIG_THERM_INTH__SHIFT) |
-		((low_temp / 1000)) << CG_THERMAL_INT__DIG_THERM_INTL__SHIFT;
-	WREG32_SMC(ixCG_THERMAL_INT, tmp);
-
-#if 0
-	/* XXX: need to figure out how to handle this properly */
-	tmp = RREG32_SMC(ixCG_THERMAL_CTRL);
-	tmp &= DIG_THERM_DPM_MASK;
-	tmp |= DIG_THERM_DPM(high_temp / 1000);
-	WREG32_SMC(ixCG_THERMAL_CTRL, tmp);
-#endif
-
-	adev->pm.dpm.thermal.min_temp = low_temp;
-	adev->pm.dpm.thermal.max_temp = high_temp;
-	return 0;
-}
-
-static int ci_thermal_enable_alert(struct amdgpu_device *adev,
-				   bool enable)
-{
-	u32 thermal_int = RREG32_SMC(ixCG_THERMAL_INT);
-	PPSMC_Result result;
-
-	if (enable) {
-		thermal_int &= ~(CG_THERMAL_INT_CTRL__THERM_INTH_MASK_MASK |
-				 CG_THERMAL_INT_CTRL__THERM_INTL_MASK_MASK);
-		WREG32_SMC(ixCG_THERMAL_INT, thermal_int);
-		result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_Thermal_Cntl_Enable);
-		if (result != PPSMC_Result_OK) {
-			DRM_DEBUG_KMS("Could not enable thermal interrupts.\n");
-			return -EINVAL;
-		}
-	} else {
-		thermal_int |= CG_THERMAL_INT_CTRL__THERM_INTH_MASK_MASK |
-			CG_THERMAL_INT_CTRL__THERM_INTL_MASK_MASK;
-		WREG32_SMC(ixCG_THERMAL_INT, thermal_int);
-		result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_Thermal_Cntl_Disable);
-		if (result != PPSMC_Result_OK) {
-			DRM_DEBUG_KMS("Could not disable thermal interrupts.\n");
-			return -EINVAL;
-		}
-	}
-
-	return 0;
-}
-
-static void ci_fan_ctrl_set_static_mode(struct amdgpu_device *adev, u32 mode)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 tmp;
-
-	if (pi->fan_ctrl_is_in_default_mode) {
-		tmp = (RREG32_SMC(ixCG_FDO_CTRL2) & CG_FDO_CTRL2__FDO_PWM_MODE_MASK)
-			>> CG_FDO_CTRL2__FDO_PWM_MODE__SHIFT;
-		pi->fan_ctrl_default_mode = tmp;
-		tmp = (RREG32_SMC(ixCG_FDO_CTRL2) & CG_FDO_CTRL2__TMIN_MASK)
-			>> CG_FDO_CTRL2__TMIN__SHIFT;
-		pi->t_min = tmp;
-		pi->fan_ctrl_is_in_default_mode = false;
-	}
-
-	tmp = RREG32_SMC(ixCG_FDO_CTRL2) & ~CG_FDO_CTRL2__TMIN_MASK;
-	tmp |= 0 << CG_FDO_CTRL2__TMIN__SHIFT;
-	WREG32_SMC(ixCG_FDO_CTRL2, tmp);
-
-	tmp = RREG32_SMC(ixCG_FDO_CTRL2) & ~CG_FDO_CTRL2__FDO_PWM_MODE_MASK;
-	tmp |= mode << CG_FDO_CTRL2__FDO_PWM_MODE__SHIFT;
-	WREG32_SMC(ixCG_FDO_CTRL2, tmp);
-}
-
-static int ci_thermal_setup_fan_table(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	SMU7_Discrete_FanTable fan_table = { FDO_MODE_HARDWARE };
-	u32 duty100;
-	u32 t_diff1, t_diff2, pwm_diff1, pwm_diff2;
-	u16 fdo_min, slope1, slope2;
-	u32 reference_clock, tmp;
-	int ret;
-	u64 tmp64;
-
-	if (!pi->fan_table_start) {
-		adev->pm.dpm.fan.ucode_fan_control = false;
-		return 0;
-	}
-
-	duty100 = (RREG32_SMC(ixCG_FDO_CTRL1) & CG_FDO_CTRL1__FMAX_DUTY100_MASK)
-		>> CG_FDO_CTRL1__FMAX_DUTY100__SHIFT;
-
-	if (duty100 == 0) {
-		adev->pm.dpm.fan.ucode_fan_control = false;
-		return 0;
-	}
-
-	tmp64 = (u64)adev->pm.dpm.fan.pwm_min * duty100;
-	do_div(tmp64, 10000);
-	fdo_min = (u16)tmp64;
-
-	t_diff1 = adev->pm.dpm.fan.t_med - adev->pm.dpm.fan.t_min;
-	t_diff2 = adev->pm.dpm.fan.t_high - adev->pm.dpm.fan.t_med;
-
-	pwm_diff1 = adev->pm.dpm.fan.pwm_med - adev->pm.dpm.fan.pwm_min;
-	pwm_diff2 = adev->pm.dpm.fan.pwm_high - adev->pm.dpm.fan.pwm_med;
-
-	slope1 = (u16)((50 + ((16 * duty100 * pwm_diff1) / t_diff1)) / 100);
-	slope2 = (u16)((50 + ((16 * duty100 * pwm_diff2) / t_diff2)) / 100);
-
-	fan_table.TempMin = cpu_to_be16((50 + adev->pm.dpm.fan.t_min) / 100);
-	fan_table.TempMed = cpu_to_be16((50 + adev->pm.dpm.fan.t_med) / 100);
-	fan_table.TempMax = cpu_to_be16((50 + adev->pm.dpm.fan.t_max) / 100);
-
-	fan_table.Slope1 = cpu_to_be16(slope1);
-	fan_table.Slope2 = cpu_to_be16(slope2);
-
-	fan_table.FdoMin = cpu_to_be16(fdo_min);
-
-	fan_table.HystDown = cpu_to_be16(adev->pm.dpm.fan.t_hyst);
-
-	fan_table.HystUp = cpu_to_be16(1);
-
-	fan_table.HystSlope = cpu_to_be16(1);
-
-	fan_table.TempRespLim = cpu_to_be16(5);
-
-	reference_clock = amdgpu_asic_get_xclk(adev);
-
-	fan_table.RefreshPeriod = cpu_to_be32((adev->pm.dpm.fan.cycle_delay *
-					       reference_clock) / 1600);
-
-	fan_table.FdoMax = cpu_to_be16((u16)duty100);
-
-	tmp = (RREG32_SMC(ixCG_MULT_THERMAL_CTRL) & CG_MULT_THERMAL_CTRL__TEMP_SEL_MASK)
-		>> CG_MULT_THERMAL_CTRL__TEMP_SEL__SHIFT;
-	fan_table.TempSrc = (uint8_t)tmp;
-
-	ret = amdgpu_ci_copy_bytes_to_smc(adev,
-					  pi->fan_table_start,
-					  (u8 *)(&fan_table),
-					  sizeof(fan_table),
-					  pi->sram_end);
-
-	if (ret) {
-		DRM_ERROR("Failed to load fan table to the SMC.");
-		adev->pm.dpm.fan.ucode_fan_control = false;
-	}
-
-	return 0;
-}
-
-static int ci_fan_ctrl_start_smc_fan_control(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result ret;
-
-	if (pi->caps_od_fuzzy_fan_control_support) {
-		ret = amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-							       PPSMC_StartFanControl,
-							       FAN_CONTROL_FUZZY);
-		if (ret != PPSMC_Result_OK)
-			return -EINVAL;
-		ret = amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-							       PPSMC_MSG_SetFanPwmMax,
-							       adev->pm.dpm.fan.default_max_fan_pwm);
-		if (ret != PPSMC_Result_OK)
-			return -EINVAL;
-	} else {
-		ret = amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-							       PPSMC_StartFanControl,
-							       FAN_CONTROL_TABLE);
-		if (ret != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	pi->fan_is_controlled_by_smc = true;
-	return 0;
-}
-
-
-static int ci_fan_ctrl_stop_smc_fan_control(struct amdgpu_device *adev)
-{
-	PPSMC_Result ret;
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	ret = amdgpu_ci_send_msg_to_smc(adev, PPSMC_StopFanControl);
-	if (ret == PPSMC_Result_OK) {
-		pi->fan_is_controlled_by_smc = false;
-		return 0;
-	} else {
-		return -EINVAL;
-	}
-}
-
-static int ci_dpm_get_fan_speed_percent(void *handle,
-					u32 *speed)
-{
-	u32 duty, duty100;
-	u64 tmp64;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	if (adev->pm.no_fan)
-		return -ENOENT;
-
-	duty100 = (RREG32_SMC(ixCG_FDO_CTRL1) & CG_FDO_CTRL1__FMAX_DUTY100_MASK)
-		>> CG_FDO_CTRL1__FMAX_DUTY100__SHIFT;
-	duty = (RREG32_SMC(ixCG_THERMAL_STATUS) & CG_THERMAL_STATUS__FDO_PWM_DUTY_MASK)
-		>> CG_THERMAL_STATUS__FDO_PWM_DUTY__SHIFT;
-
-	if (duty100 == 0)
-		return -EINVAL;
-
-	tmp64 = (u64)duty * 100;
-	do_div(tmp64, duty100);
-	*speed = (u32)tmp64;
-
-	if (*speed > 100)
-		*speed = 100;
-
-	return 0;
-}
-
-static int ci_dpm_set_fan_speed_percent(void *handle,
-					u32 speed)
-{
-	u32 tmp;
-	u32 duty, duty100;
-	u64 tmp64;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (adev->pm.no_fan)
-		return -ENOENT;
-
-	if (pi->fan_is_controlled_by_smc)
-		return -EINVAL;
-
-	if (speed > 100)
-		return -EINVAL;
-
-	duty100 = (RREG32_SMC(ixCG_FDO_CTRL1) & CG_FDO_CTRL1__FMAX_DUTY100_MASK)
-		>> CG_FDO_CTRL1__FMAX_DUTY100__SHIFT;
-
-	if (duty100 == 0)
-		return -EINVAL;
-
-	tmp64 = (u64)speed * duty100;
-	do_div(tmp64, 100);
-	duty = (u32)tmp64;
-
-	tmp = RREG32_SMC(ixCG_FDO_CTRL0) & ~CG_FDO_CTRL0__FDO_STATIC_DUTY_MASK;
-	tmp |= duty << CG_FDO_CTRL0__FDO_STATIC_DUTY__SHIFT;
-	WREG32_SMC(ixCG_FDO_CTRL0, tmp);
-
-	return 0;
-}
-
-static void ci_dpm_set_fan_control_mode(void *handle, u32 mode)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	switch (mode) {
-	case AMD_FAN_CTRL_NONE:
-		if (adev->pm.dpm.fan.ucode_fan_control)
-			ci_fan_ctrl_stop_smc_fan_control(adev);
-		ci_dpm_set_fan_speed_percent(adev, 100);
-		break;
-	case AMD_FAN_CTRL_MANUAL:
-		if (adev->pm.dpm.fan.ucode_fan_control)
-			ci_fan_ctrl_stop_smc_fan_control(adev);
-		break;
-	case AMD_FAN_CTRL_AUTO:
-		if (adev->pm.dpm.fan.ucode_fan_control)
-			ci_thermal_start_smc_fan_control(adev);
-		break;
-	default:
-		break;
-	}
-}
-
-static u32 ci_dpm_get_fan_control_mode(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (pi->fan_is_controlled_by_smc)
-		return AMD_FAN_CTRL_AUTO;
-	else
-		return AMD_FAN_CTRL_MANUAL;
-}
-
-#if 0
-static int ci_fan_ctrl_get_fan_speed_rpm(struct amdgpu_device *adev,
-					 u32 *speed)
-{
-	u32 tach_period;
-	u32 xclk = amdgpu_asic_get_xclk(adev);
-
-	if (adev->pm.no_fan)
-		return -ENOENT;
-
-	if (adev->pm.fan_pulses_per_revolution == 0)
-		return -ENOENT;
-
-	tach_period = (RREG32_SMC(ixCG_TACH_STATUS) & CG_TACH_STATUS__TACH_PERIOD_MASK)
-		>> CG_TACH_STATUS__TACH_PERIOD__SHIFT;
-	if (tach_period == 0)
-		return -ENOENT;
-
-	*speed = 60 * xclk * 10000 / tach_period;
-
-	return 0;
-}
-
-static int ci_fan_ctrl_set_fan_speed_rpm(struct amdgpu_device *adev,
-					 u32 speed)
-{
-	u32 tach_period, tmp;
-	u32 xclk = amdgpu_asic_get_xclk(adev);
-
-	if (adev->pm.no_fan)
-		return -ENOENT;
-
-	if (adev->pm.fan_pulses_per_revolution == 0)
-		return -ENOENT;
-
-	if ((speed < adev->pm.fan_min_rpm) ||
-	    (speed > adev->pm.fan_max_rpm))
-		return -EINVAL;
-
-	if (adev->pm.dpm.fan.ucode_fan_control)
-		ci_fan_ctrl_stop_smc_fan_control(adev);
-
-	tach_period = 60 * xclk * 10000 / (8 * speed);
-	tmp = RREG32_SMC(ixCG_TACH_CTRL) & ~CG_TACH_CTRL__TARGET_PERIOD_MASK;
-	tmp |= tach_period << CG_TACH_CTRL__TARGET_PERIOD__SHIFT;
-	WREG32_SMC(CG_TACH_CTRL, tmp);
-
-	ci_fan_ctrl_set_static_mode(adev, FDO_PWM_MODE_STATIC_RPM);
-
-	return 0;
-}
-#endif
-
-static void ci_fan_ctrl_set_default_mode(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 tmp;
-
-	if (!pi->fan_ctrl_is_in_default_mode) {
-		tmp = RREG32_SMC(ixCG_FDO_CTRL2) & ~CG_FDO_CTRL2__FDO_PWM_MODE_MASK;
-		tmp |= pi->fan_ctrl_default_mode << CG_FDO_CTRL2__FDO_PWM_MODE__SHIFT;
-		WREG32_SMC(ixCG_FDO_CTRL2, tmp);
-
-		tmp = RREG32_SMC(ixCG_FDO_CTRL2) & ~CG_FDO_CTRL2__TMIN_MASK;
-		tmp |= pi->t_min << CG_FDO_CTRL2__TMIN__SHIFT;
-		WREG32_SMC(ixCG_FDO_CTRL2, tmp);
-		pi->fan_ctrl_is_in_default_mode = true;
-	}
-}
-
-static void ci_thermal_start_smc_fan_control(struct amdgpu_device *adev)
-{
-	if (adev->pm.dpm.fan.ucode_fan_control) {
-		ci_fan_ctrl_start_smc_fan_control(adev);
-		ci_fan_ctrl_set_static_mode(adev, FDO_PWM_MODE_STATIC);
-	}
-}
-
-static void ci_thermal_initialize(struct amdgpu_device *adev)
-{
-	u32 tmp;
-
-	if (adev->pm.fan_pulses_per_revolution) {
-		tmp = RREG32_SMC(ixCG_TACH_CTRL) & ~CG_TACH_CTRL__EDGE_PER_REV_MASK;
-		tmp |= (adev->pm.fan_pulses_per_revolution - 1)
-			<< CG_TACH_CTRL__EDGE_PER_REV__SHIFT;
-		WREG32_SMC(ixCG_TACH_CTRL, tmp);
-	}
-
-	tmp = RREG32_SMC(ixCG_FDO_CTRL2) & ~CG_FDO_CTRL2__TACH_PWM_RESP_RATE_MASK;
-	tmp |= 0x28 << CG_FDO_CTRL2__TACH_PWM_RESP_RATE__SHIFT;
-	WREG32_SMC(ixCG_FDO_CTRL2, tmp);
-}
-
-static int ci_thermal_start_thermal_controller(struct amdgpu_device *adev)
-{
-	int ret;
-
-	ci_thermal_initialize(adev);
-	ret = ci_thermal_set_temperature_range(adev, CISLANDS_TEMP_RANGE_MIN, CISLANDS_TEMP_RANGE_MAX);
-	if (ret)
-		return ret;
-	ret = ci_thermal_enable_alert(adev, true);
-	if (ret)
-		return ret;
-	if (adev->pm.dpm.fan.ucode_fan_control) {
-		ret = ci_thermal_setup_fan_table(adev);
-		if (ret)
-			return ret;
-		ci_thermal_start_smc_fan_control(adev);
-	}
-
-	return 0;
-}
-
-static void ci_thermal_stop_thermal_controller(struct amdgpu_device *adev)
-{
-	if (!adev->pm.no_fan)
-		ci_fan_ctrl_set_default_mode(adev);
-}
-
-static int ci_read_smc_soft_register(struct amdgpu_device *adev,
-				     u16 reg_offset, u32 *value)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	return amdgpu_ci_read_smc_sram_dword(adev,
-				      pi->soft_regs_start + reg_offset,
-				      value, pi->sram_end);
-}
-
-static int ci_write_smc_soft_register(struct amdgpu_device *adev,
-				      u16 reg_offset, u32 value)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	return amdgpu_ci_write_smc_sram_dword(adev,
-				       pi->soft_regs_start + reg_offset,
-				       value, pi->sram_end);
-}
-
-static void ci_init_fps_limits(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	SMU7_Discrete_DpmTable *table = &pi->smc_state_table;
-
-	if (pi->caps_fps) {
-		u16 tmp;
-
-		tmp = 45;
-		table->FpsHighT = cpu_to_be16(tmp);
-
-		tmp = 30;
-		table->FpsLowT = cpu_to_be16(tmp);
-	}
-}
-
-static int ci_update_sclk_t(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	int ret = 0;
-	u32 low_sclk_interrupt_t = 0;
-
-	if (pi->caps_sclk_throttle_low_notification) {
-		low_sclk_interrupt_t = cpu_to_be32(pi->low_sclk_interrupt_t);
-
-		ret = amdgpu_ci_copy_bytes_to_smc(adev,
-					   pi->dpm_table_start +
-					   offsetof(SMU7_Discrete_DpmTable, LowSclkInterruptT),
-					   (u8 *)&low_sclk_interrupt_t,
-					   sizeof(u32), pi->sram_end);
-
-	}
-
-	return ret;
-}
-
-static void ci_get_leakage_voltages(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u16 leakage_id, virtual_voltage_id;
-	u16 vddc, vddci;
-	int i;
-
-	pi->vddc_leakage.count = 0;
-	pi->vddci_leakage.count = 0;
-
-	if (adev->pm.dpm.platform_caps & ATOM_PP_PLATFORM_CAP_EVV) {
-		for (i = 0; i < CISLANDS_MAX_LEAKAGE_COUNT; i++) {
-			virtual_voltage_id = ATOM_VIRTUAL_VOLTAGE_ID0 + i;
-			if (amdgpu_atombios_get_voltage_evv(adev, virtual_voltage_id, &vddc) != 0)
-				continue;
-			if (vddc != 0 && vddc != virtual_voltage_id) {
-				pi->vddc_leakage.actual_voltage[pi->vddc_leakage.count] = vddc;
-				pi->vddc_leakage.leakage_id[pi->vddc_leakage.count] = virtual_voltage_id;
-				pi->vddc_leakage.count++;
-			}
-		}
-	} else if (amdgpu_atombios_get_leakage_id_from_vbios(adev, &leakage_id) == 0) {
-		for (i = 0; i < CISLANDS_MAX_LEAKAGE_COUNT; i++) {
-			virtual_voltage_id = ATOM_VIRTUAL_VOLTAGE_ID0 + i;
-			if (amdgpu_atombios_get_leakage_vddc_based_on_leakage_params(adev, &vddc, &vddci,
-										     virtual_voltage_id,
-										     leakage_id) == 0) {
-				if (vddc != 0 && vddc != virtual_voltage_id) {
-					pi->vddc_leakage.actual_voltage[pi->vddc_leakage.count] = vddc;
-					pi->vddc_leakage.leakage_id[pi->vddc_leakage.count] = virtual_voltage_id;
-					pi->vddc_leakage.count++;
-				}
-				if (vddci != 0 && vddci != virtual_voltage_id) {
-					pi->vddci_leakage.actual_voltage[pi->vddci_leakage.count] = vddci;
-					pi->vddci_leakage.leakage_id[pi->vddci_leakage.count] = virtual_voltage_id;
-					pi->vddci_leakage.count++;
-				}
-			}
-		}
-	}
-}
-
-static void ci_set_dpm_event_sources(struct amdgpu_device *adev, u32 sources)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	bool want_thermal_protection;
-	enum amdgpu_dpm_event_src dpm_event_src;
-	u32 tmp;
-
-	switch (sources) {
-	case 0:
-	default:
-		want_thermal_protection = false;
-		break;
-	case (1 << AMDGPU_DPM_AUTO_THROTTLE_SRC_THERMAL):
-		want_thermal_protection = true;
-		dpm_event_src = AMDGPU_DPM_EVENT_SRC_DIGITAL;
-		break;
-	case (1 << AMDGPU_DPM_AUTO_THROTTLE_SRC_EXTERNAL):
-		want_thermal_protection = true;
-		dpm_event_src = AMDGPU_DPM_EVENT_SRC_EXTERNAL;
-		break;
-	case ((1 << AMDGPU_DPM_AUTO_THROTTLE_SRC_EXTERNAL) |
-	      (1 << AMDGPU_DPM_AUTO_THROTTLE_SRC_THERMAL)):
-		want_thermal_protection = true;
-		dpm_event_src = AMDGPU_DPM_EVENT_SRC_DIGIAL_OR_EXTERNAL;
-		break;
-	}
-
-	if (want_thermal_protection) {
-#if 0
-		/* XXX: need to figure out how to handle this properly */
-		tmp = RREG32_SMC(ixCG_THERMAL_CTRL);
-		tmp &= DPM_EVENT_SRC_MASK;
-		tmp |= DPM_EVENT_SRC(dpm_event_src);
-		WREG32_SMC(ixCG_THERMAL_CTRL, tmp);
-#endif
-
-		tmp = RREG32_SMC(ixGENERAL_PWRMGT);
-		if (pi->thermal_protection)
-			tmp &= ~GENERAL_PWRMGT__THERMAL_PROTECTION_DIS_MASK;
-		else
-			tmp |= GENERAL_PWRMGT__THERMAL_PROTECTION_DIS_MASK;
-		WREG32_SMC(ixGENERAL_PWRMGT, tmp);
-	} else {
-		tmp = RREG32_SMC(ixGENERAL_PWRMGT);
-		tmp |= GENERAL_PWRMGT__THERMAL_PROTECTION_DIS_MASK;
-		WREG32_SMC(ixGENERAL_PWRMGT, tmp);
-	}
-}
-
-static void ci_enable_auto_throttle_source(struct amdgpu_device *adev,
-					   enum amdgpu_dpm_auto_throttle_src source,
-					   bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (enable) {
-		if (!(pi->active_auto_throttle_sources & (1 << source))) {
-			pi->active_auto_throttle_sources |= 1 << source;
-			ci_set_dpm_event_sources(adev, pi->active_auto_throttle_sources);
-		}
-	} else {
-		if (pi->active_auto_throttle_sources & (1 << source)) {
-			pi->active_auto_throttle_sources &= ~(1 << source);
-			ci_set_dpm_event_sources(adev, pi->active_auto_throttle_sources);
-		}
-	}
-}
-
-static void ci_enable_vr_hot_gpio_interrupt(struct amdgpu_device *adev)
-{
-	if (adev->pm.dpm.platform_caps & ATOM_PP_PLATFORM_CAP_REGULATOR_HOT)
-		amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_EnableVRHotGPIOInterrupt);
-}
-
-static int ci_unfreeze_sclk_mclk_dpm(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result smc_result;
-
-	if (!pi->need_update_smu7_dpm_table)
-		return 0;
-
-	if ((!pi->sclk_dpm_key_disabled) &&
-	    (pi->need_update_smu7_dpm_table & (DPMTABLE_OD_UPDATE_SCLK | DPMTABLE_UPDATE_SCLK))) {
-		smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_SCLKDPM_UnfreezeLevel);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	if ((!pi->mclk_dpm_key_disabled) &&
-	    (pi->need_update_smu7_dpm_table & DPMTABLE_OD_UPDATE_MCLK)) {
-		smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_MCLKDPM_UnfreezeLevel);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	pi->need_update_smu7_dpm_table = 0;
-	return 0;
-}
-
-static int ci_enable_sclk_mclk_dpm(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result smc_result;
-
-	if (enable) {
-		if (!pi->sclk_dpm_key_disabled) {
-			smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_DPM_Enable);
-			if (smc_result != PPSMC_Result_OK)
-				return -EINVAL;
-		}
-
-		if (!pi->mclk_dpm_key_disabled) {
-			smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_MCLKDPM_Enable);
-			if (smc_result != PPSMC_Result_OK)
-				return -EINVAL;
-
-			WREG32_P(mmMC_SEQ_CNTL_3, MC_SEQ_CNTL_3__CAC_EN_MASK,
-					~MC_SEQ_CNTL_3__CAC_EN_MASK);
-
-			WREG32_SMC(ixLCAC_MC0_CNTL, 0x05);
-			WREG32_SMC(ixLCAC_MC1_CNTL, 0x05);
-			WREG32_SMC(ixLCAC_CPL_CNTL, 0x100005);
-
-			udelay(10);
-
-			WREG32_SMC(ixLCAC_MC0_CNTL, 0x400005);
-			WREG32_SMC(ixLCAC_MC1_CNTL, 0x400005);
-			WREG32_SMC(ixLCAC_CPL_CNTL, 0x500005);
-		}
-	} else {
-		if (!pi->sclk_dpm_key_disabled) {
-			smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_DPM_Disable);
-			if (smc_result != PPSMC_Result_OK)
-				return -EINVAL;
-		}
-
-		if (!pi->mclk_dpm_key_disabled) {
-			smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_MCLKDPM_Disable);
-			if (smc_result != PPSMC_Result_OK)
-				return -EINVAL;
-		}
-	}
-
-	return 0;
-}
-
-static int ci_start_dpm(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result smc_result;
-	int ret;
-	u32 tmp;
-
-	tmp = RREG32_SMC(ixGENERAL_PWRMGT);
-	tmp |= GENERAL_PWRMGT__GLOBAL_PWRMGT_EN_MASK;
-	WREG32_SMC(ixGENERAL_PWRMGT, tmp);
-
-	tmp = RREG32_SMC(ixSCLK_PWRMGT_CNTL);
-	tmp |= SCLK_PWRMGT_CNTL__DYNAMIC_PM_EN_MASK;
-	WREG32_SMC(ixSCLK_PWRMGT_CNTL, tmp);
-
-	ci_write_smc_soft_register(adev, offsetof(SMU7_SoftRegisters, VoltageChangeTimeout), 0x1000);
-
-	WREG32_P(mmBIF_LNCNT_RESET, 0, ~BIF_LNCNT_RESET__RESET_LNCNT_EN_MASK);
-
-	smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_Voltage_Cntl_Enable);
-	if (smc_result != PPSMC_Result_OK)
-		return -EINVAL;
-
-	ret = ci_enable_sclk_mclk_dpm(adev, true);
-	if (ret)
-		return ret;
-
-	if (!pi->pcie_dpm_key_disabled) {
-		smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_PCIeDPM_Enable);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	return 0;
-}
-
-static int ci_freeze_sclk_mclk_dpm(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result smc_result;
-
-	if (!pi->need_update_smu7_dpm_table)
-		return 0;
-
-	if ((!pi->sclk_dpm_key_disabled) &&
-	    (pi->need_update_smu7_dpm_table & (DPMTABLE_OD_UPDATE_SCLK | DPMTABLE_UPDATE_SCLK))) {
-		smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_SCLKDPM_FreezeLevel);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	if ((!pi->mclk_dpm_key_disabled) &&
-	    (pi->need_update_smu7_dpm_table & DPMTABLE_OD_UPDATE_MCLK)) {
-		smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_MCLKDPM_FreezeLevel);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	return 0;
-}
-
-static int ci_stop_dpm(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result smc_result;
-	int ret;
-	u32 tmp;
-
-	tmp = RREG32_SMC(ixGENERAL_PWRMGT);
-	tmp &= ~GENERAL_PWRMGT__GLOBAL_PWRMGT_EN_MASK;
-	WREG32_SMC(ixGENERAL_PWRMGT, tmp);
-
-	tmp = RREG32_SMC(ixSCLK_PWRMGT_CNTL);
-	tmp &= ~SCLK_PWRMGT_CNTL__DYNAMIC_PM_EN_MASK;
-	WREG32_SMC(ixSCLK_PWRMGT_CNTL, tmp);
-
-	if (!pi->pcie_dpm_key_disabled) {
-		smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_PCIeDPM_Disable);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	ret = ci_enable_sclk_mclk_dpm(adev, false);
-	if (ret)
-		return ret;
-
-	smc_result = amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_Voltage_Cntl_Disable);
-	if (smc_result != PPSMC_Result_OK)
-		return -EINVAL;
-
-	return 0;
-}
-
-static void ci_enable_sclk_control(struct amdgpu_device *adev, bool enable)
-{
-	u32 tmp = RREG32_SMC(ixSCLK_PWRMGT_CNTL);
-
-	if (enable)
-		tmp &= ~SCLK_PWRMGT_CNTL__SCLK_PWRMGT_OFF_MASK;
-	else
-		tmp |= SCLK_PWRMGT_CNTL__SCLK_PWRMGT_OFF_MASK;
-	WREG32_SMC(ixSCLK_PWRMGT_CNTL, tmp);
-}
-
-#if 0
-static int ci_notify_hw_of_power_source(struct amdgpu_device *adev,
-					bool ac_power)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_cac_tdp_table *cac_tdp_table =
-		adev->pm.dpm.dyn_state.cac_tdp_table;
-	u32 power_limit;
-
-	if (ac_power)
-		power_limit = (u32)(cac_tdp_table->maximum_power_delivery_limit * 256);
-	else
-		power_limit = (u32)(cac_tdp_table->battery_power_limit * 256);
-
-	ci_set_power_limit(adev, power_limit);
-
-	if (pi->caps_automatic_dc_transition) {
-		if (ac_power)
-			amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_RunningOnAC);
-		else
-			amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_Remove_DC_Clamp);
-	}
-
-	return 0;
-}
-#endif
-
-static PPSMC_Result amdgpu_ci_send_msg_to_smc_with_parameter(struct amdgpu_device *adev,
-						      PPSMC_Msg msg, u32 parameter)
-{
-	WREG32(mmSMC_MSG_ARG_0, parameter);
-	return amdgpu_ci_send_msg_to_smc(adev, msg);
-}
-
-static PPSMC_Result amdgpu_ci_send_msg_to_smc_return_parameter(struct amdgpu_device *adev,
-							PPSMC_Msg msg, u32 *parameter)
-{
-	PPSMC_Result smc_result;
-
-	smc_result = amdgpu_ci_send_msg_to_smc(adev, msg);
-
-	if ((smc_result == PPSMC_Result_OK) && parameter)
-		*parameter = RREG32(mmSMC_MSG_ARG_0);
-
-	return smc_result;
-}
-
-static int ci_dpm_force_state_sclk(struct amdgpu_device *adev, u32 n)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (!pi->sclk_dpm_key_disabled) {
-		PPSMC_Result smc_result =
-			amdgpu_ci_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_SCLKDPM_SetEnabledMask, 1 << n);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	return 0;
-}
-
-static int ci_dpm_force_state_mclk(struct amdgpu_device *adev, u32 n)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (!pi->mclk_dpm_key_disabled) {
-		PPSMC_Result smc_result =
-			amdgpu_ci_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_MCLKDPM_SetEnabledMask, 1 << n);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	return 0;
-}
-
-static int ci_dpm_force_state_pcie(struct amdgpu_device *adev, u32 n)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (!pi->pcie_dpm_key_disabled) {
-		PPSMC_Result smc_result =
-			amdgpu_ci_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_PCIeDPM_ForceLevel, n);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	return 0;
-}
-
-static int ci_set_power_limit(struct amdgpu_device *adev, u32 n)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (pi->power_containment_features & POWERCONTAINMENT_FEATURE_PkgPwrLimit) {
-		PPSMC_Result smc_result =
-			amdgpu_ci_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_PkgPwrSetLimit, n);
-		if (smc_result != PPSMC_Result_OK)
-			return -EINVAL;
-	}
-
-	return 0;
-}
-
-static int ci_set_overdrive_target_tdp(struct amdgpu_device *adev,
-				       u32 target_tdp)
-{
-	PPSMC_Result smc_result =
-		amdgpu_ci_send_msg_to_smc_with_parameter(adev, PPSMC_MSG_OverDriveSetTargetTdp, target_tdp);
-	if (smc_result != PPSMC_Result_OK)
-		return -EINVAL;
-	return 0;
-}
-
-#if 0
-static int ci_set_boot_state(struct amdgpu_device *adev)
-{
-	return ci_enable_sclk_mclk_dpm(adev, false);
-}
-#endif
-
-static u32 ci_get_average_sclk_freq(struct amdgpu_device *adev)
-{
-	u32 sclk_freq;
-	PPSMC_Result smc_result =
-		amdgpu_ci_send_msg_to_smc_return_parameter(adev,
-						    PPSMC_MSG_API_GetSclkFrequency,
-						    &sclk_freq);
-	if (smc_result != PPSMC_Result_OK)
-		sclk_freq = 0;
-
-	return sclk_freq;
-}
-
-static u32 ci_get_average_mclk_freq(struct amdgpu_device *adev)
-{
-	u32 mclk_freq;
-	PPSMC_Result smc_result =
-		amdgpu_ci_send_msg_to_smc_return_parameter(adev,
-						    PPSMC_MSG_API_GetMclkFrequency,
-						    &mclk_freq);
-	if (smc_result != PPSMC_Result_OK)
-		mclk_freq = 0;
-
-	return mclk_freq;
-}
-
-static void ci_dpm_start_smc(struct amdgpu_device *adev)
-{
-	int i;
-
-	amdgpu_ci_program_jump_on_start(adev);
-	amdgpu_ci_start_smc_clock(adev);
-	amdgpu_ci_start_smc(adev);
-	for (i = 0; i < adev->usec_timeout; i++) {
-		if (RREG32_SMC(ixFIRMWARE_FLAGS) & FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK)
-			break;
-	}
-}
-
-static void ci_dpm_stop_smc(struct amdgpu_device *adev)
-{
-	amdgpu_ci_reset_smc(adev);
-	amdgpu_ci_stop_smc_clock(adev);
-}
-
-static int ci_process_firmware_header(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 tmp;
-	int ret;
-
-	ret = amdgpu_ci_read_smc_sram_dword(adev,
-				     SMU7_FIRMWARE_HEADER_LOCATION +
-				     offsetof(SMU7_Firmware_Header, DpmTable),
-				     &tmp, pi->sram_end);
-	if (ret)
-		return ret;
-
-	pi->dpm_table_start = tmp;
-
-	ret = amdgpu_ci_read_smc_sram_dword(adev,
-				     SMU7_FIRMWARE_HEADER_LOCATION +
-				     offsetof(SMU7_Firmware_Header, SoftRegisters),
-				     &tmp, pi->sram_end);
-	if (ret)
-		return ret;
-
-	pi->soft_regs_start = tmp;
-
-	ret = amdgpu_ci_read_smc_sram_dword(adev,
-				     SMU7_FIRMWARE_HEADER_LOCATION +
-				     offsetof(SMU7_Firmware_Header, mcRegisterTable),
-				     &tmp, pi->sram_end);
-	if (ret)
-		return ret;
-
-	pi->mc_reg_table_start = tmp;
-
-	ret = amdgpu_ci_read_smc_sram_dword(adev,
-				     SMU7_FIRMWARE_HEADER_LOCATION +
-				     offsetof(SMU7_Firmware_Header, FanTable),
-				     &tmp, pi->sram_end);
-	if (ret)
-		return ret;
-
-	pi->fan_table_start = tmp;
-
-	ret = amdgpu_ci_read_smc_sram_dword(adev,
-				     SMU7_FIRMWARE_HEADER_LOCATION +
-				     offsetof(SMU7_Firmware_Header, mcArbDramTimingTable),
-				     &tmp, pi->sram_end);
-	if (ret)
-		return ret;
-
-	pi->arb_table_start = tmp;
-
-	return 0;
-}
-
-static void ci_read_clock_registers(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	pi->clock_registers.cg_spll_func_cntl =
-		RREG32_SMC(ixCG_SPLL_FUNC_CNTL);
-	pi->clock_registers.cg_spll_func_cntl_2 =
-		RREG32_SMC(ixCG_SPLL_FUNC_CNTL_2);
-	pi->clock_registers.cg_spll_func_cntl_3 =
-		RREG32_SMC(ixCG_SPLL_FUNC_CNTL_3);
-	pi->clock_registers.cg_spll_func_cntl_4 =
-		RREG32_SMC(ixCG_SPLL_FUNC_CNTL_4);
-	pi->clock_registers.cg_spll_spread_spectrum =
-		RREG32_SMC(ixCG_SPLL_SPREAD_SPECTRUM);
-	pi->clock_registers.cg_spll_spread_spectrum_2 =
-		RREG32_SMC(ixCG_SPLL_SPREAD_SPECTRUM_2);
-	pi->clock_registers.dll_cntl = RREG32(mmDLL_CNTL);
-	pi->clock_registers.mclk_pwrmgt_cntl = RREG32(mmMCLK_PWRMGT_CNTL);
-	pi->clock_registers.mpll_ad_func_cntl = RREG32(mmMPLL_AD_FUNC_CNTL);
-	pi->clock_registers.mpll_dq_func_cntl = RREG32(mmMPLL_DQ_FUNC_CNTL);
-	pi->clock_registers.mpll_func_cntl = RREG32(mmMPLL_FUNC_CNTL);
-	pi->clock_registers.mpll_func_cntl_1 = RREG32(mmMPLL_FUNC_CNTL_1);
-	pi->clock_registers.mpll_func_cntl_2 = RREG32(mmMPLL_FUNC_CNTL_2);
-	pi->clock_registers.mpll_ss1 = RREG32(mmMPLL_SS1);
-	pi->clock_registers.mpll_ss2 = RREG32(mmMPLL_SS2);
-}
-
-static void ci_init_sclk_t(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	pi->low_sclk_interrupt_t = 0;
-}
-
-static void ci_enable_thermal_protection(struct amdgpu_device *adev,
-					 bool enable)
-{
-	u32 tmp = RREG32_SMC(ixGENERAL_PWRMGT);
-
-	if (enable)
-		tmp &= ~GENERAL_PWRMGT__THERMAL_PROTECTION_DIS_MASK;
-	else
-		tmp |= GENERAL_PWRMGT__THERMAL_PROTECTION_DIS_MASK;
-	WREG32_SMC(ixGENERAL_PWRMGT, tmp);
-}
-
-static void ci_enable_acpi_power_management(struct amdgpu_device *adev)
-{
-	u32 tmp = RREG32_SMC(ixGENERAL_PWRMGT);
-
-	tmp |= GENERAL_PWRMGT__STATIC_PM_EN_MASK;
-
-	WREG32_SMC(ixGENERAL_PWRMGT, tmp);
-}
-
-#if 0
-static int ci_enter_ulp_state(struct amdgpu_device *adev)
-{
-
-	WREG32(mmSMC_MESSAGE_0, PPSMC_MSG_SwitchToMinimumPower);
-
-	udelay(25000);
-
-	return 0;
-}
-
-static int ci_exit_ulp_state(struct amdgpu_device *adev)
-{
-	int i;
-
-	WREG32(mmSMC_MESSAGE_0, PPSMC_MSG_ResumeFromMinimumPower);
-
-	udelay(7000);
-
-	for (i = 0; i < adev->usec_timeout; i++) {
-		if (RREG32(mmSMC_RESP_0) == 1)
-			break;
-		udelay(1000);
-	}
-
-	return 0;
-}
-#endif
-
-static int ci_notify_smc_display_change(struct amdgpu_device *adev,
-					bool has_display)
-{
-	PPSMC_Msg msg = has_display ? PPSMC_MSG_HasDisplay : PPSMC_MSG_NoDisplay;
-
-	return (amdgpu_ci_send_msg_to_smc(adev, msg) == PPSMC_Result_OK) ?  0 : -EINVAL;
-}
-
-static int ci_enable_ds_master_switch(struct amdgpu_device *adev,
-				      bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (enable) {
-		if (pi->caps_sclk_ds) {
-			if (amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_MASTER_DeepSleep_ON) != PPSMC_Result_OK)
-				return -EINVAL;
-		} else {
-			if (amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_MASTER_DeepSleep_OFF) != PPSMC_Result_OK)
-				return -EINVAL;
-		}
-	} else {
-		if (pi->caps_sclk_ds) {
-			if (amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_MASTER_DeepSleep_OFF) != PPSMC_Result_OK)
-				return -EINVAL;
-		}
-	}
-
-	return 0;
-}
-
-static void ci_program_display_gap(struct amdgpu_device *adev)
-{
-	u32 tmp = RREG32_SMC(ixCG_DISPLAY_GAP_CNTL);
-	u32 pre_vbi_time_in_us;
-	u32 frame_time_in_us;
-	u32 ref_clock = adev->clock.spll.reference_freq;
-	u32 refresh_rate = amdgpu_dpm_get_vrefresh(adev);
-	u32 vblank_time = amdgpu_dpm_get_vblank_time(adev);
-
-	tmp &= ~CG_DISPLAY_GAP_CNTL__DISP_GAP_MASK;
-	if (adev->pm.dpm.new_active_crtc_count > 0)
-		tmp |= (AMDGPU_PM_DISPLAY_GAP_VBLANK_OR_WM << CG_DISPLAY_GAP_CNTL__DISP_GAP__SHIFT);
-	else
-		tmp |= (AMDGPU_PM_DISPLAY_GAP_IGNORE << CG_DISPLAY_GAP_CNTL__DISP_GAP__SHIFT);
-	WREG32_SMC(ixCG_DISPLAY_GAP_CNTL, tmp);
-
-	if (refresh_rate == 0)
-		refresh_rate = 60;
-	if (vblank_time == 0xffffffff)
-		vblank_time = 500;
-	frame_time_in_us = 1000000 / refresh_rate;
-	pre_vbi_time_in_us =
-		frame_time_in_us - 200 - vblank_time;
-	tmp = pre_vbi_time_in_us * (ref_clock / 100);
-
-	WREG32_SMC(ixCG_DISPLAY_GAP_CNTL2, tmp);
-	ci_write_smc_soft_register(adev, offsetof(SMU7_SoftRegisters, PreVBlankGap), 0x64);
-	ci_write_smc_soft_register(adev, offsetof(SMU7_SoftRegisters, VBlankTimeout), (frame_time_in_us - pre_vbi_time_in_us));
-
-
-	ci_notify_smc_display_change(adev, (adev->pm.dpm.new_active_crtc_count == 1));
-
-}
-
-static void ci_enable_spread_spectrum(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 tmp;
-
-	if (enable) {
-		if (pi->caps_sclk_ss_support) {
-			tmp = RREG32_SMC(ixGENERAL_PWRMGT);
-			tmp |= GENERAL_PWRMGT__DYN_SPREAD_SPECTRUM_EN_MASK;
-			WREG32_SMC(ixGENERAL_PWRMGT, tmp);
-		}
-	} else {
-		tmp = RREG32_SMC(ixCG_SPLL_SPREAD_SPECTRUM);
-		tmp &= ~CG_SPLL_SPREAD_SPECTRUM__SSEN_MASK;
-		WREG32_SMC(ixCG_SPLL_SPREAD_SPECTRUM, tmp);
-
-		tmp = RREG32_SMC(ixGENERAL_PWRMGT);
-		tmp &= ~GENERAL_PWRMGT__DYN_SPREAD_SPECTRUM_EN_MASK;
-		WREG32_SMC(ixGENERAL_PWRMGT, tmp);
-	}
-}
-
-static void ci_program_sstp(struct amdgpu_device *adev)
-{
-	WREG32_SMC(ixCG_STATIC_SCREEN_PARAMETER,
-	((CISLANDS_SSTU_DFLT << CG_STATIC_SCREEN_PARAMETER__STATIC_SCREEN_THRESHOLD_UNIT__SHIFT) |
-	 (CISLANDS_SST_DFLT << CG_STATIC_SCREEN_PARAMETER__STATIC_SCREEN_THRESHOLD__SHIFT)));
-}
-
-static void ci_enable_display_gap(struct amdgpu_device *adev)
-{
-	u32 tmp = RREG32_SMC(ixCG_DISPLAY_GAP_CNTL);
-
-	tmp &= ~(CG_DISPLAY_GAP_CNTL__DISP_GAP_MASK |
-			CG_DISPLAY_GAP_CNTL__DISP_GAP_MCHG_MASK);
-	tmp |= ((AMDGPU_PM_DISPLAY_GAP_IGNORE << CG_DISPLAY_GAP_CNTL__DISP_GAP__SHIFT) |
-		(AMDGPU_PM_DISPLAY_GAP_VBLANK << CG_DISPLAY_GAP_CNTL__DISP_GAP_MCHG__SHIFT));
-
-	WREG32_SMC(ixCG_DISPLAY_GAP_CNTL, tmp);
-}
-
-static void ci_program_vc(struct amdgpu_device *adev)
-{
-	u32 tmp;
-
-	tmp = RREG32_SMC(ixSCLK_PWRMGT_CNTL);
-	tmp &= ~(SCLK_PWRMGT_CNTL__RESET_SCLK_CNT_MASK | SCLK_PWRMGT_CNTL__RESET_BUSY_CNT_MASK);
-	WREG32_SMC(ixSCLK_PWRMGT_CNTL, tmp);
-
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_0, CISLANDS_VRC_DFLT0);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_1, CISLANDS_VRC_DFLT1);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_2, CISLANDS_VRC_DFLT2);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_3, CISLANDS_VRC_DFLT3);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_4, CISLANDS_VRC_DFLT4);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_5, CISLANDS_VRC_DFLT5);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_6, CISLANDS_VRC_DFLT6);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_7, CISLANDS_VRC_DFLT7);
-}
-
-static void ci_clear_vc(struct amdgpu_device *adev)
-{
-	u32 tmp;
-
-	tmp = RREG32_SMC(ixSCLK_PWRMGT_CNTL);
-	tmp |= (SCLK_PWRMGT_CNTL__RESET_SCLK_CNT_MASK | SCLK_PWRMGT_CNTL__RESET_BUSY_CNT_MASK);
-	WREG32_SMC(ixSCLK_PWRMGT_CNTL, tmp);
-
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_0, 0);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_1, 0);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_2, 0);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_3, 0);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_4, 0);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_5, 0);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_6, 0);
-	WREG32_SMC(ixCG_FREQ_TRAN_VOTING_7, 0);
-}
-
-static int ci_upload_firmware(struct amdgpu_device *adev)
-{
-	int i, ret;
-
-	if (amdgpu_ci_is_smc_running(adev)) {
-		DRM_INFO("smc is running, no need to load smc firmware\n");
-		return 0;
-	}
-
-	for (i = 0; i < adev->usec_timeout; i++) {
-		if (RREG32_SMC(ixRCU_UC_EVENTS) & RCU_UC_EVENTS__boot_seq_done_MASK)
-			break;
-	}
-	WREG32_SMC(ixSMC_SYSCON_MISC_CNTL, 1);
-
-	amdgpu_ci_stop_smc_clock(adev);
-	amdgpu_ci_reset_smc(adev);
-
-	ret = amdgpu_ci_load_smc_ucode(adev, SMC_RAM_END);
-
-	return ret;
-
-}
-
-static int ci_get_svi2_voltage_table(struct amdgpu_device *adev,
-				     struct amdgpu_clock_voltage_dependency_table *voltage_dependency_table,
-				     struct atom_voltage_table *voltage_table)
-{
-	u32 i;
-
-	if (voltage_dependency_table == NULL)
-		return -EINVAL;
-
-	voltage_table->mask_low = 0;
-	voltage_table->phase_delay = 0;
-
-	voltage_table->count = voltage_dependency_table->count;
-	for (i = 0; i < voltage_table->count; i++) {
-		voltage_table->entries[i].value = voltage_dependency_table->entries[i].v;
-		voltage_table->entries[i].smio_low = 0;
-	}
-
-	return 0;
-}
-
-static int ci_construct_voltage_tables(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	int ret;
-
-	if (pi->voltage_control == CISLANDS_VOLTAGE_CONTROL_BY_GPIO) {
-		ret = amdgpu_atombios_get_voltage_table(adev, VOLTAGE_TYPE_VDDC,
-							VOLTAGE_OBJ_GPIO_LUT,
-							&pi->vddc_voltage_table);
-		if (ret)
-			return ret;
-	} else if (pi->voltage_control == CISLANDS_VOLTAGE_CONTROL_BY_SVID2) {
-		ret = ci_get_svi2_voltage_table(adev,
-						&adev->pm.dpm.dyn_state.vddc_dependency_on_mclk,
-						&pi->vddc_voltage_table);
-		if (ret)
-			return ret;
-	}
-
-	if (pi->vddc_voltage_table.count > SMU7_MAX_LEVELS_VDDC)
-		ci_trim_voltage_table_to_fit_state_table(adev, SMU7_MAX_LEVELS_VDDC,
-							 &pi->vddc_voltage_table);
-
-	if (pi->vddci_control == CISLANDS_VOLTAGE_CONTROL_BY_GPIO) {
-		ret = amdgpu_atombios_get_voltage_table(adev, VOLTAGE_TYPE_VDDCI,
-							VOLTAGE_OBJ_GPIO_LUT,
-							&pi->vddci_voltage_table);
-		if (ret)
-			return ret;
-	} else if (pi->vddci_control == CISLANDS_VOLTAGE_CONTROL_BY_SVID2) {
-		ret = ci_get_svi2_voltage_table(adev,
-						&adev->pm.dpm.dyn_state.vddci_dependency_on_mclk,
-						&pi->vddci_voltage_table);
-		if (ret)
-			return ret;
-	}
-
-	if (pi->vddci_voltage_table.count > SMU7_MAX_LEVELS_VDDCI)
-		ci_trim_voltage_table_to_fit_state_table(adev, SMU7_MAX_LEVELS_VDDCI,
-							 &pi->vddci_voltage_table);
-
-	if (pi->mvdd_control == CISLANDS_VOLTAGE_CONTROL_BY_GPIO) {
-		ret = amdgpu_atombios_get_voltage_table(adev, VOLTAGE_TYPE_MVDDC,
-							VOLTAGE_OBJ_GPIO_LUT,
-							&pi->mvdd_voltage_table);
-		if (ret)
-			return ret;
-	} else if (pi->mvdd_control == CISLANDS_VOLTAGE_CONTROL_BY_SVID2) {
-		ret = ci_get_svi2_voltage_table(adev,
-						&adev->pm.dpm.dyn_state.mvdd_dependency_on_mclk,
-						&pi->mvdd_voltage_table);
-		if (ret)
-			return ret;
-	}
-
-	if (pi->mvdd_voltage_table.count > SMU7_MAX_LEVELS_MVDD)
-		ci_trim_voltage_table_to_fit_state_table(adev, SMU7_MAX_LEVELS_MVDD,
-							 &pi->mvdd_voltage_table);
-
-	return 0;
-}
-
-static void ci_populate_smc_voltage_table(struct amdgpu_device *adev,
-					  struct atom_voltage_table_entry *voltage_table,
-					  SMU7_Discrete_VoltageLevel *smc_voltage_table)
-{
-	int ret;
-
-	ret = ci_get_std_voltage_value_sidd(adev, voltage_table,
-					    &smc_voltage_table->StdVoltageHiSidd,
-					    &smc_voltage_table->StdVoltageLoSidd);
-
-	if (ret) {
-		smc_voltage_table->StdVoltageHiSidd = voltage_table->value * VOLTAGE_SCALE;
-		smc_voltage_table->StdVoltageLoSidd = voltage_table->value * VOLTAGE_SCALE;
-	}
-
-	smc_voltage_table->Voltage = cpu_to_be16(voltage_table->value * VOLTAGE_SCALE);
-	smc_voltage_table->StdVoltageHiSidd =
-		cpu_to_be16(smc_voltage_table->StdVoltageHiSidd);
-	smc_voltage_table->StdVoltageLoSidd =
-		cpu_to_be16(smc_voltage_table->StdVoltageLoSidd);
-}
-
-static int ci_populate_smc_vddc_table(struct amdgpu_device *adev,
-				      SMU7_Discrete_DpmTable *table)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	unsigned int count;
-
-	table->VddcLevelCount = pi->vddc_voltage_table.count;
-	for (count = 0; count < table->VddcLevelCount; count++) {
-		ci_populate_smc_voltage_table(adev,
-					      &pi->vddc_voltage_table.entries[count],
-					      &table->VddcLevel[count]);
-
-		if (pi->voltage_control == CISLANDS_VOLTAGE_CONTROL_BY_GPIO)
-			table->VddcLevel[count].Smio |=
-				pi->vddc_voltage_table.entries[count].smio_low;
-		else
-			table->VddcLevel[count].Smio = 0;
-	}
-	table->VddcLevelCount = cpu_to_be32(table->VddcLevelCount);
-
-	return 0;
-}
-
-static int ci_populate_smc_vddci_table(struct amdgpu_device *adev,
-				       SMU7_Discrete_DpmTable *table)
-{
-	unsigned int count;
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	table->VddciLevelCount = pi->vddci_voltage_table.count;
-	for (count = 0; count < table->VddciLevelCount; count++) {
-		ci_populate_smc_voltage_table(adev,
-					      &pi->vddci_voltage_table.entries[count],
-					      &table->VddciLevel[count]);
-
-		if (pi->vddci_control == CISLANDS_VOLTAGE_CONTROL_BY_GPIO)
-			table->VddciLevel[count].Smio |=
-				pi->vddci_voltage_table.entries[count].smio_low;
-		else
-			table->VddciLevel[count].Smio = 0;
-	}
-	table->VddciLevelCount = cpu_to_be32(table->VddciLevelCount);
-
-	return 0;
-}
-
-static int ci_populate_smc_mvdd_table(struct amdgpu_device *adev,
-				      SMU7_Discrete_DpmTable *table)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	unsigned int count;
-
-	table->MvddLevelCount = pi->mvdd_voltage_table.count;
-	for (count = 0; count < table->MvddLevelCount; count++) {
-		ci_populate_smc_voltage_table(adev,
-					      &pi->mvdd_voltage_table.entries[count],
-					      &table->MvddLevel[count]);
-
-		if (pi->mvdd_control == CISLANDS_VOLTAGE_CONTROL_BY_GPIO)
-			table->MvddLevel[count].Smio |=
-				pi->mvdd_voltage_table.entries[count].smio_low;
-		else
-			table->MvddLevel[count].Smio = 0;
-	}
-	table->MvddLevelCount = cpu_to_be32(table->MvddLevelCount);
-
-	return 0;
-}
-
-static int ci_populate_smc_voltage_tables(struct amdgpu_device *adev,
-					  SMU7_Discrete_DpmTable *table)
-{
-	int ret;
-
-	ret = ci_populate_smc_vddc_table(adev, table);
-	if (ret)
-		return ret;
-
-	ret = ci_populate_smc_vddci_table(adev, table);
-	if (ret)
-		return ret;
-
-	ret = ci_populate_smc_mvdd_table(adev, table);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int ci_populate_mvdd_value(struct amdgpu_device *adev, u32 mclk,
-				  SMU7_Discrete_VoltageLevel *voltage)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 i = 0;
-
-	if (pi->mvdd_control != CISLANDS_VOLTAGE_CONTROL_NONE) {
-		for (i = 0; i < adev->pm.dpm.dyn_state.mvdd_dependency_on_mclk.count; i++) {
-			if (mclk <= adev->pm.dpm.dyn_state.mvdd_dependency_on_mclk.entries[i].clk) {
-				voltage->Voltage = pi->mvdd_voltage_table.entries[i].value;
-				break;
-			}
-		}
-
-		if (i >= adev->pm.dpm.dyn_state.mvdd_dependency_on_mclk.count)
-			return -EINVAL;
-	}
-
-	return -EINVAL;
-}
-
-static int ci_get_std_voltage_value_sidd(struct amdgpu_device *adev,
-					 struct atom_voltage_table_entry *voltage_table,
-					 u16 *std_voltage_hi_sidd, u16 *std_voltage_lo_sidd)
-{
-	u16 v_index, idx;
-	bool voltage_found = false;
-	*std_voltage_hi_sidd = voltage_table->value * VOLTAGE_SCALE;
-	*std_voltage_lo_sidd = voltage_table->value * VOLTAGE_SCALE;
-
-	if (adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.entries == NULL)
-		return -EINVAL;
-
-	if (adev->pm.dpm.dyn_state.cac_leakage_table.entries) {
-		for (v_index = 0; (u32)v_index < adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.count; v_index++) {
-			if (voltage_table->value ==
-			    adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.entries[v_index].v) {
-				voltage_found = true;
-				if ((u32)v_index < adev->pm.dpm.dyn_state.cac_leakage_table.count)
-					idx = v_index;
-				else
-					idx = adev->pm.dpm.dyn_state.cac_leakage_table.count - 1;
-				*std_voltage_lo_sidd =
-					adev->pm.dpm.dyn_state.cac_leakage_table.entries[idx].vddc * VOLTAGE_SCALE;
-				*std_voltage_hi_sidd =
-					adev->pm.dpm.dyn_state.cac_leakage_table.entries[idx].leakage * VOLTAGE_SCALE;
-				break;
-			}
-		}
-
-		if (!voltage_found) {
-			for (v_index = 0; (u32)v_index < adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.count; v_index++) {
-				if (voltage_table->value <=
-				    adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.entries[v_index].v) {
-					voltage_found = true;
-					if ((u32)v_index < adev->pm.dpm.dyn_state.cac_leakage_table.count)
-						idx = v_index;
-					else
-						idx = adev->pm.dpm.dyn_state.cac_leakage_table.count - 1;
-					*std_voltage_lo_sidd =
-						adev->pm.dpm.dyn_state.cac_leakage_table.entries[idx].vddc * VOLTAGE_SCALE;
-					*std_voltage_hi_sidd =
-						adev->pm.dpm.dyn_state.cac_leakage_table.entries[idx].leakage * VOLTAGE_SCALE;
-					break;
-				}
-			}
-		}
-	}
-
-	return 0;
-}
-
-static void ci_populate_phase_value_based_on_sclk(struct amdgpu_device *adev,
-						  const struct amdgpu_phase_shedding_limits_table *limits,
-						  u32 sclk,
-						  u32 *phase_shedding)
-{
-	unsigned int i;
-
-	*phase_shedding = 1;
-
-	for (i = 0; i < limits->count; i++) {
-		if (sclk < limits->entries[i].sclk) {
-			*phase_shedding = i;
-			break;
-		}
-	}
-}
-
-static void ci_populate_phase_value_based_on_mclk(struct amdgpu_device *adev,
-						  const struct amdgpu_phase_shedding_limits_table *limits,
-						  u32 mclk,
-						  u32 *phase_shedding)
-{
-	unsigned int i;
-
-	*phase_shedding = 1;
-
-	for (i = 0; i < limits->count; i++) {
-		if (mclk < limits->entries[i].mclk) {
-			*phase_shedding = i;
-			break;
-		}
-	}
-}
-
-static int ci_init_arb_table_index(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 tmp;
-	int ret;
-
-	ret = amdgpu_ci_read_smc_sram_dword(adev, pi->arb_table_start,
-				     &tmp, pi->sram_end);
-	if (ret)
-		return ret;
-
-	tmp &= 0x00FFFFFF;
-	tmp |= MC_CG_ARB_FREQ_F1 << 24;
-
-	return amdgpu_ci_write_smc_sram_dword(adev, pi->arb_table_start,
-				       tmp, pi->sram_end);
-}
-
-static int ci_get_dependency_volt_by_clk(struct amdgpu_device *adev,
-					 struct amdgpu_clock_voltage_dependency_table *allowed_clock_voltage_table,
-					 u32 clock, u32 *voltage)
-{
-	u32 i = 0;
-
-	if (allowed_clock_voltage_table->count == 0)
-		return -EINVAL;
-
-	for (i = 0; i < allowed_clock_voltage_table->count; i++) {
-		if (allowed_clock_voltage_table->entries[i].clk >= clock) {
-			*voltage = allowed_clock_voltage_table->entries[i].v;
-			return 0;
-		}
-	}
-
-	*voltage = allowed_clock_voltage_table->entries[i-1].v;
-
-	return 0;
-}
-
-static u8 ci_get_sleep_divider_id_from_clock(u32 sclk, u32 min_sclk_in_sr)
-{
-	u32 i;
-	u32 tmp;
-	u32 min = max(min_sclk_in_sr, (u32)CISLAND_MINIMUM_ENGINE_CLOCK);
-
-	if (sclk < min)
-		return 0;
-
-	for (i = CISLAND_MAX_DEEPSLEEP_DIVIDER_ID;  ; i--) {
-		tmp = sclk >> i;
-		if (tmp >= min || i == 0)
-			break;
-	}
-
-	return (u8)i;
-}
-
-static int ci_initial_switch_from_arb_f0_to_f1(struct amdgpu_device *adev)
-{
-	return ci_copy_and_switch_arb_sets(adev, MC_CG_ARB_FREQ_F0, MC_CG_ARB_FREQ_F1);
-}
-
-static int ci_reset_to_default(struct amdgpu_device *adev)
-{
-	return (amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_ResetToDefaults) == PPSMC_Result_OK) ?
-		0 : -EINVAL;
-}
-
-static int ci_force_switch_to_arb_f0(struct amdgpu_device *adev)
-{
-	u32 tmp;
-
-	tmp = (RREG32_SMC(ixSMC_SCRATCH9) & 0x0000ff00) >> 8;
-
-	if (tmp == MC_CG_ARB_FREQ_F0)
-		return 0;
-
-	return ci_copy_and_switch_arb_sets(adev, tmp, MC_CG_ARB_FREQ_F0);
-}
-
-static void ci_register_patching_mc_arb(struct amdgpu_device *adev,
-					const u32 engine_clock,
-					const u32 memory_clock,
-					u32 *dram_timimg2)
-{
-	bool patch;
-	u32 tmp, tmp2;
-
-	tmp = RREG32(mmMC_SEQ_MISC0);
-	patch = ((tmp & 0x0000f00) == 0x300) ? true : false;
-
-	if (patch &&
-	    ((adev->pdev->device == 0x67B0) ||
-	     (adev->pdev->device == 0x67B1))) {
-		if ((memory_clock > 100000) && (memory_clock <= 125000)) {
-			tmp2 = (((0x31 * engine_clock) / 125000) - 1) & 0xff;
-			*dram_timimg2 &= ~0x00ff0000;
-			*dram_timimg2 |= tmp2 << 16;
-		} else if ((memory_clock > 125000) && (memory_clock <= 137500)) {
-			tmp2 = (((0x36 * engine_clock) / 137500) - 1) & 0xff;
-			*dram_timimg2 &= ~0x00ff0000;
-			*dram_timimg2 |= tmp2 << 16;
-		}
-	}
-}
-
-static int ci_populate_memory_timing_parameters(struct amdgpu_device *adev,
-						u32 sclk,
-						u32 mclk,
-						SMU7_Discrete_MCArbDramTimingTableEntry *arb_regs)
-{
-	u32 dram_timing;
-	u32 dram_timing2;
-	u32 burst_time;
-
-	amdgpu_atombios_set_engine_dram_timings(adev, sclk, mclk);
-
-	dram_timing  = RREG32(mmMC_ARB_DRAM_TIMING);
-	dram_timing2 = RREG32(mmMC_ARB_DRAM_TIMING2);
-	burst_time = RREG32(mmMC_ARB_BURST_TIME) & MC_ARB_BURST_TIME__STATE0_MASK;
-
-	ci_register_patching_mc_arb(adev, sclk, mclk, &dram_timing2);
-
-	arb_regs->McArbDramTiming  = cpu_to_be32(dram_timing);
-	arb_regs->McArbDramTiming2 = cpu_to_be32(dram_timing2);
-	arb_regs->McArbBurstTime = (u8)burst_time;
-
-	return 0;
-}
-
-static int ci_do_program_memory_timing_parameters(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	SMU7_Discrete_MCArbDramTimingTable arb_regs;
-	u32 i, j;
-	int ret =  0;
-
-	memset(&arb_regs, 0, sizeof(SMU7_Discrete_MCArbDramTimingTable));
-
-	for (i = 0; i < pi->dpm_table.sclk_table.count; i++) {
-		for (j = 0; j < pi->dpm_table.mclk_table.count; j++) {
-			ret = ci_populate_memory_timing_parameters(adev,
-								   pi->dpm_table.sclk_table.dpm_levels[i].value,
-								   pi->dpm_table.mclk_table.dpm_levels[j].value,
-								   &arb_regs.entries[i][j]);
-			if (ret)
-				break;
-		}
-	}
-
-	if (ret == 0)
-		ret = amdgpu_ci_copy_bytes_to_smc(adev,
-					   pi->arb_table_start,
-					   (u8 *)&arb_regs,
-					   sizeof(SMU7_Discrete_MCArbDramTimingTable),
-					   pi->sram_end);
-
-	return ret;
-}
-
-static int ci_program_memory_timing_parameters(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (pi->need_update_smu7_dpm_table == 0)
-		return 0;
-
-	return ci_do_program_memory_timing_parameters(adev);
-}
-
-static void ci_populate_smc_initial_state(struct amdgpu_device *adev,
-					  struct amdgpu_ps *amdgpu_boot_state)
-{
-	struct ci_ps *boot_state = ci_get_ps(amdgpu_boot_state);
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 level = 0;
-
-	for (level = 0; level < adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.count; level++) {
-		if (adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.entries[level].clk >=
-		    boot_state->performance_levels[0].sclk) {
-			pi->smc_state_table.GraphicsBootLevel = level;
-			break;
-		}
-	}
-
-	for (level = 0; level < adev->pm.dpm.dyn_state.vddc_dependency_on_mclk.count; level++) {
-		if (adev->pm.dpm.dyn_state.vddc_dependency_on_mclk.entries[level].clk >=
-		    boot_state->performance_levels[0].mclk) {
-			pi->smc_state_table.MemoryBootLevel = level;
-			break;
-		}
-	}
-}
-
-static u32 ci_get_dpm_level_enable_mask_value(struct ci_single_dpm_table *dpm_table)
-{
-	u32 i;
-	u32 mask_value = 0;
-
-	for (i = dpm_table->count; i > 0; i--) {
-		mask_value = mask_value << 1;
-		if (dpm_table->dpm_levels[i-1].enabled)
-			mask_value |= 0x1;
-		else
-			mask_value &= 0xFFFFFFFE;
-	}
-
-	return mask_value;
-}
-
-static void ci_populate_smc_link_level(struct amdgpu_device *adev,
-				       SMU7_Discrete_DpmTable *table)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_dpm_table *dpm_table = &pi->dpm_table;
-	u32 i;
-
-	for (i = 0; i < dpm_table->pcie_speed_table.count; i++) {
-		table->LinkLevel[i].PcieGenSpeed =
-			(u8)dpm_table->pcie_speed_table.dpm_levels[i].value;
-		table->LinkLevel[i].PcieLaneCount =
-			amdgpu_encode_pci_lane_width(dpm_table->pcie_speed_table.dpm_levels[i].param1);
-		table->LinkLevel[i].EnabledForActivity = 1;
-		table->LinkLevel[i].DownT = cpu_to_be32(5);
-		table->LinkLevel[i].UpT = cpu_to_be32(30);
-	}
-
-	pi->smc_state_table.LinkLevelCount = (u8)dpm_table->pcie_speed_table.count;
-	pi->dpm_level_enable_mask.pcie_dpm_enable_mask =
-		ci_get_dpm_level_enable_mask_value(&dpm_table->pcie_speed_table);
-}
-
-static int ci_populate_smc_uvd_level(struct amdgpu_device *adev,
-				     SMU7_Discrete_DpmTable *table)
-{
-	u32 count;
-	struct atom_clock_dividers dividers;
-	int ret = -EINVAL;
-
-	table->UvdLevelCount =
-		adev->pm.dpm.dyn_state.uvd_clock_voltage_dependency_table.count;
-
-	for (count = 0; count < table->UvdLevelCount; count++) {
-		table->UvdLevel[count].VclkFrequency =
-			adev->pm.dpm.dyn_state.uvd_clock_voltage_dependency_table.entries[count].vclk;
-		table->UvdLevel[count].DclkFrequency =
-			adev->pm.dpm.dyn_state.uvd_clock_voltage_dependency_table.entries[count].dclk;
-		table->UvdLevel[count].MinVddc =
-			adev->pm.dpm.dyn_state.uvd_clock_voltage_dependency_table.entries[count].v * VOLTAGE_SCALE;
-		table->UvdLevel[count].MinVddcPhases = 1;
-
-		ret = amdgpu_atombios_get_clock_dividers(adev,
-							 COMPUTE_GPUCLK_INPUT_FLAG_DEFAULT_GPUCLK,
-							 table->UvdLevel[count].VclkFrequency, false, &dividers);
-		if (ret)
-			return ret;
-
-		table->UvdLevel[count].VclkDivider = (u8)dividers.post_divider;
-
-		ret = amdgpu_atombios_get_clock_dividers(adev,
-							 COMPUTE_GPUCLK_INPUT_FLAG_DEFAULT_GPUCLK,
-							 table->UvdLevel[count].DclkFrequency, false, &dividers);
-		if (ret)
-			return ret;
-
-		table->UvdLevel[count].DclkDivider = (u8)dividers.post_divider;
-
-		table->UvdLevel[count].VclkFrequency = cpu_to_be32(table->UvdLevel[count].VclkFrequency);
-		table->UvdLevel[count].DclkFrequency = cpu_to_be32(table->UvdLevel[count].DclkFrequency);
-		table->UvdLevel[count].MinVddc = cpu_to_be16(table->UvdLevel[count].MinVddc);
-	}
-
-	return ret;
-}
-
-static int ci_populate_smc_vce_level(struct amdgpu_device *adev,
-				     SMU7_Discrete_DpmTable *table)
-{
-	u32 count;
-	struct atom_clock_dividers dividers;
-	int ret = -EINVAL;
-
-	table->VceLevelCount =
-		adev->pm.dpm.dyn_state.vce_clock_voltage_dependency_table.count;
-
-	for (count = 0; count < table->VceLevelCount; count++) {
-		table->VceLevel[count].Frequency =
-			adev->pm.dpm.dyn_state.vce_clock_voltage_dependency_table.entries[count].evclk;
-		table->VceLevel[count].MinVoltage =
-			(u16)adev->pm.dpm.dyn_state.vce_clock_voltage_dependency_table.entries[count].v * VOLTAGE_SCALE;
-		table->VceLevel[count].MinPhases = 1;
-
-		ret = amdgpu_atombios_get_clock_dividers(adev,
-							 COMPUTE_GPUCLK_INPUT_FLAG_DEFAULT_GPUCLK,
-							 table->VceLevel[count].Frequency, false, &dividers);
-		if (ret)
-			return ret;
-
-		table->VceLevel[count].Divider = (u8)dividers.post_divider;
-
-		table->VceLevel[count].Frequency = cpu_to_be32(table->VceLevel[count].Frequency);
-		table->VceLevel[count].MinVoltage = cpu_to_be16(table->VceLevel[count].MinVoltage);
-	}
-
-	return ret;
-
-}
-
-static int ci_populate_smc_acp_level(struct amdgpu_device *adev,
-				     SMU7_Discrete_DpmTable *table)
-{
-	u32 count;
-	struct atom_clock_dividers dividers;
-	int ret = -EINVAL;
-
-	table->AcpLevelCount = (u8)
-		(adev->pm.dpm.dyn_state.acp_clock_voltage_dependency_table.count);
-
-	for (count = 0; count < table->AcpLevelCount; count++) {
-		table->AcpLevel[count].Frequency =
-			adev->pm.dpm.dyn_state.acp_clock_voltage_dependency_table.entries[count].clk;
-		table->AcpLevel[count].MinVoltage =
-			adev->pm.dpm.dyn_state.acp_clock_voltage_dependency_table.entries[count].v;
-		table->AcpLevel[count].MinPhases = 1;
-
-		ret = amdgpu_atombios_get_clock_dividers(adev,
-							 COMPUTE_GPUCLK_INPUT_FLAG_DEFAULT_GPUCLK,
-							 table->AcpLevel[count].Frequency, false, &dividers);
-		if (ret)
-			return ret;
-
-		table->AcpLevel[count].Divider = (u8)dividers.post_divider;
-
-		table->AcpLevel[count].Frequency = cpu_to_be32(table->AcpLevel[count].Frequency);
-		table->AcpLevel[count].MinVoltage = cpu_to_be16(table->AcpLevel[count].MinVoltage);
-	}
-
-	return ret;
-}
-
-static int ci_populate_smc_samu_level(struct amdgpu_device *adev,
-				      SMU7_Discrete_DpmTable *table)
-{
-	u32 count;
-	struct atom_clock_dividers dividers;
-	int ret = -EINVAL;
-
-	table->SamuLevelCount =
-		adev->pm.dpm.dyn_state.samu_clock_voltage_dependency_table.count;
-
-	for (count = 0; count < table->SamuLevelCount; count++) {
-		table->SamuLevel[count].Frequency =
-			adev->pm.dpm.dyn_state.samu_clock_voltage_dependency_table.entries[count].clk;
-		table->SamuLevel[count].MinVoltage =
-			adev->pm.dpm.dyn_state.samu_clock_voltage_dependency_table.entries[count].v * VOLTAGE_SCALE;
-		table->SamuLevel[count].MinPhases = 1;
-
-		ret = amdgpu_atombios_get_clock_dividers(adev,
-							 COMPUTE_GPUCLK_INPUT_FLAG_DEFAULT_GPUCLK,
-							 table->SamuLevel[count].Frequency, false, &dividers);
-		if (ret)
-			return ret;
-
-		table->SamuLevel[count].Divider = (u8)dividers.post_divider;
-
-		table->SamuLevel[count].Frequency = cpu_to_be32(table->SamuLevel[count].Frequency);
-		table->SamuLevel[count].MinVoltage = cpu_to_be16(table->SamuLevel[count].MinVoltage);
-	}
-
-	return ret;
-}
-
-static int ci_calculate_mclk_params(struct amdgpu_device *adev,
-				    u32 memory_clock,
-				    SMU7_Discrete_MemoryLevel *mclk,
-				    bool strobe_mode,
-				    bool dll_state_on)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32  dll_cntl = pi->clock_registers.dll_cntl;
-	u32  mclk_pwrmgt_cntl = pi->clock_registers.mclk_pwrmgt_cntl;
-	u32  mpll_ad_func_cntl = pi->clock_registers.mpll_ad_func_cntl;
-	u32  mpll_dq_func_cntl = pi->clock_registers.mpll_dq_func_cntl;
-	u32  mpll_func_cntl = pi->clock_registers.mpll_func_cntl;
-	u32  mpll_func_cntl_1 = pi->clock_registers.mpll_func_cntl_1;
-	u32  mpll_func_cntl_2 = pi->clock_registers.mpll_func_cntl_2;
-	u32  mpll_ss1 = pi->clock_registers.mpll_ss1;
-	u32  mpll_ss2 = pi->clock_registers.mpll_ss2;
-	struct atom_mpll_param mpll_param;
-	int ret;
-
-	ret = amdgpu_atombios_get_memory_pll_dividers(adev, memory_clock, strobe_mode, &mpll_param);
-	if (ret)
-		return ret;
-
-	mpll_func_cntl &= ~MPLL_FUNC_CNTL__BWCTRL_MASK;
-	mpll_func_cntl |= (mpll_param.bwcntl << MPLL_FUNC_CNTL__BWCTRL__SHIFT);
-
-	mpll_func_cntl_1 &= ~(MPLL_FUNC_CNTL_1__CLKF_MASK | MPLL_FUNC_CNTL_1__CLKFRAC_MASK |
-			MPLL_FUNC_CNTL_1__VCO_MODE_MASK);
-	mpll_func_cntl_1 |= (mpll_param.clkf) << MPLL_FUNC_CNTL_1__CLKF__SHIFT |
-		(mpll_param.clkfrac << MPLL_FUNC_CNTL_1__CLKFRAC__SHIFT) |
-		(mpll_param.vco_mode << MPLL_FUNC_CNTL_1__VCO_MODE__SHIFT);
-
-	mpll_ad_func_cntl &= ~MPLL_AD_FUNC_CNTL__YCLK_POST_DIV_MASK;
-	mpll_ad_func_cntl |= (mpll_param.post_div << MPLL_AD_FUNC_CNTL__YCLK_POST_DIV__SHIFT);
-
-	if (adev->gmc.vram_type == AMDGPU_VRAM_TYPE_GDDR5) {
-		mpll_dq_func_cntl &= ~(MPLL_DQ_FUNC_CNTL__YCLK_SEL_MASK |
-				MPLL_AD_FUNC_CNTL__YCLK_POST_DIV_MASK);
-		mpll_dq_func_cntl |= (mpll_param.yclk_sel << MPLL_DQ_FUNC_CNTL__YCLK_SEL__SHIFT) |
-				(mpll_param.post_div << MPLL_AD_FUNC_CNTL__YCLK_POST_DIV__SHIFT);
-	}
-
-	if (pi->caps_mclk_ss_support) {
-		struct amdgpu_atom_ss ss;
-		u32 freq_nom;
-		u32 tmp;
-		u32 reference_clock = adev->clock.mpll.reference_freq;
-
-		if (mpll_param.qdr == 1)
-			freq_nom = memory_clock * 4 * (1 << mpll_param.post_div);
-		else
-			freq_nom = memory_clock * 2 * (1 << mpll_param.post_div);
-
-		tmp = (freq_nom / reference_clock);
-		tmp = tmp * tmp;
-		if (amdgpu_atombios_get_asic_ss_info(adev, &ss,
-						     ASIC_INTERNAL_MEMORY_SS, freq_nom)) {
-			u32 clks = reference_clock * 5 / ss.rate;
-			u32 clkv = (u32)((((131 * ss.percentage * ss.rate) / 100) * tmp) / freq_nom);
-
-			mpll_ss1 &= ~MPLL_SS1__CLKV_MASK;
-			mpll_ss1 |= (clkv << MPLL_SS1__CLKV__SHIFT);
-
-			mpll_ss2 &= ~MPLL_SS2__CLKS_MASK;
-			mpll_ss2 |= (clks << MPLL_SS2__CLKS__SHIFT);
-		}
-	}
-
-	mclk_pwrmgt_cntl &= ~MCLK_PWRMGT_CNTL__DLL_SPEED_MASK;
-	mclk_pwrmgt_cntl |= (mpll_param.dll_speed << MCLK_PWRMGT_CNTL__DLL_SPEED__SHIFT);
-
-	if (dll_state_on)
-		mclk_pwrmgt_cntl |= MCLK_PWRMGT_CNTL__MRDCK0_PDNB_MASK |
-			MCLK_PWRMGT_CNTL__MRDCK1_PDNB_MASK;
-	else
-		mclk_pwrmgt_cntl &= ~(MCLK_PWRMGT_CNTL__MRDCK0_PDNB_MASK |
-			MCLK_PWRMGT_CNTL__MRDCK1_PDNB_MASK);
-
-	mclk->MclkFrequency = memory_clock;
-	mclk->MpllFuncCntl = mpll_func_cntl;
-	mclk->MpllFuncCntl_1 = mpll_func_cntl_1;
-	mclk->MpllFuncCntl_2 = mpll_func_cntl_2;
-	mclk->MpllAdFuncCntl = mpll_ad_func_cntl;
-	mclk->MpllDqFuncCntl = mpll_dq_func_cntl;
-	mclk->MclkPwrmgtCntl = mclk_pwrmgt_cntl;
-	mclk->DllCntl = dll_cntl;
-	mclk->MpllSs1 = mpll_ss1;
-	mclk->MpllSs2 = mpll_ss2;
-
-	return 0;
-}
-
-static int ci_populate_single_memory_level(struct amdgpu_device *adev,
-					   u32 memory_clock,
-					   SMU7_Discrete_MemoryLevel *memory_level)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	int ret;
-	bool dll_state_on;
-
-	if (adev->pm.dpm.dyn_state.vddc_dependency_on_mclk.entries) {
-		ret = ci_get_dependency_volt_by_clk(adev,
-						    &adev->pm.dpm.dyn_state.vddc_dependency_on_mclk,
-						    memory_clock, &memory_level->MinVddc);
-		if (ret)
-			return ret;
-	}
-
-	if (adev->pm.dpm.dyn_state.vddci_dependency_on_mclk.entries) {
-		ret = ci_get_dependency_volt_by_clk(adev,
-						    &adev->pm.dpm.dyn_state.vddci_dependency_on_mclk,
-						    memory_clock, &memory_level->MinVddci);
-		if (ret)
-			return ret;
-	}
-
-	if (adev->pm.dpm.dyn_state.mvdd_dependency_on_mclk.entries) {
-		ret = ci_get_dependency_volt_by_clk(adev,
-						    &adev->pm.dpm.dyn_state.mvdd_dependency_on_mclk,
-						    memory_clock, &memory_level->MinMvdd);
-		if (ret)
-			return ret;
-	}
-
-	memory_level->MinVddcPhases = 1;
-
-	if (pi->vddc_phase_shed_control)
-		ci_populate_phase_value_based_on_mclk(adev,
-						      &adev->pm.dpm.dyn_state.phase_shedding_limits_table,
-						      memory_clock,
-						      &memory_level->MinVddcPhases);
-
-	memory_level->EnabledForActivity = 1;
-	memory_level->EnabledForThrottle = 1;
-	memory_level->UpH = 0;
-	memory_level->DownH = 100;
-	memory_level->VoltageDownH = 0;
-	memory_level->ActivityLevel = (u16)pi->mclk_activity_target;
-
-	memory_level->StutterEnable = false;
-	memory_level->StrobeEnable = false;
-	memory_level->EdcReadEnable = false;
-	memory_level->EdcWriteEnable = false;
-	memory_level->RttEnable = false;
-
-	memory_level->DisplayWatermark = PPSMC_DISPLAY_WATERMARK_LOW;
-
-	if (pi->mclk_stutter_mode_threshold &&
-	    (memory_clock <= pi->mclk_stutter_mode_threshold) &&
-	    (!pi->uvd_enabled) &&
-	    (RREG32(mmDPG_PIPE_STUTTER_CONTROL) & DPG_PIPE_STUTTER_CONTROL__STUTTER_ENABLE_MASK) &&
-	    (adev->pm.dpm.new_active_crtc_count <= 2))
-		memory_level->StutterEnable = true;
-
-	if (pi->mclk_strobe_mode_threshold &&
-	    (memory_clock <= pi->mclk_strobe_mode_threshold))
-		memory_level->StrobeEnable = 1;
-
-	if (adev->gmc.vram_type == AMDGPU_VRAM_TYPE_GDDR5) {
-		memory_level->StrobeRatio =
-			ci_get_mclk_frequency_ratio(memory_clock, memory_level->StrobeEnable);
-		if (pi->mclk_edc_enable_threshold &&
-		    (memory_clock > pi->mclk_edc_enable_threshold))
-			memory_level->EdcReadEnable = true;
-
-		if (pi->mclk_edc_wr_enable_threshold &&
-		    (memory_clock > pi->mclk_edc_wr_enable_threshold))
-			memory_level->EdcWriteEnable = true;
-
-		if (memory_level->StrobeEnable) {
-			if (ci_get_mclk_frequency_ratio(memory_clock, true) >=
-			    ((RREG32(mmMC_SEQ_MISC7) >> 16) & 0xf))
-				dll_state_on = ((RREG32(mmMC_SEQ_MISC5) >> 1) & 0x1) ? true : false;
-			else
-				dll_state_on = ((RREG32(mmMC_SEQ_MISC6) >> 1) & 0x1) ? true : false;
-		} else {
-			dll_state_on = pi->dll_default_on;
-		}
-	} else {
-		memory_level->StrobeRatio = ci_get_ddr3_mclk_frequency_ratio(memory_clock);
-		dll_state_on = ((RREG32(mmMC_SEQ_MISC5) >> 1) & 0x1) ? true : false;
-	}
-
-	ret = ci_calculate_mclk_params(adev, memory_clock, memory_level, memory_level->StrobeEnable, dll_state_on);
-	if (ret)
-		return ret;
-
-	memory_level->MinVddc = cpu_to_be32(memory_level->MinVddc * VOLTAGE_SCALE);
-	memory_level->MinVddcPhases = cpu_to_be32(memory_level->MinVddcPhases);
-	memory_level->MinVddci = cpu_to_be32(memory_level->MinVddci * VOLTAGE_SCALE);
-	memory_level->MinMvdd = cpu_to_be32(memory_level->MinMvdd * VOLTAGE_SCALE);
-
-	memory_level->MclkFrequency = cpu_to_be32(memory_level->MclkFrequency);
-	memory_level->ActivityLevel = cpu_to_be16(memory_level->ActivityLevel);
-	memory_level->MpllFuncCntl = cpu_to_be32(memory_level->MpllFuncCntl);
-	memory_level->MpllFuncCntl_1 = cpu_to_be32(memory_level->MpllFuncCntl_1);
-	memory_level->MpllFuncCntl_2 = cpu_to_be32(memory_level->MpllFuncCntl_2);
-	memory_level->MpllAdFuncCntl = cpu_to_be32(memory_level->MpllAdFuncCntl);
-	memory_level->MpllDqFuncCntl = cpu_to_be32(memory_level->MpllDqFuncCntl);
-	memory_level->MclkPwrmgtCntl = cpu_to_be32(memory_level->MclkPwrmgtCntl);
-	memory_level->DllCntl = cpu_to_be32(memory_level->DllCntl);
-	memory_level->MpllSs1 = cpu_to_be32(memory_level->MpllSs1);
-	memory_level->MpllSs2 = cpu_to_be32(memory_level->MpllSs2);
-
-	return 0;
-}
-
-static int ci_populate_smc_acpi_level(struct amdgpu_device *adev,
-				      SMU7_Discrete_DpmTable *table)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct atom_clock_dividers dividers;
-	SMU7_Discrete_VoltageLevel voltage_level;
-	u32 spll_func_cntl = pi->clock_registers.cg_spll_func_cntl;
-	u32 spll_func_cntl_2 = pi->clock_registers.cg_spll_func_cntl_2;
-	u32 dll_cntl = pi->clock_registers.dll_cntl;
-	u32 mclk_pwrmgt_cntl = pi->clock_registers.mclk_pwrmgt_cntl;
-	int ret;
-
-	table->ACPILevel.Flags &= ~PPSMC_SWSTATE_FLAG_DC;
-
-	if (pi->acpi_vddc)
-		table->ACPILevel.MinVddc = cpu_to_be32(pi->acpi_vddc * VOLTAGE_SCALE);
-	else
-		table->ACPILevel.MinVddc = cpu_to_be32(pi->min_vddc_in_pp_table * VOLTAGE_SCALE);
-
-	table->ACPILevel.MinVddcPhases = pi->vddc_phase_shed_control ? 0 : 1;
-
-	table->ACPILevel.SclkFrequency = adev->clock.spll.reference_freq;
-
-	ret = amdgpu_atombios_get_clock_dividers(adev,
-						 COMPUTE_GPUCLK_INPUT_FLAG_SCLK,
-						 table->ACPILevel.SclkFrequency, false, &dividers);
-	if (ret)
-		return ret;
-
-	table->ACPILevel.SclkDid = (u8)dividers.post_divider;
-	table->ACPILevel.DisplayWatermark = PPSMC_DISPLAY_WATERMARK_LOW;
-	table->ACPILevel.DeepSleepDivId = 0;
-
-	spll_func_cntl &= ~CG_SPLL_FUNC_CNTL__SPLL_PWRON_MASK;
-	spll_func_cntl |= CG_SPLL_FUNC_CNTL__SPLL_RESET_MASK;
-
-	spll_func_cntl_2 &= ~CG_SPLL_FUNC_CNTL_2__SCLK_MUX_SEL_MASK;
-	spll_func_cntl_2 |= (4 << CG_SPLL_FUNC_CNTL_2__SCLK_MUX_SEL__SHIFT);
-
-	table->ACPILevel.CgSpllFuncCntl = spll_func_cntl;
-	table->ACPILevel.CgSpllFuncCntl2 = spll_func_cntl_2;
-	table->ACPILevel.CgSpllFuncCntl3 = pi->clock_registers.cg_spll_func_cntl_3;
-	table->ACPILevel.CgSpllFuncCntl4 = pi->clock_registers.cg_spll_func_cntl_4;
-	table->ACPILevel.SpllSpreadSpectrum = pi->clock_registers.cg_spll_spread_spectrum;
-	table->ACPILevel.SpllSpreadSpectrum2 = pi->clock_registers.cg_spll_spread_spectrum_2;
-	table->ACPILevel.CcPwrDynRm = 0;
-	table->ACPILevel.CcPwrDynRm1 = 0;
-
-	table->ACPILevel.Flags = cpu_to_be32(table->ACPILevel.Flags);
-	table->ACPILevel.MinVddcPhases = cpu_to_be32(table->ACPILevel.MinVddcPhases);
-	table->ACPILevel.SclkFrequency = cpu_to_be32(table->ACPILevel.SclkFrequency);
-	table->ACPILevel.CgSpllFuncCntl = cpu_to_be32(table->ACPILevel.CgSpllFuncCntl);
-	table->ACPILevel.CgSpllFuncCntl2 = cpu_to_be32(table->ACPILevel.CgSpllFuncCntl2);
-	table->ACPILevel.CgSpllFuncCntl3 = cpu_to_be32(table->ACPILevel.CgSpllFuncCntl3);
-	table->ACPILevel.CgSpllFuncCntl4 = cpu_to_be32(table->ACPILevel.CgSpllFuncCntl4);
-	table->ACPILevel.SpllSpreadSpectrum = cpu_to_be32(table->ACPILevel.SpllSpreadSpectrum);
-	table->ACPILevel.SpllSpreadSpectrum2 = cpu_to_be32(table->ACPILevel.SpllSpreadSpectrum2);
-	table->ACPILevel.CcPwrDynRm = cpu_to_be32(table->ACPILevel.CcPwrDynRm);
-	table->ACPILevel.CcPwrDynRm1 = cpu_to_be32(table->ACPILevel.CcPwrDynRm1);
-
-	table->MemoryACPILevel.MinVddc = table->ACPILevel.MinVddc;
-	table->MemoryACPILevel.MinVddcPhases = table->ACPILevel.MinVddcPhases;
-
-	if (pi->vddci_control != CISLANDS_VOLTAGE_CONTROL_NONE) {
-		if (pi->acpi_vddci)
-			table->MemoryACPILevel.MinVddci =
-				cpu_to_be32(pi->acpi_vddci * VOLTAGE_SCALE);
-		else
-			table->MemoryACPILevel.MinVddci =
-				cpu_to_be32(pi->min_vddci_in_pp_table * VOLTAGE_SCALE);
-	}
-
-	if (ci_populate_mvdd_value(adev, 0, &voltage_level))
-		table->MemoryACPILevel.MinMvdd = 0;
-	else
-		table->MemoryACPILevel.MinMvdd =
-			cpu_to_be32(voltage_level.Voltage * VOLTAGE_SCALE);
-
-	mclk_pwrmgt_cntl |= MCLK_PWRMGT_CNTL__MRDCK0_RESET_MASK |
-		MCLK_PWRMGT_CNTL__MRDCK1_RESET_MASK;
-	mclk_pwrmgt_cntl &= ~(MCLK_PWRMGT_CNTL__MRDCK0_PDNB_MASK |
-			MCLK_PWRMGT_CNTL__MRDCK1_PDNB_MASK);
-
-	dll_cntl &= ~(DLL_CNTL__MRDCK0_BYPASS_MASK | DLL_CNTL__MRDCK1_BYPASS_MASK);
-
-	table->MemoryACPILevel.DllCntl = cpu_to_be32(dll_cntl);
-	table->MemoryACPILevel.MclkPwrmgtCntl = cpu_to_be32(mclk_pwrmgt_cntl);
-	table->MemoryACPILevel.MpllAdFuncCntl =
-		cpu_to_be32(pi->clock_registers.mpll_ad_func_cntl);
-	table->MemoryACPILevel.MpllDqFuncCntl =
-		cpu_to_be32(pi->clock_registers.mpll_dq_func_cntl);
-	table->MemoryACPILevel.MpllFuncCntl =
-		cpu_to_be32(pi->clock_registers.mpll_func_cntl);
-	table->MemoryACPILevel.MpllFuncCntl_1 =
-		cpu_to_be32(pi->clock_registers.mpll_func_cntl_1);
-	table->MemoryACPILevel.MpllFuncCntl_2 =
-		cpu_to_be32(pi->clock_registers.mpll_func_cntl_2);
-	table->MemoryACPILevel.MpllSs1 = cpu_to_be32(pi->clock_registers.mpll_ss1);
-	table->MemoryACPILevel.MpllSs2 = cpu_to_be32(pi->clock_registers.mpll_ss2);
-
-	table->MemoryACPILevel.EnabledForThrottle = 0;
-	table->MemoryACPILevel.EnabledForActivity = 0;
-	table->MemoryACPILevel.UpH = 0;
-	table->MemoryACPILevel.DownH = 100;
-	table->MemoryACPILevel.VoltageDownH = 0;
-	table->MemoryACPILevel.ActivityLevel =
-		cpu_to_be16((u16)pi->mclk_activity_target);
-
-	table->MemoryACPILevel.StutterEnable = false;
-	table->MemoryACPILevel.StrobeEnable = false;
-	table->MemoryACPILevel.EdcReadEnable = false;
-	table->MemoryACPILevel.EdcWriteEnable = false;
-	table->MemoryACPILevel.RttEnable = false;
-
-	return 0;
-}
-
-
-static int ci_enable_ulv(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_ulv_parm *ulv = &pi->ulv;
-
-	if (ulv->supported) {
-		if (enable)
-			return (amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_EnableULV) == PPSMC_Result_OK) ?
-				0 : -EINVAL;
-		else
-			return (amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_DisableULV) == PPSMC_Result_OK) ?
-				0 : -EINVAL;
-	}
-
-	return 0;
-}
-
-static int ci_populate_ulv_level(struct amdgpu_device *adev,
-				 SMU7_Discrete_Ulv *state)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u16 ulv_voltage = adev->pm.dpm.backbias_response_time;
-
-	state->CcPwrDynRm = 0;
-	state->CcPwrDynRm1 = 0;
-
-	if (ulv_voltage == 0) {
-		pi->ulv.supported = false;
-		return 0;
-	}
-
-	if (pi->voltage_control != CISLANDS_VOLTAGE_CONTROL_BY_SVID2) {
-		if (ulv_voltage > adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.entries[0].v)
-			state->VddcOffset = 0;
-		else
-			state->VddcOffset =
-				adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.entries[0].v - ulv_voltage;
-	} else {
-		if (ulv_voltage > adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.entries[0].v)
-			state->VddcOffsetVid = 0;
-		else
-			state->VddcOffsetVid = (u8)
-				((adev->pm.dpm.dyn_state.vddc_dependency_on_sclk.entries[0].v - ulv_voltage) *
-				 VOLTAGE_VID_OFFSET_SCALE2 / VOLTAGE_VID_OFFSET_SCALE1);
-	}
-	state->VddcPhase = pi->vddc_phase_shed_control ? 0 : 1;
-
-	state->CcPwrDynRm = cpu_to_be32(state->CcPwrDynRm);
-	state->CcPwrDynRm1 = cpu_to_be32(state->CcPwrDynRm1);
-	state->VddcOffset = cpu_to_be16(state->VddcOffset);
-
-	return 0;
-}
-
-static int ci_calculate_sclk_params(struct amdgpu_device *adev,
-				    u32 engine_clock,
-				    SMU7_Discrete_GraphicsLevel *sclk)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct atom_clock_dividers dividers;
-	u32 spll_func_cntl_3 = pi->clock_registers.cg_spll_func_cntl_3;
-	u32 spll_func_cntl_4 = pi->clock_registers.cg_spll_func_cntl_4;
-	u32 cg_spll_spread_spectrum = pi->clock_registers.cg_spll_spread_spectrum;
-	u32 cg_spll_spread_spectrum_2 = pi->clock_registers.cg_spll_spread_spectrum_2;
-	u32 reference_clock = adev->clock.spll.reference_freq;
-	u32 reference_divider;
-	u32 fbdiv;
-	int ret;
-
-	ret = amdgpu_atombios_get_clock_dividers(adev,
-						 COMPUTE_GPUCLK_INPUT_FLAG_SCLK,
-						 engine_clock, false, &dividers);
-	if (ret)
-		return ret;
-
-	reference_divider = 1 + dividers.ref_div;
-	fbdiv = dividers.fb_div & 0x3FFFFFF;
-
-	spll_func_cntl_3 &= ~CG_SPLL_FUNC_CNTL_3__SPLL_FB_DIV_MASK;
-	spll_func_cntl_3 |= (fbdiv << CG_SPLL_FUNC_CNTL_3__SPLL_FB_DIV__SHIFT);
-	spll_func_cntl_3 |= CG_SPLL_FUNC_CNTL_3__SPLL_DITHEN_MASK;
-
-	if (pi->caps_sclk_ss_support) {
-		struct amdgpu_atom_ss ss;
-		u32 vco_freq = engine_clock * dividers.post_div;
-
-		if (amdgpu_atombios_get_asic_ss_info(adev, &ss,
-						     ASIC_INTERNAL_ENGINE_SS, vco_freq)) {
-			u32 clk_s = reference_clock * 5 / (reference_divider * ss.rate);
-			u32 clk_v = 4 * ss.percentage * fbdiv / (clk_s * 10000);
-
-			cg_spll_spread_spectrum &= ~(CG_SPLL_SPREAD_SPECTRUM__CLKS_MASK | CG_SPLL_SPREAD_SPECTRUM__SSEN_MASK);
-			cg_spll_spread_spectrum |= (clk_s << CG_SPLL_SPREAD_SPECTRUM__CLKS__SHIFT);
-			cg_spll_spread_spectrum |= (1 << CG_SPLL_SPREAD_SPECTRUM__SSEN__SHIFT);
-
-			cg_spll_spread_spectrum_2 &= ~CG_SPLL_SPREAD_SPECTRUM_2__CLKV_MASK;
-			cg_spll_spread_spectrum_2 |= (clk_v << CG_SPLL_SPREAD_SPECTRUM_2__CLKV__SHIFT);
-		}
-	}
-
-	sclk->SclkFrequency = engine_clock;
-	sclk->CgSpllFuncCntl3 = spll_func_cntl_3;
-	sclk->CgSpllFuncCntl4 = spll_func_cntl_4;
-	sclk->SpllSpreadSpectrum = cg_spll_spread_spectrum;
-	sclk->SpllSpreadSpectrum2  = cg_spll_spread_spectrum_2;
-	sclk->SclkDid = (u8)dividers.post_divider;
-
-	return 0;
-}
-
-static int ci_populate_single_graphic_level(struct amdgpu_device *adev,
-					    u32 engine_clock,
-					    u16 sclk_activity_level_t,
-					    SMU7_Discrete_GraphicsLevel *graphic_level)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	int ret;
-
-	ret = ci_calculate_sclk_params(adev, engine_clock, graphic_level);
-	if (ret)
-		return ret;
-
-	ret = ci_get_dependency_volt_by_clk(adev,
-					    &adev->pm.dpm.dyn_state.vddc_dependency_on_sclk,
-					    engine_clock, &graphic_level->MinVddc);
-	if (ret)
-		return ret;
-
-	graphic_level->SclkFrequency = engine_clock;
-
-	graphic_level->Flags =  0;
-	graphic_level->MinVddcPhases = 1;
-
-	if (pi->vddc_phase_shed_control)
-		ci_populate_phase_value_based_on_sclk(adev,
-						      &adev->pm.dpm.dyn_state.phase_shedding_limits_table,
-						      engine_clock,
-						      &graphic_level->MinVddcPhases);
-
-	graphic_level->ActivityLevel = sclk_activity_level_t;
-
-	graphic_level->CcPwrDynRm = 0;
-	graphic_level->CcPwrDynRm1 = 0;
-	graphic_level->EnabledForThrottle = 1;
-	graphic_level->UpH = 0;
-	graphic_level->DownH = 0;
-	graphic_level->VoltageDownH = 0;
-	graphic_level->PowerThrottle = 0;
-
-	if (pi->caps_sclk_ds)
-		graphic_level->DeepSleepDivId = ci_get_sleep_divider_id_from_clock(engine_clock,
-										   CISLAND_MINIMUM_ENGINE_CLOCK);
-
-	graphic_level->DisplayWatermark = PPSMC_DISPLAY_WATERMARK_LOW;
-
-	graphic_level->Flags = cpu_to_be32(graphic_level->Flags);
-	graphic_level->MinVddc = cpu_to_be32(graphic_level->MinVddc * VOLTAGE_SCALE);
-	graphic_level->MinVddcPhases = cpu_to_be32(graphic_level->MinVddcPhases);
-	graphic_level->SclkFrequency = cpu_to_be32(graphic_level->SclkFrequency);
-	graphic_level->ActivityLevel = cpu_to_be16(graphic_level->ActivityLevel);
-	graphic_level->CgSpllFuncCntl3 = cpu_to_be32(graphic_level->CgSpllFuncCntl3);
-	graphic_level->CgSpllFuncCntl4 = cpu_to_be32(graphic_level->CgSpllFuncCntl4);
-	graphic_level->SpllSpreadSpectrum = cpu_to_be32(graphic_level->SpllSpreadSpectrum);
-	graphic_level->SpllSpreadSpectrum2 = cpu_to_be32(graphic_level->SpllSpreadSpectrum2);
-	graphic_level->CcPwrDynRm = cpu_to_be32(graphic_level->CcPwrDynRm);
-	graphic_level->CcPwrDynRm1 = cpu_to_be32(graphic_level->CcPwrDynRm1);
-
-	return 0;
-}
-
-static int ci_populate_all_graphic_levels(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_dpm_table *dpm_table = &pi->dpm_table;
-	u32 level_array_address = pi->dpm_table_start +
-		offsetof(SMU7_Discrete_DpmTable, GraphicsLevel);
-	u32 level_array_size = sizeof(SMU7_Discrete_GraphicsLevel) *
-		SMU7_MAX_LEVELS_GRAPHICS;
-	SMU7_Discrete_GraphicsLevel *levels = pi->smc_state_table.GraphicsLevel;
-	u32 i, ret;
-
-	memset(levels, 0, level_array_size);
-
-	for (i = 0; i < dpm_table->sclk_table.count; i++) {
-		ret = ci_populate_single_graphic_level(adev,
-						       dpm_table->sclk_table.dpm_levels[i].value,
-						       (u16)pi->activity_target[i],
-						       &pi->smc_state_table.GraphicsLevel[i]);
-		if (ret)
-			return ret;
-		if (i > 1)
-			pi->smc_state_table.GraphicsLevel[i].DeepSleepDivId = 0;
-		if (i == (dpm_table->sclk_table.count - 1))
-			pi->smc_state_table.GraphicsLevel[i].DisplayWatermark =
-				PPSMC_DISPLAY_WATERMARK_HIGH;
-	}
-	pi->smc_state_table.GraphicsLevel[0].EnabledForActivity = 1;
-
-	pi->smc_state_table.GraphicsDpmLevelCount = (u8)dpm_table->sclk_table.count;
-	pi->dpm_level_enable_mask.sclk_dpm_enable_mask =
-		ci_get_dpm_level_enable_mask_value(&dpm_table->sclk_table);
-
-	ret = amdgpu_ci_copy_bytes_to_smc(adev, level_array_address,
-				   (u8 *)levels, level_array_size,
-				   pi->sram_end);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int ci_populate_ulv_state(struct amdgpu_device *adev,
-				 SMU7_Discrete_Ulv *ulv_level)
-{
-	return ci_populate_ulv_level(adev, ulv_level);
-}
-
-static int ci_populate_all_memory_levels(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_dpm_table *dpm_table = &pi->dpm_table;
-	u32 level_array_address = pi->dpm_table_start +
-		offsetof(SMU7_Discrete_DpmTable, MemoryLevel);
-	u32 level_array_size = sizeof(SMU7_Discrete_MemoryLevel) *
-		SMU7_MAX_LEVELS_MEMORY;
-	SMU7_Discrete_MemoryLevel *levels = pi->smc_state_table.MemoryLevel;
-	u32 i, ret;
-
-	memset(levels, 0, level_array_size);
-
-	for (i = 0; i < dpm_table->mclk_table.count; i++) {
-		if (dpm_table->mclk_table.dpm_levels[i].value == 0)
-			return -EINVAL;
-		ret = ci_populate_single_memory_level(adev,
-						      dpm_table->mclk_table.dpm_levels[i].value,
-						      &pi->smc_state_table.MemoryLevel[i]);
-		if (ret)
-			return ret;
-	}
-
-	if ((dpm_table->mclk_table.count >= 2) &&
-	    ((adev->pdev->device == 0x67B0) || (adev->pdev->device == 0x67B1))) {
-		pi->smc_state_table.MemoryLevel[1].MinVddc =
-			pi->smc_state_table.MemoryLevel[0].MinVddc;
-		pi->smc_state_table.MemoryLevel[1].MinVddcPhases =
-			pi->smc_state_table.MemoryLevel[0].MinVddcPhases;
-	}
-
-	pi->smc_state_table.MemoryLevel[0].ActivityLevel = cpu_to_be16(0x1F);
-
-	pi->smc_state_table.MemoryDpmLevelCount = (u8)dpm_table->mclk_table.count;
-	pi->dpm_level_enable_mask.mclk_dpm_enable_mask =
-		ci_get_dpm_level_enable_mask_value(&dpm_table->mclk_table);
-
-	pi->smc_state_table.MemoryLevel[dpm_table->mclk_table.count - 1].DisplayWatermark =
-		PPSMC_DISPLAY_WATERMARK_HIGH;
-
-	ret = amdgpu_ci_copy_bytes_to_smc(adev, level_array_address,
-				   (u8 *)levels, level_array_size,
-				   pi->sram_end);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static void ci_reset_single_dpm_table(struct amdgpu_device *adev,
-				      struct ci_single_dpm_table* dpm_table,
-				      u32 count)
-{
-	u32 i;
-
-	dpm_table->count = count;
-	for (i = 0; i < MAX_REGULAR_DPM_NUMBER; i++)
-		dpm_table->dpm_levels[i].enabled = false;
-}
-
-static void ci_setup_pcie_table_entry(struct ci_single_dpm_table* dpm_table,
-				      u32 index, u32 pcie_gen, u32 pcie_lanes)
-{
-	dpm_table->dpm_levels[index].value = pcie_gen;
-	dpm_table->dpm_levels[index].param1 = pcie_lanes;
-	dpm_table->dpm_levels[index].enabled = true;
-}
-
-static int ci_setup_default_pcie_tables(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (!pi->use_pcie_performance_levels && !pi->use_pcie_powersaving_levels)
-		return -EINVAL;
-
-	if (pi->use_pcie_performance_levels && !pi->use_pcie_powersaving_levels) {
-		pi->pcie_gen_powersaving = pi->pcie_gen_performance;
-		pi->pcie_lane_powersaving = pi->pcie_lane_performance;
-	} else if (!pi->use_pcie_performance_levels && pi->use_pcie_powersaving_levels) {
-		pi->pcie_gen_performance = pi->pcie_gen_powersaving;
-		pi->pcie_lane_performance = pi->pcie_lane_powersaving;
-	}
-
-	ci_reset_single_dpm_table(adev,
-				  &pi->dpm_table.pcie_speed_table,
-				  SMU7_MAX_LEVELS_LINK);
-
-	if (adev->asic_type == CHIP_BONAIRE)
-		ci_setup_pcie_table_entry(&pi->dpm_table.pcie_speed_table, 0,
-					  pi->pcie_gen_powersaving.min,
-					  pi->pcie_lane_powersaving.max);
-	else
-		ci_setup_pcie_table_entry(&pi->dpm_table.pcie_speed_table, 0,
-					  pi->pcie_gen_powersaving.min,
-					  pi->pcie_lane_powersaving.min);
-	ci_setup_pcie_table_entry(&pi->dpm_table.pcie_speed_table, 1,
-				  pi->pcie_gen_performance.min,
-				  pi->pcie_lane_performance.min);
-	ci_setup_pcie_table_entry(&pi->dpm_table.pcie_speed_table, 2,
-				  pi->pcie_gen_powersaving.min,
-				  pi->pcie_lane_powersaving.max);
-	ci_setup_pcie_table_entry(&pi->dpm_table.pcie_speed_table, 3,
-				  pi->pcie_gen_performance.min,
-				  pi->pcie_lane_performance.max);
-	ci_setup_pcie_table_entry(&pi->dpm_table.pcie_speed_table, 4,
-				  pi->pcie_gen_powersaving.max,
-				  pi->pcie_lane_powersaving.max);
-	ci_setup_pcie_table_entry(&pi->dpm_table.pcie_speed_table, 5,
-				  pi->pcie_gen_performance.max,
-				  pi->pcie_lane_performance.max);
-
-	pi->dpm_table.pcie_speed_table.count = 6;
-
-	return 0;
-}
-
-static int ci_setup_default_dpm_tables(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_clock_voltage_dependency_table *allowed_sclk_vddc_table =
-		&adev->pm.dpm.dyn_state.vddc_dependency_on_sclk;
-	struct amdgpu_clock_voltage_dependency_table *allowed_mclk_table =
-		&adev->pm.dpm.dyn_state.vddc_dependency_on_mclk;
-	struct amdgpu_cac_leakage_table *std_voltage_table =
-		&adev->pm.dpm.dyn_state.cac_leakage_table;
-	u32 i;
-
-	if (allowed_sclk_vddc_table == NULL)
-		return -EINVAL;
-	if (allowed_sclk_vddc_table->count < 1)
-		return -EINVAL;
-	if (allowed_mclk_table == NULL)
-		return -EINVAL;
-	if (allowed_mclk_table->count < 1)
-		return -EINVAL;
-
-	memset(&pi->dpm_table, 0, sizeof(struct ci_dpm_table));
-
-	ci_reset_single_dpm_table(adev,
-				  &pi->dpm_table.sclk_table,
-				  SMU7_MAX_LEVELS_GRAPHICS);
-	ci_reset_single_dpm_table(adev,
-				  &pi->dpm_table.mclk_table,
-				  SMU7_MAX_LEVELS_MEMORY);
-	ci_reset_single_dpm_table(adev,
-				  &pi->dpm_table.vddc_table,
-				  SMU7_MAX_LEVELS_VDDC);
-	ci_reset_single_dpm_table(adev,
-				  &pi->dpm_table.vddci_table,
-				  SMU7_MAX_LEVELS_VDDCI);
-	ci_reset_single_dpm_table(adev,
-				  &pi->dpm_table.mvdd_table,
-				  SMU7_MAX_LEVELS_MVDD);
-
-	pi->dpm_table.sclk_table.count = 0;
-	for (i = 0; i < allowed_sclk_vddc_table->count; i++) {
-		if ((i == 0) ||
-		    (pi->dpm_table.sclk_table.dpm_levels[pi->dpm_table.sclk_table.count-1].value !=
-		     allowed_sclk_vddc_table->entries[i].clk)) {
-			pi->dpm_table.sclk_table.dpm_levels[pi->dpm_table.sclk_table.count].value =
-				allowed_sclk_vddc_table->entries[i].clk;
-			pi->dpm_table.sclk_table.dpm_levels[pi->dpm_table.sclk_table.count].enabled =
-				(i == 0) ? true : false;
-			pi->dpm_table.sclk_table.count++;
-		}
-	}
-
-	pi->dpm_table.mclk_table.count = 0;
-	for (i = 0; i < allowed_mclk_table->count; i++) {
-		if ((i == 0) ||
-		    (pi->dpm_table.mclk_table.dpm_levels[pi->dpm_table.mclk_table.count-1].value !=
-		     allowed_mclk_table->entries[i].clk)) {
-			pi->dpm_table.mclk_table.dpm_levels[pi->dpm_table.mclk_table.count].value =
-				allowed_mclk_table->entries[i].clk;
-			pi->dpm_table.mclk_table.dpm_levels[pi->dpm_table.mclk_table.count].enabled =
-				(i == 0) ? true : false;
-			pi->dpm_table.mclk_table.count++;
-		}
-	}
-
-	for (i = 0; i < allowed_sclk_vddc_table->count; i++) {
-		pi->dpm_table.vddc_table.dpm_levels[i].value =
-			allowed_sclk_vddc_table->entries[i].v;
-		pi->dpm_table.vddc_table.dpm_levels[i].param1 =
-			std_voltage_table->entries[i].leakage;
-		pi->dpm_table.vddc_table.dpm_levels[i].enabled = true;
-	}
-	pi->dpm_table.vddc_table.count = allowed_sclk_vddc_table->count;
-
-	allowed_mclk_table = &adev->pm.dpm.dyn_state.vddci_dependency_on_mclk;
-	if (allowed_mclk_table) {
-		for (i = 0; i < allowed_mclk_table->count; i++) {
-			pi->dpm_table.vddci_table.dpm_levels[i].value =
-				allowed_mclk_table->entries[i].v;
-			pi->dpm_table.vddci_table.dpm_levels[i].enabled = true;
-		}
-		pi->dpm_table.vddci_table.count = allowed_mclk_table->count;
-	}
-
-	allowed_mclk_table = &adev->pm.dpm.dyn_state.mvdd_dependency_on_mclk;
-	if (allowed_mclk_table) {
-		for (i = 0; i < allowed_mclk_table->count; i++) {
-			pi->dpm_table.mvdd_table.dpm_levels[i].value =
-				allowed_mclk_table->entries[i].v;
-			pi->dpm_table.mvdd_table.dpm_levels[i].enabled = true;
-		}
-		pi->dpm_table.mvdd_table.count = allowed_mclk_table->count;
-	}
-
-	ci_setup_default_pcie_tables(adev);
-
-	/* save a copy of the default DPM table */
-	memcpy(&(pi->golden_dpm_table), &(pi->dpm_table),
-			sizeof(struct ci_dpm_table));
-
-	return 0;
-}
-
-static int ci_find_boot_level(struct ci_single_dpm_table *table,
-			      u32 value, u32 *boot_level)
-{
-	u32 i;
-	int ret = -EINVAL;
-
-	for(i = 0; i < table->count; i++) {
-		if (value == table->dpm_levels[i].value) {
-			*boot_level = i;
-			ret = 0;
-		}
-	}
-
-	return ret;
-}
-
-static int ci_init_smc_table(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_ulv_parm *ulv = &pi->ulv;
-	struct amdgpu_ps *amdgpu_boot_state = adev->pm.dpm.boot_ps;
-	SMU7_Discrete_DpmTable *table = &pi->smc_state_table;
-	int ret;
-
-	ret = ci_setup_default_dpm_tables(adev);
-	if (ret)
-		return ret;
-
-	if (pi->voltage_control != CISLANDS_VOLTAGE_CONTROL_NONE)
-		ci_populate_smc_voltage_tables(adev, table);
-
-	ci_init_fps_limits(adev);
-
-	if (adev->pm.dpm.platform_caps & ATOM_PP_PLATFORM_CAP_HARDWAREDC)
-		table->SystemFlags |= PPSMC_SYSTEMFLAG_GPIO_DC;
-
-	if (adev->pm.dpm.platform_caps & ATOM_PP_PLATFORM_CAP_STEPVDDC)
-		table->SystemFlags |= PPSMC_SYSTEMFLAG_STEPVDDC;
-
-	if (adev->gmc.vram_type == AMDGPU_VRAM_TYPE_GDDR5)
-		table->SystemFlags |= PPSMC_SYSTEMFLAG_GDDR5;
-
-	if (ulv->supported) {
-		ret = ci_populate_ulv_state(adev, &pi->smc_state_table.Ulv);
-		if (ret)
-			return ret;
-		WREG32_SMC(ixCG_ULV_PARAMETER, ulv->cg_ulv_parameter);
-	}
-
-	ret = ci_populate_all_graphic_levels(adev);
-	if (ret)
-		return ret;
-
-	ret = ci_populate_all_memory_levels(adev);
-	if (ret)
-		return ret;
-
-	ci_populate_smc_link_level(adev, table);
-
-	ret = ci_populate_smc_acpi_level(adev, table);
-	if (ret)
-		return ret;
-
-	ret = ci_populate_smc_vce_level(adev, table);
-	if (ret)
-		return ret;
-
-	ret = ci_populate_smc_acp_level(adev, table);
-	if (ret)
-		return ret;
-
-	ret = ci_populate_smc_samu_level(adev, table);
-	if (ret)
-		return ret;
-
-	ret = ci_do_program_memory_timing_parameters(adev);
-	if (ret)
-		return ret;
-
-	ret = ci_populate_smc_uvd_level(adev, table);
-	if (ret)
-		return ret;
-
-	table->UvdBootLevel  = 0;
-	table->VceBootLevel  = 0;
-	table->AcpBootLevel  = 0;
-	table->SamuBootLevel  = 0;
-	table->GraphicsBootLevel  = 0;
-	table->MemoryBootLevel  = 0;
-
-	ret = ci_find_boot_level(&pi->dpm_table.sclk_table,
-				 pi->vbios_boot_state.sclk_bootup_value,
-				 (u32 *)&pi->smc_state_table.GraphicsBootLevel);
-
-	ret = ci_find_boot_level(&pi->dpm_table.mclk_table,
-				 pi->vbios_boot_state.mclk_bootup_value,
-				 (u32 *)&pi->smc_state_table.MemoryBootLevel);
-
-	table->BootVddc = pi->vbios_boot_state.vddc_bootup_value;
-	table->BootVddci = pi->vbios_boot_state.vddci_bootup_value;
-	table->BootMVdd = pi->vbios_boot_state.mvdd_bootup_value;
-
-	ci_populate_smc_initial_state(adev, amdgpu_boot_state);
-
-	ret = ci_populate_bapm_parameters_in_dpm_table(adev);
-	if (ret)
-		return ret;
-
-	table->UVDInterval = 1;
-	table->VCEInterval = 1;
-	table->ACPInterval = 1;
-	table->SAMUInterval = 1;
-	table->GraphicsVoltageChangeEnable = 1;
-	table->GraphicsThermThrottleEnable = 1;
-	table->GraphicsInterval = 1;
-	table->VoltageInterval = 1;
-	table->ThermalInterval = 1;
-	table->TemperatureLimitHigh = (u16)((pi->thermal_temp_setting.temperature_high *
-					     CISLANDS_Q88_FORMAT_CONVERSION_UNIT) / 1000);
-	table->TemperatureLimitLow = (u16)((pi->thermal_temp_setting.temperature_low *
-					    CISLANDS_Q88_FORMAT_CONVERSION_UNIT) / 1000);
-	table->MemoryVoltageChangeEnable = 1;
-	table->MemoryInterval = 1;
-	table->VoltageResponseTime = 0;
-	table->VddcVddciDelta = 4000;
-	table->PhaseResponseTime = 0;
-	table->MemoryThermThrottleEnable = 1;
-	table->PCIeBootLinkLevel = pi->dpm_table.pcie_speed_table.count - 1;
-	table->PCIeGenInterval = 1;
-	if (pi->voltage_control == CISLANDS_VOLTAGE_CONTROL_BY_SVID2)
-		table->SVI2Enable  = 1;
-	else
-		table->SVI2Enable  = 0;
-
-	table->ThermGpio = 17;
-	table->SclkStepSize = 0x4000;
-
-	table->SystemFlags = cpu_to_be32(table->SystemFlags);
-	table->SmioMaskVddcVid = cpu_to_be32(table->SmioMaskVddcVid);
-	table->SmioMaskVddcPhase = cpu_to_be32(table->SmioMaskVddcPhase);
-	table->SmioMaskVddciVid = cpu_to_be32(table->SmioMaskVddciVid);
-	table->SmioMaskMvddVid = cpu_to_be32(table->SmioMaskMvddVid);
-	table->SclkStepSize = cpu_to_be32(table->SclkStepSize);
-	table->TemperatureLimitHigh = cpu_to_be16(table->TemperatureLimitHigh);
-	table->TemperatureLimitLow = cpu_to_be16(table->TemperatureLimitLow);
-	table->VddcVddciDelta = cpu_to_be16(table->VddcVddciDelta);
-	table->VoltageResponseTime = cpu_to_be16(table->VoltageResponseTime);
-	table->PhaseResponseTime = cpu_to_be16(table->PhaseResponseTime);
-	table->BootVddc = cpu_to_be16(table->BootVddc * VOLTAGE_SCALE);
-	table->BootVddci = cpu_to_be16(table->BootVddci * VOLTAGE_SCALE);
-	table->BootMVdd = cpu_to_be16(table->BootMVdd * VOLTAGE_SCALE);
-
-	ret = amdgpu_ci_copy_bytes_to_smc(adev,
-				   pi->dpm_table_start +
-				   offsetof(SMU7_Discrete_DpmTable, SystemFlags),
-				   (u8 *)&table->SystemFlags,
-				   sizeof(SMU7_Discrete_DpmTable) - 3 * sizeof(SMU7_PIDController),
-				   pi->sram_end);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static void ci_trim_single_dpm_states(struct amdgpu_device *adev,
-				      struct ci_single_dpm_table *dpm_table,
-				      u32 low_limit, u32 high_limit)
-{
-	u32 i;
-
-	for (i = 0; i < dpm_table->count; i++) {
-		if ((dpm_table->dpm_levels[i].value < low_limit) ||
-		    (dpm_table->dpm_levels[i].value > high_limit))
-			dpm_table->dpm_levels[i].enabled = false;
-		else
-			dpm_table->dpm_levels[i].enabled = true;
-	}
-}
-
-static void ci_trim_pcie_dpm_states(struct amdgpu_device *adev,
-				    u32 speed_low, u32 lanes_low,
-				    u32 speed_high, u32 lanes_high)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_single_dpm_table *pcie_table = &pi->dpm_table.pcie_speed_table;
-	u32 i, j;
-
-	for (i = 0; i < pcie_table->count; i++) {
-		if ((pcie_table->dpm_levels[i].value < speed_low) ||
-		    (pcie_table->dpm_levels[i].param1 < lanes_low) ||
-		    (pcie_table->dpm_levels[i].value > speed_high) ||
-		    (pcie_table->dpm_levels[i].param1 > lanes_high))
-			pcie_table->dpm_levels[i].enabled = false;
-		else
-			pcie_table->dpm_levels[i].enabled = true;
-	}
-
-	for (i = 0; i < pcie_table->count; i++) {
-		if (pcie_table->dpm_levels[i].enabled) {
-			for (j = i + 1; j < pcie_table->count; j++) {
-				if (pcie_table->dpm_levels[j].enabled) {
-					if ((pcie_table->dpm_levels[i].value == pcie_table->dpm_levels[j].value) &&
-					    (pcie_table->dpm_levels[i].param1 == pcie_table->dpm_levels[j].param1))
-						pcie_table->dpm_levels[j].enabled = false;
-				}
-			}
-		}
-	}
-}
-
-static int ci_trim_dpm_states(struct amdgpu_device *adev,
-			      struct amdgpu_ps *amdgpu_state)
-{
-	struct ci_ps *state = ci_get_ps(amdgpu_state);
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 high_limit_count;
-
-	if (state->performance_level_count < 1)
-		return -EINVAL;
-
-	if (state->performance_level_count == 1)
-		high_limit_count = 0;
-	else
-		high_limit_count = 1;
-
-	ci_trim_single_dpm_states(adev,
-				  &pi->dpm_table.sclk_table,
-				  state->performance_levels[0].sclk,
-				  state->performance_levels[high_limit_count].sclk);
-
-	ci_trim_single_dpm_states(adev,
-				  &pi->dpm_table.mclk_table,
-				  state->performance_levels[0].mclk,
-				  state->performance_levels[high_limit_count].mclk);
-
-	ci_trim_pcie_dpm_states(adev,
-				state->performance_levels[0].pcie_gen,
-				state->performance_levels[0].pcie_lane,
-				state->performance_levels[high_limit_count].pcie_gen,
-				state->performance_levels[high_limit_count].pcie_lane);
-
-	return 0;
-}
-
-static int ci_apply_disp_minimum_voltage_request(struct amdgpu_device *adev)
-{
-	struct amdgpu_clock_voltage_dependency_table *disp_voltage_table =
-		&adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk;
-	struct amdgpu_clock_voltage_dependency_table *vddc_table =
-		&adev->pm.dpm.dyn_state.vddc_dependency_on_sclk;
-	u32 requested_voltage = 0;
-	u32 i;
-
-	if (disp_voltage_table == NULL)
-		return -EINVAL;
-	if (!disp_voltage_table->count)
-		return -EINVAL;
-
-	for (i = 0; i < disp_voltage_table->count; i++) {
-		if (adev->clock.current_dispclk == disp_voltage_table->entries[i].clk)
-			requested_voltage = disp_voltage_table->entries[i].v;
-	}
-
-	for (i = 0; i < vddc_table->count; i++) {
-		if (requested_voltage <= vddc_table->entries[i].v) {
-			requested_voltage = vddc_table->entries[i].v;
-			return (amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-								  PPSMC_MSG_VddC_Request,
-								  requested_voltage * VOLTAGE_SCALE) == PPSMC_Result_OK) ?
-				0 : -EINVAL;
-		}
-	}
-
-	return -EINVAL;
-}
-
-static int ci_upload_dpm_level_enable_mask(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	PPSMC_Result result;
-
-	ci_apply_disp_minimum_voltage_request(adev);
-
-	if (!pi->sclk_dpm_key_disabled) {
-		if (pi->dpm_level_enable_mask.sclk_dpm_enable_mask) {
-			result = amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-								   PPSMC_MSG_SCLKDPM_SetEnabledMask,
-								   pi->dpm_level_enable_mask.sclk_dpm_enable_mask);
-			if (result != PPSMC_Result_OK)
-				return -EINVAL;
-		}
-	}
-
-	if (!pi->mclk_dpm_key_disabled) {
-		if (pi->dpm_level_enable_mask.mclk_dpm_enable_mask) {
-			result = amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-								   PPSMC_MSG_MCLKDPM_SetEnabledMask,
-								   pi->dpm_level_enable_mask.mclk_dpm_enable_mask);
-			if (result != PPSMC_Result_OK)
-				return -EINVAL;
-		}
-	}
-
-#if 0
-	if (!pi->pcie_dpm_key_disabled) {
-		if (pi->dpm_level_enable_mask.pcie_dpm_enable_mask) {
-			result = amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-								   PPSMC_MSG_PCIeDPM_SetEnabledMask,
-								   pi->dpm_level_enable_mask.pcie_dpm_enable_mask);
-			if (result != PPSMC_Result_OK)
-				return -EINVAL;
-		}
-	}
-#endif
-
-	return 0;
-}
-
-static void ci_find_dpm_states_clocks_in_dpm_table(struct amdgpu_device *adev,
-						   struct amdgpu_ps *amdgpu_state)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_ps *state = ci_get_ps(amdgpu_state);
-	struct ci_single_dpm_table *sclk_table = &pi->dpm_table.sclk_table;
-	u32 sclk = state->performance_levels[state->performance_level_count-1].sclk;
-	struct ci_single_dpm_table *mclk_table = &pi->dpm_table.mclk_table;
-	u32 mclk = state->performance_levels[state->performance_level_count-1].mclk;
-	u32 i;
-
-	pi->need_update_smu7_dpm_table = 0;
-
-	for (i = 0; i < sclk_table->count; i++) {
-		if (sclk == sclk_table->dpm_levels[i].value)
-			break;
-	}
-
-	if (i >= sclk_table->count) {
-		pi->need_update_smu7_dpm_table |= DPMTABLE_OD_UPDATE_SCLK;
-	} else {
-		/* XXX check display min clock requirements */
-		if (CISLAND_MINIMUM_ENGINE_CLOCK != CISLAND_MINIMUM_ENGINE_CLOCK)
-			pi->need_update_smu7_dpm_table |= DPMTABLE_UPDATE_SCLK;
-	}
-
-	for (i = 0; i < mclk_table->count; i++) {
-		if (mclk == mclk_table->dpm_levels[i].value)
-			break;
-	}
-
-	if (i >= mclk_table->count)
-		pi->need_update_smu7_dpm_table |= DPMTABLE_OD_UPDATE_MCLK;
-
-	if (adev->pm.dpm.current_active_crtc_count !=
-	    adev->pm.dpm.new_active_crtc_count)
-		pi->need_update_smu7_dpm_table |= DPMTABLE_UPDATE_MCLK;
-}
-
-static int ci_populate_and_upload_sclk_mclk_dpm_levels(struct amdgpu_device *adev,
-						       struct amdgpu_ps *amdgpu_state)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_ps *state = ci_get_ps(amdgpu_state);
-	u32 sclk = state->performance_levels[state->performance_level_count-1].sclk;
-	u32 mclk = state->performance_levels[state->performance_level_count-1].mclk;
-	struct ci_dpm_table *dpm_table = &pi->dpm_table;
-	int ret;
-
-	if (!pi->need_update_smu7_dpm_table)
-		return 0;
-
-	if (pi->need_update_smu7_dpm_table & DPMTABLE_OD_UPDATE_SCLK)
-		dpm_table->sclk_table.dpm_levels[dpm_table->sclk_table.count-1].value = sclk;
-
-	if (pi->need_update_smu7_dpm_table & DPMTABLE_OD_UPDATE_MCLK)
-		dpm_table->mclk_table.dpm_levels[dpm_table->mclk_table.count-1].value = mclk;
-
-	if (pi->need_update_smu7_dpm_table & (DPMTABLE_OD_UPDATE_SCLK | DPMTABLE_UPDATE_SCLK)) {
-		ret = ci_populate_all_graphic_levels(adev);
-		if (ret)
-			return ret;
-	}
-
-	if (pi->need_update_smu7_dpm_table & (DPMTABLE_OD_UPDATE_MCLK | DPMTABLE_UPDATE_MCLK)) {
-		ret = ci_populate_all_memory_levels(adev);
-		if (ret)
-			return ret;
-	}
-
-	return 0;
-}
-
-static int ci_enable_uvd_dpm(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	const struct amdgpu_clock_and_voltage_limits *max_limits;
-	int i;
-
-	if (adev->pm.ac_power)
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_ac;
-	else
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_dc;
-
-	if (enable) {
-		pi->dpm_level_enable_mask.uvd_dpm_enable_mask = 0;
-
-		for (i = adev->pm.dpm.dyn_state.uvd_clock_voltage_dependency_table.count - 1; i >= 0; i--) {
-			if (adev->pm.dpm.dyn_state.uvd_clock_voltage_dependency_table.entries[i].v <= max_limits->vddc) {
-				pi->dpm_level_enable_mask.uvd_dpm_enable_mask |= 1 << i;
-
-				if (!pi->caps_uvd_dpm)
-					break;
-			}
-		}
-
-		amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-						  PPSMC_MSG_UVDDPM_SetEnabledMask,
-						  pi->dpm_level_enable_mask.uvd_dpm_enable_mask);
-
-		if (pi->last_mclk_dpm_enable_mask & 0x1) {
-			pi->uvd_enabled = true;
-			pi->dpm_level_enable_mask.mclk_dpm_enable_mask &= 0xFFFFFFFE;
-			amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-							  PPSMC_MSG_MCLKDPM_SetEnabledMask,
-							  pi->dpm_level_enable_mask.mclk_dpm_enable_mask);
-		}
-	} else {
-		if (pi->uvd_enabled) {
-			pi->uvd_enabled = false;
-			pi->dpm_level_enable_mask.mclk_dpm_enable_mask |= 1;
-			amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-							  PPSMC_MSG_MCLKDPM_SetEnabledMask,
-							  pi->dpm_level_enable_mask.mclk_dpm_enable_mask);
-		}
-	}
-
-	return (amdgpu_ci_send_msg_to_smc(adev, enable ?
-				   PPSMC_MSG_UVDDPM_Enable : PPSMC_MSG_UVDDPM_Disable) == PPSMC_Result_OK) ?
-		0 : -EINVAL;
-}
-
-static int ci_enable_vce_dpm(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	const struct amdgpu_clock_and_voltage_limits *max_limits;
-	int i;
-
-	if (adev->pm.ac_power)
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_ac;
-	else
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_dc;
-
-	if (enable) {
-		pi->dpm_level_enable_mask.vce_dpm_enable_mask = 0;
-		for (i = adev->pm.dpm.dyn_state.vce_clock_voltage_dependency_table.count - 1; i >= 0; i--) {
-			if (adev->pm.dpm.dyn_state.vce_clock_voltage_dependency_table.entries[i].v <= max_limits->vddc) {
-				pi->dpm_level_enable_mask.vce_dpm_enable_mask |= 1 << i;
-
-				if (!pi->caps_vce_dpm)
-					break;
-			}
-		}
-
-		amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-						  PPSMC_MSG_VCEDPM_SetEnabledMask,
-						  pi->dpm_level_enable_mask.vce_dpm_enable_mask);
-	}
-
-	return (amdgpu_ci_send_msg_to_smc(adev, enable ?
-				   PPSMC_MSG_VCEDPM_Enable : PPSMC_MSG_VCEDPM_Disable) == PPSMC_Result_OK) ?
-		0 : -EINVAL;
-}
-
-#if 0
-static int ci_enable_samu_dpm(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	const struct amdgpu_clock_and_voltage_limits *max_limits;
-	int i;
-
-	if (adev->pm.ac_power)
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_ac;
-	else
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_dc;
-
-	if (enable) {
-		pi->dpm_level_enable_mask.samu_dpm_enable_mask = 0;
-		for (i = adev->pm.dpm.dyn_state.samu_clock_voltage_dependency_table.count - 1; i >= 0; i--) {
-			if (adev->pm.dpm.dyn_state.samu_clock_voltage_dependency_table.entries[i].v <= max_limits->vddc) {
-				pi->dpm_level_enable_mask.samu_dpm_enable_mask |= 1 << i;
-
-				if (!pi->caps_samu_dpm)
-					break;
-			}
-		}
-
-		amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-						  PPSMC_MSG_SAMUDPM_SetEnabledMask,
-						  pi->dpm_level_enable_mask.samu_dpm_enable_mask);
-	}
-	return (amdgpu_ci_send_msg_to_smc(adev, enable ?
-				   PPSMC_MSG_SAMUDPM_Enable : PPSMC_MSG_SAMUDPM_Disable) == PPSMC_Result_OK) ?
-		0 : -EINVAL;
-}
-
-static int ci_enable_acp_dpm(struct amdgpu_device *adev, bool enable)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	const struct amdgpu_clock_and_voltage_limits *max_limits;
-	int i;
-
-	if (adev->pm.ac_power)
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_ac;
-	else
-		max_limits = &adev->pm.dpm.dyn_state.max_clock_voltage_on_dc;
-
-	if (enable) {
-		pi->dpm_level_enable_mask.acp_dpm_enable_mask = 0;
-		for (i = adev->pm.dpm.dyn_state.acp_clock_voltage_dependency_table.count - 1; i >= 0; i--) {
-			if (adev->pm.dpm.dyn_state.acp_clock_voltage_dependency_table.entries[i].v <= max_limits->vddc) {
-				pi->dpm_level_enable_mask.acp_dpm_enable_mask |= 1 << i;
-
-				if (!pi->caps_acp_dpm)
-					break;
-			}
-		}
-
-		amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-						  PPSMC_MSG_ACPDPM_SetEnabledMask,
-						  pi->dpm_level_enable_mask.acp_dpm_enable_mask);
-	}
-
-	return (amdgpu_ci_send_msg_to_smc(adev, enable ?
-				   PPSMC_MSG_ACPDPM_Enable : PPSMC_MSG_ACPDPM_Disable) == PPSMC_Result_OK) ?
-		0 : -EINVAL;
-}
-#endif
-
-static int ci_update_uvd_dpm(struct amdgpu_device *adev, bool gate)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 tmp;
-	int ret = 0;
-
-	if (!gate) {
-		/* turn the clocks on when decoding */
-		if (pi->caps_uvd_dpm ||
-		    (adev->pm.dpm.dyn_state.uvd_clock_voltage_dependency_table.count <= 0))
-			pi->smc_state_table.UvdBootLevel = 0;
-		else
-			pi->smc_state_table.UvdBootLevel =
-				adev->pm.dpm.dyn_state.uvd_clock_voltage_dependency_table.count - 1;
-
-		tmp = RREG32_SMC(ixDPM_TABLE_475);
-		tmp &= ~DPM_TABLE_475__UvdBootLevel_MASK;
-		tmp |= (pi->smc_state_table.UvdBootLevel << DPM_TABLE_475__UvdBootLevel__SHIFT);
-		WREG32_SMC(ixDPM_TABLE_475, tmp);
-		ret = ci_enable_uvd_dpm(adev, true);
-	} else {
-		ret = ci_enable_uvd_dpm(adev, false);
-		if (ret)
-			return ret;
-	}
-
-	return ret;
-}
-
-static u8 ci_get_vce_boot_level(struct amdgpu_device *adev)
-{
-	u8 i;
-	u32 min_evclk = 30000; /* ??? */
-	struct amdgpu_vce_clock_voltage_dependency_table *table =
-		&adev->pm.dpm.dyn_state.vce_clock_voltage_dependency_table;
-
-	for (i = 0; i < table->count; i++) {
-		if (table->entries[i].evclk >= min_evclk)
-			return i;
-	}
-
-	return table->count - 1;
-}
-
-static int ci_update_vce_dpm(struct amdgpu_device *adev,
-			     struct amdgpu_ps *amdgpu_new_state,
-			     struct amdgpu_ps *amdgpu_current_state)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	int ret = 0;
-	u32 tmp;
-
-	if (amdgpu_current_state->evclk != amdgpu_new_state->evclk) {
-		if (amdgpu_new_state->evclk) {
-			pi->smc_state_table.VceBootLevel = ci_get_vce_boot_level(adev);
-			tmp = RREG32_SMC(ixDPM_TABLE_475);
-			tmp &= ~DPM_TABLE_475__VceBootLevel_MASK;
-			tmp |= (pi->smc_state_table.VceBootLevel << DPM_TABLE_475__VceBootLevel__SHIFT);
-			WREG32_SMC(ixDPM_TABLE_475, tmp);
-
-			ret = ci_enable_vce_dpm(adev, true);
-		} else {
-			ret = ci_enable_vce_dpm(adev, false);
-			if (ret)
-				return ret;
-		}
-	}
-	return ret;
-}
-
-#if 0
-static int ci_update_samu_dpm(struct amdgpu_device *adev, bool gate)
-{
-	return ci_enable_samu_dpm(adev, gate);
-}
-
-static int ci_update_acp_dpm(struct amdgpu_device *adev, bool gate)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 tmp;
-
-	if (!gate) {
-		pi->smc_state_table.AcpBootLevel = 0;
-
-		tmp = RREG32_SMC(ixDPM_TABLE_475);
-		tmp &= ~AcpBootLevel_MASK;
-		tmp |= AcpBootLevel(pi->smc_state_table.AcpBootLevel);
-		WREG32_SMC(ixDPM_TABLE_475, tmp);
-	}
-
-	return ci_enable_acp_dpm(adev, !gate);
-}
-#endif
-
-static int ci_generate_dpm_level_enable_mask(struct amdgpu_device *adev,
-					     struct amdgpu_ps *amdgpu_state)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	int ret;
-
-	ret = ci_trim_dpm_states(adev, amdgpu_state);
-	if (ret)
-		return ret;
-
-	pi->dpm_level_enable_mask.sclk_dpm_enable_mask =
-		ci_get_dpm_level_enable_mask_value(&pi->dpm_table.sclk_table);
-	pi->dpm_level_enable_mask.mclk_dpm_enable_mask =
-		ci_get_dpm_level_enable_mask_value(&pi->dpm_table.mclk_table);
-	pi->last_mclk_dpm_enable_mask =
-		pi->dpm_level_enable_mask.mclk_dpm_enable_mask;
-	if (pi->uvd_enabled) {
-		if (pi->dpm_level_enable_mask.mclk_dpm_enable_mask & 1)
-			pi->dpm_level_enable_mask.mclk_dpm_enable_mask &= 0xFFFFFFFE;
-	}
-	pi->dpm_level_enable_mask.pcie_dpm_enable_mask =
-		ci_get_dpm_level_enable_mask_value(&pi->dpm_table.pcie_speed_table);
-
-	return 0;
-}
-
-static u32 ci_get_lowest_enabled_level(struct amdgpu_device *adev,
-				       u32 level_mask)
-{
-	u32 level = 0;
-
-	while ((level_mask & (1 << level)) == 0)
-		level++;
-
-	return level;
-}
-
-
-static int ci_dpm_force_performance_level(void *handle,
-					  enum amd_dpm_forced_level level)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 tmp, levels, i;
-	int ret;
-
-	if (level == AMD_DPM_FORCED_LEVEL_HIGH) {
-		if ((!pi->pcie_dpm_key_disabled) &&
-		    pi->dpm_level_enable_mask.pcie_dpm_enable_mask) {
-			levels = 0;
-			tmp = pi->dpm_level_enable_mask.pcie_dpm_enable_mask;
-			while (tmp >>= 1)
-				levels++;
-			if (levels) {
-				ret = ci_dpm_force_state_pcie(adev, level);
-				if (ret)
-					return ret;
-				for (i = 0; i < adev->usec_timeout; i++) {
-					tmp = (RREG32_SMC(ixTARGET_AND_CURRENT_PROFILE_INDEX_1) &
-					TARGET_AND_CURRENT_PROFILE_INDEX_1__CURR_PCIE_INDEX_MASK) >>
-					TARGET_AND_CURRENT_PROFILE_INDEX_1__CURR_PCIE_INDEX__SHIFT;
-					if (tmp == levels)
-						break;
-					udelay(1);
-				}
-			}
-		}
-		if ((!pi->sclk_dpm_key_disabled) &&
-		    pi->dpm_level_enable_mask.sclk_dpm_enable_mask) {
-			levels = 0;
-			tmp = pi->dpm_level_enable_mask.sclk_dpm_enable_mask;
-			while (tmp >>= 1)
-				levels++;
-			if (levels) {
-				ret = ci_dpm_force_state_sclk(adev, levels);
-				if (ret)
-					return ret;
-				for (i = 0; i < adev->usec_timeout; i++) {
-					tmp = (RREG32_SMC(ixTARGET_AND_CURRENT_PROFILE_INDEX) &
-					TARGET_AND_CURRENT_PROFILE_INDEX__CURR_SCLK_INDEX_MASK) >>
-					TARGET_AND_CURRENT_PROFILE_INDEX__CURR_SCLK_INDEX__SHIFT;
-					if (tmp == levels)
-						break;
-					udelay(1);
-				}
-			}
-		}
-		if ((!pi->mclk_dpm_key_disabled) &&
-		    pi->dpm_level_enable_mask.mclk_dpm_enable_mask) {
-			levels = 0;
-			tmp = pi->dpm_level_enable_mask.mclk_dpm_enable_mask;
-			while (tmp >>= 1)
-				levels++;
-			if (levels) {
-				ret = ci_dpm_force_state_mclk(adev, levels);
-				if (ret)
-					return ret;
-				for (i = 0; i < adev->usec_timeout; i++) {
-					tmp = (RREG32_SMC(ixTARGET_AND_CURRENT_PROFILE_INDEX) &
-					TARGET_AND_CURRENT_PROFILE_INDEX__CURR_MCLK_INDEX_MASK) >>
-					TARGET_AND_CURRENT_PROFILE_INDEX__CURR_MCLK_INDEX__SHIFT;
-					if (tmp == levels)
-						break;
-					udelay(1);
-				}
-			}
-		}
-	} else if (level == AMD_DPM_FORCED_LEVEL_LOW) {
-		if ((!pi->sclk_dpm_key_disabled) &&
-		    pi->dpm_level_enable_mask.sclk_dpm_enable_mask) {
-			levels = ci_get_lowest_enabled_level(adev,
-							     pi->dpm_level_enable_mask.sclk_dpm_enable_mask);
-			ret = ci_dpm_force_state_sclk(adev, levels);
-			if (ret)
-				return ret;
-			for (i = 0; i < adev->usec_timeout; i++) {
-				tmp = (RREG32_SMC(ixTARGET_AND_CURRENT_PROFILE_INDEX) &
-				TARGET_AND_CURRENT_PROFILE_INDEX__CURR_SCLK_INDEX_MASK) >>
-				TARGET_AND_CURRENT_PROFILE_INDEX__CURR_SCLK_INDEX__SHIFT;
-				if (tmp == levels)
-					break;
-				udelay(1);
-			}
-		}
-		if ((!pi->mclk_dpm_key_disabled) &&
-		    pi->dpm_level_enable_mask.mclk_dpm_enable_mask) {
-			levels = ci_get_lowest_enabled_level(adev,
-							     pi->dpm_level_enable_mask.mclk_dpm_enable_mask);
-			ret = ci_dpm_force_state_mclk(adev, levels);
-			if (ret)
-				return ret;
-			for (i = 0; i < adev->usec_timeout; i++) {
-				tmp = (RREG32_SMC(ixTARGET_AND_CURRENT_PROFILE_INDEX) &
-				TARGET_AND_CURRENT_PROFILE_INDEX__CURR_MCLK_INDEX_MASK) >>
-				TARGET_AND_CURRENT_PROFILE_INDEX__CURR_MCLK_INDEX__SHIFT;
-				if (tmp == levels)
-					break;
-				udelay(1);
-			}
-		}
-		if ((!pi->pcie_dpm_key_disabled) &&
-		    pi->dpm_level_enable_mask.pcie_dpm_enable_mask) {
-			levels = ci_get_lowest_enabled_level(adev,
-							     pi->dpm_level_enable_mask.pcie_dpm_enable_mask);
-			ret = ci_dpm_force_state_pcie(adev, levels);
-			if (ret)
-				return ret;
-			for (i = 0; i < adev->usec_timeout; i++) {
-				tmp = (RREG32_SMC(ixTARGET_AND_CURRENT_PROFILE_INDEX_1) &
-				TARGET_AND_CURRENT_PROFILE_INDEX_1__CURR_PCIE_INDEX_MASK) >>
-				TARGET_AND_CURRENT_PROFILE_INDEX_1__CURR_PCIE_INDEX__SHIFT;
-				if (tmp == levels)
-					break;
-				udelay(1);
-			}
-		}
-	} else if (level == AMD_DPM_FORCED_LEVEL_AUTO) {
-		if (!pi->pcie_dpm_key_disabled) {
-			PPSMC_Result smc_result;
-
-			smc_result = amdgpu_ci_send_msg_to_smc(adev,
-							       PPSMC_MSG_PCIeDPM_UnForceLevel);
-			if (smc_result != PPSMC_Result_OK)
-				return -EINVAL;
-		}
-		ret = ci_upload_dpm_level_enable_mask(adev);
-		if (ret)
-			return ret;
-	}
-
-	adev->pm.dpm.forced_level = level;
-
-	return 0;
-}
-
-static int ci_set_mc_special_registers(struct amdgpu_device *adev,
-				       struct ci_mc_reg_table *table)
-{
-	u8 i, j, k;
-	u32 temp_reg;
-
-	for (i = 0, j = table->last; i < table->last; i++) {
-		if (j >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-			return -EINVAL;
-		switch(table->mc_reg_address[i].s1) {
-		case mmMC_SEQ_MISC1:
-			temp_reg = RREG32(mmMC_PMG_CMD_EMRS);
-			table->mc_reg_address[j].s1 = mmMC_PMG_CMD_EMRS;
-			table->mc_reg_address[j].s0 = mmMC_SEQ_PMG_CMD_EMRS_LP;
-			for (k = 0; k < table->num_entries; k++) {
-				table->mc_reg_table_entry[k].mc_data[j] =
-					((temp_reg & 0xffff0000)) | ((table->mc_reg_table_entry[k].mc_data[i] & 0xffff0000) >> 16);
-			}
-			j++;
-
-			if (j >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-				return -EINVAL;
-			temp_reg = RREG32(mmMC_PMG_CMD_MRS);
-			table->mc_reg_address[j].s1 = mmMC_PMG_CMD_MRS;
-			table->mc_reg_address[j].s0 = mmMC_SEQ_PMG_CMD_MRS_LP;
-			for (k = 0; k < table->num_entries; k++) {
-				table->mc_reg_table_entry[k].mc_data[j] =
-					(temp_reg & 0xffff0000) | (table->mc_reg_table_entry[k].mc_data[i] & 0x0000ffff);
-				if (adev->gmc.vram_type != AMDGPU_VRAM_TYPE_GDDR5)
-					table->mc_reg_table_entry[k].mc_data[j] |= 0x100;
-			}
-			j++;
-
-			if (adev->gmc.vram_type != AMDGPU_VRAM_TYPE_GDDR5) {
-				if (j >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-					return -EINVAL;
-				table->mc_reg_address[j].s1 = mmMC_PMG_AUTO_CMD;
-				table->mc_reg_address[j].s0 = mmMC_PMG_AUTO_CMD;
-				for (k = 0; k < table->num_entries; k++) {
-					table->mc_reg_table_entry[k].mc_data[j] =
-						(table->mc_reg_table_entry[k].mc_data[i] & 0xffff0000) >> 16;
-				}
-				j++;
-			}
-			break;
-		case mmMC_SEQ_RESERVE_M:
-			temp_reg = RREG32(mmMC_PMG_CMD_MRS1);
-			table->mc_reg_address[j].s1 = mmMC_PMG_CMD_MRS1;
-			table->mc_reg_address[j].s0 = mmMC_SEQ_PMG_CMD_MRS1_LP;
-			for (k = 0; k < table->num_entries; k++) {
-				table->mc_reg_table_entry[k].mc_data[j] =
-					(temp_reg & 0xffff0000) | (table->mc_reg_table_entry[k].mc_data[i] & 0x0000ffff);
-			}
-			j++;
-			break;
-		default:
-			break;
-		}
-
-	}
-
-	table->last = j;
-
-	return 0;
-}
-
-static bool ci_check_s0_mc_reg_index(u16 in_reg, u16 *out_reg)
-{
-	bool result = true;
-
-	switch(in_reg) {
-	case mmMC_SEQ_RAS_TIMING:
-		*out_reg = mmMC_SEQ_RAS_TIMING_LP;
-		break;
-	case mmMC_SEQ_DLL_STBY:
-		*out_reg = mmMC_SEQ_DLL_STBY_LP;
-		break;
-	case mmMC_SEQ_G5PDX_CMD0:
-		*out_reg = mmMC_SEQ_G5PDX_CMD0_LP;
-		break;
-	case mmMC_SEQ_G5PDX_CMD1:
-		*out_reg = mmMC_SEQ_G5PDX_CMD1_LP;
-		break;
-	case mmMC_SEQ_G5PDX_CTRL:
-		*out_reg = mmMC_SEQ_G5PDX_CTRL_LP;
-		break;
-	case mmMC_SEQ_CAS_TIMING:
-		*out_reg = mmMC_SEQ_CAS_TIMING_LP;
-	    break;
-	case mmMC_SEQ_MISC_TIMING:
-		*out_reg = mmMC_SEQ_MISC_TIMING_LP;
-		break;
-	case mmMC_SEQ_MISC_TIMING2:
-		*out_reg = mmMC_SEQ_MISC_TIMING2_LP;
-		break;
-	case mmMC_SEQ_PMG_DVS_CMD:
-		*out_reg = mmMC_SEQ_PMG_DVS_CMD_LP;
-		break;
-	case mmMC_SEQ_PMG_DVS_CTL:
-		*out_reg = mmMC_SEQ_PMG_DVS_CTL_LP;
-		break;
-	case mmMC_SEQ_RD_CTL_D0:
-		*out_reg = mmMC_SEQ_RD_CTL_D0_LP;
-		break;
-	case mmMC_SEQ_RD_CTL_D1:
-		*out_reg = mmMC_SEQ_RD_CTL_D1_LP;
-		break;
-	case mmMC_SEQ_WR_CTL_D0:
-		*out_reg = mmMC_SEQ_WR_CTL_D0_LP;
-		break;
-	case mmMC_SEQ_WR_CTL_D1:
-		*out_reg = mmMC_SEQ_WR_CTL_D1_LP;
-		break;
-	case mmMC_PMG_CMD_EMRS:
-		*out_reg = mmMC_SEQ_PMG_CMD_EMRS_LP;
-		break;
-	case mmMC_PMG_CMD_MRS:
-		*out_reg = mmMC_SEQ_PMG_CMD_MRS_LP;
-		break;
-	case mmMC_PMG_CMD_MRS1:
-		*out_reg = mmMC_SEQ_PMG_CMD_MRS1_LP;
-		break;
-	case mmMC_SEQ_PMG_TIMING:
-		*out_reg = mmMC_SEQ_PMG_TIMING_LP;
-		break;
-	case mmMC_PMG_CMD_MRS2:
-		*out_reg = mmMC_SEQ_PMG_CMD_MRS2_LP;
-		break;
-	case mmMC_SEQ_WR_CTL_2:
-		*out_reg = mmMC_SEQ_WR_CTL_2_LP;
-		break;
-	default:
-		result = false;
-		break;
-	}
-
-	return result;
-}
-
-static void ci_set_valid_flag(struct ci_mc_reg_table *table)
-{
-	u8 i, j;
-
-	for (i = 0; i < table->last; i++) {
-		for (j = 1; j < table->num_entries; j++) {
-			if (table->mc_reg_table_entry[j-1].mc_data[i] !=
-			    table->mc_reg_table_entry[j].mc_data[i]) {
-				table->valid_flag |= 1 << i;
-				break;
-			}
-		}
-	}
-}
-
-static void ci_set_s0_mc_reg_index(struct ci_mc_reg_table *table)
-{
-	u32 i;
-	u16 address;
-
-	for (i = 0; i < table->last; i++) {
-		table->mc_reg_address[i].s0 =
-			ci_check_s0_mc_reg_index(table->mc_reg_address[i].s1, &address) ?
-			address : table->mc_reg_address[i].s1;
-	}
-}
-
-static int ci_copy_vbios_mc_reg_table(const struct atom_mc_reg_table *table,
-				      struct ci_mc_reg_table *ci_table)
-{
-	u8 i, j;
-
-	if (table->last > SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-		return -EINVAL;
-	if (table->num_entries > MAX_AC_TIMING_ENTRIES)
-		return -EINVAL;
-
-	for (i = 0; i < table->last; i++)
-		ci_table->mc_reg_address[i].s1 = table->mc_reg_address[i].s1;
-
-	ci_table->last = table->last;
-
-	for (i = 0; i < table->num_entries; i++) {
-		ci_table->mc_reg_table_entry[i].mclk_max =
-			table->mc_reg_table_entry[i].mclk_max;
-		for (j = 0; j < table->last; j++)
-			ci_table->mc_reg_table_entry[i].mc_data[j] =
-				table->mc_reg_table_entry[i].mc_data[j];
-	}
-	ci_table->num_entries = table->num_entries;
-
-	return 0;
-}
-
-static int ci_register_patching_mc_seq(struct amdgpu_device *adev,
-				       struct ci_mc_reg_table *table)
-{
-	u8 i, k;
-	u32 tmp;
-	bool patch;
-
-	tmp = RREG32(mmMC_SEQ_MISC0);
-	patch = ((tmp & 0x0000f00) == 0x300) ? true : false;
-
-	if (patch &&
-	    ((adev->pdev->device == 0x67B0) ||
-	     (adev->pdev->device == 0x67B1))) {
-		for (i = 0; i < table->last; i++) {
-			if (table->last >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-				return -EINVAL;
-			switch (table->mc_reg_address[i].s1) {
-			case mmMC_SEQ_MISC1:
-				for (k = 0; k < table->num_entries; k++) {
-					if ((table->mc_reg_table_entry[k].mclk_max == 125000) ||
-					    (table->mc_reg_table_entry[k].mclk_max == 137500))
-						table->mc_reg_table_entry[k].mc_data[i] =
-							(table->mc_reg_table_entry[k].mc_data[i] & 0xFFFFFFF8) |
-							0x00000007;
-				}
-				break;
-			case mmMC_SEQ_WR_CTL_D0:
-				for (k = 0; k < table->num_entries; k++) {
-					if ((table->mc_reg_table_entry[k].mclk_max == 125000) ||
-					    (table->mc_reg_table_entry[k].mclk_max == 137500))
-						table->mc_reg_table_entry[k].mc_data[i] =
-							(table->mc_reg_table_entry[k].mc_data[i] & 0xFFFF0F00) |
-							0x0000D0DD;
-				}
-				break;
-			case mmMC_SEQ_WR_CTL_D1:
-				for (k = 0; k < table->num_entries; k++) {
-					if ((table->mc_reg_table_entry[k].mclk_max == 125000) ||
-					    (table->mc_reg_table_entry[k].mclk_max == 137500))
-						table->mc_reg_table_entry[k].mc_data[i] =
-							(table->mc_reg_table_entry[k].mc_data[i] & 0xFFFF0F00) |
-							0x0000D0DD;
-				}
-				break;
-			case mmMC_SEQ_WR_CTL_2:
-				for (k = 0; k < table->num_entries; k++) {
-					if ((table->mc_reg_table_entry[k].mclk_max == 125000) ||
-					    (table->mc_reg_table_entry[k].mclk_max == 137500))
-						table->mc_reg_table_entry[k].mc_data[i] = 0;
-				}
-				break;
-			case mmMC_SEQ_CAS_TIMING:
-				for (k = 0; k < table->num_entries; k++) {
-					if (table->mc_reg_table_entry[k].mclk_max == 125000)
-						table->mc_reg_table_entry[k].mc_data[i] =
-							(table->mc_reg_table_entry[k].mc_data[i] & 0xFFE0FE0F) |
-							0x000C0140;
-					else if (table->mc_reg_table_entry[k].mclk_max == 137500)
-						table->mc_reg_table_entry[k].mc_data[i] =
-							(table->mc_reg_table_entry[k].mc_data[i] & 0xFFE0FE0F) |
-							0x000C0150;
-				}
-				break;
-			case mmMC_SEQ_MISC_TIMING:
-				for (k = 0; k < table->num_entries; k++) {
-					if (table->mc_reg_table_entry[k].mclk_max == 125000)
-						table->mc_reg_table_entry[k].mc_data[i] =
-							(table->mc_reg_table_entry[k].mc_data[i] & 0xFFFFFFE0) |
-							0x00000030;
-					else if (table->mc_reg_table_entry[k].mclk_max == 137500)
-						table->mc_reg_table_entry[k].mc_data[i] =
-							(table->mc_reg_table_entry[k].mc_data[i] & 0xFFFFFFE0) |
-							0x00000035;
-				}
-				break;
-			default:
-				break;
-			}
-		}
-
-		WREG32(mmMC_SEQ_IO_DEBUG_INDEX, 3);
-		tmp = RREG32(mmMC_SEQ_IO_DEBUG_DATA);
-		tmp = (tmp & 0xFFF8FFFF) | (1 << 16);
-		WREG32(mmMC_SEQ_IO_DEBUG_INDEX, 3);
-		WREG32(mmMC_SEQ_IO_DEBUG_DATA, tmp);
-	}
-
-	return 0;
-}
-
-static int ci_initialize_mc_reg_table(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct atom_mc_reg_table *table;
-	struct ci_mc_reg_table *ci_table = &pi->mc_reg_table;
-	u8 module_index = ci_get_memory_module_index(adev);
-	int ret;
-
-	table = kzalloc(sizeof(struct atom_mc_reg_table), GFP_KERNEL);
-	if (!table)
-		return -ENOMEM;
-
-	WREG32(mmMC_SEQ_RAS_TIMING_LP, RREG32(mmMC_SEQ_RAS_TIMING));
-	WREG32(mmMC_SEQ_CAS_TIMING_LP, RREG32(mmMC_SEQ_CAS_TIMING));
-	WREG32(mmMC_SEQ_DLL_STBY_LP, RREG32(mmMC_SEQ_DLL_STBY));
-	WREG32(mmMC_SEQ_G5PDX_CMD0_LP, RREG32(mmMC_SEQ_G5PDX_CMD0));
-	WREG32(mmMC_SEQ_G5PDX_CMD1_LP, RREG32(mmMC_SEQ_G5PDX_CMD1));
-	WREG32(mmMC_SEQ_G5PDX_CTRL_LP, RREG32(mmMC_SEQ_G5PDX_CTRL));
-	WREG32(mmMC_SEQ_PMG_DVS_CMD_LP, RREG32(mmMC_SEQ_PMG_DVS_CMD));
-	WREG32(mmMC_SEQ_PMG_DVS_CTL_LP, RREG32(mmMC_SEQ_PMG_DVS_CTL));
-	WREG32(mmMC_SEQ_MISC_TIMING_LP, RREG32(mmMC_SEQ_MISC_TIMING));
-	WREG32(mmMC_SEQ_MISC_TIMING2_LP, RREG32(mmMC_SEQ_MISC_TIMING2));
-	WREG32(mmMC_SEQ_PMG_CMD_EMRS_LP, RREG32(mmMC_PMG_CMD_EMRS));
-	WREG32(mmMC_SEQ_PMG_CMD_MRS_LP, RREG32(mmMC_PMG_CMD_MRS));
-	WREG32(mmMC_SEQ_PMG_CMD_MRS1_LP, RREG32(mmMC_PMG_CMD_MRS1));
-	WREG32(mmMC_SEQ_WR_CTL_D0_LP, RREG32(mmMC_SEQ_WR_CTL_D0));
-	WREG32(mmMC_SEQ_WR_CTL_D1_LP, RREG32(mmMC_SEQ_WR_CTL_D1));
-	WREG32(mmMC_SEQ_RD_CTL_D0_LP, RREG32(mmMC_SEQ_RD_CTL_D0));
-	WREG32(mmMC_SEQ_RD_CTL_D1_LP, RREG32(mmMC_SEQ_RD_CTL_D1));
-	WREG32(mmMC_SEQ_PMG_TIMING_LP, RREG32(mmMC_SEQ_PMG_TIMING));
-	WREG32(mmMC_SEQ_PMG_CMD_MRS2_LP, RREG32(mmMC_PMG_CMD_MRS2));
-	WREG32(mmMC_SEQ_WR_CTL_2_LP, RREG32(mmMC_SEQ_WR_CTL_2));
-
-	ret = amdgpu_atombios_init_mc_reg_table(adev, module_index, table);
-	if (ret)
-		goto init_mc_done;
-
-	ret = ci_copy_vbios_mc_reg_table(table, ci_table);
-	if (ret)
-		goto init_mc_done;
-
-	ci_set_s0_mc_reg_index(ci_table);
-
-	ret = ci_register_patching_mc_seq(adev, ci_table);
-	if (ret)
-		goto init_mc_done;
-
-	ret = ci_set_mc_special_registers(adev, ci_table);
-	if (ret)
-		goto init_mc_done;
-
-	ci_set_valid_flag(ci_table);
-
-init_mc_done:
-	kfree(table);
-
-	return ret;
-}
-
-static int ci_populate_mc_reg_addresses(struct amdgpu_device *adev,
-					SMU7_Discrete_MCRegisters *mc_reg_table)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 i, j;
-
-	for (i = 0, j = 0; j < pi->mc_reg_table.last; j++) {
-		if (pi->mc_reg_table.valid_flag & (1 << j)) {
-			if (i >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-				return -EINVAL;
-			mc_reg_table->address[i].s0 = cpu_to_be16(pi->mc_reg_table.mc_reg_address[j].s0);
-			mc_reg_table->address[i].s1 = cpu_to_be16(pi->mc_reg_table.mc_reg_address[j].s1);
-			i++;
-		}
-	}
-
-	mc_reg_table->last = (u8)i;
-
-	return 0;
-}
-
-static void ci_convert_mc_registers(const struct ci_mc_reg_entry *entry,
-				    SMU7_Discrete_MCRegisterSet *data,
-				    u32 num_entries, u32 valid_flag)
-{
-	u32 i, j;
-
-	for (i = 0, j = 0; j < num_entries; j++) {
-		if (valid_flag & (1 << j)) {
-			data->value[i] = cpu_to_be32(entry->mc_data[j]);
-			i++;
-		}
-	}
-}
-
-static void ci_convert_mc_reg_table_entry_to_smc(struct amdgpu_device *adev,
-						 const u32 memory_clock,
-						 SMU7_Discrete_MCRegisterSet *mc_reg_table_data)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 i = 0;
-
-	for(i = 0; i < pi->mc_reg_table.num_entries; i++) {
-		if (memory_clock <= pi->mc_reg_table.mc_reg_table_entry[i].mclk_max)
-			break;
-	}
-
-	if ((i == pi->mc_reg_table.num_entries) && (i > 0))
-		--i;
-
-	ci_convert_mc_registers(&pi->mc_reg_table.mc_reg_table_entry[i],
-				mc_reg_table_data, pi->mc_reg_table.last,
-				pi->mc_reg_table.valid_flag);
-}
-
-static void ci_convert_mc_reg_table_to_smc(struct amdgpu_device *adev,
-					   SMU7_Discrete_MCRegisters *mc_reg_table)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	u32 i;
-
-	for (i = 0; i < pi->dpm_table.mclk_table.count; i++)
-		ci_convert_mc_reg_table_entry_to_smc(adev,
-						     pi->dpm_table.mclk_table.dpm_levels[i].value,
-						     &mc_reg_table->data[i]);
-}
-
-static int ci_populate_initial_mc_reg_table(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	int ret;
-
-	memset(&pi->smc_mc_reg_table, 0, sizeof(SMU7_Discrete_MCRegisters));
-
-	ret = ci_populate_mc_reg_addresses(adev, &pi->smc_mc_reg_table);
-	if (ret)
-		return ret;
-	ci_convert_mc_reg_table_to_smc(adev, &pi->smc_mc_reg_table);
-
-	return amdgpu_ci_copy_bytes_to_smc(adev,
-				    pi->mc_reg_table_start,
-				    (u8 *)&pi->smc_mc_reg_table,
-				    sizeof(SMU7_Discrete_MCRegisters),
-				    pi->sram_end);
-}
-
-static int ci_update_and_upload_mc_reg_table(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (!(pi->need_update_smu7_dpm_table & DPMTABLE_OD_UPDATE_MCLK))
-		return 0;
-
-	memset(&pi->smc_mc_reg_table, 0, sizeof(SMU7_Discrete_MCRegisters));
-
-	ci_convert_mc_reg_table_to_smc(adev, &pi->smc_mc_reg_table);
-
-	return amdgpu_ci_copy_bytes_to_smc(adev,
-				    pi->mc_reg_table_start +
-				    offsetof(SMU7_Discrete_MCRegisters, data[0]),
-				    (u8 *)&pi->smc_mc_reg_table.data[0],
-				    sizeof(SMU7_Discrete_MCRegisterSet) *
-				    pi->dpm_table.mclk_table.count,
-				    pi->sram_end);
-}
-
-static void ci_enable_voltage_control(struct amdgpu_device *adev)
-{
-	u32 tmp = RREG32_SMC(ixGENERAL_PWRMGT);
-
-	tmp |= GENERAL_PWRMGT__VOLT_PWRMGT_EN_MASK;
-	WREG32_SMC(ixGENERAL_PWRMGT, tmp);
-}
-
-static enum amdgpu_pcie_gen ci_get_maximum_link_speed(struct amdgpu_device *adev,
-						      struct amdgpu_ps *amdgpu_state)
-{
-	struct ci_ps *state = ci_get_ps(amdgpu_state);
-	int i;
-	u16 pcie_speed, max_speed = 0;
-
-	for (i = 0; i < state->performance_level_count; i++) {
-		pcie_speed = state->performance_levels[i].pcie_gen;
-		if (max_speed < pcie_speed)
-			max_speed = pcie_speed;
-	}
-
-	return max_speed;
-}
-
-static u16 ci_get_current_pcie_speed(struct amdgpu_device *adev)
-{
-	u32 speed_cntl = 0;
-
-	speed_cntl = RREG32_PCIE(ixPCIE_LC_SPEED_CNTL) &
-		PCIE_LC_SPEED_CNTL__LC_CURRENT_DATA_RATE_MASK;
-	speed_cntl >>= PCIE_LC_SPEED_CNTL__LC_CURRENT_DATA_RATE__SHIFT;
-
-	return (u16)speed_cntl;
-}
-
-static int ci_get_current_pcie_lane_number(struct amdgpu_device *adev)
-{
-	u32 link_width = 0;
-
-	link_width = RREG32_PCIE(ixPCIE_LC_LINK_WIDTH_CNTL) &
-		PCIE_LC_LINK_WIDTH_CNTL__LC_LINK_WIDTH_RD_MASK;
-	link_width >>= PCIE_LC_LINK_WIDTH_CNTL__LC_LINK_WIDTH_RD__SHIFT;
-
-	switch (link_width) {
-	case 1:
-		return 1;
-	case 2:
-		return 2;
-	case 3:
-		return 4;
-	case 4:
-		return 8;
-	case 0:
-	case 6:
-	default:
-		return 16;
-	}
-}
-
-static void ci_request_link_speed_change_before_state_change(struct amdgpu_device *adev,
-							     struct amdgpu_ps *amdgpu_new_state,
-							     struct amdgpu_ps *amdgpu_current_state)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	enum amdgpu_pcie_gen target_link_speed =
-		ci_get_maximum_link_speed(adev, amdgpu_new_state);
-	enum amdgpu_pcie_gen current_link_speed;
-
-	if (pi->force_pcie_gen == AMDGPU_PCIE_GEN_INVALID)
-		current_link_speed = ci_get_maximum_link_speed(adev, amdgpu_current_state);
-	else
-		current_link_speed = pi->force_pcie_gen;
-
-	pi->force_pcie_gen = AMDGPU_PCIE_GEN_INVALID;
-	pi->pspp_notify_required = false;
-	if (target_link_speed > current_link_speed) {
-		switch (target_link_speed) {
-#ifdef CONFIG_ACPI
-		case AMDGPU_PCIE_GEN3:
-			if (amdgpu_acpi_pcie_performance_request(adev, PCIE_PERF_REQ_PECI_GEN3, false) == 0)
-				break;
-			pi->force_pcie_gen = AMDGPU_PCIE_GEN2;
-			if (current_link_speed == AMDGPU_PCIE_GEN2)
-				break;
-		case AMDGPU_PCIE_GEN2:
-			if (amdgpu_acpi_pcie_performance_request(adev, PCIE_PERF_REQ_PECI_GEN2, false) == 0)
-				break;
-#endif
-		default:
-			pi->force_pcie_gen = ci_get_current_pcie_speed(adev);
-			break;
-		}
-	} else {
-		if (target_link_speed < current_link_speed)
-			pi->pspp_notify_required = true;
-	}
-}
-
-static void ci_notify_link_speed_change_after_state_change(struct amdgpu_device *adev,
-							   struct amdgpu_ps *amdgpu_new_state,
-							   struct amdgpu_ps *amdgpu_current_state)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	enum amdgpu_pcie_gen target_link_speed =
-		ci_get_maximum_link_speed(adev, amdgpu_new_state);
-	u8 request;
-
-	if (pi->pspp_notify_required) {
-		if (target_link_speed == AMDGPU_PCIE_GEN3)
-			request = PCIE_PERF_REQ_PECI_GEN3;
-		else if (target_link_speed == AMDGPU_PCIE_GEN2)
-			request = PCIE_PERF_REQ_PECI_GEN2;
-		else
-			request = PCIE_PERF_REQ_PECI_GEN1;
-
-		if ((request == PCIE_PERF_REQ_PECI_GEN1) &&
-		    (ci_get_current_pcie_speed(adev) > 0))
-			return;
-
-#ifdef CONFIG_ACPI
-		amdgpu_acpi_pcie_performance_request(adev, request, false);
-#endif
-	}
-}
-
-static int ci_set_private_data_variables_based_on_pptable(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_clock_voltage_dependency_table *allowed_sclk_vddc_table =
-		&adev->pm.dpm.dyn_state.vddc_dependency_on_sclk;
-	struct amdgpu_clock_voltage_dependency_table *allowed_mclk_vddc_table =
-		&adev->pm.dpm.dyn_state.vddc_dependency_on_mclk;
-	struct amdgpu_clock_voltage_dependency_table *allowed_mclk_vddci_table =
-		&adev->pm.dpm.dyn_state.vddci_dependency_on_mclk;
-
-	if (allowed_sclk_vddc_table == NULL)
-		return -EINVAL;
-	if (allowed_sclk_vddc_table->count < 1)
-		return -EINVAL;
-	if (allowed_mclk_vddc_table == NULL)
-		return -EINVAL;
-	if (allowed_mclk_vddc_table->count < 1)
-		return -EINVAL;
-	if (allowed_mclk_vddci_table == NULL)
-		return -EINVAL;
-	if (allowed_mclk_vddci_table->count < 1)
-		return -EINVAL;
-
-	pi->min_vddc_in_pp_table = allowed_sclk_vddc_table->entries[0].v;
-	pi->max_vddc_in_pp_table =
-		allowed_sclk_vddc_table->entries[allowed_sclk_vddc_table->count - 1].v;
-
-	pi->min_vddci_in_pp_table = allowed_mclk_vddci_table->entries[0].v;
-	pi->max_vddci_in_pp_table =
-		allowed_mclk_vddci_table->entries[allowed_mclk_vddci_table->count - 1].v;
-
-	adev->pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk =
-		allowed_sclk_vddc_table->entries[allowed_sclk_vddc_table->count - 1].clk;
-	adev->pm.dpm.dyn_state.max_clock_voltage_on_ac.mclk =
-		allowed_mclk_vddc_table->entries[allowed_sclk_vddc_table->count - 1].clk;
-	adev->pm.dpm.dyn_state.max_clock_voltage_on_ac.vddc =
-		allowed_sclk_vddc_table->entries[allowed_sclk_vddc_table->count - 1].v;
-	adev->pm.dpm.dyn_state.max_clock_voltage_on_ac.vddci =
-		allowed_mclk_vddci_table->entries[allowed_mclk_vddci_table->count - 1].v;
-
-	return 0;
-}
-
-static void ci_patch_with_vddc_leakage(struct amdgpu_device *adev, u16 *vddc)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_leakage_voltage *leakage_table = &pi->vddc_leakage;
-	u32 leakage_index;
-
-	for (leakage_index = 0; leakage_index < leakage_table->count; leakage_index++) {
-		if (leakage_table->leakage_id[leakage_index] == *vddc) {
-			*vddc = leakage_table->actual_voltage[leakage_index];
-			break;
-		}
-	}
-}
-
-static void ci_patch_with_vddci_leakage(struct amdgpu_device *adev, u16 *vddci)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_leakage_voltage *leakage_table = &pi->vddci_leakage;
-	u32 leakage_index;
-
-	for (leakage_index = 0; leakage_index < leakage_table->count; leakage_index++) {
-		if (leakage_table->leakage_id[leakage_index] == *vddci) {
-			*vddci = leakage_table->actual_voltage[leakage_index];
-			break;
-		}
-	}
-}
-
-static void ci_patch_clock_voltage_dependency_table_with_vddc_leakage(struct amdgpu_device *adev,
-								      struct amdgpu_clock_voltage_dependency_table *table)
-{
-	u32 i;
-
-	if (table) {
-		for (i = 0; i < table->count; i++)
-			ci_patch_with_vddc_leakage(adev, &table->entries[i].v);
-	}
-}
-
-static void ci_patch_clock_voltage_dependency_table_with_vddci_leakage(struct amdgpu_device *adev,
-								       struct amdgpu_clock_voltage_dependency_table *table)
-{
-	u32 i;
-
-	if (table) {
-		for (i = 0; i < table->count; i++)
-			ci_patch_with_vddci_leakage(adev, &table->entries[i].v);
-	}
-}
-
-static void ci_patch_vce_clock_voltage_dependency_table_with_vddc_leakage(struct amdgpu_device *adev,
-									  struct amdgpu_vce_clock_voltage_dependency_table *table)
-{
-	u32 i;
-
-	if (table) {
-		for (i = 0; i < table->count; i++)
-			ci_patch_with_vddc_leakage(adev, &table->entries[i].v);
-	}
-}
-
-static void ci_patch_uvd_clock_voltage_dependency_table_with_vddc_leakage(struct amdgpu_device *adev,
-									  struct amdgpu_uvd_clock_voltage_dependency_table *table)
-{
-	u32 i;
-
-	if (table) {
-		for (i = 0; i < table->count; i++)
-			ci_patch_with_vddc_leakage(adev, &table->entries[i].v);
-	}
-}
-
-static void ci_patch_vddc_phase_shed_limit_table_with_vddc_leakage(struct amdgpu_device *adev,
-								   struct amdgpu_phase_shedding_limits_table *table)
-{
-	u32 i;
-
-	if (table) {
-		for (i = 0; i < table->count; i++)
-			ci_patch_with_vddc_leakage(adev, &table->entries[i].voltage);
-	}
-}
-
-static void ci_patch_clock_voltage_limits_with_vddc_leakage(struct amdgpu_device *adev,
-							    struct amdgpu_clock_and_voltage_limits *table)
-{
-	if (table) {
-		ci_patch_with_vddc_leakage(adev, (u16 *)&table->vddc);
-		ci_patch_with_vddci_leakage(adev, (u16 *)&table->vddci);
-	}
-}
-
-static void ci_patch_cac_leakage_table_with_vddc_leakage(struct amdgpu_device *adev,
-							 struct amdgpu_cac_leakage_table *table)
-{
-	u32 i;
-
-	if (table) {
-		for (i = 0; i < table->count; i++)
-			ci_patch_with_vddc_leakage(adev, &table->entries[i].vddc);
-	}
-}
-
-static void ci_patch_dependency_tables_with_leakage(struct amdgpu_device *adev)
-{
-
-	ci_patch_clock_voltage_dependency_table_with_vddc_leakage(adev,
-								  &adev->pm.dpm.dyn_state.vddc_dependency_on_sclk);
-	ci_patch_clock_voltage_dependency_table_with_vddc_leakage(adev,
-								  &adev->pm.dpm.dyn_state.vddc_dependency_on_mclk);
-	ci_patch_clock_voltage_dependency_table_with_vddc_leakage(adev,
-								  &adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk);
-	ci_patch_clock_voltage_dependency_table_with_vddci_leakage(adev,
-								   &adev->pm.dpm.dyn_state.vddci_dependency_on_mclk);
-	ci_patch_vce_clock_voltage_dependency_table_with_vddc_leakage(adev,
-								      &adev->pm.dpm.dyn_state.vce_clock_voltage_dependency_table);
-	ci_patch_uvd_clock_voltage_dependency_table_with_vddc_leakage(adev,
-								      &adev->pm.dpm.dyn_state.uvd_clock_voltage_dependency_table);
-	ci_patch_clock_voltage_dependency_table_with_vddc_leakage(adev,
-								  &adev->pm.dpm.dyn_state.samu_clock_voltage_dependency_table);
-	ci_patch_clock_voltage_dependency_table_with_vddc_leakage(adev,
-								  &adev->pm.dpm.dyn_state.acp_clock_voltage_dependency_table);
-	ci_patch_vddc_phase_shed_limit_table_with_vddc_leakage(adev,
-							       &adev->pm.dpm.dyn_state.phase_shedding_limits_table);
-	ci_patch_clock_voltage_limits_with_vddc_leakage(adev,
-							&adev->pm.dpm.dyn_state.max_clock_voltage_on_ac);
-	ci_patch_clock_voltage_limits_with_vddc_leakage(adev,
-							&adev->pm.dpm.dyn_state.max_clock_voltage_on_dc);
-	ci_patch_cac_leakage_table_with_vddc_leakage(adev,
-						     &adev->pm.dpm.dyn_state.cac_leakage_table);
-
-}
-
-static void ci_update_current_ps(struct amdgpu_device *adev,
-				 struct amdgpu_ps *rps)
-{
-	struct ci_ps *new_ps = ci_get_ps(rps);
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	pi->current_rps = *rps;
-	pi->current_ps = *new_ps;
-	pi->current_rps.ps_priv = &pi->current_ps;
-	adev->pm.dpm.current_ps = &pi->current_rps;
-}
-
-static void ci_update_requested_ps(struct amdgpu_device *adev,
-				   struct amdgpu_ps *rps)
-{
-	struct ci_ps *new_ps = ci_get_ps(rps);
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	pi->requested_rps = *rps;
-	pi->requested_ps = *new_ps;
-	pi->requested_rps.ps_priv = &pi->requested_ps;
-	adev->pm.dpm.requested_ps = &pi->requested_rps;
-}
-
-static int ci_dpm_pre_set_power_state(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_ps requested_ps = *adev->pm.dpm.requested_ps;
-	struct amdgpu_ps *new_ps = &requested_ps;
-
-	ci_update_requested_ps(adev, new_ps);
-
-	ci_apply_state_adjust_rules(adev, &pi->requested_rps);
-
-	return 0;
-}
-
-static void ci_dpm_post_set_power_state(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_ps *new_ps = &pi->requested_rps;
-
-	ci_update_current_ps(adev, new_ps);
-}
-
-
-static void ci_dpm_setup_asic(struct amdgpu_device *adev)
-{
-	ci_read_clock_registers(adev);
-	ci_enable_acpi_power_management(adev);
-	ci_init_sclk_t(adev);
-}
-
-static int ci_dpm_enable(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_ps *boot_ps = adev->pm.dpm.boot_ps;
-	int ret;
-
-	if (pi->voltage_control != CISLANDS_VOLTAGE_CONTROL_NONE) {
-		ci_enable_voltage_control(adev);
-		ret = ci_construct_voltage_tables(adev);
-		if (ret) {
-			DRM_ERROR("ci_construct_voltage_tables failed\n");
-			return ret;
-		}
-	}
-	if (pi->caps_dynamic_ac_timing) {
-		ret = ci_initialize_mc_reg_table(adev);
-		if (ret)
-			pi->caps_dynamic_ac_timing = false;
-	}
-	if (pi->dynamic_ss)
-		ci_enable_spread_spectrum(adev, true);
-	if (pi->thermal_protection)
-		ci_enable_thermal_protection(adev, true);
-	ci_program_sstp(adev);
-	ci_enable_display_gap(adev);
-	ci_program_vc(adev);
-	ret = ci_upload_firmware(adev);
-	if (ret) {
-		DRM_ERROR("ci_upload_firmware failed\n");
-		return ret;
-	}
-	ret = ci_process_firmware_header(adev);
-	if (ret) {
-		DRM_ERROR("ci_process_firmware_header failed\n");
-		return ret;
-	}
-	ret = ci_initial_switch_from_arb_f0_to_f1(adev);
-	if (ret) {
-		DRM_ERROR("ci_initial_switch_from_arb_f0_to_f1 failed\n");
-		return ret;
-	}
-	ret = ci_init_smc_table(adev);
-	if (ret) {
-		DRM_ERROR("ci_init_smc_table failed\n");
-		return ret;
-	}
-	ret = ci_init_arb_table_index(adev);
-	if (ret) {
-		DRM_ERROR("ci_init_arb_table_index failed\n");
-		return ret;
-	}
-	if (pi->caps_dynamic_ac_timing) {
-		ret = ci_populate_initial_mc_reg_table(adev);
-		if (ret) {
-			DRM_ERROR("ci_populate_initial_mc_reg_table failed\n");
-			return ret;
-		}
-	}
-	ret = ci_populate_pm_base(adev);
-	if (ret) {
-		DRM_ERROR("ci_populate_pm_base failed\n");
-		return ret;
-	}
-	ci_dpm_start_smc(adev);
-	ci_enable_vr_hot_gpio_interrupt(adev);
-	ret = ci_notify_smc_display_change(adev, false);
-	if (ret) {
-		DRM_ERROR("ci_notify_smc_display_change failed\n");
-		return ret;
-	}
-	ci_enable_sclk_control(adev, true);
-	ret = ci_enable_ulv(adev, true);
-	if (ret) {
-		DRM_ERROR("ci_enable_ulv failed\n");
-		return ret;
-	}
-	ret = ci_enable_ds_master_switch(adev, true);
-	if (ret) {
-		DRM_ERROR("ci_enable_ds_master_switch failed\n");
-		return ret;
-	}
-	ret = ci_start_dpm(adev);
-	if (ret) {
-		DRM_ERROR("ci_start_dpm failed\n");
-		return ret;
-	}
-	ret = ci_enable_didt(adev, true);
-	if (ret) {
-		DRM_ERROR("ci_enable_didt failed\n");
-		return ret;
-	}
-	ret = ci_enable_smc_cac(adev, true);
-	if (ret) {
-		DRM_ERROR("ci_enable_smc_cac failed\n");
-		return ret;
-	}
-	ret = ci_enable_power_containment(adev, true);
-	if (ret) {
-		DRM_ERROR("ci_enable_power_containment failed\n");
-		return ret;
-	}
-
-	ret = ci_power_control_set_level(adev);
-	if (ret) {
-		DRM_ERROR("ci_power_control_set_level failed\n");
-		return ret;
-	}
-
-	ci_enable_auto_throttle_source(adev, AMDGPU_DPM_AUTO_THROTTLE_SRC_THERMAL, true);
-
-	ret = ci_enable_thermal_based_sclk_dpm(adev, true);
-	if (ret) {
-		DRM_ERROR("ci_enable_thermal_based_sclk_dpm failed\n");
-		return ret;
-	}
-
-	ci_thermal_start_thermal_controller(adev);
-
-	ci_update_current_ps(adev, boot_ps);
-
-	return 0;
-}
-
-static void ci_dpm_disable(struct amdgpu_device *adev)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_ps *boot_ps = adev->pm.dpm.boot_ps;
-
-	amdgpu_irq_put(adev, &adev->pm.dpm.thermal.irq,
-		       AMDGPU_THERMAL_IRQ_LOW_TO_HIGH);
-	amdgpu_irq_put(adev, &adev->pm.dpm.thermal.irq,
-		       AMDGPU_THERMAL_IRQ_HIGH_TO_LOW);
-
-	ci_dpm_powergate_uvd(adev, true);
-
-	if (!amdgpu_ci_is_smc_running(adev))
-		return;
-
-	ci_thermal_stop_thermal_controller(adev);
-
-	if (pi->thermal_protection)
-		ci_enable_thermal_protection(adev, false);
-	ci_enable_power_containment(adev, false);
-	ci_enable_smc_cac(adev, false);
-	ci_enable_didt(adev, false);
-	ci_enable_spread_spectrum(adev, false);
-	ci_enable_auto_throttle_source(adev, AMDGPU_DPM_AUTO_THROTTLE_SRC_THERMAL, false);
-	ci_stop_dpm(adev);
-	ci_enable_ds_master_switch(adev, false);
-	ci_enable_ulv(adev, false);
-	ci_clear_vc(adev);
-	ci_reset_to_default(adev);
-	ci_dpm_stop_smc(adev);
-	ci_force_switch_to_arb_f0(adev);
-	ci_enable_thermal_based_sclk_dpm(adev, false);
-
-	ci_update_current_ps(adev, boot_ps);
-}
-
-static int ci_dpm_set_power_state(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_ps *new_ps = &pi->requested_rps;
-	struct amdgpu_ps *old_ps = &pi->current_rps;
-	int ret;
-
-	ci_find_dpm_states_clocks_in_dpm_table(adev, new_ps);
-	if (pi->pcie_performance_request)
-		ci_request_link_speed_change_before_state_change(adev, new_ps, old_ps);
-	ret = ci_freeze_sclk_mclk_dpm(adev);
-	if (ret) {
-		DRM_ERROR("ci_freeze_sclk_mclk_dpm failed\n");
-		return ret;
-	}
-	ret = ci_populate_and_upload_sclk_mclk_dpm_levels(adev, new_ps);
-	if (ret) {
-		DRM_ERROR("ci_populate_and_upload_sclk_mclk_dpm_levels failed\n");
-		return ret;
-	}
-	ret = ci_generate_dpm_level_enable_mask(adev, new_ps);
-	if (ret) {
-		DRM_ERROR("ci_generate_dpm_level_enable_mask failed\n");
-		return ret;
-	}
-
-	ret = ci_update_vce_dpm(adev, new_ps, old_ps);
-	if (ret) {
-		DRM_ERROR("ci_update_vce_dpm failed\n");
-		return ret;
-	}
-
-	ret = ci_update_sclk_t(adev);
-	if (ret) {
-		DRM_ERROR("ci_update_sclk_t failed\n");
-		return ret;
-	}
-	if (pi->caps_dynamic_ac_timing) {
-		ret = ci_update_and_upload_mc_reg_table(adev);
-		if (ret) {
-			DRM_ERROR("ci_update_and_upload_mc_reg_table failed\n");
-			return ret;
-		}
-	}
-	ret = ci_program_memory_timing_parameters(adev);
-	if (ret) {
-		DRM_ERROR("ci_program_memory_timing_parameters failed\n");
-		return ret;
-	}
-	ret = ci_unfreeze_sclk_mclk_dpm(adev);
-	if (ret) {
-		DRM_ERROR("ci_unfreeze_sclk_mclk_dpm failed\n");
-		return ret;
-	}
-	ret = ci_upload_dpm_level_enable_mask(adev);
-	if (ret) {
-		DRM_ERROR("ci_upload_dpm_level_enable_mask failed\n");
-		return ret;
-	}
-	if (pi->pcie_performance_request)
-		ci_notify_link_speed_change_after_state_change(adev, new_ps, old_ps);
-
-	return 0;
-}
-
-#if 0
-static void ci_dpm_reset_asic(struct amdgpu_device *adev)
-{
-	ci_set_boot_state(adev);
-}
-#endif
-
-static void ci_dpm_display_configuration_changed(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	ci_program_display_gap(adev);
-}
-
-union power_info {
-	struct _ATOM_POWERPLAY_INFO info;
-	struct _ATOM_POWERPLAY_INFO_V2 info_2;
-	struct _ATOM_POWERPLAY_INFO_V3 info_3;
-	struct _ATOM_PPLIB_POWERPLAYTABLE pplib;
-	struct _ATOM_PPLIB_POWERPLAYTABLE2 pplib2;
-	struct _ATOM_PPLIB_POWERPLAYTABLE3 pplib3;
-};
-
-union pplib_clock_info {
-	struct _ATOM_PPLIB_R600_CLOCK_INFO r600;
-	struct _ATOM_PPLIB_RS780_CLOCK_INFO rs780;
-	struct _ATOM_PPLIB_EVERGREEN_CLOCK_INFO evergreen;
-	struct _ATOM_PPLIB_SUMO_CLOCK_INFO sumo;
-	struct _ATOM_PPLIB_SI_CLOCK_INFO si;
-	struct _ATOM_PPLIB_CI_CLOCK_INFO ci;
-};
-
-union pplib_power_state {
-	struct _ATOM_PPLIB_STATE v1;
-	struct _ATOM_PPLIB_STATE_V2 v2;
-};
-
-static void ci_parse_pplib_non_clock_info(struct amdgpu_device *adev,
-					  struct amdgpu_ps *rps,
-					  struct _ATOM_PPLIB_NONCLOCK_INFO *non_clock_info,
-					  u8 table_rev)
-{
-	rps->caps = le32_to_cpu(non_clock_info->ulCapsAndSettings);
-	rps->class = le16_to_cpu(non_clock_info->usClassification);
-	rps->class2 = le16_to_cpu(non_clock_info->usClassification2);
-
-	if (ATOM_PPLIB_NONCLOCKINFO_VER1 < table_rev) {
-		rps->vclk = le32_to_cpu(non_clock_info->ulVCLK);
-		rps->dclk = le32_to_cpu(non_clock_info->ulDCLK);
-	} else {
-		rps->vclk = 0;
-		rps->dclk = 0;
-	}
-
-	if (rps->class & ATOM_PPLIB_CLASSIFICATION_BOOT)
-		adev->pm.dpm.boot_ps = rps;
-	if (rps->class & ATOM_PPLIB_CLASSIFICATION_UVDSTATE)
-		adev->pm.dpm.uvd_ps = rps;
-}
-
-static void ci_parse_pplib_clock_info(struct amdgpu_device *adev,
-				      struct amdgpu_ps *rps, int index,
-				      union pplib_clock_info *clock_info)
-{
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_ps *ps = ci_get_ps(rps);
-	struct ci_pl *pl = &ps->performance_levels[index];
-
-	ps->performance_level_count = index + 1;
-
-	pl->sclk = le16_to_cpu(clock_info->ci.usEngineClockLow);
-	pl->sclk |= clock_info->ci.ucEngineClockHigh << 16;
-	pl->mclk = le16_to_cpu(clock_info->ci.usMemoryClockLow);
-	pl->mclk |= clock_info->ci.ucMemoryClockHigh << 16;
-
-	pl->pcie_gen = amdgpu_get_pcie_gen_support(adev,
-						   pi->sys_pcie_mask,
-						   pi->vbios_boot_state.pcie_gen_bootup_value,
-						   clock_info->ci.ucPCIEGen);
-	pl->pcie_lane = amdgpu_get_pcie_lane_support(adev,
-						     pi->vbios_boot_state.pcie_lane_bootup_value,
-						     le16_to_cpu(clock_info->ci.usPCIELane));
-
-	if (rps->class & ATOM_PPLIB_CLASSIFICATION_ACPI) {
-		pi->acpi_pcie_gen = pl->pcie_gen;
-	}
-
-	if (rps->class2 & ATOM_PPLIB_CLASSIFICATION2_ULV) {
-		pi->ulv.supported = true;
-		pi->ulv.pl = *pl;
-		pi->ulv.cg_ulv_parameter = CISLANDS_CGULVPARAMETER_DFLT;
-	}
-
-	/* patch up boot state */
-	if (rps->class & ATOM_PPLIB_CLASSIFICATION_BOOT) {
-		pl->mclk = pi->vbios_boot_state.mclk_bootup_value;
-		pl->sclk = pi->vbios_boot_state.sclk_bootup_value;
-		pl->pcie_gen = pi->vbios_boot_state.pcie_gen_bootup_value;
-		pl->pcie_lane = pi->vbios_boot_state.pcie_lane_bootup_value;
-	}
-
-	switch (rps->class & ATOM_PPLIB_CLASSIFICATION_UI_MASK) {
-	case ATOM_PPLIB_CLASSIFICATION_UI_BATTERY:
-		pi->use_pcie_powersaving_levels = true;
-		if (pi->pcie_gen_powersaving.max < pl->pcie_gen)
-			pi->pcie_gen_powersaving.max = pl->pcie_gen;
-		if (pi->pcie_gen_powersaving.min > pl->pcie_gen)
-			pi->pcie_gen_powersaving.min = pl->pcie_gen;
-		if (pi->pcie_lane_powersaving.max < pl->pcie_lane)
-			pi->pcie_lane_powersaving.max = pl->pcie_lane;
-		if (pi->pcie_lane_powersaving.min > pl->pcie_lane)
-			pi->pcie_lane_powersaving.min = pl->pcie_lane;
-		break;
-	case ATOM_PPLIB_CLASSIFICATION_UI_PERFORMANCE:
-		pi->use_pcie_performance_levels = true;
-		if (pi->pcie_gen_performance.max < pl->pcie_gen)
-			pi->pcie_gen_performance.max = pl->pcie_gen;
-		if (pi->pcie_gen_performance.min > pl->pcie_gen)
-			pi->pcie_gen_performance.min = pl->pcie_gen;
-		if (pi->pcie_lane_performance.max < pl->pcie_lane)
-			pi->pcie_lane_performance.max = pl->pcie_lane;
-		if (pi->pcie_lane_performance.min > pl->pcie_lane)
-			pi->pcie_lane_performance.min = pl->pcie_lane;
-		break;
-	default:
-		break;
-	}
-}
-
-static int ci_parse_power_table(struct amdgpu_device *adev)
-{
-	struct amdgpu_mode_info *mode_info = &adev->mode_info;
-	struct _ATOM_PPLIB_NONCLOCK_INFO *non_clock_info;
-	union pplib_power_state *power_state;
-	int i, j, k, non_clock_array_index, clock_array_index;
-	union pplib_clock_info *clock_info;
-	struct _StateArray *state_array;
-	struct _ClockInfoArray *clock_info_array;
-	struct _NonClockInfoArray *non_clock_info_array;
-	union power_info *power_info;
-	int index = GetIndexIntoMasterTable(DATA, PowerPlayInfo);
-	u16 data_offset;
-	u8 frev, crev;
-	u8 *power_state_offset;
-	struct ci_ps *ps;
-
-	if (!amdgpu_atom_parse_data_header(mode_info->atom_context, index, NULL,
-				   &frev, &crev, &data_offset))
-		return -EINVAL;
-	power_info = (union power_info *)(mode_info->atom_context->bios + data_offset);
-
-	amdgpu_add_thermal_controller(adev);
-
-	state_array = (struct _StateArray *)
-		(mode_info->atom_context->bios + data_offset +
-		 le16_to_cpu(power_info->pplib.usStateArrayOffset));
-	clock_info_array = (struct _ClockInfoArray *)
-		(mode_info->atom_context->bios + data_offset +
-		 le16_to_cpu(power_info->pplib.usClockInfoArrayOffset));
-	non_clock_info_array = (struct _NonClockInfoArray *)
-		(mode_info->atom_context->bios + data_offset +
-		 le16_to_cpu(power_info->pplib.usNonClockInfoArrayOffset));
-
-	adev->pm.dpm.ps = kcalloc(state_array->ucNumEntries,
-				  sizeof(struct amdgpu_ps),
-				  GFP_KERNEL);
-	if (!adev->pm.dpm.ps)
-		return -ENOMEM;
-	power_state_offset = (u8 *)state_array->states;
-	for (i = 0; i < state_array->ucNumEntries; i++) {
-		u8 *idx;
-		power_state = (union pplib_power_state *)power_state_offset;
-		non_clock_array_index = power_state->v2.nonClockInfoIndex;
-		non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)
-			&non_clock_info_array->nonClockInfo[non_clock_array_index];
-		ps = kzalloc(sizeof(struct ci_ps), GFP_KERNEL);
-		if (ps == NULL) {
-			kfree(adev->pm.dpm.ps);
-			return -ENOMEM;
-		}
-		adev->pm.dpm.ps[i].ps_priv = ps;
-		ci_parse_pplib_non_clock_info(adev, &adev->pm.dpm.ps[i],
-					      non_clock_info,
-					      non_clock_info_array->ucEntrySize);
-		k = 0;
-		idx = (u8 *)&power_state->v2.clockInfoIndex[0];
-		for (j = 0; j < power_state->v2.ucNumDPMLevels; j++) {
-			clock_array_index = idx[j];
-			if (clock_array_index >= clock_info_array->ucNumEntries)
-				continue;
-			if (k >= CISLANDS_MAX_HARDWARE_POWERLEVELS)
-				break;
-			clock_info = (union pplib_clock_info *)
-				((u8 *)&clock_info_array->clockInfo[0] +
-				 (clock_array_index * clock_info_array->ucEntrySize));
-			ci_parse_pplib_clock_info(adev,
-						  &adev->pm.dpm.ps[i], k,
-						  clock_info);
-			k++;
-		}
-		power_state_offset += 2 + power_state->v2.ucNumDPMLevels;
-	}
-	adev->pm.dpm.num_ps = state_array->ucNumEntries;
-
-	/* fill in the vce power states */
-	for (i = 0; i < adev->pm.dpm.num_of_vce_states; i++) {
-		u32 sclk, mclk;
-		clock_array_index = adev->pm.dpm.vce_states[i].clk_idx;
-		clock_info = (union pplib_clock_info *)
-			&clock_info_array->clockInfo[clock_array_index * clock_info_array->ucEntrySize];
-		sclk = le16_to_cpu(clock_info->ci.usEngineClockLow);
-		sclk |= clock_info->ci.ucEngineClockHigh << 16;
-		mclk = le16_to_cpu(clock_info->ci.usMemoryClockLow);
-		mclk |= clock_info->ci.ucMemoryClockHigh << 16;
-		adev->pm.dpm.vce_states[i].sclk = sclk;
-		adev->pm.dpm.vce_states[i].mclk = mclk;
-	}
-
-	return 0;
-}
-
-static int ci_get_vbios_boot_values(struct amdgpu_device *adev,
-				    struct ci_vbios_boot_state *boot_state)
-{
-	struct amdgpu_mode_info *mode_info = &adev->mode_info;
-	int index = GetIndexIntoMasterTable(DATA, FirmwareInfo);
-	ATOM_FIRMWARE_INFO_V2_2 *firmware_info;
-	u8 frev, crev;
-	u16 data_offset;
-
-	if (amdgpu_atom_parse_data_header(mode_info->atom_context, index, NULL,
-				   &frev, &crev, &data_offset)) {
-		firmware_info =
-			(ATOM_FIRMWARE_INFO_V2_2 *)(mode_info->atom_context->bios +
-						    data_offset);
-		boot_state->mvdd_bootup_value = le16_to_cpu(firmware_info->usBootUpMVDDCVoltage);
-		boot_state->vddc_bootup_value = le16_to_cpu(firmware_info->usBootUpVDDCVoltage);
-		boot_state->vddci_bootup_value = le16_to_cpu(firmware_info->usBootUpVDDCIVoltage);
-		boot_state->pcie_gen_bootup_value = ci_get_current_pcie_speed(adev);
-		boot_state->pcie_lane_bootup_value = ci_get_current_pcie_lane_number(adev);
-		boot_state->sclk_bootup_value = le32_to_cpu(firmware_info->ulDefaultEngineClock);
-		boot_state->mclk_bootup_value = le32_to_cpu(firmware_info->ulDefaultMemoryClock);
-
-		return 0;
-	}
-	return -EINVAL;
-}
-
-static void ci_dpm_fini(struct amdgpu_device *adev)
-{
-	int i;
-
-	for (i = 0; i < adev->pm.dpm.num_ps; i++) {
-		kfree(adev->pm.dpm.ps[i].ps_priv);
-	}
-	kfree(adev->pm.dpm.ps);
-	kfree(adev->pm.dpm.priv);
-	kfree(adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries);
-	amdgpu_free_extended_power_table(adev);
-}
-
-/**
- * ci_dpm_init_microcode - load ucode images from disk
- *
- * @adev: amdgpu_device pointer
- *
- * Use the firmware interface to load the ucode images into
- * the driver (not loaded into hw).
- * Returns 0 on success, error on failure.
- */
-static int ci_dpm_init_microcode(struct amdgpu_device *adev)
-{
-	const char *chip_name;
-	char fw_name[30];
-	int err;
-
-	DRM_DEBUG("\n");
-
-	switch (adev->asic_type) {
-	case CHIP_BONAIRE:
-		if ((adev->pdev->revision == 0x80) ||
-		    (adev->pdev->revision == 0x81) ||
-		    (adev->pdev->device == 0x665f))
-			chip_name = "bonaire_k";
-		else
-			chip_name = "bonaire";
-		break;
-	case CHIP_HAWAII:
-		if (adev->pdev->revision == 0x80)
-			chip_name = "hawaii_k";
-		else
-			chip_name = "hawaii";
-		break;
-	case CHIP_KAVERI:
-	case CHIP_KABINI:
-	case CHIP_MULLINS:
-	default: BUG();
-	}
-
-	snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_smc.bin", chip_name);
-	err = request_firmware(&adev->pm.fw, fw_name, adev->dev);
-	if (err)
-		goto out;
-	err = amdgpu_ucode_validate(adev->pm.fw);
-
-out:
-	if (err) {
-		pr_err("cik_smc: Failed to load firmware \"%s\"\n", fw_name);
-		release_firmware(adev->pm.fw);
-		adev->pm.fw = NULL;
-	}
-	return err;
-}
-
-static int ci_dpm_init(struct amdgpu_device *adev)
-{
-	int index = GetIndexIntoMasterTable(DATA, ASIC_InternalSS_Info);
-	SMU7_Discrete_DpmTable *dpm_table;
-	struct amdgpu_gpio_rec gpio;
-	u16 data_offset, size;
-	u8 frev, crev;
-	struct ci_power_info *pi;
-	int ret;
-
-	pi = kzalloc(sizeof(struct ci_power_info), GFP_KERNEL);
-	if (pi == NULL)
-		return -ENOMEM;
-	adev->pm.dpm.priv = pi;
-
-	pi->sys_pcie_mask =
-		adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_MASK;
-
-	pi->force_pcie_gen = AMDGPU_PCIE_GEN_INVALID;
-
-	pi->pcie_gen_performance.max = AMDGPU_PCIE_GEN1;
-	pi->pcie_gen_performance.min = AMDGPU_PCIE_GEN3;
-	pi->pcie_gen_powersaving.max = AMDGPU_PCIE_GEN1;
-	pi->pcie_gen_powersaving.min = AMDGPU_PCIE_GEN3;
-
-	pi->pcie_lane_performance.max = 0;
-	pi->pcie_lane_performance.min = 16;
-	pi->pcie_lane_powersaving.max = 0;
-	pi->pcie_lane_powersaving.min = 16;
-
-	ret = ci_get_vbios_boot_values(adev, &pi->vbios_boot_state);
-	if (ret) {
-		ci_dpm_fini(adev);
-		return ret;
-	}
-
-	ret = amdgpu_get_platform_caps(adev);
-	if (ret) {
-		ci_dpm_fini(adev);
-		return ret;
-	}
-
-	ret = amdgpu_parse_extended_power_table(adev);
-	if (ret) {
-		ci_dpm_fini(adev);
-		return ret;
-	}
-
-	ret = ci_parse_power_table(adev);
-	if (ret) {
-		ci_dpm_fini(adev);
-		return ret;
-	}
-
-	pi->dll_default_on = false;
-	pi->sram_end = SMC_RAM_END;
-
-	pi->activity_target[0] = CISLAND_TARGETACTIVITY_DFLT;
-	pi->activity_target[1] = CISLAND_TARGETACTIVITY_DFLT;
-	pi->activity_target[2] = CISLAND_TARGETACTIVITY_DFLT;
-	pi->activity_target[3] = CISLAND_TARGETACTIVITY_DFLT;
-	pi->activity_target[4] = CISLAND_TARGETACTIVITY_DFLT;
-	pi->activity_target[5] = CISLAND_TARGETACTIVITY_DFLT;
-	pi->activity_target[6] = CISLAND_TARGETACTIVITY_DFLT;
-	pi->activity_target[7] = CISLAND_TARGETACTIVITY_DFLT;
-
-	pi->mclk_activity_target = CISLAND_MCLK_TARGETACTIVITY_DFLT;
-
-	pi->sclk_dpm_key_disabled = 0;
-	pi->mclk_dpm_key_disabled = 0;
-	pi->pcie_dpm_key_disabled = 0;
-	pi->thermal_sclk_dpm_enabled = 0;
-
-	if (adev->powerplay.pp_feature & PP_SCLK_DEEP_SLEEP_MASK)
-		pi->caps_sclk_ds = true;
-	else
-		pi->caps_sclk_ds = false;
-
-	pi->mclk_strobe_mode_threshold = 40000;
-	pi->mclk_stutter_mode_threshold = 40000;
-	pi->mclk_edc_enable_threshold = 40000;
-	pi->mclk_edc_wr_enable_threshold = 40000;
-
-	ci_initialize_powertune_defaults(adev);
-
-	pi->caps_fps = false;
-
-	pi->caps_sclk_throttle_low_notification = false;
-
-	pi->caps_uvd_dpm = true;
-	pi->caps_vce_dpm = true;
-
-	ci_get_leakage_voltages(adev);
-	ci_patch_dependency_tables_with_leakage(adev);
-	ci_set_private_data_variables_based_on_pptable(adev);
-
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries =
-		kcalloc(4,
-			sizeof(struct amdgpu_clock_voltage_dependency_entry),
-			GFP_KERNEL);
-	if (!adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries) {
-		ci_dpm_fini(adev);
-		return -ENOMEM;
-	}
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.count = 4;
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries[0].clk = 0;
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries[0].v = 0;
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries[1].clk = 36000;
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries[1].v = 720;
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries[2].clk = 54000;
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries[2].v = 810;
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries[3].clk = 72000;
-	adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries[3].v = 900;
-
-	adev->pm.dpm.dyn_state.mclk_sclk_ratio = 4;
-	adev->pm.dpm.dyn_state.sclk_mclk_delta = 15000;
-	adev->pm.dpm.dyn_state.vddc_vddci_delta = 200;
-
-	adev->pm.dpm.dyn_state.valid_sclk_values.count = 0;
-	adev->pm.dpm.dyn_state.valid_sclk_values.values = NULL;
-	adev->pm.dpm.dyn_state.valid_mclk_values.count = 0;
-	adev->pm.dpm.dyn_state.valid_mclk_values.values = NULL;
-
-	if (adev->asic_type == CHIP_HAWAII) {
-		pi->thermal_temp_setting.temperature_low = 94500;
-		pi->thermal_temp_setting.temperature_high = 95000;
-		pi->thermal_temp_setting.temperature_shutdown = 104000;
-	} else {
-		pi->thermal_temp_setting.temperature_low = 99500;
-		pi->thermal_temp_setting.temperature_high = 100000;
-		pi->thermal_temp_setting.temperature_shutdown = 104000;
-	}
-
-	pi->uvd_enabled = false;
-
-	dpm_table = &pi->smc_state_table;
-
-	gpio = amdgpu_atombios_lookup_gpio(adev, VDDC_VRHOT_GPIO_PINID);
-	if (gpio.valid) {
-		dpm_table->VRHotGpio = gpio.shift;
-		adev->pm.dpm.platform_caps |= ATOM_PP_PLATFORM_CAP_REGULATOR_HOT;
-	} else {
-		dpm_table->VRHotGpio = CISLANDS_UNUSED_GPIO_PIN;
-		adev->pm.dpm.platform_caps &= ~ATOM_PP_PLATFORM_CAP_REGULATOR_HOT;
-	}
-
-	gpio = amdgpu_atombios_lookup_gpio(adev, PP_AC_DC_SWITCH_GPIO_PINID);
-	if (gpio.valid) {
-		dpm_table->AcDcGpio = gpio.shift;
-		adev->pm.dpm.platform_caps |= ATOM_PP_PLATFORM_CAP_HARDWAREDC;
-	} else {
-		dpm_table->AcDcGpio = CISLANDS_UNUSED_GPIO_PIN;
-		adev->pm.dpm.platform_caps &= ~ATOM_PP_PLATFORM_CAP_HARDWAREDC;
-	}
-
-	gpio = amdgpu_atombios_lookup_gpio(adev, VDDC_PCC_GPIO_PINID);
-	if (gpio.valid) {
-		u32 tmp = RREG32_SMC(ixCNB_PWRMGT_CNTL);
-
-		switch (gpio.shift) {
-		case 0:
-			tmp &= ~CNB_PWRMGT_CNTL__GNB_SLOW_MODE_MASK;
-			tmp |= 1 << CNB_PWRMGT_CNTL__GNB_SLOW_MODE__SHIFT;
-			break;
-		case 1:
-			tmp &= ~CNB_PWRMGT_CNTL__GNB_SLOW_MODE_MASK;
-			tmp |= 2 << CNB_PWRMGT_CNTL__GNB_SLOW_MODE__SHIFT;
-			break;
-		case 2:
-			tmp |= CNB_PWRMGT_CNTL__GNB_SLOW_MASK;
-			break;
-		case 3:
-			tmp |= CNB_PWRMGT_CNTL__FORCE_NB_PS1_MASK;
-			break;
-		case 4:
-			tmp |= CNB_PWRMGT_CNTL__DPM_ENABLED_MASK;
-			break;
-		default:
-			DRM_INFO("Invalid PCC GPIO: %u!\n", gpio.shift);
-			break;
-		}
-		WREG32_SMC(ixCNB_PWRMGT_CNTL, tmp);
-	}
-
-	pi->voltage_control = CISLANDS_VOLTAGE_CONTROL_NONE;
-	pi->vddci_control = CISLANDS_VOLTAGE_CONTROL_NONE;
-	pi->mvdd_control = CISLANDS_VOLTAGE_CONTROL_NONE;
-	if (amdgpu_atombios_is_voltage_gpio(adev, VOLTAGE_TYPE_VDDC, VOLTAGE_OBJ_GPIO_LUT))
-		pi->voltage_control = CISLANDS_VOLTAGE_CONTROL_BY_GPIO;
-	else if (amdgpu_atombios_is_voltage_gpio(adev, VOLTAGE_TYPE_VDDC, VOLTAGE_OBJ_SVID2))
-		pi->voltage_control = CISLANDS_VOLTAGE_CONTROL_BY_SVID2;
-
-	if (adev->pm.dpm.platform_caps & ATOM_PP_PLATFORM_CAP_VDDCI_CONTROL) {
-		if (amdgpu_atombios_is_voltage_gpio(adev, VOLTAGE_TYPE_VDDCI, VOLTAGE_OBJ_GPIO_LUT))
-			pi->vddci_control = CISLANDS_VOLTAGE_CONTROL_BY_GPIO;
-		else if (amdgpu_atombios_is_voltage_gpio(adev, VOLTAGE_TYPE_VDDCI, VOLTAGE_OBJ_SVID2))
-			pi->vddci_control = CISLANDS_VOLTAGE_CONTROL_BY_SVID2;
-		else
-			adev->pm.dpm.platform_caps &= ~ATOM_PP_PLATFORM_CAP_VDDCI_CONTROL;
-	}
-
-	if (adev->pm.dpm.platform_caps & ATOM_PP_PLATFORM_CAP_MVDDCONTROL) {
-		if (amdgpu_atombios_is_voltage_gpio(adev, VOLTAGE_TYPE_MVDDC, VOLTAGE_OBJ_GPIO_LUT))
-			pi->mvdd_control = CISLANDS_VOLTAGE_CONTROL_BY_GPIO;
-		else if (amdgpu_atombios_is_voltage_gpio(adev, VOLTAGE_TYPE_MVDDC, VOLTAGE_OBJ_SVID2))
-			pi->mvdd_control = CISLANDS_VOLTAGE_CONTROL_BY_SVID2;
-		else
-			adev->pm.dpm.platform_caps &= ~ATOM_PP_PLATFORM_CAP_MVDDCONTROL;
-	}
-
-	pi->vddc_phase_shed_control = true;
-
-#if defined(CONFIG_ACPI)
-	pi->pcie_performance_request =
-		amdgpu_acpi_is_pcie_performance_request_supported(adev);
-#else
-	pi->pcie_performance_request = false;
-#endif
-
-	if (amdgpu_atom_parse_data_header(adev->mode_info.atom_context, index, &size,
-				   &frev, &crev, &data_offset)) {
-		pi->caps_sclk_ss_support = true;
-		pi->caps_mclk_ss_support = true;
-		pi->dynamic_ss = true;
-	} else {
-		pi->caps_sclk_ss_support = false;
-		pi->caps_mclk_ss_support = false;
-		pi->dynamic_ss = true;
-	}
-
-	if (adev->pm.int_thermal_type != THERMAL_TYPE_NONE)
-		pi->thermal_protection = true;
-	else
-		pi->thermal_protection = false;
-
-	pi->caps_dynamic_ac_timing = true;
-
-	pi->uvd_power_gated = true;
-
-	/* make sure dc limits are valid */
-	if ((adev->pm.dpm.dyn_state.max_clock_voltage_on_dc.sclk == 0) ||
-	    (adev->pm.dpm.dyn_state.max_clock_voltage_on_dc.mclk == 0))
-		adev->pm.dpm.dyn_state.max_clock_voltage_on_dc =
-			adev->pm.dpm.dyn_state.max_clock_voltage_on_ac;
-
-	pi->fan_ctrl_is_in_default_mode = true;
-
-	return 0;
-}
-
-static void
-ci_dpm_debugfs_print_current_performance_level(void *handle,
-					       struct seq_file *m)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct amdgpu_ps *rps = &pi->current_rps;
-	u32 sclk = ci_get_average_sclk_freq(adev);
-	u32 mclk = ci_get_average_mclk_freq(adev);
-	u32 activity_percent = 50;
-	int ret;
-
-	ret = ci_read_smc_soft_register(adev, offsetof(SMU7_SoftRegisters, AverageGraphicsA),
-					&activity_percent);
-
-	if (ret == 0) {
-		activity_percent += 0x80;
-		activity_percent >>= 8;
-		activity_percent = activity_percent > 100 ? 100 : activity_percent;
-	}
-
-	seq_printf(m, "uvd %sabled\n", pi->uvd_power_gated ? "dis" : "en");
-	seq_printf(m, "vce %sabled\n", rps->vce_active ? "en" : "dis");
-	seq_printf(m, "power level avg    sclk: %u mclk: %u\n",
-		   sclk, mclk);
-	seq_printf(m, "GPU load: %u %%\n", activity_percent);
-}
-
-static void ci_dpm_print_power_state(void *handle, void *current_ps)
-{
-	struct amdgpu_ps *rps = (struct amdgpu_ps *)current_ps;
-	struct ci_ps *ps = ci_get_ps(rps);
-	struct ci_pl *pl;
-	int i;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	amdgpu_dpm_print_class_info(rps->class, rps->class2);
-	amdgpu_dpm_print_cap_info(rps->caps);
-	printk("\tuvd    vclk: %d dclk: %d\n", rps->vclk, rps->dclk);
-	for (i = 0; i < ps->performance_level_count; i++) {
-		pl = &ps->performance_levels[i];
-		printk("\t\tpower level %d    sclk: %u mclk: %u pcie gen: %u pcie lanes: %u\n",
-		       i, pl->sclk, pl->mclk, pl->pcie_gen + 1, pl->pcie_lane);
-	}
-	amdgpu_dpm_print_ps_status(adev, rps);
-}
-
-static inline bool ci_are_power_levels_equal(const struct ci_pl *ci_cpl1,
-						const struct ci_pl *ci_cpl2)
-{
-	return ((ci_cpl1->mclk == ci_cpl2->mclk) &&
-		  (ci_cpl1->sclk == ci_cpl2->sclk) &&
-		  (ci_cpl1->pcie_gen == ci_cpl2->pcie_gen) &&
-		  (ci_cpl1->pcie_lane == ci_cpl2->pcie_lane));
-}
-
-static int ci_check_state_equal(void *handle,
-				void *current_ps,
-				void *request_ps,
-				bool *equal)
-{
-	struct ci_ps *ci_cps;
-	struct ci_ps *ci_rps;
-	int i;
-	struct amdgpu_ps *cps = (struct amdgpu_ps *)current_ps;
-	struct amdgpu_ps *rps = (struct amdgpu_ps *)request_ps;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	if (adev == NULL || cps == NULL || rps == NULL || equal == NULL)
-		return -EINVAL;
-
-	ci_cps = ci_get_ps((struct amdgpu_ps *)cps);
-	ci_rps = ci_get_ps((struct amdgpu_ps *)rps);
-
-	if (ci_cps == NULL) {
-		*equal = false;
-		return 0;
-	}
-
-	if (ci_cps->performance_level_count != ci_rps->performance_level_count) {
-
-		*equal = false;
-		return 0;
-	}
-
-	for (i = 0; i < ci_cps->performance_level_count; i++) {
-		if (!ci_are_power_levels_equal(&(ci_cps->performance_levels[i]),
-					&(ci_rps->performance_levels[i]))) {
-			*equal = false;
-			return 0;
-		}
-	}
-
-	/* If all performance levels are the same try to use the UVD clocks to break the tie.*/
-	*equal = ((cps->vclk == rps->vclk) && (cps->dclk == rps->dclk));
-	*equal &= ((cps->evclk == rps->evclk) && (cps->ecclk == rps->ecclk));
-
-	return 0;
-}
-
-static u32 ci_dpm_get_sclk(void *handle, bool low)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_ps *requested_state = ci_get_ps(&pi->requested_rps);
-
-	if (low)
-		return requested_state->performance_levels[0].sclk;
-	else
-		return requested_state->performance_levels[requested_state->performance_level_count - 1].sclk;
-}
-
-static u32 ci_dpm_get_mclk(void *handle, bool low)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_ps *requested_state = ci_get_ps(&pi->requested_rps);
-
-	if (low)
-		return requested_state->performance_levels[0].mclk;
-	else
-		return requested_state->performance_levels[requested_state->performance_level_count - 1].mclk;
-}
-
-/* get temperature in millidegrees */
-static int ci_dpm_get_temp(void *handle)
-{
-	u32 temp;
-	int actual_temp = 0;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	temp = (RREG32_SMC(ixCG_MULT_THERMAL_STATUS) & CG_MULT_THERMAL_STATUS__CTF_TEMP_MASK) >>
-		CG_MULT_THERMAL_STATUS__CTF_TEMP__SHIFT;
-
-	if (temp & 0x200)
-		actual_temp = 255;
-	else
-		actual_temp = temp & 0x1ff;
-
-	actual_temp = actual_temp * 1000;
-
-	return actual_temp;
-}
-
-static int ci_set_temperature_range(struct amdgpu_device *adev)
-{
-	int ret;
-
-	ret = ci_thermal_enable_alert(adev, false);
-	if (ret)
-		return ret;
-	ret = ci_thermal_set_temperature_range(adev, CISLANDS_TEMP_RANGE_MIN,
-					       CISLANDS_TEMP_RANGE_MAX);
-	if (ret)
-		return ret;
-	ret = ci_thermal_enable_alert(adev, true);
-	if (ret)
-		return ret;
-	return ret;
-}
-
-static int ci_dpm_early_init(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	adev->powerplay.pp_funcs = &ci_dpm_funcs;
-	adev->powerplay.pp_handle = adev;
-	ci_dpm_set_irq_funcs(adev);
-
-	return 0;
-}
-
-static int ci_dpm_late_init(void *handle)
-{
-	int ret;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	if (!adev->pm.dpm_enabled)
-		return 0;
-
-	/* init the sysfs and debugfs files late */
-	ret = amdgpu_pm_sysfs_init(adev);
-	if (ret)
-		return ret;
-
-	ret = ci_set_temperature_range(adev);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
-static int ci_dpm_sw_init(void *handle)
-{
-	int ret;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	ret = amdgpu_irq_add_id(adev, AMDGPU_IRQ_CLIENTID_LEGACY, 230,
-				&adev->pm.dpm.thermal.irq);
-	if (ret)
-		return ret;
-
-	ret = amdgpu_irq_add_id(adev, AMDGPU_IRQ_CLIENTID_LEGACY, 231,
-				&adev->pm.dpm.thermal.irq);
-	if (ret)
-		return ret;
-
-	/* default to balanced state */
-	adev->pm.dpm.state = POWER_STATE_TYPE_BALANCED;
-	adev->pm.dpm.user_state = POWER_STATE_TYPE_BALANCED;
-	adev->pm.dpm.forced_level = AMD_DPM_FORCED_LEVEL_AUTO;
-	adev->pm.default_sclk = adev->clock.default_sclk;
-	adev->pm.default_mclk = adev->clock.default_mclk;
-	adev->pm.current_sclk = adev->clock.default_sclk;
-	adev->pm.current_mclk = adev->clock.default_mclk;
-	adev->pm.int_thermal_type = THERMAL_TYPE_NONE;
-
-	ret = ci_dpm_init_microcode(adev);
-	if (ret)
-		return ret;
-
-	if (amdgpu_dpm == 0)
-		return 0;
-
-	INIT_WORK(&adev->pm.dpm.thermal.work, amdgpu_dpm_thermal_work_handler);
-	mutex_lock(&adev->pm.mutex);
-	ret = ci_dpm_init(adev);
-	if (ret)
-		goto dpm_failed;
-	adev->pm.dpm.current_ps = adev->pm.dpm.requested_ps = adev->pm.dpm.boot_ps;
-	if (amdgpu_dpm == 1)
-		amdgpu_pm_print_power_states(adev);
-	mutex_unlock(&adev->pm.mutex);
-	DRM_INFO("amdgpu: dpm initialized\n");
-
-	return 0;
-
-dpm_failed:
-	ci_dpm_fini(adev);
-	mutex_unlock(&adev->pm.mutex);
-	DRM_ERROR("amdgpu: dpm initialization failed\n");
-	return ret;
-}
-
-static int ci_dpm_sw_fini(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	flush_work(&adev->pm.dpm.thermal.work);
-
-	mutex_lock(&adev->pm.mutex);
-	ci_dpm_fini(adev);
-	mutex_unlock(&adev->pm.mutex);
-
-	release_firmware(adev->pm.fw);
-	adev->pm.fw = NULL;
-
-	return 0;
-}
-
-static int ci_dpm_hw_init(void *handle)
-{
-	int ret;
-
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	if (!amdgpu_dpm) {
-		ret = ci_upload_firmware(adev);
-		if (ret) {
-			DRM_ERROR("ci_upload_firmware failed\n");
-			return ret;
-		}
-		ci_dpm_start_smc(adev);
-		return 0;
-	}
-
-	mutex_lock(&adev->pm.mutex);
-	ci_dpm_setup_asic(adev);
-	ret = ci_dpm_enable(adev);
-	if (ret)
-		adev->pm.dpm_enabled = false;
-	else
-		adev->pm.dpm_enabled = true;
-	mutex_unlock(&adev->pm.mutex);
-
-	return ret;
-}
-
-static int ci_dpm_hw_fini(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	if (adev->pm.dpm_enabled) {
-		mutex_lock(&adev->pm.mutex);
-		ci_dpm_disable(adev);
-		mutex_unlock(&adev->pm.mutex);
-	} else {
-		ci_dpm_stop_smc(adev);
-	}
-
-	return 0;
-}
-
-static int ci_dpm_suspend(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	if (adev->pm.dpm_enabled) {
-		mutex_lock(&adev->pm.mutex);
-		amdgpu_irq_put(adev, &adev->pm.dpm.thermal.irq,
-			       AMDGPU_THERMAL_IRQ_LOW_TO_HIGH);
-		amdgpu_irq_put(adev, &adev->pm.dpm.thermal.irq,
-			       AMDGPU_THERMAL_IRQ_HIGH_TO_LOW);
-		adev->pm.dpm.last_user_state = adev->pm.dpm.user_state;
-		adev->pm.dpm.last_state = adev->pm.dpm.state;
-		adev->pm.dpm.user_state = POWER_STATE_TYPE_INTERNAL_BOOT;
-		adev->pm.dpm.state = POWER_STATE_TYPE_INTERNAL_BOOT;
-		mutex_unlock(&adev->pm.mutex);
-		amdgpu_pm_compute_clocks(adev);
-
-	}
-
-	return 0;
-}
-
-static int ci_dpm_resume(void *handle)
-{
-	int ret;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	if (adev->pm.dpm_enabled) {
-		/* asic init will reset to the boot state */
-		mutex_lock(&adev->pm.mutex);
-		ci_dpm_setup_asic(adev);
-		ret = ci_dpm_enable(adev);
-		if (ret)
-			adev->pm.dpm_enabled = false;
-		else
-			adev->pm.dpm_enabled = true;
-		adev->pm.dpm.user_state = adev->pm.dpm.last_user_state;
-		adev->pm.dpm.state = adev->pm.dpm.last_state;
-		mutex_unlock(&adev->pm.mutex);
-		if (adev->pm.dpm_enabled)
-			amdgpu_pm_compute_clocks(adev);
-	}
-	return 0;
-}
-
-static bool ci_dpm_is_idle(void *handle)
-{
-	/* XXX */
-	return true;
-}
-
-static int ci_dpm_wait_for_idle(void *handle)
-{
-	/* XXX */
-	return 0;
-}
-
-static int ci_dpm_soft_reset(void *handle)
-{
-	return 0;
-}
-
-static int ci_dpm_set_interrupt_state(struct amdgpu_device *adev,
-				      struct amdgpu_irq_src *source,
-				      unsigned type,
-				      enum amdgpu_interrupt_state state)
-{
-	u32 cg_thermal_int;
-
-	switch (type) {
-	case AMDGPU_THERMAL_IRQ_LOW_TO_HIGH:
-		switch (state) {
-		case AMDGPU_IRQ_STATE_DISABLE:
-			cg_thermal_int = RREG32_SMC(ixCG_THERMAL_INT);
-			cg_thermal_int |= CG_THERMAL_INT_CTRL__THERM_INTH_MASK_MASK;
-			WREG32_SMC(ixCG_THERMAL_INT, cg_thermal_int);
-			break;
-		case AMDGPU_IRQ_STATE_ENABLE:
-			cg_thermal_int = RREG32_SMC(ixCG_THERMAL_INT);
-			cg_thermal_int &= ~CG_THERMAL_INT_CTRL__THERM_INTH_MASK_MASK;
-			WREG32_SMC(ixCG_THERMAL_INT, cg_thermal_int);
-			break;
-		default:
-			break;
-		}
-		break;
-
-	case AMDGPU_THERMAL_IRQ_HIGH_TO_LOW:
-		switch (state) {
-		case AMDGPU_IRQ_STATE_DISABLE:
-			cg_thermal_int = RREG32_SMC(ixCG_THERMAL_INT);
-			cg_thermal_int |= CG_THERMAL_INT_CTRL__THERM_INTL_MASK_MASK;
-			WREG32_SMC(ixCG_THERMAL_INT, cg_thermal_int);
-			break;
-		case AMDGPU_IRQ_STATE_ENABLE:
-			cg_thermal_int = RREG32_SMC(ixCG_THERMAL_INT);
-			cg_thermal_int &= ~CG_THERMAL_INT_CTRL__THERM_INTL_MASK_MASK;
-			WREG32_SMC(ixCG_THERMAL_INT, cg_thermal_int);
-			break;
-		default:
-			break;
-		}
-		break;
-
-	default:
-		break;
-	}
-	return 0;
-}
-
-static int ci_dpm_process_interrupt(struct amdgpu_device *adev,
-				    struct amdgpu_irq_src *source,
-				    struct amdgpu_iv_entry *entry)
-{
-	bool queue_thermal = false;
-
-	if (entry == NULL)
-		return -EINVAL;
-
-	switch (entry->src_id) {
-	case 230: /* thermal low to high */
-		DRM_DEBUG("IH: thermal low to high\n");
-		adev->pm.dpm.thermal.high_to_low = false;
-		queue_thermal = true;
-		break;
-	case 231: /* thermal high to low */
-		DRM_DEBUG("IH: thermal high to low\n");
-		adev->pm.dpm.thermal.high_to_low = true;
-		queue_thermal = true;
-		break;
-	default:
-		break;
-	}
-
-	if (queue_thermal)
-		schedule_work(&adev->pm.dpm.thermal.work);
-
-	return 0;
-}
-
-static int ci_dpm_set_clockgating_state(void *handle,
-					  enum amd_clockgating_state state)
-{
-	return 0;
-}
-
-static int ci_dpm_set_powergating_state(void *handle,
-					  enum amd_powergating_state state)
-{
-	return 0;
-}
-
-static int ci_dpm_print_clock_levels(void *handle,
-		enum pp_clock_type type, char *buf)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_single_dpm_table *sclk_table = &pi->dpm_table.sclk_table;
-	struct ci_single_dpm_table *mclk_table = &pi->dpm_table.mclk_table;
-	struct ci_single_dpm_table *pcie_table = &pi->dpm_table.pcie_speed_table;
-
-	int i, now, size = 0;
-	uint32_t clock, pcie_speed;
-
-	switch (type) {
-	case PP_SCLK:
-		amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_API_GetSclkFrequency);
-		clock = RREG32(mmSMC_MSG_ARG_0);
-
-		for (i = 0; i < sclk_table->count; i++) {
-			if (clock > sclk_table->dpm_levels[i].value)
-				continue;
-			break;
-		}
-		now = i;
-
-		for (i = 0; i < sclk_table->count; i++)
-			size += sprintf(buf + size, "%d: %uMhz %s\n",
-					i, sclk_table->dpm_levels[i].value / 100,
-					(i == now) ? "*" : "");
-		break;
-	case PP_MCLK:
-		amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_API_GetMclkFrequency);
-		clock = RREG32(mmSMC_MSG_ARG_0);
-
-		for (i = 0; i < mclk_table->count; i++) {
-			if (clock > mclk_table->dpm_levels[i].value)
-				continue;
-			break;
-		}
-		now = i;
-
-		for (i = 0; i < mclk_table->count; i++)
-			size += sprintf(buf + size, "%d: %uMhz %s\n",
-					i, mclk_table->dpm_levels[i].value / 100,
-					(i == now) ? "*" : "");
-		break;
-	case PP_PCIE:
-		pcie_speed = ci_get_current_pcie_speed(adev);
-		for (i = 0; i < pcie_table->count; i++) {
-			if (pcie_speed != pcie_table->dpm_levels[i].value)
-				continue;
-			break;
-		}
-		now = i;
-
-		for (i = 0; i < pcie_table->count; i++)
-			size += sprintf(buf + size, "%d: %s %s\n", i,
-					(pcie_table->dpm_levels[i].value == 0) ? "2.5GT/s, x1" :
-					(pcie_table->dpm_levels[i].value == 1) ? "5.0GT/s, x16" :
-					(pcie_table->dpm_levels[i].value == 2) ? "8.0GT/s, x16" : "",
-					(i == now) ? "*" : "");
-		break;
-	default:
-		break;
-	}
-
-	return size;
-}
-
-static int ci_dpm_force_clock_level(void *handle,
-		enum pp_clock_type type, uint32_t mask)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-
-	if (adev->pm.dpm.forced_level != AMD_DPM_FORCED_LEVEL_MANUAL)
-		return -EINVAL;
-
-	if (mask == 0)
-		return -EINVAL;
-
-	switch (type) {
-	case PP_SCLK:
-		if (!pi->sclk_dpm_key_disabled)
-			amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-					PPSMC_MSG_SCLKDPM_SetEnabledMask,
-					pi->dpm_level_enable_mask.sclk_dpm_enable_mask & mask);
-		break;
-
-	case PP_MCLK:
-		if (!pi->mclk_dpm_key_disabled)
-			amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-					PPSMC_MSG_MCLKDPM_SetEnabledMask,
-					pi->dpm_level_enable_mask.mclk_dpm_enable_mask & mask);
-		break;
-
-	case PP_PCIE:
-	{
-		uint32_t tmp = mask & pi->dpm_level_enable_mask.pcie_dpm_enable_mask;
-
-		if (!pi->pcie_dpm_key_disabled) {
-			if (fls(tmp) != ffs(tmp))
-				amdgpu_ci_send_msg_to_smc(adev, PPSMC_MSG_PCIeDPM_UnForceLevel);
-			else
-				amdgpu_ci_send_msg_to_smc_with_parameter(adev,
-					PPSMC_MSG_PCIeDPM_ForceLevel,
-					fls(tmp) - 1);
-		}
-		break;
-	}
-	default:
-		break;
-	}
-
-	return 0;
-}
-
-static int ci_dpm_get_sclk_od(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_single_dpm_table *sclk_table = &(pi->dpm_table.sclk_table);
-	struct ci_single_dpm_table *golden_sclk_table =
-			&(pi->golden_dpm_table.sclk_table);
-	int value;
-
-	value = (sclk_table->dpm_levels[sclk_table->count - 1].value -
-			golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value) *
-			100 /
-			golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value;
-
-	return value;
-}
-
-static int ci_dpm_set_sclk_od(void *handle, uint32_t value)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_ps *ps = ci_get_ps(adev->pm.dpm.requested_ps);
-	struct ci_single_dpm_table *golden_sclk_table =
-			&(pi->golden_dpm_table.sclk_table);
-
-	if (value > 20)
-		value = 20;
-
-	ps->performance_levels[ps->performance_level_count - 1].sclk =
-			golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value *
-			value / 100 +
-			golden_sclk_table->dpm_levels[golden_sclk_table->count - 1].value;
-
-	return 0;
-}
-
-static int ci_dpm_get_mclk_od(void *handle)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_single_dpm_table *mclk_table = &(pi->dpm_table.mclk_table);
-	struct ci_single_dpm_table *golden_mclk_table =
-			&(pi->golden_dpm_table.mclk_table);
-	int value;
-
-	value = (mclk_table->dpm_levels[mclk_table->count - 1].value -
-			golden_mclk_table->dpm_levels[golden_mclk_table->count - 1].value) *
-			100 /
-			golden_mclk_table->dpm_levels[golden_mclk_table->count - 1].value;
-
-	return value;
-}
-
-static int ci_dpm_set_mclk_od(void *handle, uint32_t value)
-{
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct ci_power_info *pi = ci_get_pi(adev);
-	struct ci_ps *ps = ci_get_ps(adev->pm.dpm.requested_ps);
-	struct ci_single_dpm_table *golden_mclk_table =
-			&(pi->golden_dpm_table.mclk_table);
-
-	if (value > 20)
-		value = 20;
-
-	ps->performance_levels[ps->performance_level_count - 1].mclk =
-			golden_mclk_table->dpm_levels[golden_mclk_table->count - 1].value *
-			value / 100 +
-			golden_mclk_table->dpm_levels[golden_mclk_table->count - 1].value;
-
-	return 0;
-}
-
-static int ci_dpm_read_sensor(void *handle, int idx,
-			      void *value, int *size)
-{
-	u32 activity_percent = 50;
-	int ret;
-	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
-	/* size must be at least 4 bytes for all sensors */
-	if (*size < 4)
-		return -EINVAL;
-
-	switch (idx) {
-	case AMDGPU_PP_SENSOR_GFX_SCLK:
-		*((uint32_t *)value) = ci_get_average_sclk_freq(adev);
-		*size = 4;
-		return 0;
-	case AMDGPU_PP_SENSOR_GFX_MCLK:
-		*((uint32_t *)value) = ci_get_average_mclk_freq(adev);
-		*size = 4;
-		return 0;
-	case AMDGPU_PP_SENSOR_GPU_TEMP:
-		*((uint32_t *)value) = ci_dpm_get_temp(adev);
-		*size = 4;
-		return 0;
-	case AMDGPU_PP_SENSOR_GPU_LOAD:
-		ret = ci_read_smc_soft_register(adev,
-						offsetof(SMU7_SoftRegisters,
-							 AverageGraphicsA),
-						&activity_percent);
-		if (ret == 0) {
-			activity_percent += 0x80;
-			activity_percent >>= 8;
-			activity_percent =
-				activity_percent > 100 ? 100 : activity_percent;
-		}
-		*((uint32_t *)value) = activity_percent;
-		*size = 4;
-		return 0;
-	default:
-		return -EINVAL;
-	}
-}
-
-static int ci_set_powergating_by_smu(void *handle,
-				uint32_t block_type, bool gate)
-{
-	switch (block_type) {
-	case AMD_IP_BLOCK_TYPE_UVD:
-		ci_dpm_powergate_uvd(handle, gate);
-		break;
-	default:
-		break;
-	}
-	return 0;
-}
-
-static const struct amd_ip_funcs ci_dpm_ip_funcs = {
-	.name = "ci_dpm",
-	.early_init = ci_dpm_early_init,
-	.late_init = ci_dpm_late_init,
-	.sw_init = ci_dpm_sw_init,
-	.sw_fini = ci_dpm_sw_fini,
-	.hw_init = ci_dpm_hw_init,
-	.hw_fini = ci_dpm_hw_fini,
-	.suspend = ci_dpm_suspend,
-	.resume = ci_dpm_resume,
-	.is_idle = ci_dpm_is_idle,
-	.wait_for_idle = ci_dpm_wait_for_idle,
-	.soft_reset = ci_dpm_soft_reset,
-	.set_clockgating_state = ci_dpm_set_clockgating_state,
-	.set_powergating_state = ci_dpm_set_powergating_state,
-};
-
-const struct amdgpu_ip_block_version ci_smu_ip_block =
-{
-	.type = AMD_IP_BLOCK_TYPE_SMC,
-	.major = 7,
-	.minor = 0,
-	.rev = 0,
-	.funcs = &ci_dpm_ip_funcs,
-};
-
-static const struct amd_pm_funcs ci_dpm_funcs = {
-	.pre_set_power_state = &ci_dpm_pre_set_power_state,
-	.set_power_state = &ci_dpm_set_power_state,
-	.post_set_power_state = &ci_dpm_post_set_power_state,
-	.display_configuration_changed = &ci_dpm_display_configuration_changed,
-	.get_sclk = &ci_dpm_get_sclk,
-	.get_mclk = &ci_dpm_get_mclk,
-	.print_power_state = &ci_dpm_print_power_state,
-	.debugfs_print_current_performance_level = &ci_dpm_debugfs_print_current_performance_level,
-	.force_performance_level = &ci_dpm_force_performance_level,
-	.vblank_too_short = &ci_dpm_vblank_too_short,
-	.set_powergating_by_smu = &ci_set_powergating_by_smu,
-	.set_fan_control_mode = &ci_dpm_set_fan_control_mode,
-	.get_fan_control_mode = &ci_dpm_get_fan_control_mode,
-	.set_fan_speed_percent = &ci_dpm_set_fan_speed_percent,
-	.get_fan_speed_percent = &ci_dpm_get_fan_speed_percent,
-	.print_clock_levels = ci_dpm_print_clock_levels,
-	.force_clock_level = ci_dpm_force_clock_level,
-	.get_sclk_od = ci_dpm_get_sclk_od,
-	.set_sclk_od = ci_dpm_set_sclk_od,
-	.get_mclk_od = ci_dpm_get_mclk_od,
-	.set_mclk_od = ci_dpm_set_mclk_od,
-	.check_state_equal = ci_check_state_equal,
-	.get_vce_clock_state = amdgpu_get_vce_clock_state,
-	.read_sensor = ci_dpm_read_sensor,
-};
-
-static const struct amdgpu_irq_src_funcs ci_dpm_irq_funcs = {
-	.set = ci_dpm_set_interrupt_state,
-	.process = ci_dpm_process_interrupt,
-};
-
-static void ci_dpm_set_irq_funcs(struct amdgpu_device *adev)
-{
-	adev->pm.dpm.thermal.irq.num_types = AMDGPU_THERMAL_IRQ_LAST;
-	adev->pm.dpm.thermal.irq.funcs = &ci_dpm_irq_funcs;
-}
diff --git a/drivers/gpu/drm/amd/amdgpu/ci_dpm.h b/drivers/gpu/drm/amd/amdgpu/ci_dpm.h
deleted file mode 100644
index 91be2996ae7c..000000000000
--- a/drivers/gpu/drm/amd/amdgpu/ci_dpm.h
+++ /dev/null
@@ -1,349 +0,0 @@
-/*
- * Copyright 2013 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- */
-#ifndef __CI_DPM_H__
-#define __CI_DPM_H__
-
-#include "amdgpu_atombios.h"
-#include "ppsmc.h"
-
-#define SMU__NUM_SCLK_DPM_STATE  8
-#define SMU__NUM_MCLK_DPM_LEVELS 6
-#define SMU__NUM_LCLK_DPM_LEVELS 8
-#define SMU__NUM_PCIE_DPM_LEVELS 8
-#include "smu7_discrete.h"
-
-#define CISLANDS_MAX_HARDWARE_POWERLEVELS 2
-
-#define CISLANDS_UNUSED_GPIO_PIN 0x7F
-
-struct ci_pl {
-	u32 mclk;
-	u32 sclk;
-	enum amdgpu_pcie_gen pcie_gen;
-	u16 pcie_lane;
-};
-
-struct ci_ps {
-	u16 performance_level_count;
-	bool dc_compatible;
-	u32 sclk_t;
-	struct ci_pl performance_levels[CISLANDS_MAX_HARDWARE_POWERLEVELS];
-};
-
-struct ci_dpm_level {
-	bool enabled;
-	u32 value;
-	u32 param1;
-};
-
-#define CISLAND_MAX_DEEPSLEEP_DIVIDER_ID 5
-#define MAX_REGULAR_DPM_NUMBER 8
-#define CISLAND_MINIMUM_ENGINE_CLOCK 800
-
-struct ci_single_dpm_table {
-	u32 count;
-	struct ci_dpm_level dpm_levels[MAX_REGULAR_DPM_NUMBER];
-};
-
-struct ci_dpm_table {
-	struct ci_single_dpm_table sclk_table;
-	struct ci_single_dpm_table mclk_table;
-	struct ci_single_dpm_table pcie_speed_table;
-	struct ci_single_dpm_table vddc_table;
-	struct ci_single_dpm_table vddci_table;
-	struct ci_single_dpm_table mvdd_table;
-};
-
-struct ci_mc_reg_entry {
-	u32 mclk_max;
-	u32 mc_data[SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE];
-};
-
-struct ci_mc_reg_table {
-	u8 last;
-	u8 num_entries;
-	u16 valid_flag;
-	struct ci_mc_reg_entry mc_reg_table_entry[MAX_AC_TIMING_ENTRIES];
-	SMU7_Discrete_MCRegisterAddress mc_reg_address[SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE];
-};
-
-struct ci_ulv_parm
-{
-	bool supported;
-	u32 cg_ulv_parameter;
-	u32 volt_change_delay;
-	struct ci_pl pl;
-};
-
-#define CISLANDS_MAX_LEAKAGE_COUNT  8
-
-struct ci_leakage_voltage {
-	u16 count;
-	u16 leakage_id[CISLANDS_MAX_LEAKAGE_COUNT];
-	u16 actual_voltage[CISLANDS_MAX_LEAKAGE_COUNT];
-};
-
-struct ci_dpm_level_enable_mask {
-	u32 uvd_dpm_enable_mask;
-	u32 vce_dpm_enable_mask;
-	u32 acp_dpm_enable_mask;
-	u32 samu_dpm_enable_mask;
-	u32 sclk_dpm_enable_mask;
-	u32 mclk_dpm_enable_mask;
-	u32 pcie_dpm_enable_mask;
-};
-
-struct ci_vbios_boot_state
-{
-	u16 mvdd_bootup_value;
-	u16 vddc_bootup_value;
-	u16 vddci_bootup_value;
-	u32 sclk_bootup_value;
-	u32 mclk_bootup_value;
-	u16 pcie_gen_bootup_value;
-	u16 pcie_lane_bootup_value;
-};
-
-struct ci_clock_registers {
-	u32 cg_spll_func_cntl;
-	u32 cg_spll_func_cntl_2;
-	u32 cg_spll_func_cntl_3;
-	u32 cg_spll_func_cntl_4;
-	u32 cg_spll_spread_spectrum;
-	u32 cg_spll_spread_spectrum_2;
-	u32 dll_cntl;
-	u32 mclk_pwrmgt_cntl;
-	u32 mpll_ad_func_cntl;
-	u32 mpll_dq_func_cntl;
-	u32 mpll_func_cntl;
-	u32 mpll_func_cntl_1;
-	u32 mpll_func_cntl_2;
-	u32 mpll_ss1;
-	u32 mpll_ss2;
-};
-
-struct ci_thermal_temperature_setting {
-	s32 temperature_low;
-	s32 temperature_high;
-	s32 temperature_shutdown;
-};
-
-struct ci_pcie_perf_range {
-	u16 max;
-	u16 min;
-};
-
-enum ci_pt_config_reg_type {
-	CISLANDS_CONFIGREG_MMR = 0,
-	CISLANDS_CONFIGREG_SMC_IND,
-	CISLANDS_CONFIGREG_DIDT_IND,
-	CISLANDS_CONFIGREG_CACHE,
-	CISLANDS_CONFIGREG_MAX
-};
-
-#define POWERCONTAINMENT_FEATURE_BAPM            0x00000001
-#define POWERCONTAINMENT_FEATURE_TDCLimit        0x00000002
-#define POWERCONTAINMENT_FEATURE_PkgPwrLimit     0x00000004
-
-struct ci_pt_config_reg {
-	u32 offset;
-	u32 mask;
-	u32 shift;
-	u32 value;
-	enum ci_pt_config_reg_type type;
-};
-
-struct ci_pt_defaults {
-	u8 svi_load_line_en;
-	u8 svi_load_line_vddc;
-	u8 tdc_vddc_throttle_release_limit_perc;
-	u8 tdc_mawt;
-	u8 tdc_waterfall_ctl;
-	u8 dte_ambient_temp_base;
-	u32 display_cac;
-	u32 bapm_temp_gradient;
-	u16 bapmti_r[SMU7_DTE_ITERATIONS * SMU7_DTE_SOURCES * SMU7_DTE_SINKS];
-	u16 bapmti_rc[SMU7_DTE_ITERATIONS * SMU7_DTE_SOURCES * SMU7_DTE_SINKS];
-};
-
-#define DPMTABLE_OD_UPDATE_SCLK     0x00000001
-#define DPMTABLE_OD_UPDATE_MCLK     0x00000002
-#define DPMTABLE_UPDATE_SCLK        0x00000004
-#define DPMTABLE_UPDATE_MCLK        0x00000008
-
-struct ci_power_info {
-	struct ci_dpm_table dpm_table;
-	struct ci_dpm_table golden_dpm_table;
-	u32 voltage_control;
-	u32 mvdd_control;
-	u32 vddci_control;
-	u32 active_auto_throttle_sources;
-	struct ci_clock_registers clock_registers;
-	u16 acpi_vddc;
-	u16 acpi_vddci;
-	enum amdgpu_pcie_gen force_pcie_gen;
-	enum amdgpu_pcie_gen acpi_pcie_gen;
-	struct ci_leakage_voltage vddc_leakage;
-	struct ci_leakage_voltage vddci_leakage;
-	u16 max_vddc_in_pp_table;
-	u16 min_vddc_in_pp_table;
-	u16 max_vddci_in_pp_table;
-	u16 min_vddci_in_pp_table;
-	u32 mclk_strobe_mode_threshold;
-	u32 mclk_stutter_mode_threshold;
-	u32 mclk_edc_enable_threshold;
-	u32 mclk_edc_wr_enable_threshold;
-	struct ci_vbios_boot_state vbios_boot_state;
-	/* smc offsets */
-	u32 sram_end;
-	u32 dpm_table_start;
-	u32 soft_regs_start;
-	u32 mc_reg_table_start;
-	u32 fan_table_start;
-	u32 arb_table_start;
-	/* smc tables */
-	SMU7_Discrete_DpmTable smc_state_table;
-	SMU7_Discrete_MCRegisters smc_mc_reg_table;
-	SMU7_Discrete_PmFuses smc_powertune_table;
-	/* other stuff */
-	struct ci_mc_reg_table mc_reg_table;
-	struct atom_voltage_table vddc_voltage_table;
-	struct atom_voltage_table vddci_voltage_table;
-	struct atom_voltage_table mvdd_voltage_table;
-	struct ci_ulv_parm ulv;
-	u32 power_containment_features;
-	const struct ci_pt_defaults *powertune_defaults;
-	u32 dte_tj_offset;
-	bool vddc_phase_shed_control;
-	struct ci_thermal_temperature_setting thermal_temp_setting;
-	struct ci_dpm_level_enable_mask dpm_level_enable_mask;
-	u32 need_update_smu7_dpm_table;
-	u32 sclk_dpm_key_disabled;
-	u32 mclk_dpm_key_disabled;
-	u32 pcie_dpm_key_disabled;
-	u32 thermal_sclk_dpm_enabled;
-	struct ci_pcie_perf_range pcie_gen_performance;
-	struct ci_pcie_perf_range pcie_lane_performance;
-	struct ci_pcie_perf_range pcie_gen_powersaving;
-	struct ci_pcie_perf_range pcie_lane_powersaving;
-	u32 activity_target[SMU7_MAX_LEVELS_GRAPHICS];
-	u32 mclk_activity_target;
-	u32 low_sclk_interrupt_t;
-	u32 last_mclk_dpm_enable_mask;
-	u32 sys_pcie_mask;
-	/* caps */
-	bool caps_power_containment;
-	bool caps_cac;
-	bool caps_sq_ramping;
-	bool caps_db_ramping;
-	bool caps_td_ramping;
-	bool caps_tcp_ramping;
-	bool caps_fps;
-	bool caps_sclk_ds;
-	bool caps_sclk_ss_support;
-	bool caps_mclk_ss_support;
-	bool caps_uvd_dpm;
-	bool caps_vce_dpm;
-	bool caps_samu_dpm;
-	bool caps_acp_dpm;
-	bool caps_automatic_dc_transition;
-	bool caps_sclk_throttle_low_notification;
-	bool caps_dynamic_ac_timing;
-	bool caps_od_fuzzy_fan_control_support;
-	/* flags */
-	bool thermal_protection;
-	bool pcie_performance_request;
-	bool dynamic_ss;
-	bool dll_default_on;
-	bool cac_enabled;
-	bool uvd_enabled;
-	bool battery_state;
-	bool pspp_notify_required;
-	bool enable_bapm_feature;
-	bool enable_tdc_limit_feature;
-	bool enable_pkg_pwr_tracking_feature;
-	bool use_pcie_performance_levels;
-	bool use_pcie_powersaving_levels;
-	bool uvd_power_gated;
-	/* driver states */
-	struct amdgpu_ps current_rps;
-	struct ci_ps current_ps;
-	struct amdgpu_ps requested_rps;
-	struct ci_ps requested_ps;
-	/* fan control */
-	bool fan_ctrl_is_in_default_mode;
-	bool fan_is_controlled_by_smc;
-	u32 t_min;
-	u32 fan_ctrl_default_mode;
-};
-
-#define CISLANDS_VOLTAGE_CONTROL_NONE                   0x0
-#define CISLANDS_VOLTAGE_CONTROL_BY_GPIO                0x1
-#define CISLANDS_VOLTAGE_CONTROL_BY_SVID2               0x2
-
-#define CISLANDS_Q88_FORMAT_CONVERSION_UNIT             256
-
-#define CISLANDS_VRC_DFLT0                              0x3FFFC000
-#define CISLANDS_VRC_DFLT1                              0x000400
-#define CISLANDS_VRC_DFLT2                              0xC00080
-#define CISLANDS_VRC_DFLT3                              0xC00200
-#define CISLANDS_VRC_DFLT4                              0xC01680
-#define CISLANDS_VRC_DFLT5                              0xC00033
-#define CISLANDS_VRC_DFLT6                              0xC00033
-#define CISLANDS_VRC_DFLT7                              0x3FFFC000
-
-#define CISLANDS_CGULVPARAMETER_DFLT                    0x00040035
-#define CISLAND_TARGETACTIVITY_DFLT                     30
-#define CISLAND_MCLK_TARGETACTIVITY_DFLT                10
-
-#define PCIE_PERF_REQ_REMOVE_REGISTRY   0
-#define PCIE_PERF_REQ_FORCE_LOWPOWER    1
-#define PCIE_PERF_REQ_PECI_GEN1         2
-#define PCIE_PERF_REQ_PECI_GEN2         3
-#define PCIE_PERF_REQ_PECI_GEN3         4
-
-#define CISLANDS_SSTU_DFLT                               0
-#define CISLANDS_SST_DFLT                                0x00C8
-
-/* XXX are these ok? */
-#define CISLANDS_TEMP_RANGE_MIN (90 * 1000)
-#define CISLANDS_TEMP_RANGE_MAX (120 * 1000)
-
-int amdgpu_ci_copy_bytes_to_smc(struct amdgpu_device *adev,
-			 u32 smc_start_address,
-			 const u8 *src, u32 byte_count, u32 limit);
-void amdgpu_ci_start_smc(struct amdgpu_device *adev);
-void amdgpu_ci_reset_smc(struct amdgpu_device *adev);
-int amdgpu_ci_program_jump_on_start(struct amdgpu_device *adev);
-void amdgpu_ci_stop_smc_clock(struct amdgpu_device *adev);
-void amdgpu_ci_start_smc_clock(struct amdgpu_device *adev);
-bool amdgpu_ci_is_smc_running(struct amdgpu_device *adev);
-PPSMC_Result amdgpu_ci_send_msg_to_smc(struct amdgpu_device *adev, PPSMC_Msg msg);
-PPSMC_Result amdgpu_ci_wait_for_smc_inactive(struct amdgpu_device *adev);
-int amdgpu_ci_load_smc_ucode(struct amdgpu_device *adev, u32 limit);
-int amdgpu_ci_read_smc_sram_dword(struct amdgpu_device *adev,
-			   u32 smc_address, u32 *value, u32 limit);
-int amdgpu_ci_write_smc_sram_dword(struct amdgpu_device *adev,
-			    u32 smc_address, u32 value, u32 limit);
-
-#endif
diff --git a/drivers/gpu/drm/amd/amdgpu/ci_smc.c b/drivers/gpu/drm/amd/amdgpu/ci_smc.c
deleted file mode 100644
index b8ba51e045b5..000000000000
--- a/drivers/gpu/drm/amd/amdgpu/ci_smc.c
+++ /dev/null
@@ -1,279 +0,0 @@
-/*
- * Copyright 2011 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: Alex Deucher
- */
-
-#include <linux/firmware.h>
-#include <drm/drmP.h>
-#include "amdgpu.h"
-#include "cikd.h"
-#include "ppsmc.h"
-#include "amdgpu_ucode.h"
-#include "ci_dpm.h"
-
-#include "smu/smu_7_0_1_d.h"
-#include "smu/smu_7_0_1_sh_mask.h"
-
-static int ci_set_smc_sram_address(struct amdgpu_device *adev,
-				   u32 smc_address, u32 limit)
-{
-	if (smc_address & 3)
-		return -EINVAL;
-	if ((smc_address + 3) > limit)
-		return -EINVAL;
-
-	WREG32(mmSMC_IND_INDEX_0, smc_address);
-	WREG32_P(mmSMC_IND_ACCESS_CNTL, 0, ~SMC_IND_ACCESS_CNTL__AUTO_INCREMENT_IND_0_MASK);
-
-	return 0;
-}
-
-int amdgpu_ci_copy_bytes_to_smc(struct amdgpu_device *adev,
-			 u32 smc_start_address,
-			 const u8 *src, u32 byte_count, u32 limit)
-{
-	unsigned long flags;
-	u32 data, original_data;
-	u32 addr;
-	u32 extra_shift;
-	int ret = 0;
-
-	if (smc_start_address & 3)
-		return -EINVAL;
-	if ((smc_start_address + byte_count) > limit)
-		return -EINVAL;
-
-	addr = smc_start_address;
-
-	spin_lock_irqsave(&adev->smc_idx_lock, flags);
-	while (byte_count >= 4) {
-		/* SMC address space is BE */
-		data = (src[0] << 24) | (src[1] << 16) | (src[2] << 8) | src[3];
-
-		ret = ci_set_smc_sram_address(adev, addr, limit);
-		if (ret)
-			goto done;
-
-		WREG32(mmSMC_IND_DATA_0, data);
-
-		src += 4;
-		byte_count -= 4;
-		addr += 4;
-	}
-
-	/* RMW for the final bytes */
-	if (byte_count > 0) {
-		data = 0;
-
-		ret = ci_set_smc_sram_address(adev, addr, limit);
-		if (ret)
-			goto done;
-
-		original_data = RREG32(mmSMC_IND_DATA_0);
-
-		extra_shift = 8 * (4 - byte_count);
-
-		while (byte_count > 0) {
-			data = (data << 8) + *src++;
-			byte_count--;
-		}
-
-		data <<= extra_shift;
-
-		data |= (original_data & ~((~0UL) << extra_shift));
-
-		ret = ci_set_smc_sram_address(adev, addr, limit);
-		if (ret)
-			goto done;
-
-		WREG32(mmSMC_IND_DATA_0, data);
-	}
-
-done:
-	spin_unlock_irqrestore(&adev->smc_idx_lock, flags);
-
-	return ret;
-}
-
-void amdgpu_ci_start_smc(struct amdgpu_device *adev)
-{
-	u32 tmp = RREG32_SMC(ixSMC_SYSCON_RESET_CNTL);
-
-	tmp &= ~SMC_SYSCON_RESET_CNTL__rst_reg_MASK;
-	WREG32_SMC(ixSMC_SYSCON_RESET_CNTL, tmp);
-}
-
-void amdgpu_ci_reset_smc(struct amdgpu_device *adev)
-{
-	u32 tmp = RREG32_SMC(ixSMC_SYSCON_RESET_CNTL);
-
-	tmp |= SMC_SYSCON_RESET_CNTL__rst_reg_MASK;
-	WREG32_SMC(ixSMC_SYSCON_RESET_CNTL, tmp);
-}
-
-int amdgpu_ci_program_jump_on_start(struct amdgpu_device *adev)
-{
-	static u8 data[] = { 0xE0, 0x00, 0x80, 0x40 };
-
-	return amdgpu_ci_copy_bytes_to_smc(adev, 0x0, data, 4, sizeof(data)+1);
-}
-
-void amdgpu_ci_stop_smc_clock(struct amdgpu_device *adev)
-{
-	u32 tmp = RREG32_SMC(ixSMC_SYSCON_CLOCK_CNTL_0);
-
-	tmp |= SMC_SYSCON_CLOCK_CNTL_0__ck_disable_MASK;
-
-	WREG32_SMC(ixSMC_SYSCON_CLOCK_CNTL_0, tmp);
-}
-
-void amdgpu_ci_start_smc_clock(struct amdgpu_device *adev)
-{
-	u32 tmp = RREG32_SMC(ixSMC_SYSCON_CLOCK_CNTL_0);
-
-	tmp &= ~SMC_SYSCON_CLOCK_CNTL_0__ck_disable_MASK;
-
-	WREG32_SMC(ixSMC_SYSCON_CLOCK_CNTL_0, tmp);
-}
-
-bool amdgpu_ci_is_smc_running(struct amdgpu_device *adev)
-{
-	u32 clk = RREG32_SMC(ixSMC_SYSCON_CLOCK_CNTL_0);
-	u32 pc_c = RREG32_SMC(ixSMC_PC_C);
-
-	if (!(clk & SMC_SYSCON_CLOCK_CNTL_0__ck_disable_MASK) && (0x20100 <= pc_c))
-		return true;
-
-	return false;
-}
-
-PPSMC_Result amdgpu_ci_send_msg_to_smc(struct amdgpu_device *adev, PPSMC_Msg msg)
-{
-	u32 tmp;
-	int i;
-
-	if (!amdgpu_ci_is_smc_running(adev))
-		return PPSMC_Result_Failed;
-
-	WREG32(mmSMC_MESSAGE_0, msg);
-
-	for (i = 0; i < adev->usec_timeout; i++) {
-		tmp = RREG32(mmSMC_RESP_0);
-		if (tmp != 0)
-			break;
-		udelay(1);
-	}
-	tmp = RREG32(mmSMC_RESP_0);
-
-	return (PPSMC_Result)tmp;
-}
-
-PPSMC_Result amdgpu_ci_wait_for_smc_inactive(struct amdgpu_device *adev)
-{
-	u32 tmp;
-	int i;
-
-	if (!amdgpu_ci_is_smc_running(adev))
-		return PPSMC_Result_OK;
-
-	for (i = 0; i < adev->usec_timeout; i++) {
-		tmp = RREG32_SMC(ixSMC_SYSCON_CLOCK_CNTL_0);
-		if ((tmp & SMC_SYSCON_CLOCK_CNTL_0__cken_MASK) == 0)
-			break;
-		udelay(1);
-	}
-
-	return PPSMC_Result_OK;
-}
-
-int amdgpu_ci_load_smc_ucode(struct amdgpu_device *adev, u32 limit)
-{
-	const struct smc_firmware_header_v1_0 *hdr;
-	unsigned long flags;
-	u32 ucode_start_address;
-	u32 ucode_size;
-	const u8 *src;
-	u32 data;
-
-	if (!adev->pm.fw)
-		return -EINVAL;
-
-	hdr = (const struct smc_firmware_header_v1_0 *)adev->pm.fw->data;
-	amdgpu_ucode_print_smc_hdr(&hdr->header);
-
-	adev->pm.fw_version = le32_to_cpu(hdr->header.ucode_version);
-	ucode_start_address = le32_to_cpu(hdr->ucode_start_addr);
-	ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes);
-	src = (const u8 *)
-		(adev->pm.fw->data + le32_to_cpu(hdr->header.ucode_array_offset_bytes));
-
-	if (ucode_size & 3)
-		return -EINVAL;
-
-	spin_lock_irqsave(&adev->smc_idx_lock, flags);
-	WREG32(mmSMC_IND_INDEX_0, ucode_start_address);
-	WREG32_P(mmSMC_IND_ACCESS_CNTL, SMC_IND_ACCESS_CNTL__AUTO_INCREMENT_IND_0_MASK,
-		~SMC_IND_ACCESS_CNTL__AUTO_INCREMENT_IND_0_MASK);
-	while (ucode_size >= 4) {
-		/* SMC address space is BE */
-		data = (src[0] << 24) | (src[1] << 16) | (src[2] << 8) | src[3];
-
-		WREG32(mmSMC_IND_DATA_0, data);
-
-		src += 4;
-		ucode_size -= 4;
-	}
-	WREG32_P(mmSMC_IND_ACCESS_CNTL, 0, ~SMC_IND_ACCESS_CNTL__AUTO_INCREMENT_IND_0_MASK);
-	spin_unlock_irqrestore(&adev->smc_idx_lock, flags);
-
-	return 0;
-}
-
-int amdgpu_ci_read_smc_sram_dword(struct amdgpu_device *adev,
-			   u32 smc_address, u32 *value, u32 limit)
-{
-	unsigned long flags;
-	int ret;
-
-	spin_lock_irqsave(&adev->smc_idx_lock, flags);
-	ret = ci_set_smc_sram_address(adev, smc_address, limit);
-	if (ret == 0)
-		*value = RREG32(mmSMC_IND_DATA_0);
-	spin_unlock_irqrestore(&adev->smc_idx_lock, flags);
-
-	return ret;
-}
-
-int amdgpu_ci_write_smc_sram_dword(struct amdgpu_device *adev,
-			    u32 smc_address, u32 value, u32 limit)
-{
-	unsigned long flags;
-	int ret;
-
-	spin_lock_irqsave(&adev->smc_idx_lock, flags);
-	ret = ci_set_smc_sram_address(adev, smc_address, limit);
-	if (ret == 0)
-		WREG32(mmSMC_IND_DATA_0, value);
-	spin_unlock_irqrestore(&adev->smc_idx_lock, flags);
-
-	return ret;
-}
diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c
index 71c50d8900e3..07c1f239e9c3 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik.c
@@ -1741,6 +1741,69 @@ static bool cik_need_full_reset(struct amdgpu_device *adev)
 	return true;
 }
 
+static void cik_get_pcie_usage(struct amdgpu_device *adev, uint64_t *count0,
+			       uint64_t *count1)
+{
+	uint32_t perfctr = 0;
+	uint64_t cnt0_of, cnt1_of;
+	int tmp;
+
+	/* This reports 0 on APUs, so return to avoid writing/reading registers
+	 * that may or may not be different from their GPU counterparts
+	 */
+	if (adev->flags & AMD_IS_APU)
+		return;
+
+	/* Set the 2 events that we wish to watch, defined above */
+	/* Reg 40 is # received msgs, Reg 104 is # of posted requests sent */
+	perfctr = REG_SET_FIELD(perfctr, PCIE_PERF_CNTL_TXCLK, EVENT0_SEL, 40);
+	perfctr = REG_SET_FIELD(perfctr, PCIE_PERF_CNTL_TXCLK, EVENT1_SEL, 104);
+
+	/* Write to enable desired perf counters */
+	WREG32_PCIE(ixPCIE_PERF_CNTL_TXCLK, perfctr);
+	/* Zero out and enable the perf counters
+	 * Write 0x5:
+	 * Bit 0 = Start all counters(1)
+	 * Bit 2 = Global counter reset enable(1)
+	 */
+	WREG32_PCIE(ixPCIE_PERF_COUNT_CNTL, 0x00000005);
+
+	msleep(1000);
+
+	/* Load the shadow and disable the perf counters
+	 * Write 0x2:
+	 * Bit 0 = Stop counters(0)
+	 * Bit 1 = Load the shadow counters(1)
+	 */
+	WREG32_PCIE(ixPCIE_PERF_COUNT_CNTL, 0x00000002);
+
+	/* Read register values to get any >32bit overflow */
+	tmp = RREG32_PCIE(ixPCIE_PERF_CNTL_TXCLK);
+	cnt0_of = REG_GET_FIELD(tmp, PCIE_PERF_CNTL_TXCLK, COUNTER0_UPPER);
+	cnt1_of = REG_GET_FIELD(tmp, PCIE_PERF_CNTL_TXCLK, COUNTER1_UPPER);
+
+	/* Get the values and add the overflow */
+	*count0 = RREG32_PCIE(ixPCIE_PERF_COUNT0_TXCLK) | (cnt0_of << 32);
+	*count1 = RREG32_PCIE(ixPCIE_PERF_COUNT1_TXCLK) | (cnt1_of << 32);
+}
+
+static bool cik_need_reset_on_init(struct amdgpu_device *adev)
+{
+	u32 clock_cntl, pc;
+
+	if (adev->flags & AMD_IS_APU)
+		return false;
+
+	/* check if the SMC is already running */
+	clock_cntl = RREG32_SMC(ixSMC_SYSCON_CLOCK_CNTL_0);
+	pc = RREG32_SMC(ixSMC_PC_C);
+	if ((0 == REG_GET_FIELD(clock_cntl, SMC_SYSCON_CLOCK_CNTL_0, ck_disable)) &&
+	    (0x20100 <= pc))
+		return true;
+
+	return false;
+}
+
 static const struct amdgpu_asic_funcs cik_asic_funcs =
 {
 	.read_disabled_bios = &cik_read_disabled_bios,
@@ -1756,6 +1819,8 @@ static const struct amdgpu_asic_funcs cik_asic_funcs =
 	.invalidate_hdp = &cik_invalidate_hdp,
 	.need_full_reset = &cik_need_full_reset,
 	.init_doorbell_index = &legacy_doorbell_index_init,
+	.get_pcie_usage = &cik_get_pcie_usage,
+	.need_reset_on_init = &cik_need_reset_on_init,
 };
 
 static int cik_common_early_init(void *handle)
@@ -2005,10 +2070,7 @@ int cik_set_ip_blocks(struct amdgpu_device *adev)
 		amdgpu_device_ip_block_add(adev, &cik_ih_ip_block);
 		amdgpu_device_ip_block_add(adev, &gfx_v7_2_ip_block);
 		amdgpu_device_ip_block_add(adev, &cik_sdma_ip_block);
-		if (amdgpu_dpm == -1)
-			amdgpu_device_ip_block_add(adev, &pp_smu_ip_block);
-		else
-			amdgpu_device_ip_block_add(adev, &ci_smu_ip_block);
+		amdgpu_device_ip_block_add(adev, &pp_smu_ip_block);
 		if (adev->enable_virtual_display)
 			amdgpu_device_ip_block_add(adev, &dce_virtual_ip_block);
 #if defined(CONFIG_DRM_AMD_DC)
@@ -2026,10 +2088,7 @@ int cik_set_ip_blocks(struct amdgpu_device *adev)
 		amdgpu_device_ip_block_add(adev, &cik_ih_ip_block);
 		amdgpu_device_ip_block_add(adev, &gfx_v7_3_ip_block);
 		amdgpu_device_ip_block_add(adev, &cik_sdma_ip_block);
-		if (amdgpu_dpm == -1)
-			amdgpu_device_ip_block_add(adev, &pp_smu_ip_block);
-		else
-			amdgpu_device_ip_block_add(adev, &ci_smu_ip_block);
+		amdgpu_device_ip_block_add(adev, &pp_smu_ip_block);
 		if (adev->enable_virtual_display)
 			amdgpu_device_ip_block_add(adev, &dce_virtual_ip_block);
 #if defined(CONFIG_DRM_AMD_DC)
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_dpm.h b/drivers/gpu/drm/amd/amdgpu/cik_dpm.h
index 2a086610f74d..2fcc4b60153c 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_dpm.h
+++ b/drivers/gpu/drm/amd/amdgpu/cik_dpm.h
@@ -24,7 +24,6 @@
 #ifndef __CIK_DPM_H__
 #define __CIK_DPM_H__
 
-extern const struct amdgpu_ip_block_version ci_smu_ip_block;
 extern const struct amdgpu_ip_block_version kv_smu_ip_block;
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index 8a8b4967a101..721c757156e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -103,9 +103,9 @@ static void cik_ih_disable_interrupts(struct amdgpu_device *adev)
  */
 static int cik_ih_irq_init(struct amdgpu_device *adev)
 {
+	struct amdgpu_ih_ring *ih = &adev->irq.ih;
 	int rb_bufsz;
 	u32 interrupt_cntl, ih_cntl, ih_rb_cntl;
-	u64 wptr_off;
 
 	/* disable irqs */
 	cik_ih_disable_interrupts(adev);
@@ -131,9 +131,8 @@ static int cik_ih_irq_init(struct amdgpu_device *adev)
 	ih_rb_cntl |= IH_RB_CNTL__WPTR_WRITEBACK_ENABLE_MASK;
 
 	/* set the writeback address whether it's enabled or not */
-	wptr_off = adev->wb.gpu_addr + (adev->irq.ih.wptr_offs * 4);
-	WREG32(mmIH_RB_WPTR_ADDR_LO, lower_32_bits(wptr_off));
-	WREG32(mmIH_RB_WPTR_ADDR_HI, upper_32_bits(wptr_off) & 0xFF);
+	WREG32(mmIH_RB_WPTR_ADDR_LO, lower_32_bits(ih->wptr_addr));
+	WREG32(mmIH_RB_WPTR_ADDR_HI, upper_32_bits(ih->wptr_addr) & 0xFF);
 
 	WREG32(mmIH_RB_CNTL, ih_rb_cntl);
 
@@ -183,11 +182,12 @@ static void cik_ih_irq_disable(struct amdgpu_device *adev)
  * Used by cik_irq_process().
  * Returns the value of the wptr.
  */
-static u32 cik_ih_get_wptr(struct amdgpu_device *adev)
+static u32 cik_ih_get_wptr(struct amdgpu_device *adev,
+			   struct amdgpu_ih_ring *ih)
 {
 	u32 wptr, tmp;
 
-	wptr = le32_to_cpu(adev->wb.wb[adev->irq.ih.wptr_offs]);
+	wptr = le32_to_cpu(*ih->wptr_cpu);
 
 	if (wptr & IH_RB_WPTR__RB_OVERFLOW_MASK) {
 		wptr &= ~IH_RB_WPTR__RB_OVERFLOW_MASK;
@@ -196,13 +196,13 @@ static u32 cik_ih_get_wptr(struct amdgpu_device *adev)
 		 * this should allow us to catchup.
 		 */
 		dev_warn(adev->dev, "IH ring buffer overflow (0x%08X, 0x%08X, 0x%08X)\n",
-			wptr, adev->irq.ih.rptr, (wptr + 16) & adev->irq.ih.ptr_mask);
-		adev->irq.ih.rptr = (wptr + 16) & adev->irq.ih.ptr_mask;
+			 wptr, ih->rptr, (wptr + 16) & ih->ptr_mask);
+		ih->rptr = (wptr + 16) & ih->ptr_mask;
 		tmp = RREG32(mmIH_RB_CNTL);
 		tmp |= IH_RB_CNTL__WPTR_OVERFLOW_CLEAR_MASK;
 		WREG32(mmIH_RB_CNTL, tmp);
 	}
-	return (wptr & adev->irq.ih.ptr_mask);
+	return (wptr & ih->ptr_mask);
 }
 
 /*        CIK IV Ring
@@ -237,16 +237,17 @@ static u32 cik_ih_get_wptr(struct amdgpu_device *adev)
  * position and also advance the position.
  */
 static void cik_ih_decode_iv(struct amdgpu_device *adev,
+			     struct amdgpu_ih_ring *ih,
 			     struct amdgpu_iv_entry *entry)
 {
 	/* wptr/rptr are in bytes! */
-	u32 ring_index = adev->irq.ih.rptr >> 2;
+	u32 ring_index = ih->rptr >> 2;
 	uint32_t dw[4];
 
-	dw[0] = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
-	dw[1] = le32_to_cpu(adev->irq.ih.ring[ring_index + 1]);
-	dw[2] = le32_to_cpu(adev->irq.ih.ring[ring_index + 2]);
-	dw[3] = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
+	dw[0] = le32_to_cpu(ih->ring[ring_index + 0]);
+	dw[1] = le32_to_cpu(ih->ring[ring_index + 1]);
+	dw[2] = le32_to_cpu(ih->ring[ring_index + 2]);
+	dw[3] = le32_to_cpu(ih->ring[ring_index + 3]);
 
 	entry->client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
 	entry->src_id = dw[0] & 0xff;
@@ -256,7 +257,7 @@ static void cik_ih_decode_iv(struct amdgpu_device *adev,
 	entry->pasid = (dw[2] >> 16) & 0xffff;
 
 	/* wptr/rptr are in bytes! */
-	adev->irq.ih.rptr += 16;
+	ih->rptr += 16;
 }
 
 /**
@@ -266,9 +267,10 @@ static void cik_ih_decode_iv(struct amdgpu_device *adev,
  *
  * Set the IH ring buffer rptr.
  */
-static void cik_ih_set_rptr(struct amdgpu_device *adev)
+static void cik_ih_set_rptr(struct amdgpu_device *adev,
+			    struct amdgpu_ih_ring *ih)
 {
-	WREG32(mmIH_RB_RPTR, adev->irq.ih.rptr);
+	WREG32(mmIH_RB_RPTR, ih->rptr);
 }
 
 static int cik_ih_early_init(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index 45795191de1f..189599b694e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -220,7 +220,7 @@ static void cik_sdma_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
 static void cik_sdma_ring_emit_ib(struct amdgpu_ring *ring,
 				  struct amdgpu_job *job,
 				  struct amdgpu_ib *ib,
-				  bool ctx_switch)
+				  uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 extra_bits = vmid & 0xf;
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index 9d3ea298e116..61024b9c7a4b 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -103,9 +103,9 @@ static void cz_ih_disable_interrupts(struct amdgpu_device *adev)
  */
 static int cz_ih_irq_init(struct amdgpu_device *adev)
 {
-	int rb_bufsz;
+	struct amdgpu_ih_ring *ih = &adev->irq.ih;
 	u32 interrupt_cntl, ih_cntl, ih_rb_cntl;
-	u64 wptr_off;
+	int rb_bufsz;
 
 	/* disable irqs */
 	cz_ih_disable_interrupts(adev);
@@ -133,9 +133,8 @@ static int cz_ih_irq_init(struct amdgpu_device *adev)
 	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, WPTR_WRITEBACK_ENABLE, 1);
 
 	/* set the writeback address whether it's enabled or not */
-	wptr_off = adev->wb.gpu_addr + (adev->irq.ih.wptr_offs * 4);
-	WREG32(mmIH_RB_WPTR_ADDR_LO, lower_32_bits(wptr_off));
-	WREG32(mmIH_RB_WPTR_ADDR_HI, upper_32_bits(wptr_off) & 0xFF);
+	WREG32(mmIH_RB_WPTR_ADDR_LO, lower_32_bits(ih->wptr_addr));
+	WREG32(mmIH_RB_WPTR_ADDR_HI, upper_32_bits(ih->wptr_addr) & 0xFF);
 
 	WREG32(mmIH_RB_CNTL, ih_rb_cntl);
 
@@ -185,11 +184,12 @@ static void cz_ih_irq_disable(struct amdgpu_device *adev)
  * Used by cz_irq_process(VI).
  * Returns the value of the wptr.
  */
-static u32 cz_ih_get_wptr(struct amdgpu_device *adev)
+static u32 cz_ih_get_wptr(struct amdgpu_device *adev,
+			  struct amdgpu_ih_ring *ih)
 {
 	u32 wptr, tmp;
 
-	wptr = le32_to_cpu(adev->wb.wb[adev->irq.ih.wptr_offs]);
+	wptr = le32_to_cpu(*ih->wptr_cpu);
 
 	if (REG_GET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW)) {
 		wptr = REG_SET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW, 0);
@@ -198,13 +198,13 @@ static u32 cz_ih_get_wptr(struct amdgpu_device *adev)
 		 * this should allow us to catchup.
 		 */
 		dev_warn(adev->dev, "IH ring buffer overflow (0x%08X, 0x%08X, 0x%08X)\n",
-			wptr, adev->irq.ih.rptr, (wptr + 16) & adev->irq.ih.ptr_mask);
-		adev->irq.ih.rptr = (wptr + 16) & adev->irq.ih.ptr_mask;
+			wptr, ih->rptr, (wptr + 16) & ih->ptr_mask);
+		ih->rptr = (wptr + 16) & ih->ptr_mask;
 		tmp = RREG32(mmIH_RB_CNTL);
 		tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 1);
 		WREG32(mmIH_RB_CNTL, tmp);
 	}
-	return (wptr & adev->irq.ih.ptr_mask);
+	return (wptr & ih->ptr_mask);
 }
 
 /**
@@ -216,16 +216,17 @@ static u32 cz_ih_get_wptr(struct amdgpu_device *adev)
  * position and also advance the position.
  */
 static void cz_ih_decode_iv(struct amdgpu_device *adev,
-				 struct amdgpu_iv_entry *entry)
+			    struct amdgpu_ih_ring *ih,
+			    struct amdgpu_iv_entry *entry)
 {
 	/* wptr/rptr are in bytes! */
-	u32 ring_index = adev->irq.ih.rptr >> 2;
+	u32 ring_index = ih->rptr >> 2;
 	uint32_t dw[4];
 
-	dw[0] = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
-	dw[1] = le32_to_cpu(adev->irq.ih.ring[ring_index + 1]);
-	dw[2] = le32_to_cpu(adev->irq.ih.ring[ring_index + 2]);
-	dw[3] = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
+	dw[0] = le32_to_cpu(ih->ring[ring_index + 0]);
+	dw[1] = le32_to_cpu(ih->ring[ring_index + 1]);
+	dw[2] = le32_to_cpu(ih->ring[ring_index + 2]);
+	dw[3] = le32_to_cpu(ih->ring[ring_index + 3]);
 
 	entry->client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
 	entry->src_id = dw[0] & 0xff;
@@ -235,7 +236,7 @@ static void cz_ih_decode_iv(struct amdgpu_device *adev,
 	entry->pasid = (dw[2] >> 16) & 0xffff;
 
 	/* wptr/rptr are in bytes! */
-	adev->irq.ih.rptr += 16;
+	ih->rptr += 16;
 }
 
 /**
@@ -245,9 +246,10 @@ static void cz_ih_decode_iv(struct amdgpu_device *adev,
  *
  * Set the IH ring buffer rptr.
  */
-static void cz_ih_set_rptr(struct amdgpu_device *adev)
+static void cz_ih_set_rptr(struct amdgpu_device *adev,
+			   struct amdgpu_ih_ring *ih)
 {
-	WREG32(mmIH_RB_RPTR, adev->irq.ih.rptr);
+	WREG32(mmIH_RB_RPTR, ih->rptr);
 }
 
 static int cz_ih_early_init(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
index 4cfecdce29a3..1f0426d2fc2a 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
@@ -1682,7 +1682,7 @@ static void dce_v10_0_afmt_setmode(struct drm_encoder *encoder,
 	dce_v10_0_audio_write_sad_regs(encoder);
 	dce_v10_0_audio_write_latency_fields(encoder, mode);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, connector, mode);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
 		return;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
index 7c868916d90f..2280b971d758 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
@@ -1724,7 +1724,7 @@ static void dce_v11_0_afmt_setmode(struct drm_encoder *encoder,
 	dce_v11_0_audio_write_sad_regs(encoder);
 	dce_v11_0_audio_write_latency_fields(encoder, mode);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, connector, mode);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
 		return;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
index 17eaaba36017..bea32f076b91 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
@@ -1423,6 +1423,7 @@ static void dce_v6_0_audio_set_avi_infoframe(struct drm_encoder *encoder,
 	struct amdgpu_device *adev = dev->dev_private;
 	struct amdgpu_encoder *amdgpu_encoder = to_amdgpu_encoder(encoder);
 	struct amdgpu_encoder_atom_dig *dig = amdgpu_encoder->enc_priv;
+	struct drm_connector *connector = amdgpu_get_connector_for_encoder(encoder);
 	struct hdmi_avi_infoframe frame;
 	u8 buffer[HDMI_INFOFRAME_HEADER_SIZE + HDMI_AVI_INFOFRAME_SIZE];
 	uint8_t *payload = buffer + 3;
@@ -1430,7 +1431,7 @@ static void dce_v6_0_audio_set_avi_infoframe(struct drm_encoder *encoder,
 	ssize_t err;
 	u32 tmp;
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, connector, mode);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
 		return;
@@ -2979,7 +2980,7 @@ static int dce_v6_0_pageflip_irq(struct amdgpu_device *adev,
 				 struct amdgpu_irq_src *source,
 				 struct amdgpu_iv_entry *entry)
 {
-		unsigned long flags;
+	unsigned long flags;
 	unsigned crtc_id;
 	struct amdgpu_crtc *amdgpu_crtc;
 	struct amdgpu_flip_work *works;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
index 8c0576978d36..13da915991dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
@@ -1616,7 +1616,7 @@ static void dce_v8_0_afmt_setmode(struct drm_encoder *encoder,
 	dce_v8_0_audio_write_sad_regs(encoder);
 	dce_v8_0_audio_write_latency_fields(encoder, mode);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, connector, mode);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
 		return;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
index 1dc3013ea1d5..305276c7e4bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -1842,13 +1842,13 @@ static void gfx_v6_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
 static void gfx_v6_0_ring_emit_ib(struct amdgpu_ring *ring,
 				  struct amdgpu_job *job,
 				  struct amdgpu_ib *ib,
-				  bool ctx_switch)
+				  uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 header, control = 0;
 
 	/* insert SWITCH_BUFFER packet before first IB in the ring frame */
-	if (ctx_switch) {
+	if (flags & AMDGPU_HAVE_CTX_SWITCH) {
 		amdgpu_ring_write(ring, PACKET3(PACKET3_SWITCH_BUFFER, 0));
 		amdgpu_ring_write(ring, 0);
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 3a9fb6018c16..a59e0fdf5a97 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2228,13 +2228,13 @@ static void gfx_v7_0_ring_emit_fence_compute(struct amdgpu_ring *ring,
 static void gfx_v7_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
 					struct amdgpu_job *job,
 					struct amdgpu_ib *ib,
-					bool ctx_switch)
+					uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 header, control = 0;
 
 	/* insert SWITCH_BUFFER packet before first IB in the ring frame */
-	if (ctx_switch) {
+	if (flags & AMDGPU_HAVE_CTX_SWITCH) {
 		amdgpu_ring_write(ring, PACKET3(PACKET3_SWITCH_BUFFER, 0));
 		amdgpu_ring_write(ring, 0);
 	}
@@ -2259,11 +2259,27 @@ static void gfx_v7_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
 static void gfx_v7_0_ring_emit_ib_compute(struct amdgpu_ring *ring,
 					  struct amdgpu_job *job,
 					  struct amdgpu_ib *ib,
-					  bool ctx_switch)
+					  uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 control = INDIRECT_BUFFER_VALID | ib->length_dw | (vmid << 24);
 
+	/* Currently, there is a high possibility to get wave ID mismatch
+	 * between ME and GDS, leading to a hw deadlock, because ME generates
+	 * different wave IDs than the GDS expects. This situation happens
+	 * randomly when at least 5 compute pipes use GDS ordered append.
+	 * The wave IDs generated by ME are also wrong after suspend/resume.
+	 * Those are probably bugs somewhere else in the kernel driver.
+	 *
+	 * Writing GDS_COMPUTE_MAX_WAVE_ID resets wave ID counters in ME and
+	 * GDS to 0 for this ring (me/pipe).
+	 */
+	if (ib->flags & AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID) {
+		amdgpu_ring_write(ring, PACKET3(PACKET3_SET_CONFIG_REG, 1));
+		amdgpu_ring_write(ring, mmGDS_COMPUTE_MAX_WAVE_ID - PACKET3_SET_CONFIG_REG_START);
+		amdgpu_ring_write(ring, ring->adev->gds.gds_compute_max_wave_id);
+	}
+
 	amdgpu_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
 	amdgpu_ring_write(ring,
 #ifdef __BIG_ENDIAN
@@ -5000,7 +5016,7 @@ static const struct amdgpu_ring_funcs gfx_v7_0_ring_funcs_compute = {
 		7 + /* gfx_v7_0_ring_emit_pipeline_sync */
 		CIK_FLUSH_GPU_TLB_NUM_WREG * 5 + 7 + /* gfx_v7_0_ring_emit_vm_flush */
 		7 + 7 + 7, /* gfx_v7_0_ring_emit_fence_compute x3 for user fence, vm fence */
-	.emit_ib_size =	4, /* gfx_v7_0_ring_emit_ib_compute */
+	.emit_ib_size =	7, /* gfx_v7_0_ring_emit_ib_compute */
 	.emit_ib = gfx_v7_0_ring_emit_ib_compute,
 	.emit_fence = gfx_v7_0_ring_emit_fence_compute,
 	.emit_pipeline_sync = gfx_v7_0_ring_emit_pipeline_sync,
@@ -5057,6 +5073,7 @@ static void gfx_v7_0_set_gds_init(struct amdgpu_device *adev)
 	adev->gds.mem.total_size = RREG32(mmGDS_VMID0_SIZE);
 	adev->gds.gws.total_size = 64;
 	adev->gds.oa.total_size = 16;
+	adev->gds.gds_compute_max_wave_id = RREG32(mmGDS_COMPUTE_MAX_WAVE_ID);
 
 	if (adev->gds.mem.total_size == 64 * 1024) {
 		adev->gds.mem.gfx_partition_size = 4096;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 57cb3a51bda7..b8e50a34bdb3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -6047,7 +6047,7 @@ static void gfx_v8_0_ring_emit_vgt_flush(struct amdgpu_ring *ring)
 static void gfx_v8_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
 					struct amdgpu_job *job,
 					struct amdgpu_ib *ib,
-					bool ctx_switch)
+					uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 header, control = 0;
@@ -6079,11 +6079,27 @@ static void gfx_v8_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
 static void gfx_v8_0_ring_emit_ib_compute(struct amdgpu_ring *ring,
 					  struct amdgpu_job *job,
 					  struct amdgpu_ib *ib,
-					  bool ctx_switch)
+					  uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 control = INDIRECT_BUFFER_VALID | ib->length_dw | (vmid << 24);
 
+	/* Currently, there is a high possibility to get wave ID mismatch
+	 * between ME and GDS, leading to a hw deadlock, because ME generates
+	 * different wave IDs than the GDS expects. This situation happens
+	 * randomly when at least 5 compute pipes use GDS ordered append.
+	 * The wave IDs generated by ME are also wrong after suspend/resume.
+	 * Those are probably bugs somewhere else in the kernel driver.
+	 *
+	 * Writing GDS_COMPUTE_MAX_WAVE_ID resets wave ID counters in ME and
+	 * GDS to 0 for this ring (me/pipe).
+	 */
+	if (ib->flags & AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID) {
+		amdgpu_ring_write(ring, PACKET3(PACKET3_SET_CONFIG_REG, 1));
+		amdgpu_ring_write(ring, mmGDS_COMPUTE_MAX_WAVE_ID - PACKET3_SET_CONFIG_REG_START);
+		amdgpu_ring_write(ring, ring->adev->gds.gds_compute_max_wave_id);
+	}
+
 	amdgpu_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
 	amdgpu_ring_write(ring,
 #ifdef __BIG_ENDIAN
@@ -6890,7 +6906,7 @@ static const struct amdgpu_ring_funcs gfx_v8_0_ring_funcs_compute = {
 		7 + /* gfx_v8_0_ring_emit_pipeline_sync */
 		VI_FLUSH_GPU_TLB_NUM_WREG * 5 + 7 + /* gfx_v8_0_ring_emit_vm_flush */
 		7 + 7 + 7, /* gfx_v8_0_ring_emit_fence_compute x3 for user fence, vm fence */
-	.emit_ib_size =	4, /* gfx_v8_0_ring_emit_ib_compute */
+	.emit_ib_size =	7, /* gfx_v8_0_ring_emit_ib_compute */
 	.emit_ib = gfx_v8_0_ring_emit_ib_compute,
 	.emit_fence = gfx_v8_0_ring_emit_fence_compute,
 	.emit_pipeline_sync = gfx_v8_0_ring_emit_pipeline_sync,
@@ -6920,7 +6936,7 @@ static const struct amdgpu_ring_funcs gfx_v8_0_ring_funcs_kiq = {
 		7 + /* gfx_v8_0_ring_emit_pipeline_sync */
 		17 + /* gfx_v8_0_ring_emit_vm_flush */
 		7 + 7 + 7, /* gfx_v8_0_ring_emit_fence_kiq x3 for user fence, vm fence */
-	.emit_ib_size =	4, /* gfx_v8_0_ring_emit_ib_compute */
+	.emit_ib_size =	7, /* gfx_v8_0_ring_emit_ib_compute */
 	.emit_fence = gfx_v8_0_ring_emit_fence_kiq,
 	.test_ring = gfx_v8_0_ring_test_ring,
 	.insert_nop = amdgpu_ring_insert_nop,
@@ -6996,6 +7012,7 @@ static void gfx_v8_0_set_gds_init(struct amdgpu_device *adev)
 	adev->gds.mem.total_size = RREG32(mmGDS_VMID0_SIZE);
 	adev->gds.gws.total_size = 64;
 	adev->gds.oa.total_size = 16;
+	adev->gds.gds_compute_max_wave_id = RREG32(mmGDS_COMPUTE_MAX_WAVE_ID);
 
 	if (adev->gds.mem.total_size == 64 * 1024) {
 		adev->gds.mem.gfx_partition_size = 4096;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index fbca0494f871..5533f6e4f4a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -3972,7 +3972,7 @@ static void gfx_v9_0_ring_emit_hdp_flush(struct amdgpu_ring *ring)
 static void gfx_v9_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
 					struct amdgpu_job *job,
 					struct amdgpu_ib *ib,
-					bool ctx_switch)
+					uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 header, control = 0;
@@ -4005,11 +4005,27 @@ static void gfx_v9_0_ring_emit_ib_gfx(struct amdgpu_ring *ring,
 static void gfx_v9_0_ring_emit_ib_compute(struct amdgpu_ring *ring,
 					  struct amdgpu_job *job,
 					  struct amdgpu_ib *ib,
-					  bool ctx_switch)
+					  uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	u32 control = INDIRECT_BUFFER_VALID | ib->length_dw | (vmid << 24);
 
+	/* Currently, there is a high possibility to get wave ID mismatch
+	 * between ME and GDS, leading to a hw deadlock, because ME generates
+	 * different wave IDs than the GDS expects. This situation happens
+	 * randomly when at least 5 compute pipes use GDS ordered append.
+	 * The wave IDs generated by ME are also wrong after suspend/resume.
+	 * Those are probably bugs somewhere else in the kernel driver.
+	 *
+	 * Writing GDS_COMPUTE_MAX_WAVE_ID resets wave ID counters in ME and
+	 * GDS to 0 for this ring (me/pipe).
+	 */
+	if (ib->flags & AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID) {
+		amdgpu_ring_write(ring, PACKET3(PACKET3_SET_CONFIG_REG, 1));
+		amdgpu_ring_write(ring, mmGDS_COMPUTE_MAX_WAVE_ID);
+		amdgpu_ring_write(ring, ring->adev->gds.gds_compute_max_wave_id);
+	}
+
 	amdgpu_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
 	BUG_ON(ib->gpu_addr & 0x3); /* Dword align */
 	amdgpu_ring_write(ring,
@@ -4729,7 +4745,7 @@ static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_compute = {
 		SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 +
 		2 + /* gfx_v9_0_ring_emit_vm_flush */
 		8 + 8 + 8, /* gfx_v9_0_ring_emit_fence x3 for user fence, vm fence */
-	.emit_ib_size =	4, /* gfx_v9_0_ring_emit_ib_compute */
+	.emit_ib_size =	7, /* gfx_v9_0_ring_emit_ib_compute */
 	.emit_ib = gfx_v9_0_ring_emit_ib_compute,
 	.emit_fence = gfx_v9_0_ring_emit_fence,
 	.emit_pipeline_sync = gfx_v9_0_ring_emit_pipeline_sync,
@@ -4764,7 +4780,7 @@ static const struct amdgpu_ring_funcs gfx_v9_0_ring_funcs_kiq = {
 		SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 +
 		2 + /* gfx_v9_0_ring_emit_vm_flush */
 		8 + 8 + 8, /* gfx_v9_0_ring_emit_fence_kiq x3 for user fence, vm fence */
-	.emit_ib_size =	4, /* gfx_v9_0_ring_emit_ib_compute */
+	.emit_ib_size =	7, /* gfx_v9_0_ring_emit_ib_compute */
 	.emit_fence = gfx_v9_0_ring_emit_fence_kiq,
 	.test_ring = gfx_v9_0_ring_test_ring,
 	.insert_nop = amdgpu_ring_insert_nop,
@@ -4846,6 +4862,26 @@ static void gfx_v9_0_set_gds_init(struct amdgpu_device *adev)
 		break;
 	}
 
+	switch (adev->asic_type) {
+	case CHIP_VEGA10:
+	case CHIP_VEGA20:
+		adev->gds.gds_compute_max_wave_id = 0x7ff;
+		break;
+	case CHIP_VEGA12:
+		adev->gds.gds_compute_max_wave_id = 0x27f;
+		break;
+	case CHIP_RAVEN:
+		if (adev->rev_id >= 0x8)
+			adev->gds.gds_compute_max_wave_id = 0x77; /* raven2 */
+		else
+			adev->gds.gds_compute_max_wave_id = 0x15f; /* raven1 */
+		break;
+	default:
+		/* this really depends on the chip */
+		adev->gds.gds_compute_max_wave_id = 0x7ff;
+		break;
+	}
+
 	adev->gds.gws.total_size = 64;
 	adev->gds.oa.total_size = 16;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 1ad7e6b8ed1d..34440672f938 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1471,8 +1471,9 @@ static int gmc_v8_0_process_interrupt(struct amdgpu_device *adev,
 		gmc_v8_0_set_fault_enable_default(adev, false);
 
 	if (printk_ratelimit()) {
-		struct amdgpu_task_info task_info = { 0 };
+		struct amdgpu_task_info task_info;
 
+		memset(&task_info, 0, sizeof(struct amdgpu_task_info));
 		amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
 
 		dev_err(adev->dev, "GPU fault detected: %d 0x%08x for process %s pid %d thread %s pid %d\n",
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index bacdaef77b6c..600259b4e291 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -305,6 +305,7 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev,
 				struct amdgpu_iv_entry *entry)
 {
 	struct amdgpu_vmhub *hub = &adev->vmhub[entry->vmid_src];
+	bool retry_fault = !!(entry->src_data[1] & 0x80);
 	uint32_t status = 0;
 	u64 addr;
 
@@ -320,13 +321,16 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device *adev,
 	}
 
 	if (printk_ratelimit()) {
-		struct amdgpu_task_info task_info = { 0 };
+		struct amdgpu_task_info task_info;
 
+		memset(&task_info, 0, sizeof(struct amdgpu_task_info));
 		amdgpu_vm_get_task_info(adev, entry->pasid, &task_info);
 
 		dev_err(adev->dev,
-			"[%s] VMC page fault (src_id:%u ring:%u vmid:%u pasid:%u, for process %s pid %d thread %s pid %d)\n",
+			"[%s] %s page fault (src_id:%u ring:%u vmid:%u "
+			"pasid:%u, for process %s pid %d thread %s pid %d)\n",
 			entry->vmid_src ? "mmhub" : "gfxhub",
+			retry_fault ? "retry" : "no-retry",
 			entry->src_id, entry->ring_id, entry->vmid,
 			entry->pasid, task_info.process_name, task_info.tgid,
 			task_info.task_name, task_info.pid);
@@ -961,7 +965,11 @@ static int gmc_v9_0_sw_init(void *handle)
 		 * vm size is 256TB (48bit), maximum size of Vega10,
 		 * block size 512 (9bit)
 		 */
-		amdgpu_vm_adjust_size(adev, 256 * 1024, 9, 3, 48);
+		/* sriov restrict max_pfn below AMDGPU_GMC_HOLE */
+		if (amdgpu_sriov_vf(adev))
+			amdgpu_vm_adjust_size(adev, 256 * 1024, 9, 3, 47);
+		else
+			amdgpu_vm_adjust_size(adev, 256 * 1024, 9, 3, 48);
 		break;
 	default:
 		break;
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index a3984d10b604..b1626e1d2f5d 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -103,9 +103,9 @@ static void iceland_ih_disable_interrupts(struct amdgpu_device *adev)
  */
 static int iceland_ih_irq_init(struct amdgpu_device *adev)
 {
+	struct amdgpu_ih_ring *ih = &adev->irq.ih;
 	int rb_bufsz;
 	u32 interrupt_cntl, ih_cntl, ih_rb_cntl;
-	u64 wptr_off;
 
 	/* disable irqs */
 	iceland_ih_disable_interrupts(adev);
@@ -133,9 +133,8 @@ static int iceland_ih_irq_init(struct amdgpu_device *adev)
 	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, WPTR_WRITEBACK_ENABLE, 1);
 
 	/* set the writeback address whether it's enabled or not */
-	wptr_off = adev->wb.gpu_addr + (adev->irq.ih.wptr_offs * 4);
-	WREG32(mmIH_RB_WPTR_ADDR_LO, lower_32_bits(wptr_off));
-	WREG32(mmIH_RB_WPTR_ADDR_HI, upper_32_bits(wptr_off) & 0xFF);
+	WREG32(mmIH_RB_WPTR_ADDR_LO, lower_32_bits(ih->wptr_addr));
+	WREG32(mmIH_RB_WPTR_ADDR_HI, upper_32_bits(ih->wptr_addr) & 0xFF);
 
 	WREG32(mmIH_RB_CNTL, ih_rb_cntl);
 
@@ -185,11 +184,12 @@ static void iceland_ih_irq_disable(struct amdgpu_device *adev)
  * Used by cz_irq_process(VI).
  * Returns the value of the wptr.
  */
-static u32 iceland_ih_get_wptr(struct amdgpu_device *adev)
+static u32 iceland_ih_get_wptr(struct amdgpu_device *adev,
+			       struct amdgpu_ih_ring *ih)
 {
 	u32 wptr, tmp;
 
-	wptr = le32_to_cpu(adev->wb.wb[adev->irq.ih.wptr_offs]);
+	wptr = le32_to_cpu(*ih->wptr_cpu);
 
 	if (REG_GET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW)) {
 		wptr = REG_SET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW, 0);
@@ -198,13 +198,13 @@ static u32 iceland_ih_get_wptr(struct amdgpu_device *adev)
 		 * this should allow us to catchup.
 		 */
 		dev_warn(adev->dev, "IH ring buffer overflow (0x%08X, 0x%08X, 0x%08X)\n",
-			wptr, adev->irq.ih.rptr, (wptr + 16) & adev->irq.ih.ptr_mask);
-		adev->irq.ih.rptr = (wptr + 16) & adev->irq.ih.ptr_mask;
+			 wptr, ih->rptr, (wptr + 16) & ih->ptr_mask);
+		ih->rptr = (wptr + 16) & ih->ptr_mask;
 		tmp = RREG32(mmIH_RB_CNTL);
 		tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 1);
 		WREG32(mmIH_RB_CNTL, tmp);
 	}
-	return (wptr & adev->irq.ih.ptr_mask);
+	return (wptr & ih->ptr_mask);
 }
 
 /**
@@ -216,16 +216,17 @@ static u32 iceland_ih_get_wptr(struct amdgpu_device *adev)
  * position and also advance the position.
  */
 static void iceland_ih_decode_iv(struct amdgpu_device *adev,
+				 struct amdgpu_ih_ring *ih,
 				 struct amdgpu_iv_entry *entry)
 {
 	/* wptr/rptr are in bytes! */
-	u32 ring_index = adev->irq.ih.rptr >> 2;
+	u32 ring_index = ih->rptr >> 2;
 	uint32_t dw[4];
 
-	dw[0] = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
-	dw[1] = le32_to_cpu(adev->irq.ih.ring[ring_index + 1]);
-	dw[2] = le32_to_cpu(adev->irq.ih.ring[ring_index + 2]);
-	dw[3] = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
+	dw[0] = le32_to_cpu(ih->ring[ring_index + 0]);
+	dw[1] = le32_to_cpu(ih->ring[ring_index + 1]);
+	dw[2] = le32_to_cpu(ih->ring[ring_index + 2]);
+	dw[3] = le32_to_cpu(ih->ring[ring_index + 3]);
 
 	entry->client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
 	entry->src_id = dw[0] & 0xff;
@@ -235,7 +236,7 @@ static void iceland_ih_decode_iv(struct amdgpu_device *adev,
 	entry->pasid = (dw[2] >> 16) & 0xffff;
 
 	/* wptr/rptr are in bytes! */
-	adev->irq.ih.rptr += 16;
+	ih->rptr += 16;
 }
 
 /**
@@ -245,9 +246,10 @@ static void iceland_ih_decode_iv(struct amdgpu_device *adev,
  *
  * Set the IH ring buffer rptr.
  */
-static void iceland_ih_set_rptr(struct amdgpu_device *adev)
+static void iceland_ih_set_rptr(struct amdgpu_device *adev,
+				struct amdgpu_ih_ring *ih)
 {
-	WREG32(mmIH_RB_RPTR, adev->irq.ih.rptr);
+	WREG32(mmIH_RB_RPTR, ih->rptr);
 }
 
 static int iceland_ih_early_init(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index b11a1c17a7f2..73851ebb3833 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -266,7 +266,8 @@ flr_done:
 	}
 
 	/* Trigger recovery for world switch failure if no TDR */
-	if (amdgpu_device_should_recover_gpu(adev))
+	if (amdgpu_device_should_recover_gpu(adev)
+		&& amdgpu_lockup_timeout == MAX_SCHEDULE_TIMEOUT)
 		amdgpu_device_gpu_recover(adev, NULL);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c b/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
index accdedd63c98..cc967dbfd631 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
@@ -27,13 +27,9 @@
 #include "nbio/nbio_6_1_default.h"
 #include "nbio/nbio_6_1_offset.h"
 #include "nbio/nbio_6_1_sh_mask.h"
+#include "nbio/nbio_6_1_smn.h"
 #include "vega10_enum.h"
 
-#define smnCPM_CONTROL                                                                                  0x11180460
-#define smnPCIE_CNTL2                                                                                   0x11180070
-#define smnPCIE_CONFIG_CNTL                                                                             0x11180044
-#define smnPCIE_CI_CNTL                                                                                 0x11180080
-
 static u32 nbio_v6_1_get_rev_id(struct amdgpu_device *adev)
 {
         u32 tmp = RREG32_SOC15(NBIO, 0, mmRCC_DEV0_EPF0_STRAP0);
@@ -72,7 +68,7 @@ static u32 nbio_v6_1_get_memsize(struct amdgpu_device *adev)
 }
 
 static void nbio_v6_1_sdma_doorbell_range(struct amdgpu_device *adev, int instance,
-				  bool use_doorbell, int doorbell_index)
+			bool use_doorbell, int doorbell_index, int doorbell_size)
 {
 	u32 reg = instance == 0 ? SOC15_REG_OFFSET(NBIO, 0, mmBIF_SDMA0_DOORBELL_RANGE) :
 			SOC15_REG_OFFSET(NBIO, 0, mmBIF_SDMA1_DOORBELL_RANGE);
@@ -81,7 +77,7 @@ static void nbio_v6_1_sdma_doorbell_range(struct amdgpu_device *adev, int instan
 
 	if (use_doorbell) {
 		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, OFFSET, doorbell_index);
-		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, SIZE, 2);
+		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, SIZE, doorbell_size);
 	} else
 		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, SIZE, 0);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c
index df34dc79d444..1cdb98ad2db3 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c
@@ -27,13 +27,11 @@
 #include "nbio/nbio_7_0_default.h"
 #include "nbio/nbio_7_0_offset.h"
 #include "nbio/nbio_7_0_sh_mask.h"
+#include "nbio/nbio_7_0_smn.h"
 #include "vega10_enum.h"
 
 #define smnNBIF_MGCG_CTRL_LCLK	0x1013a05c
 
-#define smnCPM_CONTROL                                                                                  0x11180460
-#define smnPCIE_CNTL2                                                                                   0x11180070
-
 static u32 nbio_v7_0_get_rev_id(struct amdgpu_device *adev)
 {
         u32 tmp = RREG32_SOC15(NBIO, 0, mmRCC_DEV0_EPF0_STRAP0);
@@ -69,7 +67,7 @@ static u32 nbio_v7_0_get_memsize(struct amdgpu_device *adev)
 }
 
 static void nbio_v7_0_sdma_doorbell_range(struct amdgpu_device *adev, int instance,
-					  bool use_doorbell, int doorbell_index)
+			bool use_doorbell, int doorbell_index, int doorbell_size)
 {
 	u32 reg = instance == 0 ? SOC15_REG_OFFSET(NBIO, 0, mmBIF_SDMA0_DOORBELL_RANGE) :
 			SOC15_REG_OFFSET(NBIO, 0, mmBIF_SDMA1_DOORBELL_RANGE);
@@ -78,7 +76,7 @@ static void nbio_v7_0_sdma_doorbell_range(struct amdgpu_device *adev, int instan
 
 	if (use_doorbell) {
 		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, OFFSET, doorbell_index);
-		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, SIZE, 2);
+		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, SIZE, doorbell_size);
 	} else
 		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, SIZE, 0);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
index 186db182f924..c69d51598cfe 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
@@ -26,16 +26,13 @@
 
 #include "nbio/nbio_7_4_offset.h"
 #include "nbio/nbio_7_4_sh_mask.h"
+#include "nbio/nbio_7_4_0_smn.h"
 
 #define smnNBIF_MGCG_CTRL_LCLK	0x1013a21c
 
-#define smnCPM_CONTROL                                                                                  0x11180460
-#define smnPCIE_CNTL2                                                                                   0x11180070
-#define smnPCIE_CI_CNTL                                                                                 0x11180080
-
 static u32 nbio_v7_4_get_rev_id(struct amdgpu_device *adev)
 {
-    u32 tmp = RREG32_SOC15(NBIO, 0, mmRCC_DEV0_EPF0_STRAP0);
+	u32 tmp = RREG32_SOC15(NBIO, 0, mmRCC_DEV0_EPF0_STRAP0);
 
 	tmp &= RCC_DEV0_EPF0_STRAP0__STRAP_ATI_REV_ID_DEV0_F0_MASK;
 	tmp >>= RCC_DEV0_EPF0_STRAP0__STRAP_ATI_REV_ID_DEV0_F0__SHIFT;
@@ -68,7 +65,7 @@ static u32 nbio_v7_4_get_memsize(struct amdgpu_device *adev)
 }
 
 static void nbio_v7_4_sdma_doorbell_range(struct amdgpu_device *adev, int instance,
-					  bool use_doorbell, int doorbell_index)
+			bool use_doorbell, int doorbell_index, int doorbell_size)
 {
 	u32 reg = instance == 0 ? SOC15_REG_OFFSET(NBIO, 0, mmBIF_SDMA0_DOORBELL_RANGE) :
 			SOC15_REG_OFFSET(NBIO, 0, mmBIF_SDMA1_DOORBELL_RANGE);
@@ -77,7 +74,7 @@ static void nbio_v7_4_sdma_doorbell_range(struct amdgpu_device *adev, int instan
 
 	if (use_doorbell) {
 		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, OFFSET, doorbell_index);
-		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, SIZE, 2);
+		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, SIZE, doorbell_size);
 	} else
 		doorbell_range = REG_SET_FIELD(doorbell_range, BIF_SDMA0_DOORBELL_RANGE, SIZE, 0);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
index 0de00fbe9233..f3a7d207af07 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
+++ b/drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h
@@ -191,7 +191,7 @@ enum psp_gfx_fw_type
     GFX_FW_TYPE_MMSCH       = 19,
     GFX_FW_TYPE_RLC_RESTORE_LIST_GPM_MEM        = 20,
     GFX_FW_TYPE_RLC_RESTORE_LIST_SRM_MEM        = 21,
-    GFX_FW_TYPE_RLC_RESTORE_LIST_CNTL           = 22,
+    GFX_FW_TYPE_RLC_RESTORE_LIST_SRM_CNTL       = 22,
     GFX_FW_TYPE_UVD1        = 23,
     GFX_FW_TYPE_MAX         = 24
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c
index d78b4306a36f..77c2bc344dfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c
@@ -38,75 +38,6 @@ MODULE_FIRMWARE("amdgpu/raven_asd.bin");
 MODULE_FIRMWARE("amdgpu/picasso_asd.bin");
 MODULE_FIRMWARE("amdgpu/raven2_asd.bin");
 
-static int
-psp_v10_0_get_fw_type(struct amdgpu_firmware_info *ucode, enum psp_gfx_fw_type *type)
-{
-	switch(ucode->ucode_id) {
-	case AMDGPU_UCODE_ID_SDMA0:
-		*type = GFX_FW_TYPE_SDMA0;
-		break;
-	case AMDGPU_UCODE_ID_SDMA1:
-		*type = GFX_FW_TYPE_SDMA1;
-		break;
-	case AMDGPU_UCODE_ID_CP_CE:
-		*type = GFX_FW_TYPE_CP_CE;
-		break;
-	case AMDGPU_UCODE_ID_CP_PFP:
-		*type = GFX_FW_TYPE_CP_PFP;
-		break;
-	case AMDGPU_UCODE_ID_CP_ME:
-		*type = GFX_FW_TYPE_CP_ME;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC1:
-		*type = GFX_FW_TYPE_CP_MEC;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC1_JT:
-		*type = GFX_FW_TYPE_CP_MEC_ME1;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC2:
-		*type = GFX_FW_TYPE_CP_MEC;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC2_JT:
-		*type = GFX_FW_TYPE_CP_MEC_ME2;
-		break;
-	case AMDGPU_UCODE_ID_RLC_G:
-		*type = GFX_FW_TYPE_RLC_G;
-		break;
-	case AMDGPU_UCODE_ID_RLC_RESTORE_LIST_CNTL:
-		*type = GFX_FW_TYPE_RLC_RESTORE_LIST_CNTL;
-		break;
-	case AMDGPU_UCODE_ID_RLC_RESTORE_LIST_GPM_MEM:
-		*type = GFX_FW_TYPE_RLC_RESTORE_LIST_GPM_MEM;
-		break;
-	case AMDGPU_UCODE_ID_RLC_RESTORE_LIST_SRM_MEM:
-		*type = GFX_FW_TYPE_RLC_RESTORE_LIST_SRM_MEM;
-		break;
-	case AMDGPU_UCODE_ID_SMC:
-		*type = GFX_FW_TYPE_SMU;
-		break;
-	case AMDGPU_UCODE_ID_UVD:
-		*type = GFX_FW_TYPE_UVD;
-		break;
-	case AMDGPU_UCODE_ID_VCE:
-		*type = GFX_FW_TYPE_VCE;
-		break;
-	case AMDGPU_UCODE_ID_VCN:
-		*type = GFX_FW_TYPE_VCN;
-		break;
-	case AMDGPU_UCODE_ID_DMCU_ERAM:
-		*type = GFX_FW_TYPE_DMCU_ERAM;
-		break;
-	case AMDGPU_UCODE_ID_DMCU_INTV:
-		*type = GFX_FW_TYPE_DMCU_ISR;
-		break;
-	case AMDGPU_UCODE_ID_MAXIMUM:
-	default:
-		return -EINVAL;
-	}
-
-	return 0;
-}
-
 static int psp_v10_0_init_microcode(struct psp_context *psp)
 {
 	struct amdgpu_device *adev = psp->adev;
@@ -158,26 +89,6 @@ out:
 	return err;
 }
 
-static int psp_v10_0_prep_cmd_buf(struct amdgpu_firmware_info *ucode,
-				  struct psp_gfx_cmd_resp *cmd)
-{
-	int ret;
-	uint64_t fw_mem_mc_addr = ucode->mc_addr;
-
-	memset(cmd, 0, sizeof(struct psp_gfx_cmd_resp));
-
-	cmd->cmd_id = GFX_CMD_ID_LOAD_IP_FW;
-	cmd->cmd.cmd_load_ip_fw.fw_phy_addr_lo = lower_32_bits(fw_mem_mc_addr);
-	cmd->cmd.cmd_load_ip_fw.fw_phy_addr_hi = upper_32_bits(fw_mem_mc_addr);
-	cmd->cmd.cmd_load_ip_fw.fw_size = ucode->ucode_size;
-
-	ret = psp_v10_0_get_fw_type(ucode, &cmd->cmd.cmd_load_ip_fw.fw_type);
-	if (ret)
-		DRM_ERROR("Unknown firmware type\n");
-
-	return ret;
-}
-
 static int psp_v10_0_ring_init(struct psp_context *psp,
 			       enum psp_ring_type ring_type)
 {
@@ -454,7 +365,6 @@ static int psp_v10_0_mode1_reset(struct psp_context *psp)
 
 static const struct psp_funcs psp_v10_0_funcs = {
 	.init_microcode = psp_v10_0_init_microcode,
-	.prep_cmd_buf = psp_v10_0_prep_cmd_buf,
 	.ring_init = psp_v10_0_ring_init,
 	.ring_create = psp_v10_0_ring_create,
 	.ring_stop = psp_v10_0_ring_stop,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
index 189fcb004579..860b70d80d3c 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v11_0.c
@@ -40,60 +40,6 @@ MODULE_FIRMWARE("amdgpu/vega20_ta.bin");
 /* address block */
 #define smnMP1_FIRMWARE_FLAGS		0x3010024
 
-static int
-psp_v11_0_get_fw_type(struct amdgpu_firmware_info *ucode, enum psp_gfx_fw_type *type)
-{
-	switch (ucode->ucode_id) {
-	case AMDGPU_UCODE_ID_SDMA0:
-		*type = GFX_FW_TYPE_SDMA0;
-		break;
-	case AMDGPU_UCODE_ID_SDMA1:
-		*type = GFX_FW_TYPE_SDMA1;
-		break;
-	case AMDGPU_UCODE_ID_CP_CE:
-		*type = GFX_FW_TYPE_CP_CE;
-		break;
-	case AMDGPU_UCODE_ID_CP_PFP:
-		*type = GFX_FW_TYPE_CP_PFP;
-		break;
-	case AMDGPU_UCODE_ID_CP_ME:
-		*type = GFX_FW_TYPE_CP_ME;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC1:
-		*type = GFX_FW_TYPE_CP_MEC;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC1_JT:
-		*type = GFX_FW_TYPE_CP_MEC_ME1;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC2:
-		*type = GFX_FW_TYPE_CP_MEC;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC2_JT:
-		*type = GFX_FW_TYPE_CP_MEC_ME2;
-		break;
-	case AMDGPU_UCODE_ID_RLC_G:
-		*type = GFX_FW_TYPE_RLC_G;
-		break;
-	case AMDGPU_UCODE_ID_SMC:
-		*type = GFX_FW_TYPE_SMU;
-		break;
-	case AMDGPU_UCODE_ID_UVD:
-		*type = GFX_FW_TYPE_UVD;
-		break;
-	case AMDGPU_UCODE_ID_VCE:
-		*type = GFX_FW_TYPE_VCE;
-		break;
-	case AMDGPU_UCODE_ID_UVD1:
-		*type = GFX_FW_TYPE_UVD1;
-		break;
-	case AMDGPU_UCODE_ID_MAXIMUM:
-	default:
-		return -EINVAL;
-	}
-
-	return 0;
-}
-
 static int psp_v11_0_init_microcode(struct psp_context *psp)
 {
 	struct amdgpu_device *adev = psp->adev;
@@ -271,26 +217,6 @@ static int psp_v11_0_bootloader_load_sos(struct psp_context *psp)
 	return ret;
 }
 
-static int psp_v11_0_prep_cmd_buf(struct amdgpu_firmware_info *ucode,
-				 struct psp_gfx_cmd_resp *cmd)
-{
-	int ret;
-	uint64_t fw_mem_mc_addr = ucode->mc_addr;
-
-	memset(cmd, 0, sizeof(struct psp_gfx_cmd_resp));
-
-	cmd->cmd_id = GFX_CMD_ID_LOAD_IP_FW;
-	cmd->cmd.cmd_load_ip_fw.fw_phy_addr_lo = lower_32_bits(fw_mem_mc_addr);
-	cmd->cmd.cmd_load_ip_fw.fw_phy_addr_hi = upper_32_bits(fw_mem_mc_addr);
-	cmd->cmd.cmd_load_ip_fw.fw_size = ucode->ucode_size;
-
-	ret = psp_v11_0_get_fw_type(ucode, &cmd->cmd.cmd_load_ip_fw.fw_type);
-	if (ret)
-		DRM_ERROR("Unknown firmware type\n");
-
-	return ret;
-}
-
 static int psp_v11_0_ring_init(struct psp_context *psp,
 			      enum psp_ring_type ring_type)
 {
@@ -757,7 +683,6 @@ static const struct psp_funcs psp_v11_0_funcs = {
 	.init_microcode = psp_v11_0_init_microcode,
 	.bootloader_load_sysdrv = psp_v11_0_bootloader_load_sysdrv,
 	.bootloader_load_sos = psp_v11_0_bootloader_load_sos,
-	.prep_cmd_buf = psp_v11_0_prep_cmd_buf,
 	.ring_init = psp_v11_0_ring_init,
 	.ring_create = psp_v11_0_ring_create,
 	.ring_stop = psp_v11_0_ring_stop,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
index 79694ff16969..c63de945c021 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
@@ -47,57 +47,6 @@ MODULE_FIRMWARE("amdgpu/vega12_asd.bin");
 
 static uint32_t sos_old_versions[] = {1517616, 1510592, 1448594, 1446554};
 
-static int
-psp_v3_1_get_fw_type(struct amdgpu_firmware_info *ucode, enum psp_gfx_fw_type *type)
-{
-	switch(ucode->ucode_id) {
-	case AMDGPU_UCODE_ID_SDMA0:
-		*type = GFX_FW_TYPE_SDMA0;
-		break;
-	case AMDGPU_UCODE_ID_SDMA1:
-		*type = GFX_FW_TYPE_SDMA1;
-		break;
-	case AMDGPU_UCODE_ID_CP_CE:
-		*type = GFX_FW_TYPE_CP_CE;
-		break;
-	case AMDGPU_UCODE_ID_CP_PFP:
-		*type = GFX_FW_TYPE_CP_PFP;
-		break;
-	case AMDGPU_UCODE_ID_CP_ME:
-		*type = GFX_FW_TYPE_CP_ME;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC1:
-		*type = GFX_FW_TYPE_CP_MEC;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC1_JT:
-		*type = GFX_FW_TYPE_CP_MEC_ME1;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC2:
-		*type = GFX_FW_TYPE_CP_MEC;
-		break;
-	case AMDGPU_UCODE_ID_CP_MEC2_JT:
-		*type = GFX_FW_TYPE_CP_MEC_ME2;
-		break;
-	case AMDGPU_UCODE_ID_RLC_G:
-		*type = GFX_FW_TYPE_RLC_G;
-		break;
-	case AMDGPU_UCODE_ID_SMC:
-		*type = GFX_FW_TYPE_SMU;
-		break;
-	case AMDGPU_UCODE_ID_UVD:
-		*type = GFX_FW_TYPE_UVD;
-		break;
-	case AMDGPU_UCODE_ID_VCE:
-		*type = GFX_FW_TYPE_VCE;
-		break;
-	case AMDGPU_UCODE_ID_MAXIMUM:
-	default:
-		return -EINVAL;
-	}
-
-	return 0;
-}
-
 static int psp_v3_1_init_microcode(struct psp_context *psp)
 {
 	struct amdgpu_device *adev = psp->adev;
@@ -277,26 +226,6 @@ static int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
 	return ret;
 }
 
-static int psp_v3_1_prep_cmd_buf(struct amdgpu_firmware_info *ucode,
-				 struct psp_gfx_cmd_resp *cmd)
-{
-	int ret;
-	uint64_t fw_mem_mc_addr = ucode->mc_addr;
-
-	memset(cmd, 0, sizeof(struct psp_gfx_cmd_resp));
-
-	cmd->cmd_id = GFX_CMD_ID_LOAD_IP_FW;
-	cmd->cmd.cmd_load_ip_fw.fw_phy_addr_lo = lower_32_bits(fw_mem_mc_addr);
-	cmd->cmd.cmd_load_ip_fw.fw_phy_addr_hi = upper_32_bits(fw_mem_mc_addr);
-	cmd->cmd.cmd_load_ip_fw.fw_size = ucode->ucode_size;
-
-	ret = psp_v3_1_get_fw_type(ucode, &cmd->cmd.cmd_load_ip_fw.fw_type);
-	if (ret)
-		DRM_ERROR("Unknown firmware type\n");
-
-	return ret;
-}
-
 static int psp_v3_1_ring_init(struct psp_context *psp,
 			      enum psp_ring_type ring_type)
 {
@@ -615,7 +544,6 @@ static const struct psp_funcs psp_v3_1_funcs = {
 	.init_microcode = psp_v3_1_init_microcode,
 	.bootloader_load_sysdrv = psp_v3_1_bootloader_load_sysdrv,
 	.bootloader_load_sos = psp_v3_1_bootloader_load_sos,
-	.prep_cmd_buf = psp_v3_1_prep_cmd_buf,
 	.ring_init = psp_v3_1_ring_init,
 	.ring_create = psp_v3_1_ring_create,
 	.ring_stop = psp_v3_1_ring_stop,
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
index 9f3cb2aec7c2..cca3552b36ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c
@@ -247,7 +247,7 @@ static void sdma_v2_4_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
 static void sdma_v2_4_ring_emit_ib(struct amdgpu_ring *ring,
 				   struct amdgpu_job *job,
 				   struct amdgpu_ib *ib,
-				   bool ctx_switch)
+				   uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 1bccc5fe2d9d..0ce8331baeb2 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -421,7 +421,7 @@ static void sdma_v3_0_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
 static void sdma_v3_0_ring_emit_ib(struct amdgpu_ring *ring,
 				   struct amdgpu_job *job,
 				   struct amdgpu_ib *ib,
-				   bool ctx_switch)
+				   uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 
@@ -1145,8 +1145,7 @@ static int sdma_v3_0_sw_init(void *handle)
 		ring->ring_obj = NULL;
 		if (!amdgpu_sriov_vf(adev)) {
 			ring->use_doorbell = true;
-			ring->doorbell_index = (i == 0) ?
-				adev->doorbell_index.sdma_engine0 : adev->doorbell_index.sdma_engine1;
+			ring->doorbell_index = adev->doorbell_index.sdma_engine[i];
 		} else {
 			ring->use_pollmem = true;
 		}
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index aa2f71cc1eba..c816e55d43a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -500,7 +500,7 @@ static void sdma_v4_0_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
 static void sdma_v4_0_ring_emit_ib(struct amdgpu_ring *ring,
 				   struct amdgpu_job *job,
 				   struct amdgpu_ib *ib,
-				   bool ctx_switch)
+				   uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 
@@ -834,8 +834,6 @@ static void sdma_v4_0_gfx_resume(struct amdgpu_device *adev, unsigned int i)
 					OFFSET, ring->doorbell_index);
 	WREG32_SDMA(i, mmSDMA0_GFX_DOORBELL, doorbell);
 	WREG32_SDMA(i, mmSDMA0_GFX_DOORBELL_OFFSET, doorbell_offset);
-	adev->nbio_funcs->sdma_doorbell_range(adev, i, ring->use_doorbell,
-					      ring->doorbell_index);
 
 	sdma_v4_0_ring_set_wptr(ring);
 
@@ -1522,9 +1520,7 @@ static int sdma_v4_0_sw_init(void *handle)
 				ring->use_doorbell?"true":"false");
 
 		/* doorbell size is 2 dwords, get DWORD offset */
-		ring->doorbell_index = (i == 0) ?
-			(adev->doorbell_index.sdma_engine0 << 1)
-			: (adev->doorbell_index.sdma_engine1 << 1);
+		ring->doorbell_index = adev->doorbell_index.sdma_engine[i] << 1;
 
 		sprintf(ring->name, "sdma%d", i);
 		r = amdgpu_ring_init(adev, ring, 1024,
@@ -1543,9 +1539,7 @@ static int sdma_v4_0_sw_init(void *handle)
 			/* paging queue use same doorbell index/routing as gfx queue
 			 * with 0x400 (4096 dwords) offset on second doorbell page
 			 */
-			ring->doorbell_index = (i == 0) ?
-				(adev->doorbell_index.sdma_engine0 << 1)
-				: (adev->doorbell_index.sdma_engine1 << 1);
+			ring->doorbell_index = adev->doorbell_index.sdma_engine[i] << 1;
 			ring->doorbell_index += 0x400;
 
 			sprintf(ring->name, "page%d", i);
diff --git a/drivers/gpu/drm/amd/amdgpu/si.c b/drivers/gpu/drm/amd/amdgpu/si.c
index f8408f88cd37..9d8df68893b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/si.c
+++ b/drivers/gpu/drm/amd/amdgpu/si.c
@@ -47,6 +47,7 @@
 #include "dce/dce_6_0_d.h"
 #include "uvd/uvd_4_0_d.h"
 #include "bif/bif_3_0_d.h"
+#include "bif/bif_3_0_sh_mask.h"
 
 static const u32 tahiti_golden_registers[] =
 {
@@ -1258,6 +1259,11 @@ static bool si_need_full_reset(struct amdgpu_device *adev)
 	return true;
 }
 
+static bool si_need_reset_on_init(struct amdgpu_device *adev)
+{
+	return false;
+}
+
 static int si_get_pcie_lanes(struct amdgpu_device *adev)
 {
 	u32 link_width_cntl;
@@ -1323,6 +1329,52 @@ static void si_set_pcie_lanes(struct amdgpu_device *adev, int lanes)
 	WREG32_PCIE_PORT(PCIE_LC_LINK_WIDTH_CNTL, link_width_cntl);
 }
 
+static void si_get_pcie_usage(struct amdgpu_device *adev, uint64_t *count0,
+			      uint64_t *count1)
+{
+	uint32_t perfctr = 0;
+	uint64_t cnt0_of, cnt1_of;
+	int tmp;
+
+	/* This reports 0 on APUs, so return to avoid writing/reading registers
+	 * that may or may not be different from their GPU counterparts
+	 */
+        if (adev->flags & AMD_IS_APU)
+                return;
+
+	/* Set the 2 events that we wish to watch, defined above */
+	/* Reg 40 is # received msgs, Reg 104 is # of posted requests sent */
+	perfctr = REG_SET_FIELD(perfctr, PCIE_PERF_CNTL_TXCLK, EVENT0_SEL, 40);
+	perfctr = REG_SET_FIELD(perfctr, PCIE_PERF_CNTL_TXCLK, EVENT1_SEL, 104);
+
+	/* Write to enable desired perf counters */
+	WREG32_PCIE(ixPCIE_PERF_CNTL_TXCLK, perfctr);
+	/* Zero out and enable the perf counters
+	 * Write 0x5:
+	 * Bit 0 = Start all counters(1)
+	 * Bit 2 = Global counter reset enable(1)
+	 */
+	WREG32_PCIE(ixPCIE_PERF_COUNT_CNTL, 0x00000005);
+
+	msleep(1000);
+
+	/* Load the shadow and disable the perf counters
+	 * Write 0x2:
+	 * Bit 0 = Stop counters(0)
+	 * Bit 1 = Load the shadow counters(1)
+	 */
+	WREG32_PCIE(ixPCIE_PERF_COUNT_CNTL, 0x00000002);
+
+	/* Read register values to get any >32bit overflow */
+	tmp = RREG32_PCIE(ixPCIE_PERF_CNTL_TXCLK);
+	cnt0_of = REG_GET_FIELD(tmp, PCIE_PERF_CNTL_TXCLK, COUNTER0_UPPER);
+	cnt1_of = REG_GET_FIELD(tmp, PCIE_PERF_CNTL_TXCLK, COUNTER1_UPPER);
+
+	/* Get the values and add the overflow */
+	*count0 = RREG32_PCIE(ixPCIE_PERF_COUNT0_TXCLK) | (cnt0_of << 32);
+	*count1 = RREG32_PCIE(ixPCIE_PERF_COUNT1_TXCLK) | (cnt1_of << 32);
+}
+
 static const struct amdgpu_asic_funcs si_asic_funcs =
 {
 	.read_disabled_bios = &si_read_disabled_bios,
@@ -1339,6 +1391,8 @@ static const struct amdgpu_asic_funcs si_asic_funcs =
 	.flush_hdp = &si_flush_hdp,
 	.invalidate_hdp = &si_invalidate_hdp,
 	.need_full_reset = &si_need_full_reset,
+	.get_pcie_usage = &si_get_pcie_usage,
+	.need_reset_on_init = &si_need_reset_on_init,
 };
 
 static uint32_t si_get_rev_id(struct amdgpu_device *adev)
@@ -1382,7 +1436,7 @@ static int si_common_early_init(void *handle)
 			AMD_CG_SUPPORT_UVD_MGCG |
 			AMD_CG_SUPPORT_HDP_LS |
 			AMD_CG_SUPPORT_HDP_MGCG;
-			adev->pg_flags = 0;
+		adev->pg_flags = 0;
 		adev->external_rev_id = (adev->rev_id == 0) ? 1 :
 					(adev->rev_id == 1) ? 5 : 6;
 		break;
diff --git a/drivers/gpu/drm/amd/amdgpu/si_dma.c b/drivers/gpu/drm/amd/amdgpu/si_dma.c
index b6e473134e19..f15f196684ba 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dma.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dma.c
@@ -63,7 +63,7 @@ static void si_dma_ring_set_wptr(struct amdgpu_ring *ring)
 static void si_dma_ring_emit_ib(struct amdgpu_ring *ring,
 				struct amdgpu_job *job,
 				struct amdgpu_ib *ib,
-				bool ctx_switch)
+				uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 	/* The indirect buffer packet must end on an 8 DW boundary in the DMA ring.
diff --git a/drivers/gpu/drm/amd/amdgpu/si_dpm.c b/drivers/gpu/drm/amd/amdgpu/si_dpm.c
index da58040fdbdc..41e01a7f57a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dpm.c
@@ -6216,10 +6216,12 @@ static void si_request_link_speed_change_before_state_change(struct amdgpu_devic
 			si_pi->force_pcie_gen = AMDGPU_PCIE_GEN2;
 			if (current_link_speed == AMDGPU_PCIE_GEN2)
 				break;
+			/* fall through */
 		case AMDGPU_PCIE_GEN2:
 			if (amdgpu_acpi_pcie_performance_request(adev, PCIE_PERF_REQ_PECI_GEN2, false) == 0)
 				break;
 #endif
+			/* fall through */
 		default:
 			si_pi->force_pcie_gen = si_get_current_pcie_speed(adev);
 			break;
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index 2938fb9f17cc..8c50c9cab455 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -57,9 +57,9 @@ static void si_ih_disable_interrupts(struct amdgpu_device *adev)
 
 static int si_ih_irq_init(struct amdgpu_device *adev)
 {
+	struct amdgpu_ih_ring *ih = &adev->irq.ih;
 	int rb_bufsz;
 	u32 interrupt_cntl, ih_cntl, ih_rb_cntl;
-	u64 wptr_off;
 
 	si_ih_disable_interrupts(adev);
 	WREG32(INTERRUPT_CNTL2, adev->irq.ih.gpu_addr >> 8);
@@ -76,9 +76,8 @@ static int si_ih_irq_init(struct amdgpu_device *adev)
 		     (rb_bufsz << 1) |
 		     IH_WPTR_WRITEBACK_ENABLE;
 
-	wptr_off = adev->wb.gpu_addr + (adev->irq.ih.wptr_offs * 4);
-	WREG32(IH_RB_WPTR_ADDR_LO, lower_32_bits(wptr_off));
-	WREG32(IH_RB_WPTR_ADDR_HI, upper_32_bits(wptr_off) & 0xFF);
+	WREG32(IH_RB_WPTR_ADDR_LO, lower_32_bits(ih->wptr_addr));
+	WREG32(IH_RB_WPTR_ADDR_HI, upper_32_bits(ih->wptr_addr) & 0xFF);
 	WREG32(IH_RB_CNTL, ih_rb_cntl);
 	WREG32(IH_RB_RPTR, 0);
 	WREG32(IH_RB_WPTR, 0);
@@ -100,34 +99,36 @@ static void si_ih_irq_disable(struct amdgpu_device *adev)
 	mdelay(1);
 }
 
-static u32 si_ih_get_wptr(struct amdgpu_device *adev)
+static u32 si_ih_get_wptr(struct amdgpu_device *adev,
+			  struct amdgpu_ih_ring *ih)
 {
 	u32 wptr, tmp;
 
-	wptr = le32_to_cpu(adev->wb.wb[adev->irq.ih.wptr_offs]);
+	wptr = le32_to_cpu(*ih->wptr_cpu);
 
 	if (wptr & IH_RB_WPTR__RB_OVERFLOW_MASK) {
 		wptr &= ~IH_RB_WPTR__RB_OVERFLOW_MASK;
 		dev_warn(adev->dev, "IH ring buffer overflow (0x%08X, 0x%08X, 0x%08X)\n",
-			wptr, adev->irq.ih.rptr, (wptr + 16) & adev->irq.ih.ptr_mask);
-		adev->irq.ih.rptr = (wptr + 16) & adev->irq.ih.ptr_mask;
+			wptr, ih->rptr, (wptr + 16) & ih->ptr_mask);
+		ih->rptr = (wptr + 16) & ih->ptr_mask;
 		tmp = RREG32(IH_RB_CNTL);
 		tmp |= IH_RB_CNTL__WPTR_OVERFLOW_CLEAR_MASK;
 		WREG32(IH_RB_CNTL, tmp);
 	}
-	return (wptr & adev->irq.ih.ptr_mask);
+	return (wptr & ih->ptr_mask);
 }
 
 static void si_ih_decode_iv(struct amdgpu_device *adev,
-			     struct amdgpu_iv_entry *entry)
+			    struct amdgpu_ih_ring *ih,
+			    struct amdgpu_iv_entry *entry)
 {
-	u32 ring_index = adev->irq.ih.rptr >> 2;
+	u32 ring_index = ih->rptr >> 2;
 	uint32_t dw[4];
 
-	dw[0] = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
-	dw[1] = le32_to_cpu(adev->irq.ih.ring[ring_index + 1]);
-	dw[2] = le32_to_cpu(adev->irq.ih.ring[ring_index + 2]);
-	dw[3] = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
+	dw[0] = le32_to_cpu(ih->ring[ring_index + 0]);
+	dw[1] = le32_to_cpu(ih->ring[ring_index + 1]);
+	dw[2] = le32_to_cpu(ih->ring[ring_index + 2]);
+	dw[3] = le32_to_cpu(ih->ring[ring_index + 3]);
 
 	entry->client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
 	entry->src_id = dw[0] & 0xff;
@@ -135,12 +136,13 @@ static void si_ih_decode_iv(struct amdgpu_device *adev,
 	entry->ring_id = dw[2] & 0xff;
 	entry->vmid = (dw[2] >> 8) & 0xff;
 
-	adev->irq.ih.rptr += 16;
+	ih->rptr += 16;
 }
 
-static void si_ih_set_rptr(struct amdgpu_device *adev)
+static void si_ih_set_rptr(struct amdgpu_device *adev,
+			   struct amdgpu_ih_ring *ih)
 {
-	WREG32(IH_RB_RPTR, adev->irq.ih.rptr);
+	WREG32(IH_RB_RPTR, ih->rptr);
 }
 
 static int si_ih_early_init(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 9b639974c70c..99ebcf29dcb0 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -43,6 +43,10 @@
 #include "hdp/hdp_4_0_sh_mask.h"
 #include "smuio/smuio_9_0_offset.h"
 #include "smuio/smuio_9_0_sh_mask.h"
+#include "nbio/nbio_7_0_default.h"
+#include "nbio/nbio_7_0_sh_mask.h"
+#include "nbio/nbio_7_0_smn.h"
+#include "mp/mp_9_0_offset.h"
 
 #include "soc15.h"
 #include "soc15_common.h"
@@ -385,14 +389,13 @@ void soc15_program_register_sequence(struct amdgpu_device *adev,
 
 }
 
-
-static int soc15_asic_reset(struct amdgpu_device *adev)
+static int soc15_asic_mode1_reset(struct amdgpu_device *adev)
 {
 	u32 i;
 
 	amdgpu_atombios_scratch_regs_engine_hung(adev, true);
 
-	dev_info(adev->dev, "GPU reset\n");
+	dev_info(adev->dev, "GPU mode1 reset\n");
 
 	/* disable BM */
 	pci_clear_master(adev->pdev);
@@ -417,6 +420,63 @@ static int soc15_asic_reset(struct amdgpu_device *adev)
 	return 0;
 }
 
+static int soc15_asic_get_baco_capability(struct amdgpu_device *adev, bool *cap)
+{
+	void *pp_handle = adev->powerplay.pp_handle;
+	const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
+
+	if (!pp_funcs || !pp_funcs->get_asic_baco_capability) {
+		*cap = false;
+		return -ENOENT;
+	}
+
+	return pp_funcs->get_asic_baco_capability(pp_handle, cap);
+}
+
+static int soc15_asic_baco_reset(struct amdgpu_device *adev)
+{
+	void *pp_handle = adev->powerplay.pp_handle;
+	const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
+
+	if (!pp_funcs ||!pp_funcs->get_asic_baco_state ||!pp_funcs->set_asic_baco_state)
+		return -ENOENT;
+
+	/* enter BACO state */
+	if (pp_funcs->set_asic_baco_state(pp_handle, 1))
+		return -EIO;
+
+	/* exit BACO state */
+	if (pp_funcs->set_asic_baco_state(pp_handle, 0))
+		return -EIO;
+
+	dev_info(adev->dev, "GPU BACO reset\n");
+
+	return 0;
+}
+
+static int soc15_asic_reset(struct amdgpu_device *adev)
+{
+	int ret;
+	bool baco_reset;
+
+	switch (adev->asic_type) {
+	case CHIP_VEGA10:
+	case CHIP_VEGA20:
+		soc15_asic_get_baco_capability(adev, &baco_reset);
+		break;
+	default:
+		baco_reset = false;
+		break;
+	}
+
+	if (baco_reset)
+		ret = soc15_asic_baco_reset(adev);
+	else
+		ret = soc15_asic_mode1_reset(adev);
+
+	return ret;
+}
+
 /*static int soc15_set_uvd_clock(struct amdgpu_device *adev, u32 clock,
 			u32 cntl_reg, u32 status_reg)
 {
@@ -535,10 +595,12 @@ int soc15_set_ip_blocks(struct amdgpu_device *adev)
 		amdgpu_device_ip_block_add(adev, &vega10_common_ip_block);
 		amdgpu_device_ip_block_add(adev, &gmc_v9_0_ip_block);
 		amdgpu_device_ip_block_add(adev, &vega10_ih_ip_block);
-		if (adev->asic_type == CHIP_VEGA20)
-			amdgpu_device_ip_block_add(adev, &psp_v11_0_ip_block);
-		else
-			amdgpu_device_ip_block_add(adev, &psp_v3_1_ip_block);
+		if (likely(adev->firmware.load_type == AMDGPU_FW_LOAD_PSP)) {
+			if (adev->asic_type == CHIP_VEGA20)
+				amdgpu_device_ip_block_add(adev, &psp_v11_0_ip_block);
+			else
+				amdgpu_device_ip_block_add(adev, &psp_v3_1_ip_block);
+		}
 		amdgpu_device_ip_block_add(adev, &gfx_v9_0_ip_block);
 		amdgpu_device_ip_block_add(adev, &sdma_v4_0_ip_block);
 		if (!amdgpu_sriov_vf(adev))
@@ -560,7 +622,8 @@ int soc15_set_ip_blocks(struct amdgpu_device *adev)
 		amdgpu_device_ip_block_add(adev, &vega10_common_ip_block);
 		amdgpu_device_ip_block_add(adev, &gmc_v9_0_ip_block);
 		amdgpu_device_ip_block_add(adev, &vega10_ih_ip_block);
-		amdgpu_device_ip_block_add(adev, &psp_v10_0_ip_block);
+		if (likely(adev->firmware.load_type == AMDGPU_FW_LOAD_PSP))
+			amdgpu_device_ip_block_add(adev, &psp_v10_0_ip_block);
 		amdgpu_device_ip_block_add(adev, &gfx_v9_0_ip_block);
 		amdgpu_device_ip_block_add(adev, &sdma_v4_0_ip_block);
 		amdgpu_device_ip_block_add(adev, &pp_smu_ip_block);
@@ -601,6 +664,68 @@ static bool soc15_need_full_reset(struct amdgpu_device *adev)
 	/* change this when we implement soft reset */
 	return true;
 }
+static void soc15_get_pcie_usage(struct amdgpu_device *adev, uint64_t *count0,
+				 uint64_t *count1)
+{
+	uint32_t perfctr = 0;
+	uint64_t cnt0_of, cnt1_of;
+	int tmp;
+
+	/* This reports 0 on APUs, so return to avoid writing/reading registers
+	 * that may or may not be different from their GPU counterparts
+	 */
+	 if (adev->flags & AMD_IS_APU)
+		 return;
+
+	/* Set the 2 events that we wish to watch, defined above */
+	/* Reg 40 is # received msgs, Reg 104 is # of posted requests sent */
+	perfctr = REG_SET_FIELD(perfctr, PCIE_PERF_CNTL_TXCLK, EVENT0_SEL, 40);
+	perfctr = REG_SET_FIELD(perfctr, PCIE_PERF_CNTL_TXCLK, EVENT1_SEL, 104);
+
+	/* Write to enable desired perf counters */
+	WREG32_PCIE(smnPCIE_PERF_CNTL_TXCLK, perfctr);
+	/* Zero out and enable the perf counters
+	 * Write 0x5:
+	 * Bit 0 = Start all counters(1)
+	 * Bit 2 = Global counter reset enable(1)
+	 */
+	WREG32_PCIE(smnPCIE_PERF_COUNT_CNTL, 0x00000005);
+
+	msleep(1000);
+
+	/* Load the shadow and disable the perf counters
+	 * Write 0x2:
+	 * Bit 0 = Stop counters(0)
+	 * Bit 1 = Load the shadow counters(1)
+	 */
+	WREG32_PCIE(smnPCIE_PERF_COUNT_CNTL, 0x00000002);
+
+	/* Read register values to get any >32bit overflow */
+	tmp = RREG32_PCIE(smnPCIE_PERF_CNTL_TXCLK);
+	cnt0_of = REG_GET_FIELD(tmp, PCIE_PERF_CNTL_TXCLK, COUNTER0_UPPER);
+	cnt1_of = REG_GET_FIELD(tmp, PCIE_PERF_CNTL_TXCLK, COUNTER1_UPPER);
+
+	/* Get the values and add the overflow */
+	*count0 = RREG32_PCIE(smnPCIE_PERF_COUNT0_TXCLK) | (cnt0_of << 32);
+	*count1 = RREG32_PCIE(smnPCIE_PERF_COUNT1_TXCLK) | (cnt1_of << 32);
+}
+
+static bool soc15_need_reset_on_init(struct amdgpu_device *adev)
+{
+	u32 sol_reg;
+
+	if (adev->flags & AMD_IS_APU)
+		return false;
+
+	/* Check sOS sign of life register to confirm sys driver and sOS
+	 * are already been loaded.
+	 */
+	sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);
+	if (sol_reg)
+		return true;
+
+	return false;
+}
 
 static const struct amdgpu_asic_funcs soc15_asic_funcs =
 {
@@ -617,6 +742,8 @@ static const struct amdgpu_asic_funcs soc15_asic_funcs =
 	.invalidate_hdp = &soc15_invalidate_hdp,
 	.need_full_reset = &soc15_need_full_reset,
 	.init_doorbell_index = &vega10_doorbell_index_init,
+	.get_pcie_usage = &soc15_get_pcie_usage,
+	.need_reset_on_init = &soc15_need_reset_on_init,
 };
 
 static const struct amdgpu_asic_funcs vega20_asic_funcs =
@@ -634,6 +761,8 @@ static const struct amdgpu_asic_funcs vega20_asic_funcs =
 	.invalidate_hdp = &soc15_invalidate_hdp,
 	.need_full_reset = &soc15_need_full_reset,
 	.init_doorbell_index = &vega20_doorbell_index_init,
+	.get_pcie_usage = &soc15_get_pcie_usage,
+	.need_reset_on_init = &soc15_need_reset_on_init,
 };
 
 static int soc15_common_early_init(void *handle)
@@ -842,6 +971,22 @@ static int soc15_common_sw_fini(void *handle)
 	return 0;
 }
 
+static void soc15_doorbell_range_init(struct amdgpu_device *adev)
+{
+	int i;
+	struct amdgpu_ring *ring;
+
+	for (i = 0; i < adev->sdma.num_instances; i++) {
+		ring = &adev->sdma.instance[i].ring;
+		adev->nbio_funcs->sdma_doorbell_range(adev, i,
+			ring->use_doorbell, ring->doorbell_index,
+			adev->doorbell_index.sdma_doorbell_range);
+	}
+
+	adev->nbio_funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell,
+						adev->irq.ih.doorbell_index);
+}
+
 static int soc15_common_hw_init(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -854,6 +999,12 @@ static int soc15_common_hw_init(void *handle)
 	adev->nbio_funcs->init_registers(adev);
 	/* enable the doorbell aperture */
 	soc15_enable_doorbell_aperture(adev, true);
+	/* HW doorbell routing policy: doorbell writing not
+	 * in SDMA/IH/MM/ACV range will be routed to CP. So
+	 * we need to init SDMA/IH/MM/ACV doorbell range prior
+	 * to CP ip block init and ring test.
+	 */
+	soc15_doorbell_range_init(adev);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index 15da06ddeb75..a20b711a6756 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -99,9 +99,9 @@ static void tonga_ih_disable_interrupts(struct amdgpu_device *adev)
  */
 static int tonga_ih_irq_init(struct amdgpu_device *adev)
 {
-	int rb_bufsz;
 	u32 interrupt_cntl, ih_rb_cntl, ih_doorbell_rtpr;
-	u64 wptr_off;
+	struct amdgpu_ih_ring *ih = &adev->irq.ih;
+	int rb_bufsz;
 
 	/* disable irqs */
 	tonga_ih_disable_interrupts(adev);
@@ -118,10 +118,7 @@ static int tonga_ih_irq_init(struct amdgpu_device *adev)
 	WREG32(mmINTERRUPT_CNTL, interrupt_cntl);
 
 	/* Ring Buffer base. [39:8] of 40-bit address of the beginning of the ring buffer*/
-	if (adev->irq.ih.use_bus_addr)
-		WREG32(mmIH_RB_BASE, adev->irq.ih.rb_dma_addr >> 8);
-	else
-		WREG32(mmIH_RB_BASE, adev->irq.ih.gpu_addr >> 8);
+	WREG32(mmIH_RB_BASE, ih->gpu_addr >> 8);
 
 	rb_bufsz = order_base_2(adev->irq.ih.ring_size / 4);
 	ih_rb_cntl = REG_SET_FIELD(0, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 1);
@@ -136,12 +133,8 @@ static int tonga_ih_irq_init(struct amdgpu_device *adev)
 	WREG32(mmIH_RB_CNTL, ih_rb_cntl);
 
 	/* set the writeback address whether it's enabled or not */
-	if (adev->irq.ih.use_bus_addr)
-		wptr_off = adev->irq.ih.rb_dma_addr + (adev->irq.ih.wptr_offs * 4);
-	else
-		wptr_off = adev->wb.gpu_addr + (adev->irq.ih.wptr_offs * 4);
-	WREG32(mmIH_RB_WPTR_ADDR_LO, lower_32_bits(wptr_off));
-	WREG32(mmIH_RB_WPTR_ADDR_HI, upper_32_bits(wptr_off) & 0xFF);
+	WREG32(mmIH_RB_WPTR_ADDR_LO, lower_32_bits(ih->wptr_addr));
+	WREG32(mmIH_RB_WPTR_ADDR_HI, upper_32_bits(ih->wptr_addr) & 0xFF);
 
 	/* set rptr, wptr to 0 */
 	WREG32(mmIH_RB_RPTR, 0);
@@ -193,14 +186,12 @@ static void tonga_ih_irq_disable(struct amdgpu_device *adev)
  * Used by cz_irq_process(VI).
  * Returns the value of the wptr.
  */
-static u32 tonga_ih_get_wptr(struct amdgpu_device *adev)
+static u32 tonga_ih_get_wptr(struct amdgpu_device *adev,
+			     struct amdgpu_ih_ring *ih)
 {
 	u32 wptr, tmp;
 
-	if (adev->irq.ih.use_bus_addr)
-		wptr = le32_to_cpu(adev->irq.ih.ring[adev->irq.ih.wptr_offs]);
-	else
-		wptr = le32_to_cpu(adev->wb.wb[adev->irq.ih.wptr_offs]);
+	wptr = le32_to_cpu(*ih->wptr_cpu);
 
 	if (REG_GET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW)) {
 		wptr = REG_SET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW, 0);
@@ -209,13 +200,13 @@ static u32 tonga_ih_get_wptr(struct amdgpu_device *adev)
 		 * this should allow us to catchup.
 		 */
 		dev_warn(adev->dev, "IH ring buffer overflow (0x%08X, 0x%08X, 0x%08X)\n",
-			wptr, adev->irq.ih.rptr, (wptr + 16) & adev->irq.ih.ptr_mask);
-		adev->irq.ih.rptr = (wptr + 16) & adev->irq.ih.ptr_mask;
+			 wptr, ih->rptr, (wptr + 16) & ih->ptr_mask);
+		ih->rptr = (wptr + 16) & ih->ptr_mask;
 		tmp = RREG32(mmIH_RB_CNTL);
 		tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 1);
 		WREG32(mmIH_RB_CNTL, tmp);
 	}
-	return (wptr & adev->irq.ih.ptr_mask);
+	return (wptr & ih->ptr_mask);
 }
 
 /**
@@ -227,16 +218,17 @@ static u32 tonga_ih_get_wptr(struct amdgpu_device *adev)
  * position and also advance the position.
  */
 static void tonga_ih_decode_iv(struct amdgpu_device *adev,
-				 struct amdgpu_iv_entry *entry)
+			       struct amdgpu_ih_ring *ih,
+			       struct amdgpu_iv_entry *entry)
 {
 	/* wptr/rptr are in bytes! */
-	u32 ring_index = adev->irq.ih.rptr >> 2;
+	u32 ring_index = ih->rptr >> 2;
 	uint32_t dw[4];
 
-	dw[0] = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
-	dw[1] = le32_to_cpu(adev->irq.ih.ring[ring_index + 1]);
-	dw[2] = le32_to_cpu(adev->irq.ih.ring[ring_index + 2]);
-	dw[3] = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
+	dw[0] = le32_to_cpu(ih->ring[ring_index + 0]);
+	dw[1] = le32_to_cpu(ih->ring[ring_index + 1]);
+	dw[2] = le32_to_cpu(ih->ring[ring_index + 2]);
+	dw[3] = le32_to_cpu(ih->ring[ring_index + 3]);
 
 	entry->client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
 	entry->src_id = dw[0] & 0xff;
@@ -246,7 +238,7 @@ static void tonga_ih_decode_iv(struct amdgpu_device *adev,
 	entry->pasid = (dw[2] >> 16) & 0xffff;
 
 	/* wptr/rptr are in bytes! */
-	adev->irq.ih.rptr += 16;
+	ih->rptr += 16;
 }
 
 /**
@@ -256,17 +248,15 @@ static void tonga_ih_decode_iv(struct amdgpu_device *adev,
  *
  * Set the IH ring buffer rptr.
  */
-static void tonga_ih_set_rptr(struct amdgpu_device *adev)
+static void tonga_ih_set_rptr(struct amdgpu_device *adev,
+			      struct amdgpu_ih_ring *ih)
 {
-	if (adev->irq.ih.use_doorbell) {
+	if (ih->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		if (adev->irq.ih.use_bus_addr)
-			adev->irq.ih.ring[adev->irq.ih.rptr_offs] = adev->irq.ih.rptr;
-		else
-			adev->wb.wb[adev->irq.ih.rptr_offs] = adev->irq.ih.rptr;
-		WDOORBELL32(adev->irq.ih.doorbell_index, adev->irq.ih.rptr);
+		*ih->rptr_cpu = ih->rptr;
+		WDOORBELL32(ih->doorbell_index, ih->rptr);
 	} else {
-		WREG32(mmIH_RB_RPTR, adev->irq.ih.rptr);
+		WREG32(mmIH_RB_RPTR, ih->rptr);
 	}
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
index d69c8f6daaf8..c4fb58667fd4 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c
@@ -511,7 +511,7 @@ static int uvd_v4_2_ring_test_ring(struct amdgpu_ring *ring)
 static void uvd_v4_2_ring_emit_ib(struct amdgpu_ring *ring,
 				  struct amdgpu_job *job,
 				  struct amdgpu_ib *ib,
-				  bool ctx_switch)
+				  uint32_t flags)
 {
 	amdgpu_ring_write(ring, PACKET0(mmUVD_RBC_IB_BASE, 0));
 	amdgpu_ring_write(ring, ib->gpu_addr);
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
index ee8cd06ddc38..52bd8a654734 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c
@@ -526,7 +526,7 @@ static int uvd_v5_0_ring_test_ring(struct amdgpu_ring *ring)
 static void uvd_v5_0_ring_emit_ib(struct amdgpu_ring *ring,
 				  struct amdgpu_job *job,
 				  struct amdgpu_ib *ib,
-				  bool ctx_switch)
+				  uint32_t flags)
 {
 	amdgpu_ring_write(ring, PACKET0(mmUVD_LMI_RBC_IB_64BIT_BAR_LOW, 0));
 	amdgpu_ring_write(ring, lower_32_bits(ib->gpu_addr));
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index d4f4a66f8324..c9edddf9f88a 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -977,7 +977,7 @@ static int uvd_v6_0_ring_test_ring(struct amdgpu_ring *ring)
 static void uvd_v6_0_ring_emit_ib(struct amdgpu_ring *ring,
 				  struct amdgpu_job *job,
 				  struct amdgpu_ib *ib,
-				  bool ctx_switch)
+				  uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 
@@ -1003,7 +1003,7 @@ static void uvd_v6_0_ring_emit_ib(struct amdgpu_ring *ring,
 static void uvd_v6_0_enc_ring_emit_ib(struct amdgpu_ring *ring,
 					struct amdgpu_job *job,
 					struct amdgpu_ib *ib,
-					bool ctx_switch)
+					uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index aef924026a28..dc461df48da0 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -1272,7 +1272,7 @@ static int uvd_v7_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
 static void uvd_v7_0_ring_emit_ib(struct amdgpu_ring *ring,
 				  struct amdgpu_job *job,
 				  struct amdgpu_ib *ib,
-				  bool ctx_switch)
+				  uint32_t flags)
 {
 	struct amdgpu_device *adev = ring->adev;
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
@@ -1303,7 +1303,7 @@ static void uvd_v7_0_ring_emit_ib(struct amdgpu_ring *ring,
 static void uvd_v7_0_enc_ring_emit_ib(struct amdgpu_ring *ring,
 					struct amdgpu_job *job,
 					struct amdgpu_ib *ib,
-					bool ctx_switch)
+					uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 2668effadd27..6ec65cf11112 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -834,7 +834,7 @@ out:
 static void vce_v3_0_ring_emit_ib(struct amdgpu_ring *ring,
 				  struct amdgpu_job *job,
 				  struct amdgpu_ib *ib,
-				  bool ctx_switch)
+				  uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index 9fb34b7d8e03..aadc3e66ebd7 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -947,7 +947,7 @@ static int vce_v4_0_set_powergating_state(void *handle,
 #endif
 
 static void vce_v4_0_ring_emit_ib(struct amdgpu_ring *ring, struct amdgpu_job *job,
-					struct amdgpu_ib *ib, bool ctx_switch)
+					struct amdgpu_ib *ib, uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
index 89bb2fef90eb..3dbc51f9d3b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -1371,7 +1371,7 @@ static void vcn_v1_0_dec_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u64
 static void vcn_v1_0_dec_ring_emit_ib(struct amdgpu_ring *ring,
 					struct amdgpu_job *job,
 					struct amdgpu_ib *ib,
-					bool ctx_switch)
+					uint32_t flags)
 {
 	struct amdgpu_device *adev = ring->adev;
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
@@ -1531,7 +1531,7 @@ static void vcn_v1_0_enc_ring_insert_end(struct amdgpu_ring *ring)
 static void vcn_v1_0_enc_ring_emit_ib(struct amdgpu_ring *ring,
 					struct amdgpu_job *job,
 					struct amdgpu_ib *ib,
-					bool ctx_switch)
+					uint32_t flags)
 {
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
 
@@ -1736,7 +1736,7 @@ static void vcn_v1_0_jpeg_ring_emit_fence(struct amdgpu_ring *ring, u64 addr, u6
 static void vcn_v1_0_jpeg_ring_emit_ib(struct amdgpu_ring *ring,
 					struct amdgpu_job *job,
 					struct amdgpu_ib *ib,
-					bool ctx_switch)
+					uint32_t flags)
 {
 	struct amdgpu_device *adev = ring->adev;
 	unsigned vmid = AMDGPU_JOB_GET_VMID(job);
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index 2c250b01a903..6d1f804277f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -50,6 +50,22 @@ static void vega10_ih_enable_interrupts(struct amdgpu_device *adev)
 	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, ENABLE_INTR, 1);
 	WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
 	adev->irq.ih.enabled = true;
+
+	if (adev->irq.ih1.ring_size) {
+		ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
+		ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
+					   RB_ENABLE, 1);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
+		adev->irq.ih1.enabled = true;
+	}
+
+	if (adev->irq.ih2.ring_size) {
+		ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2);
+		ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING2,
+					   RB_ENABLE, 1);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
+		adev->irq.ih2.enabled = true;
+	}
 }
 
 /**
@@ -71,6 +87,53 @@ static void vega10_ih_disable_interrupts(struct amdgpu_device *adev)
 	WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR, 0);
 	adev->irq.ih.enabled = false;
 	adev->irq.ih.rptr = 0;
+
+	if (adev->irq.ih1.ring_size) {
+		ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
+		ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
+					   RB_ENABLE, 0);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
+		/* set rptr, wptr to 0 */
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING1, 0);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_RING1, 0);
+		adev->irq.ih1.enabled = false;
+		adev->irq.ih1.rptr = 0;
+	}
+
+	if (adev->irq.ih2.ring_size) {
+		ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2);
+		ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING2,
+					   RB_ENABLE, 0);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
+		/* set rptr, wptr to 0 */
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING2, 0);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_RING2, 0);
+		adev->irq.ih2.enabled = false;
+		adev->irq.ih2.rptr = 0;
+	}
+}
+
+static uint32_t vega10_ih_rb_cntl(struct amdgpu_ih_ring *ih, uint32_t ih_rb_cntl)
+{
+	int rb_bufsz = order_base_2(ih->ring_size / 4);
+
+	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
+				   MC_SPACE, ih->use_bus_addr ? 1 : 4);
+	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
+				   WPTR_OVERFLOW_CLEAR, 1);
+	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
+				   WPTR_OVERFLOW_ENABLE, 1);
+	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, RB_SIZE, rb_bufsz);
+	/* Ring Buffer write pointer writeback. If enabled, IH_RB_WPTR register
+	 * value is written to memory
+	 */
+	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL,
+				   WPTR_WRITEBACK_ENABLE, 1);
+	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_SNOOP, 1);
+	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_RO, 0);
+	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_VMID, 0);
+
+	return ih_rb_cntl;
 }
 
 /**
@@ -86,50 +149,32 @@ static void vega10_ih_disable_interrupts(struct amdgpu_device *adev)
  */
 static int vega10_ih_irq_init(struct amdgpu_device *adev)
 {
+	struct amdgpu_ih_ring *ih;
 	int ret = 0;
-	int rb_bufsz;
 	u32 ih_rb_cntl, ih_doorbell_rtpr;
 	u32 tmp;
-	u64 wptr_off;
 
 	/* disable irqs */
 	vega10_ih_disable_interrupts(adev);
 
 	adev->nbio_funcs->ih_control(adev);
 
-	ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL);
+	ih = &adev->irq.ih;
 	/* Ring Buffer base. [39:8] of 40-bit address of the beginning of the ring buffer*/
-	if (adev->irq.ih.use_bus_addr) {
-		WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE, adev->irq.ih.rb_dma_addr >> 8);
-		WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE_HI, ((u64)adev->irq.ih.rb_dma_addr >> 40) & 0xff);
-		ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_SPACE, 1);
-	} else {
-		WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE, adev->irq.ih.gpu_addr >> 8);
-		WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE_HI, (adev->irq.ih.gpu_addr >> 40) & 0xff);
-		ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_SPACE, 4);
-	}
-	rb_bufsz = order_base_2(adev->irq.ih.ring_size / 4);
-	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 1);
-	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, WPTR_OVERFLOW_ENABLE, 1);
-	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, RB_SIZE, rb_bufsz);
-	/* Ring Buffer write pointer writeback. If enabled, IH_RB_WPTR register value is written to memory */
-	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, WPTR_WRITEBACK_ENABLE, 1);
-	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_SNOOP, 1);
-	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_RO, 0);
-	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, MC_VMID, 0);
-
-	if (adev->irq.msi_enabled)
-		ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, RPTR_REARM, 1);
+	WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE, ih->gpu_addr >> 8);
+	WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE_HI, (ih->gpu_addr >> 40) & 0xff);
 
+	ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL);
+	ih_rb_cntl = vega10_ih_rb_cntl(ih, ih_rb_cntl);
+	ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, RPTR_REARM,
+				   !!adev->irq.msi_enabled);
 	WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
 
 	/* set the writeback address whether it's enabled or not */
-	if (adev->irq.ih.use_bus_addr)
-		wptr_off = adev->irq.ih.rb_dma_addr + (adev->irq.ih.wptr_offs * 4);
-	else
-		wptr_off = adev->wb.gpu_addr + (adev->irq.ih.wptr_offs * 4);
-	WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_ADDR_LO, lower_32_bits(wptr_off));
-	WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_ADDR_HI, upper_32_bits(wptr_off) & 0xFFFF);
+	WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_ADDR_LO,
+		     lower_32_bits(ih->wptr_addr));
+	WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_ADDR_HI,
+		     upper_32_bits(ih->wptr_addr) & 0xFFFF);
 
 	/* set rptr, wptr to 0 */
 	WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR, 0);
@@ -137,17 +182,48 @@ static int vega10_ih_irq_init(struct amdgpu_device *adev)
 
 	ih_doorbell_rtpr = RREG32_SOC15(OSSSYS, 0, mmIH_DOORBELL_RPTR);
 	if (adev->irq.ih.use_doorbell) {
-		ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr, IH_DOORBELL_RPTR,
-						 OFFSET, adev->irq.ih.doorbell_index);
-		ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr, IH_DOORBELL_RPTR,
+		ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr,
+						 IH_DOORBELL_RPTR, OFFSET,
+						 adev->irq.ih.doorbell_index);
+		ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr,
+						 IH_DOORBELL_RPTR,
 						 ENABLE, 1);
 	} else {
-		ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr, IH_DOORBELL_RPTR,
+		ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr,
+						 IH_DOORBELL_RPTR,
 						 ENABLE, 0);
 	}
 	WREG32_SOC15(OSSSYS, 0, mmIH_DOORBELL_RPTR, ih_doorbell_rtpr);
-	adev->nbio_funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell,
-					    adev->irq.ih.doorbell_index);
+
+	ih = &adev->irq.ih1;
+	if (ih->ring_size) {
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE_RING1, ih->gpu_addr >> 8);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE_HI_RING1,
+			     (ih->gpu_addr >> 40) & 0xff);
+
+		ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
+		ih_rb_cntl = vega10_ih_rb_cntl(ih, ih_rb_cntl);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
+
+		/* set rptr, wptr to 0 */
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING1, 0);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_RING1, 0);
+	}
+
+	ih = &adev->irq.ih2;
+	if (ih->ring_size) {
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE_RING2, ih->gpu_addr >> 8);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_BASE_HI_RING2,
+			     (ih->gpu_addr >> 40) & 0xff);
+
+		ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
+		ih_rb_cntl = vega10_ih_rb_cntl(ih, ih_rb_cntl);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
+
+		/* set rptr, wptr to 0 */
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING2, 0);
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_RING2, 0);
+	}
 
 	tmp = RREG32_SOC15(OSSSYS, 0, mmIH_STORM_CLIENT_LIST_CNTL);
 	tmp = REG_SET_FIELD(tmp, IH_STORM_CLIENT_LIST_CNTL,
@@ -191,32 +267,58 @@ static void vega10_ih_irq_disable(struct amdgpu_device *adev)
  * ring buffer overflow and deal with it.
  * Returns the value of the wptr.
  */
-static u32 vega10_ih_get_wptr(struct amdgpu_device *adev)
+static u32 vega10_ih_get_wptr(struct amdgpu_device *adev,
+			      struct amdgpu_ih_ring *ih)
 {
-	u32 wptr, tmp;
+	u32 wptr, reg, tmp;
 
-	if (adev->irq.ih.use_bus_addr)
-		wptr = le32_to_cpu(adev->irq.ih.ring[adev->irq.ih.wptr_offs]);
+	wptr = le32_to_cpu(*ih->wptr_cpu);
+
+	if (!REG_GET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW))
+		goto out;
+
+	/* Double check that the overflow wasn't already cleared. */
+
+	if (ih == &adev->irq.ih)
+		reg = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_WPTR);
+	else if (ih == &adev->irq.ih1)
+		reg = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_WPTR_RING1);
+	else if (ih == &adev->irq.ih2)
+		reg = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_WPTR_RING2);
 	else
-		wptr = le32_to_cpu(adev->wb.wb[adev->irq.ih.wptr_offs]);
-
-	if (REG_GET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW)) {
-		wptr = REG_SET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW, 0);
-
-		/* When a ring buffer overflow happen start parsing interrupt
-		 * from the last not overwritten vector (wptr + 32). Hopefully
-		 * this should allow us to catchup.
-		 */
-		tmp = (wptr + 32) & adev->irq.ih.ptr_mask;
-		dev_warn(adev->dev, "IH ring buffer overflow (0x%08X, 0x%08X, 0x%08X)\n",
-			wptr, adev->irq.ih.rptr, tmp);
-		adev->irq.ih.rptr = tmp;
-
-		tmp = RREG32_NO_KIQ(SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_CNTL));
-		tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 1);
-		WREG32_NO_KIQ(SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_CNTL), tmp);
-	}
-	return (wptr & adev->irq.ih.ptr_mask);
+		BUG();
+
+	wptr = RREG32_NO_KIQ(reg);
+	if (!REG_GET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW))
+		goto out;
+
+	wptr = REG_SET_FIELD(wptr, IH_RB_WPTR, RB_OVERFLOW, 0);
+
+	/* When a ring buffer overflow happen start parsing interrupt
+	 * from the last not overwritten vector (wptr + 32). Hopefully
+	 * this should allow us to catchup.
+	 */
+	tmp = (wptr + 32) & ih->ptr_mask;
+	dev_warn(adev->dev, "IH ring buffer overflow "
+		 "(0x%08X, 0x%08X, 0x%08X)\n",
+		 wptr, ih->rptr, tmp);
+	ih->rptr = tmp;
+
+	if (ih == &adev->irq.ih)
+		reg = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_CNTL);
+	else if (ih == &adev->irq.ih1)
+		reg = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_CNTL_RING1);
+	else if (ih == &adev->irq.ih2)
+		reg = SOC15_REG_OFFSET(OSSSYS, 0, mmIH_RB_CNTL_RING2);
+	else
+		BUG();
+
+	tmp = RREG32_NO_KIQ(reg);
+	tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 1);
+	WREG32_NO_KIQ(reg, tmp);
+
+out:
+	return (wptr & ih->ptr_mask);
 }
 
 /**
@@ -228,20 +330,21 @@ static u32 vega10_ih_get_wptr(struct amdgpu_device *adev)
  * position and also advance the position.
  */
 static void vega10_ih_decode_iv(struct amdgpu_device *adev,
-				 struct amdgpu_iv_entry *entry)
+				struct amdgpu_ih_ring *ih,
+				struct amdgpu_iv_entry *entry)
 {
 	/* wptr/rptr are in bytes! */
-	u32 ring_index = adev->irq.ih.rptr >> 2;
+	u32 ring_index = ih->rptr >> 2;
 	uint32_t dw[8];
 
-	dw[0] = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
-	dw[1] = le32_to_cpu(adev->irq.ih.ring[ring_index + 1]);
-	dw[2] = le32_to_cpu(adev->irq.ih.ring[ring_index + 2]);
-	dw[3] = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
-	dw[4] = le32_to_cpu(adev->irq.ih.ring[ring_index + 4]);
-	dw[5] = le32_to_cpu(adev->irq.ih.ring[ring_index + 5]);
-	dw[6] = le32_to_cpu(adev->irq.ih.ring[ring_index + 6]);
-	dw[7] = le32_to_cpu(adev->irq.ih.ring[ring_index + 7]);
+	dw[0] = le32_to_cpu(ih->ring[ring_index + 0]);
+	dw[1] = le32_to_cpu(ih->ring[ring_index + 1]);
+	dw[2] = le32_to_cpu(ih->ring[ring_index + 2]);
+	dw[3] = le32_to_cpu(ih->ring[ring_index + 3]);
+	dw[4] = le32_to_cpu(ih->ring[ring_index + 4]);
+	dw[5] = le32_to_cpu(ih->ring[ring_index + 5]);
+	dw[6] = le32_to_cpu(ih->ring[ring_index + 6]);
+	dw[7] = le32_to_cpu(ih->ring[ring_index + 7]);
 
 	entry->client_id = dw[0] & 0xff;
 	entry->src_id = (dw[0] >> 8) & 0xff;
@@ -257,9 +360,8 @@ static void vega10_ih_decode_iv(struct amdgpu_device *adev,
 	entry->src_data[2] = dw[6];
 	entry->src_data[3] = dw[7];
 
-
 	/* wptr/rptr are in bytes! */
-	adev->irq.ih.rptr += 32;
+	ih->rptr += 32;
 }
 
 /**
@@ -269,37 +371,95 @@ static void vega10_ih_decode_iv(struct amdgpu_device *adev,
  *
  * Set the IH ring buffer rptr.
  */
-static void vega10_ih_set_rptr(struct amdgpu_device *adev)
+static void vega10_ih_set_rptr(struct amdgpu_device *adev,
+			       struct amdgpu_ih_ring *ih)
 {
-	if (adev->irq.ih.use_doorbell) {
+	if (ih->use_doorbell) {
 		/* XXX check if swapping is necessary on BE */
-		if (adev->irq.ih.use_bus_addr)
-			adev->irq.ih.ring[adev->irq.ih.rptr_offs] = adev->irq.ih.rptr;
-		else
-			adev->wb.wb[adev->irq.ih.rptr_offs] = adev->irq.ih.rptr;
-		WDOORBELL32(adev->irq.ih.doorbell_index, adev->irq.ih.rptr);
-	} else {
-		WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR, adev->irq.ih.rptr);
+		*ih->rptr_cpu = ih->rptr;
+		WDOORBELL32(ih->doorbell_index, ih->rptr);
+	} else if (ih == &adev->irq.ih) {
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR, ih->rptr);
+	} else if (ih == &adev->irq.ih1) {
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING1, ih->rptr);
+	} else if (ih == &adev->irq.ih2) {
+		WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING2, ih->rptr);
 	}
 }
 
+/**
+ * vega10_ih_self_irq - dispatch work for ring 1 and 2
+ *
+ * @adev: amdgpu_device pointer
+ * @source: irq source
+ * @entry: IV with WPTR update
+ *
+ * Update the WPTR from the IV and schedule work to handle the entries.
+ */
+static int vega10_ih_self_irq(struct amdgpu_device *adev,
+			      struct amdgpu_irq_src *source,
+			      struct amdgpu_iv_entry *entry)
+{
+	uint32_t wptr = cpu_to_le32(entry->src_data[0]);
+
+	switch (entry->ring_id) {
+	case 1:
+		*adev->irq.ih1.wptr_cpu = wptr;
+		schedule_work(&adev->irq.ih1_work);
+		break;
+	case 2:
+		*adev->irq.ih2.wptr_cpu = wptr;
+		schedule_work(&adev->irq.ih2_work);
+		break;
+	default: break;
+	}
+	return 0;
+}
+
+static const struct amdgpu_irq_src_funcs vega10_ih_self_irq_funcs = {
+	.process = vega10_ih_self_irq,
+};
+
+static void vega10_ih_set_self_irq_funcs(struct amdgpu_device *adev)
+{
+	adev->irq.self_irq.num_types = 0;
+	adev->irq.self_irq.funcs = &vega10_ih_self_irq_funcs;
+}
+
 static int vega10_ih_early_init(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	vega10_ih_set_interrupt_funcs(adev);
+	vega10_ih_set_self_irq_funcs(adev);
 	return 0;
 }
 
 static int vega10_ih_sw_init(void *handle)
 {
-	int r;
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+	int r;
+
+	r = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_IH, 0,
+			      &adev->irq.self_irq);
+	if (r)
+		return r;
 
 	r = amdgpu_ih_ring_init(adev, &adev->irq.ih, 256 * 1024, true);
 	if (r)
 		return r;
 
+	if (adev->asic_type == CHIP_VEGA10) {
+		r = amdgpu_ih_ring_init(adev, &adev->irq.ih1, PAGE_SIZE, true);
+		if (r)
+			return r;
+
+		r = amdgpu_ih_ring_init(adev, &adev->irq.ih2, PAGE_SIZE, true);
+		if (r)
+			return r;
+	}
+
+	/* TODO add doorbell for IH1 & IH2 as well */
 	adev->irq.ih.use_doorbell = true;
 	adev->irq.ih.doorbell_index = adev->doorbell_index.ih << 1;
 
@@ -313,6 +473,8 @@ static int vega10_ih_sw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	amdgpu_irq_fini(adev);
+	amdgpu_ih_ring_fini(adev, &adev->irq.ih2);
+	amdgpu_ih_ring_fini(adev, &adev->irq.ih1);
 	amdgpu_ih_ring_fini(adev, &adev->irq.ih);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
index 422674bb3cdf..a8e92638a2e8 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_reg_init.c
@@ -70,8 +70,8 @@ void vega10_doorbell_index_init(struct amdgpu_device *adev)
 	adev->doorbell_index.userqueue_start = AMDGPU_DOORBELL64_USERQUEUE_START;
 	adev->doorbell_index.userqueue_end = AMDGPU_DOORBELL64_USERQUEUE_END;
 	adev->doorbell_index.gfx_ring0 = AMDGPU_DOORBELL64_GFX_RING0;
-	adev->doorbell_index.sdma_engine0 = AMDGPU_DOORBELL64_sDMA_ENGINE0;
-	adev->doorbell_index.sdma_engine1 = AMDGPU_DOORBELL64_sDMA_ENGINE1;
+	adev->doorbell_index.sdma_engine[0] = AMDGPU_DOORBELL64_sDMA_ENGINE0;
+	adev->doorbell_index.sdma_engine[1] = AMDGPU_DOORBELL64_sDMA_ENGINE1;
 	adev->doorbell_index.ih = AMDGPU_DOORBELL64_IH;
 	adev->doorbell_index.uvd_vce.uvd_ring0_1 = AMDGPU_DOORBELL64_UVD_RING0_1;
 	adev->doorbell_index.uvd_vce.uvd_ring2_3 = AMDGPU_DOORBELL64_UVD_RING2_3;
@@ -81,7 +81,12 @@ void vega10_doorbell_index_init(struct amdgpu_device *adev)
 	adev->doorbell_index.uvd_vce.vce_ring2_3 = AMDGPU_DOORBELL64_VCE_RING2_3;
 	adev->doorbell_index.uvd_vce.vce_ring4_5 = AMDGPU_DOORBELL64_VCE_RING4_5;
 	adev->doorbell_index.uvd_vce.vce_ring6_7 = AMDGPU_DOORBELL64_VCE_RING6_7;
+
+	adev->doorbell_index.first_non_cp = AMDGPU_DOORBELL64_FIRST_NON_CP;
+	adev->doorbell_index.last_non_cp = AMDGPU_DOORBELL64_LAST_NON_CP;
+
 	/* In unit of dword doorbell */
 	adev->doorbell_index.max_assignment = AMDGPU_DOORBELL64_MAX_ASSIGNMENT << 1;
+	adev->doorbell_index.sdma_doorbell_range = 4;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
index edce413fda9a..0db84386252a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega20_reg_init.c
@@ -68,14 +68,14 @@ void vega20_doorbell_index_init(struct amdgpu_device *adev)
 	adev->doorbell_index.userqueue_start = AMDGPU_VEGA20_DOORBELL_USERQUEUE_START;
 	adev->doorbell_index.userqueue_end = AMDGPU_VEGA20_DOORBELL_USERQUEUE_END;
 	adev->doorbell_index.gfx_ring0 = AMDGPU_VEGA20_DOORBELL_GFX_RING0;
-	adev->doorbell_index.sdma_engine0 = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE0;
-	adev->doorbell_index.sdma_engine1 = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE1;
-	adev->doorbell_index.sdma_engine2 = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE2;
-	adev->doorbell_index.sdma_engine3 = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE3;
-	adev->doorbell_index.sdma_engine4 = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE4;
-	adev->doorbell_index.sdma_engine5 = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE5;
-	adev->doorbell_index.sdma_engine6 = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE6;
-	adev->doorbell_index.sdma_engine7 = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE7;
+	adev->doorbell_index.sdma_engine[0] = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE0;
+	adev->doorbell_index.sdma_engine[1] = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE1;
+	adev->doorbell_index.sdma_engine[2] = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE2;
+	adev->doorbell_index.sdma_engine[3] = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE3;
+	adev->doorbell_index.sdma_engine[4] = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE4;
+	adev->doorbell_index.sdma_engine[5] = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE5;
+	adev->doorbell_index.sdma_engine[6] = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE6;
+	adev->doorbell_index.sdma_engine[7] = AMDGPU_VEGA20_DOORBELL_sDMA_ENGINE7;
 	adev->doorbell_index.ih = AMDGPU_VEGA20_DOORBELL_IH;
 	adev->doorbell_index.uvd_vce.uvd_ring0_1 = AMDGPU_VEGA20_DOORBELL64_UVD_RING0_1;
 	adev->doorbell_index.uvd_vce.uvd_ring2_3 = AMDGPU_VEGA20_DOORBELL64_UVD_RING2_3;
@@ -85,6 +85,11 @@ void vega20_doorbell_index_init(struct amdgpu_device *adev)
 	adev->doorbell_index.uvd_vce.vce_ring2_3 = AMDGPU_VEGA20_DOORBELL64_VCE_RING2_3;
 	adev->doorbell_index.uvd_vce.vce_ring4_5 = AMDGPU_VEGA20_DOORBELL64_VCE_RING4_5;
 	adev->doorbell_index.uvd_vce.vce_ring6_7 = AMDGPU_VEGA20_DOORBELL64_VCE_RING6_7;
+
+	adev->doorbell_index.first_non_cp = AMDGPU_VEGA20_DOORBELL64_FIRST_NON_CP;
+	adev->doorbell_index.last_non_cp = AMDGPU_VEGA20_DOORBELL64_LAST_NON_CP;
+
 	adev->doorbell_index.max_assignment = AMDGPU_VEGA20_DOORBELL_MAX_ASSIGNMENT << 1;
+	adev->doorbell_index.sdma_doorbell_range = 20;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 77e367459101..5e5b42a0744a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -941,6 +941,69 @@ static bool vi_need_full_reset(struct amdgpu_device *adev)
 	}
 }
 
+static void vi_get_pcie_usage(struct amdgpu_device *adev, uint64_t *count0,
+			      uint64_t *count1)
+{
+	uint32_t perfctr = 0;
+	uint64_t cnt0_of, cnt1_of;
+	int tmp;
+
+	/* This reports 0 on APUs, so return to avoid writing/reading registers
+	 * that may or may not be different from their GPU counterparts
+	 */
+	if (adev->flags & AMD_IS_APU)
+		return;
+
+	/* Set the 2 events that we wish to watch, defined above */
+	/* Reg 40 is # received msgs, Reg 104 is # of posted requests sent */
+	perfctr = REG_SET_FIELD(perfctr, PCIE_PERF_CNTL_TXCLK, EVENT0_SEL, 40);
+	perfctr = REG_SET_FIELD(perfctr, PCIE_PERF_CNTL_TXCLK, EVENT1_SEL, 104);
+
+	/* Write to enable desired perf counters */
+	WREG32_PCIE(ixPCIE_PERF_CNTL_TXCLK, perfctr);
+	/* Zero out and enable the perf counters
+	 * Write 0x5:
+	 * Bit 0 = Start all counters(1)
+	 * Bit 2 = Global counter reset enable(1)
+	 */
+	WREG32_PCIE(ixPCIE_PERF_COUNT_CNTL, 0x00000005);
+
+	msleep(1000);
+
+	/* Load the shadow and disable the perf counters
+	 * Write 0x2:
+	 * Bit 0 = Stop counters(0)
+	 * Bit 1 = Load the shadow counters(1)
+	 */
+	WREG32_PCIE(ixPCIE_PERF_COUNT_CNTL, 0x00000002);
+
+	/* Read register values to get any >32bit overflow */
+	tmp = RREG32_PCIE(ixPCIE_PERF_CNTL_TXCLK);
+	cnt0_of = REG_GET_FIELD(tmp, PCIE_PERF_CNTL_TXCLK, COUNTER0_UPPER);
+	cnt1_of = REG_GET_FIELD(tmp, PCIE_PERF_CNTL_TXCLK, COUNTER1_UPPER);
+
+	/* Get the values and add the overflow */
+	*count0 = RREG32_PCIE(ixPCIE_PERF_COUNT0_TXCLK) | (cnt0_of << 32);
+	*count1 = RREG32_PCIE(ixPCIE_PERF_COUNT1_TXCLK) | (cnt1_of << 32);
+}
+
+static bool vi_need_reset_on_init(struct amdgpu_device *adev)
+{
+	u32 clock_cntl, pc;
+
+	if (adev->flags & AMD_IS_APU)
+		return false;
+
+	/* check if the SMC is already running */
+	clock_cntl = RREG32_SMC(ixSMC_SYSCON_CLOCK_CNTL_0);
+	pc = RREG32_SMC(ixSMC_PC_C);
+	if ((0 == REG_GET_FIELD(clock_cntl, SMC_SYSCON_CLOCK_CNTL_0, ck_disable)) &&
+	    (0x20100 <= pc))
+		return true;
+
+	return false;
+}
+
 static const struct amdgpu_asic_funcs vi_asic_funcs =
 {
 	.read_disabled_bios = &vi_read_disabled_bios,
@@ -956,6 +1019,8 @@ static const struct amdgpu_asic_funcs vi_asic_funcs =
 	.invalidate_hdp = &vi_invalidate_hdp,
 	.need_full_reset = &vi_need_full_reset,
 	.init_doorbell_index = &legacy_doorbell_index_init,
+	.get_pcie_usage = &vi_get_pcie_usage,
+	.need_reset_on_init = &vi_need_reset_on_init,
 };
 
 #define CZ_REV_BRISTOL(rev)	 \
@@ -1726,8 +1791,8 @@ void legacy_doorbell_index_init(struct amdgpu_device *adev)
 	adev->doorbell_index.mec_ring6 = AMDGPU_DOORBELL_MEC_RING6;
 	adev->doorbell_index.mec_ring7 = AMDGPU_DOORBELL_MEC_RING7;
 	adev->doorbell_index.gfx_ring0 = AMDGPU_DOORBELL_GFX_RING0;
-	adev->doorbell_index.sdma_engine0 = AMDGPU_DOORBELL_sDMA_ENGINE0;
-	adev->doorbell_index.sdma_engine1 = AMDGPU_DOORBELL_sDMA_ENGINE1;
+	adev->doorbell_index.sdma_engine[0] = AMDGPU_DOORBELL_sDMA_ENGINE0;
+	adev->doorbell_index.sdma_engine[1] = AMDGPU_DOORBELL_sDMA_ENGINE1;
 	adev->doorbell_index.ih = AMDGPU_DOORBELL_IH;
 	adev->doorbell_index.max_assignment = AMDGPU_DOORBELL_MAX_ASSIGNMENT;
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 8372556b52eb..c6c9530e704e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -134,12 +134,18 @@ static int allocate_doorbell(struct qcm_process_device *qpd, struct queue *q)
 		 */
 		q->doorbell_id = q->properties.queue_id;
 	} else if (q->properties.type == KFD_QUEUE_TYPE_SDMA) {
-		/* For SDMA queues on SOC15, use static doorbell
-		 * assignments based on the engine and queue.
+		/* For SDMA queues on SOC15 with 8-byte doorbell, use static
+		 * doorbell assignments based on the engine and queue id.
+		 * The doobell index distance between RLC (2*i) and (2*i+1)
+		 * for a SDMA engine is 512.
 		 */
-		q->doorbell_id = dev->shared_resources.sdma_doorbell
-			[q->properties.sdma_engine_id]
-			[q->properties.sdma_queue_id];
+		uint32_t *idx_offset =
+				dev->shared_resources.sdma_doorbell_idx;
+
+		q->doorbell_id = idx_offset[q->properties.sdma_engine_id]
+			+ (q->properties.sdma_queue_id & 1)
+			* KFD_QUEUE_DOORBELL_MIRROR_OFFSET
+			+ (q->properties.sdma_queue_id >> 1);
 	} else {
 		/* For CP queues on SOC15 reserve a free doorbell ID */
 		unsigned int found;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 8018163414ff..932007eb9168 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -23,22 +23,7 @@
 #include <linux/sched.h>
 #include <linux/device.h>
 #include "kfd_priv.h"
-
-static const struct kgd2kfd_calls kgd2kfd = {
-	.exit		= kgd2kfd_exit,
-	.probe		= kgd2kfd_probe,
-	.device_init	= kgd2kfd_device_init,
-	.device_exit	= kgd2kfd_device_exit,
-	.interrupt	= kgd2kfd_interrupt,
-	.suspend	= kgd2kfd_suspend,
-	.resume		= kgd2kfd_resume,
-	.quiesce_mm	= kgd2kfd_quiesce_mm,
-	.resume_mm	= kgd2kfd_resume_mm,
-	.schedule_evict_and_restore_process =
-			  kgd2kfd_schedule_evict_and_restore_process,
-	.pre_reset	= kgd2kfd_pre_reset,
-	.post_reset	= kgd2kfd_post_reset,
-};
+#include "amdgpu_amdkfd.h"
 
 static int kfd_init(void)
 {
@@ -91,20 +76,10 @@ static void kfd_exit(void)
 	kfd_chardev_exit();
 }
 
-int kgd2kfd_init(unsigned int interface_version,
-		const struct kgd2kfd_calls **g2f)
+int kgd2kfd_init()
 {
-	int err;
-
-	err = kfd_init();
-	if (err)
-		return err;
-
-	*g2f = &kgd2kfd;
-
-	return 0;
+	return kfd_init();
 }
-EXPORT_SYMBOL(kgd2kfd_init);
 
 void kgd2kfd_exit(void)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 0689d4ccbbc0..0eeee3c6d6dc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -97,17 +97,29 @@
 #define KFD_CWSR_TBA_TMA_SIZE (PAGE_SIZE * 2)
 #define KFD_CWSR_TMA_OFFSET PAGE_SIZE
 
+#define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE		\
+	(KFD_MAX_NUM_OF_PROCESSES *			\
+			KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
+
+#define KFD_KERNEL_QUEUE_SIZE 2048
+
+/*
+ * 512 = 0x200
+ * The doorbell index distance between SDMA RLC (2*i) and (2*i+1) in the
+ * same SDMA engine on SOC15, which has 8-byte doorbells for SDMA.
+ * 512 8-byte doorbell distance (i.e. one page away) ensures that SDMA RLC
+ * (2*i+1) doorbells (in terms of the lower 12 bit address) lie exactly in
+ * the OFFSET and SIZE set in registers like BIF_SDMA0_DOORBELL_RANGE.
+ */
+#define KFD_QUEUE_DOORBELL_MIRROR_OFFSET 512
+
+
 /*
  * Kernel module parameter to specify maximum number of supported queues per
  * device
  */
 extern int max_num_of_queues_per_device;
 
-#define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE		\
-	(KFD_MAX_NUM_OF_PROCESSES *			\
-			KFD_MAX_NUM_OF_QUEUES_PER_PROCESS)
-
-#define KFD_KERNEL_QUEUE_SIZE 2048
 
 /* Kernel module parameter to specify the scheduling policy */
 extern int sched_policy;
@@ -266,14 +278,6 @@ struct kfd_dev {
 	bool pci_atomic_requested;
 };
 
-/* KGD2KFD callbacks */
-void kgd2kfd_exit(void);
-struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
-			struct pci_dev *pdev, const struct kfd2kgd_calls *f2g);
-bool kgd2kfd_device_init(struct kfd_dev *kfd,
-			const struct kgd2kfd_shared_resources *gpu_resources);
-void kgd2kfd_device_exit(struct kfd_dev *kfd);
-
 enum kfd_mempool {
 	KFD_MEMPOOL_SYSTEM_CACHEABLE = 1,
 	KFD_MEMPOOL_SYSTEM_WRITECOMBINE = 2,
@@ -541,11 +545,6 @@ struct qcm_process_device {
 /* Approx. time before evicting the process again */
 #define PROCESS_ACTIVE_TIME_MS 10
 
-int kgd2kfd_quiesce_mm(struct mm_struct *mm);
-int kgd2kfd_resume_mm(struct mm_struct *mm);
-int kgd2kfd_schedule_evict_and_restore_process(struct mm_struct *mm,
-					       struct dma_fence *fence);
-
 /* 8 byte handle containing GPU ID in the most significant 4 bytes and
  * idr_handle in the least significant 4 bytes
  */
@@ -800,20 +799,11 @@ int kfd_numa_node_to_apic_id(int numa_node_id);
 /* Interrupts */
 int kfd_interrupt_init(struct kfd_dev *dev);
 void kfd_interrupt_exit(struct kfd_dev *dev);
-void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry);
 bool enqueue_ih_ring_entry(struct kfd_dev *kfd,	const void *ih_ring_entry);
 bool interrupt_is_wanted(struct kfd_dev *dev,
 				const uint32_t *ih_ring_entry,
 				uint32_t *patched_ihre, bool *flag);
 
-/* Power Management */
-void kgd2kfd_suspend(struct kfd_dev *kfd);
-int kgd2kfd_resume(struct kfd_dev *kfd);
-
-/* GPU reset */
-int kgd2kfd_pre_reset(struct kfd_dev *kfd);
-int kgd2kfd_post_reset(struct kfd_dev *kfd);
-
 /* amdkfd Apertures */
 int kfd_init_apertures(struct kfd_process *process);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 80b36e860a0a..4bdae78bab8e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -607,13 +607,17 @@ static int init_doorbell_bitmap(struct qcm_process_device *qpd,
 	if (!qpd->doorbell_bitmap)
 		return -ENOMEM;
 
-	/* Mask out any reserved doorbells */
-	for (i = 0; i < KFD_MAX_NUM_OF_QUEUES_PER_PROCESS; i++)
-		if ((dev->shared_resources.reserved_doorbell_mask & i) ==
-		    dev->shared_resources.reserved_doorbell_val) {
+	/* Mask out doorbells reserved for SDMA, IH, and VCN on SOC15. */
+	for (i = 0; i < KFD_MAX_NUM_OF_QUEUES_PER_PROCESS / 2; i++) {
+		if (i >= dev->shared_resources.non_cp_doorbells_start
+			&& i <= dev->shared_resources.non_cp_doorbells_end) {
 			set_bit(i, qpd->doorbell_bitmap);
-			pr_debug("reserved doorbell 0x%03x\n", i);
+			set_bit(i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET,
+				qpd->doorbell_bitmap);
+			pr_debug("reserved doorbell 0x%03x and 0x%03x\n", i,
+				i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET);
 		}
+	}
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 636d14a60952..2f26581b93ff 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1705,7 +1705,8 @@ static int amdgpu_dm_mode_config_init(struct amdgpu_device *adev)
 
 	dc_resource_state_copy_construct_current(adev->dm.dc, state->context);
 
-	drm_atomic_private_obj_init(&adev->dm.atomic_obj,
+	drm_atomic_private_obj_init(adev->ddev,
+				    &adev->dm.atomic_obj,
 				    &state->base,
 				    &dm_atomic_state_funcs);
 
@@ -2296,6 +2297,71 @@ static int get_fb_info(const struct amdgpu_framebuffer *amdgpu_fb,
 	return r;
 }
 
+static inline uint64_t get_dcc_address(uint64_t address, uint64_t tiling_flags)
+{
+	uint32_t offset = AMDGPU_TILING_GET(tiling_flags, DCC_OFFSET_256B);
+
+	return offset ? (address + offset * 256) : 0;
+}
+
+static bool fill_plane_dcc_attributes(struct amdgpu_device *adev,
+				      const struct amdgpu_framebuffer *afb,
+				      struct dc_plane_state *plane_state,
+				      uint64_t info)
+{
+	struct dc *dc = adev->dm.dc;
+	struct dc_dcc_surface_param input;
+	struct dc_surface_dcc_cap output;
+	uint32_t offset = AMDGPU_TILING_GET(info, DCC_OFFSET_256B);
+	uint32_t i64b = AMDGPU_TILING_GET(info, DCC_INDEPENDENT_64B) != 0;
+	uint64_t dcc_address;
+
+	memset(&input, 0, sizeof(input));
+	memset(&output, 0, sizeof(output));
+
+	if (!offset)
+		return false;
+
+	if (!dc->cap_funcs.get_dcc_compression_cap)
+		return false;
+
+	input.format = plane_state->format;
+	input.surface_size.width =
+		plane_state->plane_size.grph.surface_size.width;
+	input.surface_size.height =
+		plane_state->plane_size.grph.surface_size.height;
+	input.swizzle_mode = plane_state->tiling_info.gfx9.swizzle;
+
+	if (plane_state->rotation == ROTATION_ANGLE_0 ||
+	    plane_state->rotation == ROTATION_ANGLE_180)
+		input.scan = SCAN_DIRECTION_HORIZONTAL;
+	else if (plane_state->rotation == ROTATION_ANGLE_90 ||
+		 plane_state->rotation == ROTATION_ANGLE_270)
+		input.scan = SCAN_DIRECTION_VERTICAL;
+
+	if (!dc->cap_funcs.get_dcc_compression_cap(dc, &input, &output))
+		return false;
+
+	if (!output.capable)
+		return false;
+
+	if (i64b == 0 && output.grph.rgb.independent_64b_blks != 0)
+		return false;
+
+	plane_state->dcc.enable = 1;
+	plane_state->dcc.grph.meta_pitch =
+		AMDGPU_TILING_GET(info, DCC_PITCH_MAX) + 1;
+	plane_state->dcc.grph.independent_64b_blks = i64b;
+
+	dcc_address = get_dcc_address(afb->address, info);
+	plane_state->address.grph.meta_addr.low_part =
+		lower_32_bits(dcc_address);
+	plane_state->address.grph.meta_addr.high_part =
+		upper_32_bits(dcc_address);
+
+	return true;
+}
+
 static int fill_plane_attributes_from_fb(struct amdgpu_device *adev,
 					 struct dc_plane_state *plane_state,
 					 const struct amdgpu_framebuffer *amdgpu_fb)
@@ -2348,6 +2414,10 @@ static int fill_plane_attributes_from_fb(struct amdgpu_device *adev,
 		return -EINVAL;
 	}
 
+	memset(&plane_state->address, 0, sizeof(plane_state->address));
+	memset(&plane_state->tiling_info, 0, sizeof(plane_state->tiling_info));
+	memset(&plane_state->dcc, 0, sizeof(plane_state->dcc));
+
 	if (plane_state->format < SURFACE_PIXEL_FORMAT_VIDEO_BEGIN) {
 		plane_state->address.type = PLN_ADDR_TYPE_GRAPHICS;
 		plane_state->plane_size.grph.surface_size.x = 0;
@@ -2379,8 +2449,6 @@ static int fill_plane_attributes_from_fb(struct amdgpu_device *adev,
 		plane_state->color_space = COLOR_SPACE_YCBCR709;
 	}
 
-	memset(&plane_state->tiling_info, 0, sizeof(plane_state->tiling_info));
-
 	/* Fill GFX8 params */
 	if (AMDGPU_TILING_GET(tiling_flags, ARRAY_MODE) == DC_ARRAY_2D_TILED_THIN1) {
 		unsigned int bankw, bankh, mtaspect, tile_split, num_banks;
@@ -2429,6 +2497,9 @@ static int fill_plane_attributes_from_fb(struct amdgpu_device *adev,
 		plane_state->tiling_info.gfx9.swizzle =
 			AMDGPU_TILING_GET(tiling_flags, SWIZZLE_MODE);
 		plane_state->tiling_info.gfx9.shaderEnable = 1;
+
+		fill_plane_dcc_attributes(adev, amdgpu_fb, plane_state,
+					  tiling_flags);
 	}
 
 	plane_state->visible = true;
@@ -2592,7 +2663,7 @@ get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing)
 		 * according to HDMI spec, we use YCbCr709 and YCbCr601
 		 * respectively
 		 */
-		if (dc_crtc_timing->pix_clk_khz > 27030) {
+		if (dc_crtc_timing->pix_clk_100hz > 270300) {
 			if (dc_crtc_timing->flags.Y_ONLY)
 				color_space =
 					COLOR_SPACE_YCBCR709_LIMITED;
@@ -2635,7 +2706,7 @@ static void adjust_colour_depth_from_display_info(struct dc_crtc_timing *timing_
 	if (timing_out->display_color_depth <= COLOR_DEPTH_888)
 		return;
 	do {
-		normalized_clk = timing_out->pix_clk_khz;
+		normalized_clk = timing_out->pix_clk_100hz / 10;
 		/* YCbCr 4:2:0 requires additional adjustment of 1/2 */
 		if (timing_out->pixel_encoding == PIXEL_ENCODING_YCBCR420)
 			normalized_clk /= 2;
@@ -2678,10 +2749,10 @@ fill_stream_properties_from_drm_display_mode(struct dc_stream_state *stream,
 	timing_out->v_border_bottom = 0;
 	/* TODO: un-hardcode */
 	if (drm_mode_is_420_only(info, mode_in)
-			&& stream->sink->sink_signal == SIGNAL_TYPE_HDMI_TYPE_A)
+			&& stream->signal == SIGNAL_TYPE_HDMI_TYPE_A)
 		timing_out->pixel_encoding = PIXEL_ENCODING_YCBCR420;
 	else if ((connector->display_info.color_formats & DRM_COLOR_FORMAT_YCRCB444)
-			&& stream->sink->sink_signal == SIGNAL_TYPE_HDMI_TYPE_A)
+			&& stream->signal == SIGNAL_TYPE_HDMI_TYPE_A)
 		timing_out->pixel_encoding = PIXEL_ENCODING_YCBCR444;
 	else
 		timing_out->pixel_encoding = PIXEL_ENCODING_RGB;
@@ -2716,14 +2787,14 @@ fill_stream_properties_from_drm_display_mode(struct dc_stream_state *stream,
 		mode_in->crtc_vsync_start - mode_in->crtc_vdisplay;
 	timing_out->v_sync_width =
 		mode_in->crtc_vsync_end - mode_in->crtc_vsync_start;
-	timing_out->pix_clk_khz = mode_in->crtc_clock;
+	timing_out->pix_clk_100hz = mode_in->crtc_clock * 10;
 	timing_out->aspect_ratio = get_aspect_ratio(mode_in);
 
 	stream->output_color_space = get_output_color_space(timing_out);
 
 	stream->out_transfer_func->type = TF_TYPE_PREDEFINED;
 	stream->out_transfer_func->tf = TRANSFER_FUNCTION_SRGB;
-	if (stream->sink->sink_signal == SIGNAL_TYPE_HDMI_TYPE_A)
+	if (stream->signal == SIGNAL_TYPE_HDMI_TYPE_A)
 		adjust_colour_depth_from_display_info(timing_out, info);
 }
 
@@ -2844,7 +2915,7 @@ static void set_master_stream(struct dc_stream_state *stream_set[],
 		if (stream_set[j] && stream_set[j]->triggered_crtc_reset.enabled) {
 			int refresh_rate = 0;
 
-			refresh_rate = (stream_set[j]->timing.pix_clk_khz*1000)/
+			refresh_rate = (stream_set[j]->timing.pix_clk_100hz*100)/
 				(stream_set[j]->timing.h_total*stream_set[j]->timing.v_total);
 			if (refresh_rate > highest_rfr) {
 				highest_rfr = refresh_rate;
@@ -2901,11 +2972,9 @@ create_stream_for_sink(struct amdgpu_dm_connector *aconnector,
 	drm_connector = &aconnector->base;
 
 	if (!aconnector->dc_sink) {
-		if (!aconnector->mst_port) {
-			sink = create_fake_sink(aconnector);
-			if (!sink)
-				return stream;
-		}
+		sink = create_fake_sink(aconnector);
+		if (!sink)
+			return stream;
 	} else {
 		sink = aconnector->dc_sink;
 	}
@@ -2917,6 +2986,8 @@ create_stream_for_sink(struct amdgpu_dm_connector *aconnector,
 		goto finish;
 	}
 
+	stream->dm_stream_context = aconnector;
+
 	list_for_each_entry(preferred_mode, &aconnector->base.modes, head) {
 		/* Search for preferred mode */
 		if (preferred_mode->type & DRM_MODE_TYPE_PREFERRED) {
@@ -2968,10 +3039,7 @@ create_stream_for_sink(struct amdgpu_dm_connector *aconnector,
 		drm_connector,
 		sink);
 
-	update_stream_signal(stream);
-
-	if (dm_state && dm_state->freesync_capable)
-		stream->ignore_msa_timing_param = true;
+	update_stream_signal(stream, sink);
 
 finish:
 	if (sink && sink->sink_signal == SIGNAL_TYPE_VIRTUAL && aconnector->base.force != DRM_FORCE_ON)
@@ -3544,6 +3612,7 @@ static int dm_plane_helper_prepare_fb(struct drm_plane *plane,
 	struct amdgpu_bo *rbo;
 	uint64_t chroma_addr = 0;
 	struct dm_plane_state *dm_plane_state_new, *dm_plane_state_old;
+	uint64_t tiling_flags, dcc_address;
 	unsigned int awidth;
 	uint32_t domain;
 	int r;
@@ -3584,6 +3653,9 @@ static int dm_plane_helper_prepare_fb(struct drm_plane *plane,
 		DRM_ERROR("%p bind failed\n", rbo);
 		return r;
 	}
+
+	amdgpu_bo_get_tiling_flags(rbo, &tiling_flags);
+
 	amdgpu_bo_unreserve(rbo);
 
 	afb->address = amdgpu_bo_gpu_offset(rbo);
@@ -3597,6 +3669,13 @@ static int dm_plane_helper_prepare_fb(struct drm_plane *plane,
 		if (plane_state->format < SURFACE_PIXEL_FORMAT_VIDEO_BEGIN) {
 			plane_state->address.grph.addr.low_part = lower_32_bits(afb->address);
 			plane_state->address.grph.addr.high_part = upper_32_bits(afb->address);
+
+			dcc_address =
+				get_dcc_address(afb->address, tiling_flags);
+			plane_state->address.grph.meta_addr.low_part =
+				lower_32_bits(dcc_address);
+			plane_state->address.grph.meta_addr.high_part =
+				upper_32_bits(dcc_address);
 		} else {
 			awidth = ALIGN(new_state->fb->width, 64);
 			plane_state->address.type = PLN_ADDR_TYPE_VIDEO_PROGRESSIVE;
@@ -3711,7 +3790,6 @@ static const struct drm_plane_helper_funcs dm_plane_helper_funcs = {
  * check will succeed, and let DC implement proper check
  */
 static const uint32_t rgb_formats[] = {
-	DRM_FORMAT_RGB888,
 	DRM_FORMAT_XRGB8888,
 	DRM_FORMAT_ARGB8888,
 	DRM_FORMAT_RGBA8888,
@@ -4469,20 +4547,6 @@ static void prepare_flip_isr(struct amdgpu_crtc *acrtc)
 						 acrtc->crtc_id);
 }
 
-struct dc_stream_status *dc_state_get_stream_status(
-	struct dc_state *state,
-	struct dc_stream_state *stream)
-{
-	uint8_t i;
-
-	for (i = 0; i < state->stream_count; i++) {
-		if (stream == state->streams[i])
-			return &state->stream_status[i];
-	}
-
-	return NULL;
-}
-
 static void update_freesync_state_on_stream(
 	struct amdgpu_display_manager *dm,
 	struct dm_crtc_state *new_crtc_state,
@@ -4536,12 +4600,12 @@ static void update_freesync_state_on_stream(
 		TRANSFER_FUNC_UNKNOWN,
 		&vrr_infopacket);
 
-	new_crtc_state->freesync_timing_changed =
+	new_crtc_state->freesync_timing_changed |=
 		(memcmp(&new_crtc_state->vrr_params.adjust,
 			&vrr_params.adjust,
 			sizeof(vrr_params.adjust)) != 0);
 
-	new_crtc_state->freesync_vrr_info_changed =
+	new_crtc_state->freesync_vrr_info_changed |=
 		(memcmp(&new_crtc_state->vrr_infopacket,
 			&vrr_infopacket,
 			sizeof(vrr_infopacket)) != 0);
@@ -4557,254 +4621,6 @@ static void update_freesync_state_on_stream(
 			      new_crtc_state->base.crtc->base.id,
 			      (int)new_crtc_state->base.vrr_enabled,
 			      (int)vrr_params.state);
-
-	if (new_crtc_state->freesync_timing_changed)
-		DRM_DEBUG_KMS("VRR timing update: crtc=%u min=%u max=%u\n",
-			      new_crtc_state->base.crtc->base.id,
-				  vrr_params.adjust.v_total_min,
-				  vrr_params.adjust.v_total_max);
-}
-
-/*
- * Executes flip
- *
- * Waits on all BO's fences and for proper vblank count
- */
-static void amdgpu_dm_do_flip(struct drm_crtc *crtc,
-			      struct drm_framebuffer *fb,
-			      uint32_t target,
-			      struct dc_state *state)
-{
-	unsigned long flags;
-	uint64_t timestamp_ns;
-	uint32_t target_vblank;
-	int r, vpos, hpos;
-	struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
-	struct amdgpu_framebuffer *afb = to_amdgpu_framebuffer(fb);
-	struct amdgpu_bo *abo = gem_to_amdgpu_bo(fb->obj[0]);
-	struct amdgpu_device *adev = crtc->dev->dev_private;
-	bool async_flip = (crtc->state->pageflip_flags & DRM_MODE_PAGE_FLIP_ASYNC) != 0;
-	struct dc_flip_addrs addr = { {0} };
-	/* TODO eliminate or rename surface_update */
-	struct dc_surface_update surface_updates[1] = { {0} };
-	struct dc_stream_update stream_update = {0};
-	struct dm_crtc_state *acrtc_state = to_dm_crtc_state(crtc->state);
-	struct dc_stream_status *stream_status;
-	struct dc_plane_state *surface;
-
-
-	/* Prepare wait for target vblank early - before the fence-waits */
-	target_vblank = target - (uint32_t)drm_crtc_vblank_count(crtc) +
-			amdgpu_get_vblank_counter_kms(crtc->dev, acrtc->crtc_id);
-
-	/*
-	 * TODO This might fail and hence better not used, wait
-	 * explicitly on fences instead
-	 * and in general should be called for
-	 * blocking commit to as per framework helpers
-	 */
-	r = amdgpu_bo_reserve(abo, true);
-	if (unlikely(r != 0)) {
-		DRM_ERROR("failed to reserve buffer before flip\n");
-		WARN_ON(1);
-	}
-
-	/* Wait for all fences on this FB */
-	WARN_ON(reservation_object_wait_timeout_rcu(abo->tbo.resv, true, false,
-								    MAX_SCHEDULE_TIMEOUT) < 0);
-
-	amdgpu_bo_unreserve(abo);
-
-	/*
-	 * Wait until we're out of the vertical blank period before the one
-	 * targeted by the flip
-	 */
-	while ((acrtc->enabled &&
-		(amdgpu_display_get_crtc_scanoutpos(adev->ddev, acrtc->crtc_id,
-						    0, &vpos, &hpos, NULL,
-						    NULL, &crtc->hwmode)
-		 & (DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK)) ==
-		(DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK) &&
-		(int)(target_vblank -
-		  amdgpu_get_vblank_counter_kms(adev->ddev, acrtc->crtc_id)) > 0)) {
-		usleep_range(1000, 1100);
-	}
-
-	/* Flip */
-	spin_lock_irqsave(&crtc->dev->event_lock, flags);
-
-	WARN_ON(acrtc->pflip_status != AMDGPU_FLIP_NONE);
-	WARN_ON(!acrtc_state->stream);
-
-	addr.address.grph.addr.low_part = lower_32_bits(afb->address);
-	addr.address.grph.addr.high_part = upper_32_bits(afb->address);
-	addr.flip_immediate = async_flip;
-
-	timestamp_ns = ktime_get_ns();
-	addr.flip_timestamp_in_us = div_u64(timestamp_ns, 1000);
-
-
-	if (acrtc->base.state->event)
-		prepare_flip_isr(acrtc);
-
-	spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
-
-	stream_status = dc_stream_get_status(acrtc_state->stream);
-	if (!stream_status) {
-		DRM_ERROR("No stream status for CRTC: id=%d\n",
-			acrtc->crtc_id);
-		return;
-	}
-
-	surface = stream_status->plane_states[0];
-	surface_updates->surface = surface;
-
-	if (!surface) {
-		DRM_ERROR("No surface for CRTC: id=%d\n",
-			acrtc->crtc_id);
-		return;
-	}
-	surface_updates->flip_addr = &addr;
-
-	if (acrtc_state->stream) {
-		update_freesync_state_on_stream(
-			&adev->dm,
-			acrtc_state,
-			acrtc_state->stream,
-			surface,
-			addr.flip_timestamp_in_us);
-
-		if (acrtc_state->freesync_timing_changed)
-			stream_update.adjust =
-				&acrtc_state->stream->adjust;
-
-		if (acrtc_state->freesync_vrr_info_changed)
-			stream_update.vrr_infopacket =
-				&acrtc_state->stream->vrr_infopacket;
-	}
-
-	/* Update surface timing information. */
-	surface->time.time_elapsed_in_us[surface->time.index] =
-		addr.flip_timestamp_in_us - surface->time.prev_update_time_in_us;
-	surface->time.prev_update_time_in_us = addr.flip_timestamp_in_us;
-	surface->time.index++;
-	if (surface->time.index >= DC_PLANE_UPDATE_TIMES_MAX)
-		surface->time.index = 0;
-
-	mutex_lock(&adev->dm.dc_lock);
-
-	dc_commit_updates_for_stream(adev->dm.dc,
-					     surface_updates,
-					     1,
-					     acrtc_state->stream,
-					     &stream_update,
-					     &surface_updates->surface,
-					     state);
-	mutex_unlock(&adev->dm.dc_lock);
-
-	DRM_DEBUG_DRIVER("%s Flipping to hi: 0x%x, low: 0x%x \n",
-			 __func__,
-			 addr.address.grph.addr.high_part,
-			 addr.address.grph.addr.low_part);
-}
-
-/*
- * TODO this whole function needs to go
- *
- * dc_surface_update is needlessly complex. See if we can just replace this
- * with a dc_plane_state and follow the atomic model a bit more closely here.
- */
-static bool commit_planes_to_stream(
-		struct amdgpu_display_manager *dm,
-		struct dc *dc,
-		struct dc_plane_state **plane_states,
-		uint8_t new_plane_count,
-		struct dm_crtc_state *dm_new_crtc_state,
-		struct dm_crtc_state *dm_old_crtc_state,
-		struct dc_state *state)
-{
-	/* no need to dynamically allocate this. it's pretty small */
-	struct dc_surface_update updates[MAX_SURFACES];
-	struct dc_flip_addrs *flip_addr;
-	struct dc_plane_info *plane_info;
-	struct dc_scaling_info *scaling_info;
-	int i;
-	struct dc_stream_state *dc_stream = dm_new_crtc_state->stream;
-	struct dc_stream_update *stream_update =
-			kzalloc(sizeof(struct dc_stream_update), GFP_KERNEL);
-	unsigned int abm_level;
-
-	if (!stream_update) {
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	flip_addr = kcalloc(MAX_SURFACES, sizeof(struct dc_flip_addrs),
-			    GFP_KERNEL);
-	plane_info = kcalloc(MAX_SURFACES, sizeof(struct dc_plane_info),
-			     GFP_KERNEL);
-	scaling_info = kcalloc(MAX_SURFACES, sizeof(struct dc_scaling_info),
-			       GFP_KERNEL);
-
-	if (!flip_addr || !plane_info || !scaling_info) {
-		kfree(flip_addr);
-		kfree(plane_info);
-		kfree(scaling_info);
-		kfree(stream_update);
-		return false;
-	}
-
-	memset(updates, 0, sizeof(updates));
-
-	stream_update->src = dc_stream->src;
-	stream_update->dst = dc_stream->dst;
-	stream_update->out_transfer_func = dc_stream->out_transfer_func;
-
-	if (dm_new_crtc_state->abm_level != dm_old_crtc_state->abm_level) {
-		abm_level = dm_new_crtc_state->abm_level;
-		stream_update->abm_level = &abm_level;
-	}
-
-	for (i = 0; i < new_plane_count; i++) {
-		updates[i].surface = plane_states[i];
-		updates[i].gamma =
-			(struct dc_gamma *)plane_states[i]->gamma_correction;
-		updates[i].in_transfer_func = plane_states[i]->in_transfer_func;
-		flip_addr[i].address = plane_states[i]->address;
-		flip_addr[i].flip_immediate = plane_states[i]->flip_immediate;
-		plane_info[i].color_space = plane_states[i]->color_space;
-		plane_info[i].format = plane_states[i]->format;
-		plane_info[i].plane_size = plane_states[i]->plane_size;
-		plane_info[i].rotation = plane_states[i]->rotation;
-		plane_info[i].horizontal_mirror = plane_states[i]->horizontal_mirror;
-		plane_info[i].stereo_format = plane_states[i]->stereo_format;
-		plane_info[i].tiling_info = plane_states[i]->tiling_info;
-		plane_info[i].visible = plane_states[i]->visible;
-		plane_info[i].per_pixel_alpha = plane_states[i]->per_pixel_alpha;
-		plane_info[i].dcc = plane_states[i]->dcc;
-		scaling_info[i].scaling_quality = plane_states[i]->scaling_quality;
-		scaling_info[i].src_rect = plane_states[i]->src_rect;
-		scaling_info[i].dst_rect = plane_states[i]->dst_rect;
-		scaling_info[i].clip_rect = plane_states[i]->clip_rect;
-
-		updates[i].flip_addr = &flip_addr[i];
-		updates[i].plane_info = &plane_info[i];
-		updates[i].scaling_info = &scaling_info[i];
-	}
-
-	mutex_lock(&dm->dc_lock);
-	dc_commit_updates_for_stream(
-			dc,
-			updates,
-			new_plane_count,
-			dc_stream, stream_update, plane_states, state);
-	mutex_unlock(&dm->dc_lock);
-
-	kfree(flip_addr);
-	kfree(plane_info);
-	kfree(scaling_info);
-	kfree(stream_update);
-	return true;
 }
 
 static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
@@ -4814,34 +4630,58 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 				    struct drm_crtc *pcrtc,
 				    bool *wait_for_vblank)
 {
-	uint32_t i;
+	uint32_t i, r;
+	uint64_t timestamp_ns;
 	struct drm_plane *plane;
 	struct drm_plane_state *old_plane_state, *new_plane_state;
-	struct dc_stream_state *dc_stream_attach;
-	struct dc_plane_state *plane_states_constructed[MAX_SURFACES];
 	struct amdgpu_crtc *acrtc_attach = to_amdgpu_crtc(pcrtc);
 	struct drm_crtc_state *new_pcrtc_state =
 			drm_atomic_get_new_crtc_state(state, pcrtc);
 	struct dm_crtc_state *acrtc_state = to_dm_crtc_state(new_pcrtc_state);
 	struct dm_crtc_state *dm_old_crtc_state =
 			to_dm_crtc_state(drm_atomic_get_old_crtc_state(state, pcrtc));
-	int planes_count = 0;
+	int flip_count = 0, planes_count = 0, vpos, hpos;
 	unsigned long flags;
-	u64 last_flip_vblank;
+	struct amdgpu_bo *abo;
+	uint64_t tiling_flags, dcc_address;
+	uint32_t target, target_vblank;
+	uint64_t last_flip_vblank;
 	bool vrr_active = acrtc_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE;
 
+	struct {
+		struct dc_surface_update surface_updates[MAX_SURFACES];
+		struct dc_flip_addrs flip_addrs[MAX_SURFACES];
+		struct dc_stream_update stream_update;
+	} *flip;
+
+	struct {
+		struct dc_surface_update surface_updates[MAX_SURFACES];
+		struct dc_plane_info plane_infos[MAX_SURFACES];
+		struct dc_scaling_info scaling_infos[MAX_SURFACES];
+		struct dc_stream_update stream_update;
+	} *full;
+
+	flip = kzalloc(sizeof(*flip), GFP_KERNEL);
+	full = kzalloc(sizeof(*full), GFP_KERNEL);
+
+	if (!flip || !full) {
+		dm_error("Failed to allocate update bundles\n");
+		goto cleanup;
+	}
+
 	/* update planes when needed */
 	for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
 		struct drm_crtc *crtc = new_plane_state->crtc;
 		struct drm_crtc_state *new_crtc_state;
 		struct drm_framebuffer *fb = new_plane_state->fb;
+		struct amdgpu_framebuffer *afb = to_amdgpu_framebuffer(fb);
 		bool pflip_needed;
+		struct dc_plane_state *dc_plane;
 		struct dm_plane_state *dm_new_plane_state = to_dm_plane_state(new_plane_state);
 
-		if (plane->type == DRM_PLANE_TYPE_CURSOR) {
-			handle_cursor_update(plane, old_plane_state);
+		/* Cursor plane is handled after stream updates */
+		if (plane->type == DRM_PLANE_TYPE_CURSOR)
 			continue;
-		}
 
 		if (!fb || !crtc || pcrtc != crtc)
 			continue;
@@ -4850,91 +4690,228 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 		if (!new_crtc_state->active)
 			continue;
 
-		pflip_needed = !state->allow_modeset;
+		pflip_needed = old_plane_state->fb &&
+			old_plane_state->fb != new_plane_state->fb;
+
+		dc_plane = dm_new_plane_state->dc_state;
+
+		if (pflip_needed) {
+			/*
+			 * Assume even ONE crtc with immediate flip means
+			 * entire can't wait for VBLANK
+			 * TODO Check if it's correct
+			 */
+			if (new_pcrtc_state->pageflip_flags & DRM_MODE_PAGE_FLIP_ASYNC)
+				*wait_for_vblank = false;
+
+			/*
+			 * TODO This might fail and hence better not used, wait
+			 * explicitly on fences instead
+			 * and in general should be called for
+			 * blocking commit to as per framework helpers
+			 */
+			abo = gem_to_amdgpu_bo(fb->obj[0]);
+			r = amdgpu_bo_reserve(abo, true);
+			if (unlikely(r != 0))
+				DRM_ERROR("failed to reserve buffer before flip\n");
+
+			/*
+			 * Wait for all fences on this FB. Do limited wait to avoid
+			 * deadlock during GPU reset when this fence will not signal
+			 * but we hold reservation lock for the BO.
+			 */
+			r = reservation_object_wait_timeout_rcu(abo->tbo.resv,
+								true, false,
+								msecs_to_jiffies(5000));
+			if (unlikely(r == 0))
+				DRM_ERROR("Waiting for fences timed out.");
+
+
+
+			amdgpu_bo_get_tiling_flags(abo, &tiling_flags);
+
+			amdgpu_bo_unreserve(abo);
+
+			flip->flip_addrs[flip_count].address.grph.addr.low_part = lower_32_bits(afb->address);
+			flip->flip_addrs[flip_count].address.grph.addr.high_part = upper_32_bits(afb->address);
+
+			dcc_address = get_dcc_address(afb->address, tiling_flags);
+			flip->flip_addrs[flip_count].address.grph.meta_addr.low_part = lower_32_bits(dcc_address);
+			flip->flip_addrs[flip_count].address.grph.meta_addr.high_part = upper_32_bits(dcc_address);
+
+			flip->flip_addrs[flip_count].flip_immediate =
+					(crtc->state->pageflip_flags & DRM_MODE_PAGE_FLIP_ASYNC) != 0;
+
+			timestamp_ns = ktime_get_ns();
+			flip->flip_addrs[flip_count].flip_timestamp_in_us = div_u64(timestamp_ns, 1000);
+			flip->surface_updates[flip_count].flip_addr = &flip->flip_addrs[flip_count];
+			flip->surface_updates[flip_count].surface = dc_plane;
+
+			if (!flip->surface_updates[flip_count].surface) {
+				DRM_ERROR("No surface for CRTC: id=%d\n",
+						acrtc_attach->crtc_id);
+				continue;
+			}
+
+			if (plane == pcrtc->primary)
+				update_freesync_state_on_stream(
+					dm,
+					acrtc_state,
+					acrtc_state->stream,
+					dc_plane,
+					flip->flip_addrs[flip_count].flip_timestamp_in_us);
+
+			DRM_DEBUG_DRIVER("%s Flipping to hi: 0x%x, low: 0x%x\n",
+					 __func__,
+					 flip->flip_addrs[flip_count].address.grph.addr.high_part,
+					 flip->flip_addrs[flip_count].address.grph.addr.low_part);
 
-		spin_lock_irqsave(&crtc->dev->event_lock, flags);
-		if (acrtc_attach->pflip_status != AMDGPU_FLIP_NONE) {
-			DRM_ERROR("%s: acrtc %d, already busy\n",
-				  __func__,
-				  acrtc_attach->crtc_id);
-			/* In commit tail framework this cannot happen */
-			WARN_ON(1);
+			flip_count += 1;
 		}
 
-		/* For variable refresh rate mode only:
-		 * Get vblank of last completed flip to avoid > 1 vrr flips per
-		 * video frame by use of throttling, but allow flip programming
-		 * anywhere in the possibly large variable vrr vblank interval
-		 * for fine-grained flip timing control and more opportunity to
-		 * avoid stutter on late submission of amdgpu_dm_do_flip() calls.
-		 */
-		last_flip_vblank = acrtc_attach->last_flip_vblank;
+		full->surface_updates[planes_count].surface = dc_plane;
+		if (new_pcrtc_state->color_mgmt_changed) {
+			full->surface_updates[planes_count].gamma = dc_plane->gamma_correction;
+			full->surface_updates[planes_count].in_transfer_func = dc_plane->in_transfer_func;
+		}
 
-		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
 
-		if (!pflip_needed || plane->type == DRM_PLANE_TYPE_OVERLAY) {
-			WARN_ON(!dm_new_plane_state->dc_state);
+		full->scaling_infos[planes_count].scaling_quality = dc_plane->scaling_quality;
+		full->scaling_infos[planes_count].src_rect = dc_plane->src_rect;
+		full->scaling_infos[planes_count].dst_rect = dc_plane->dst_rect;
+		full->scaling_infos[planes_count].clip_rect = dc_plane->clip_rect;
+		full->surface_updates[planes_count].scaling_info = &full->scaling_infos[planes_count];
 
-			plane_states_constructed[planes_count] = dm_new_plane_state->dc_state;
 
-			dc_stream_attach = acrtc_state->stream;
-			planes_count++;
+		full->plane_infos[planes_count].color_space = dc_plane->color_space;
+		full->plane_infos[planes_count].format = dc_plane->format;
+		full->plane_infos[planes_count].plane_size = dc_plane->plane_size;
+		full->plane_infos[planes_count].rotation = dc_plane->rotation;
+		full->plane_infos[planes_count].horizontal_mirror = dc_plane->horizontal_mirror;
+		full->plane_infos[planes_count].stereo_format = dc_plane->stereo_format;
+		full->plane_infos[planes_count].tiling_info = dc_plane->tiling_info;
+		full->plane_infos[planes_count].visible = dc_plane->visible;
+		full->plane_infos[planes_count].per_pixel_alpha = dc_plane->per_pixel_alpha;
+		full->plane_infos[planes_count].dcc = dc_plane->dcc;
+		full->surface_updates[planes_count].plane_info = &full->plane_infos[planes_count];
 
-		} else if (new_crtc_state->planes_changed) {
-			/* Assume even ONE crtc with immediate flip means
-			 * entire can't wait for VBLANK
-			 * TODO Check if it's correct
-			 */
-			*wait_for_vblank =
-					new_pcrtc_state->pageflip_flags & DRM_MODE_PAGE_FLIP_ASYNC ?
-				false : true;
+		planes_count += 1;
 
-			/* TODO: Needs rework for multiplane flip */
-			if (plane->type == DRM_PLANE_TYPE_PRIMARY)
-				drm_crtc_vblank_get(crtc);
+	}
 
+	/*
+	 * TODO: For proper atomic behaviour, we should be calling into DC once with
+	 * all the changes.  However, DC refuses to do pageflips and non-pageflip
+	 * changes in the same call.  Change DC to respect atomic behaviour,
+	 * hopefully eliminating dc_*_update structs in their entirety.
+	 */
+	if (flip_count) {
+		if (!vrr_active) {
 			/* Use old throttling in non-vrr fixed refresh rate mode
 			 * to keep flip scheduling based on target vblank counts
-			 * working in a backwards compatible way, e.g., clients
-			 * using GLX_OML_sync_control extension.
+			 * working in a backwards compatible way, e.g., for
+			 * clients using the GLX_OML_sync_control extension or
+			 * DRI3/Present extension with defined target_msc.
+			 */
+			last_flip_vblank = drm_crtc_vblank_count(pcrtc);
+		}
+		else {
+			/* For variable refresh rate mode only:
+			 * Get vblank of last completed flip to avoid > 1 vrr
+			 * flips per video frame by use of throttling, but allow
+			 * flip programming anywhere in the possibly large
+			 * variable vrr vblank interval for fine-grained flip
+			 * timing control and more opportunity to avoid stutter
+			 * on late submission of flips.
 			 */
-			if (!vrr_active)
-				last_flip_vblank = drm_crtc_vblank_count(crtc);
-
-			amdgpu_dm_do_flip(
-				crtc,
-				fb,
-				(uint32_t) last_flip_vblank + *wait_for_vblank,
-				dc_state);
+			spin_lock_irqsave(&pcrtc->dev->event_lock, flags);
+			last_flip_vblank = acrtc_attach->last_flip_vblank;
+			spin_unlock_irqrestore(&pcrtc->dev->event_lock, flags);
 		}
 
-	}
+		target = (uint32_t)last_flip_vblank + *wait_for_vblank;
 
-	if (planes_count) {
-		unsigned long flags;
+		/* Prepare wait for target vblank early - before the fence-waits */
+		target_vblank = target - (uint32_t)drm_crtc_vblank_count(pcrtc) +
+				amdgpu_get_vblank_counter_kms(pcrtc->dev, acrtc_attach->crtc_id);
 
-		if (new_pcrtc_state->event) {
+		/*
+		 * Wait until we're out of the vertical blank period before the one
+		 * targeted by the flip
+		 */
+		while ((acrtc_attach->enabled &&
+			(amdgpu_display_get_crtc_scanoutpos(dm->ddev, acrtc_attach->crtc_id,
+							    0, &vpos, &hpos, NULL,
+							    NULL, &pcrtc->hwmode)
+			 & (DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK)) ==
+			(DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK) &&
+			(int)(target_vblank -
+			  amdgpu_get_vblank_counter_kms(dm->ddev, acrtc_attach->crtc_id)) > 0)) {
+			usleep_range(1000, 1100);
+		}
 
+		if (acrtc_attach->base.state->event) {
 			drm_crtc_vblank_get(pcrtc);
 
 			spin_lock_irqsave(&pcrtc->dev->event_lock, flags);
+
+			WARN_ON(acrtc_attach->pflip_status != AMDGPU_FLIP_NONE);
 			prepare_flip_isr(acrtc_attach);
+
 			spin_unlock_irqrestore(&pcrtc->dev->event_lock, flags);
 		}
 
-		dc_stream_attach->abm_level = acrtc_state->abm_level;
+		if (acrtc_state->stream) {
 
-		if (false == commit_planes_to_stream(dm,
-							dm->dc,
-							plane_states_constructed,
-							planes_count,
-							acrtc_state,
-							dm_old_crtc_state,
-							dc_state))
-			dm_error("%s: Failed to attach plane!\n", __func__);
-	} else {
-		/*TODO BUG Here should go disable planes on CRTC. */
+			if (acrtc_state->freesync_timing_changed)
+				flip->stream_update.adjust =
+					&acrtc_state->stream->adjust;
+
+			if (acrtc_state->freesync_vrr_info_changed)
+				flip->stream_update.vrr_infopacket =
+					&acrtc_state->stream->vrr_infopacket;
+		}
+
+		mutex_lock(&dm->dc_lock);
+		dc_commit_updates_for_stream(dm->dc,
+						     flip->surface_updates,
+						     flip_count,
+						     acrtc_state->stream,
+						     &flip->stream_update,
+						     dc_state);
+		mutex_unlock(&dm->dc_lock);
+	}
+
+	if (planes_count) {
+		if (new_pcrtc_state->mode_changed) {
+			full->stream_update.src = acrtc_state->stream->src;
+			full->stream_update.dst = acrtc_state->stream->dst;
+		}
+
+		if (new_pcrtc_state->color_mgmt_changed)
+			full->stream_update.out_transfer_func = acrtc_state->stream->out_transfer_func;
+
+		acrtc_state->stream->abm_level = acrtc_state->abm_level;
+		if (acrtc_state->abm_level != dm_old_crtc_state->abm_level)
+			full->stream_update.abm_level = &acrtc_state->abm_level;
+
+		mutex_lock(&dm->dc_lock);
+		dc_commit_updates_for_stream(dm->dc,
+						     full->surface_updates,
+						     planes_count,
+						     acrtc_state->stream,
+						     &full->stream_update,
+						     dc_state);
+		mutex_unlock(&dm->dc_lock);
 	}
+
+	for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i)
+		if (plane->type == DRM_PLANE_TYPE_CURSOR)
+			handle_cursor_update(plane, old_plane_state);
+
+cleanup:
+	kfree(flip);
+	kfree(full);
 }
 
 /*
@@ -4948,7 +4925,8 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
 static void amdgpu_dm_crtc_copy_transient_flags(struct drm_crtc_state *crtc_state,
 						struct dc_stream_state *stream_state)
 {
-	stream_state->mode_changed = crtc_state->mode_changed;
+	stream_state->mode_changed =
+		crtc_state->mode_changed || crtc_state->active_changed;
 }
 
 static int amdgpu_dm_atomic_commit(struct drm_device *dev,
@@ -4969,10 +4947,25 @@ static int amdgpu_dm_atomic_commit(struct drm_device *dev,
 	 */
 	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
 		struct dm_crtc_state *dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+		struct dm_crtc_state *dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
 		struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
 
-		if (drm_atomic_crtc_needs_modeset(new_crtc_state) && dm_old_crtc_state->stream)
+		if (drm_atomic_crtc_needs_modeset(new_crtc_state)
+		    && dm_old_crtc_state->stream) {
+			/*
+			 * If the stream is removed and CRC capture was
+			 * enabled on the CRTC the extra vblank reference
+			 * needs to be dropped since CRC capture will be
+			 * disabled.
+			 */
+			if (!dm_new_crtc_state->stream
+			    && dm_new_crtc_state->crc_enabled) {
+				drm_crtc_vblank_put(crtc);
+				dm_new_crtc_state->crc_enabled = false;
+			}
+
 			manage_dm_interrupts(adev, acrtc, false);
+		}
 	}
 	/*
 	 * Add check here for SoC's that support hardware cursor plane, to
@@ -5110,8 +5103,8 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
 					dc_stream_get_status(dm_new_crtc_state->stream);
 
 			if (!status)
-				status = dc_state_get_stream_status(dc_state,
-								    dm_new_crtc_state->stream);
+				status = dc_stream_get_status_from_state(dc_state,
+									 dm_new_crtc_state->stream);
 
 			if (!status)
 				DC_ERR("got no status for stream %p on acrtc%p\n", dm_new_crtc_state->stream, acrtc);
@@ -5120,13 +5113,18 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
 		}
 	}
 
-	/* Handle scaling, underscan, and abm changes*/
+	/* Handle connector state changes */
 	for_each_oldnew_connector_in_state(state, connector, old_con_state, new_con_state, i) {
 		struct dm_connector_state *dm_new_con_state = to_dm_connector_state(new_con_state);
 		struct dm_connector_state *dm_old_con_state = to_dm_connector_state(old_con_state);
 		struct amdgpu_crtc *acrtc = to_amdgpu_crtc(dm_new_con_state->base.crtc);
+		struct dc_surface_update dummy_updates[MAX_SURFACES];
+		struct dc_stream_update stream_update;
 		struct dc_stream_status *status = NULL;
 
+		memset(&dummy_updates, 0, sizeof(dummy_updates));
+		memset(&stream_update, 0, sizeof(stream_update));
+
 		if (acrtc) {
 			new_crtc_state = drm_atomic_get_new_crtc_state(state, &acrtc->base);
 			old_crtc_state = drm_atomic_get_old_crtc_state(state, &acrtc->base);
@@ -5136,37 +5134,48 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
 		if (!acrtc || drm_atomic_crtc_needs_modeset(new_crtc_state))
 			continue;
 
-
 		dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
 		dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
 
-		/* Skip anything that is not scaling or underscan changes */
 		if (!is_scaling_state_different(dm_new_con_state, dm_old_con_state) &&
 				(dm_new_crtc_state->abm_level == dm_old_crtc_state->abm_level))
 			continue;
 
-		update_stream_scaling_settings(&dm_new_con_state->base.crtc->mode,
-				dm_new_con_state, (struct dc_stream_state *)dm_new_crtc_state->stream);
+		if (is_scaling_state_different(dm_new_con_state, dm_old_con_state)) {
+			update_stream_scaling_settings(&dm_new_con_state->base.crtc->mode,
+					dm_new_con_state, (struct dc_stream_state *)dm_new_crtc_state->stream);
 
-		if (!dm_new_crtc_state->stream)
-			continue;
+			stream_update.src = dm_new_crtc_state->stream->src;
+			stream_update.dst = dm_new_crtc_state->stream->dst;
+		}
+
+		if (dm_new_crtc_state->abm_level != dm_old_crtc_state->abm_level) {
+			dm_new_crtc_state->stream->abm_level = dm_new_crtc_state->abm_level;
+
+			stream_update.abm_level = &dm_new_crtc_state->abm_level;
+		}
 
 		status = dc_stream_get_status(dm_new_crtc_state->stream);
 		WARN_ON(!status);
 		WARN_ON(!status->plane_count);
 
-		dm_new_crtc_state->stream->abm_level = dm_new_crtc_state->abm_level;
+		/*
+		 * TODO: DC refuses to perform stream updates without a dc_surface_update.
+		 * Here we create an empty update on each plane.
+		 * To fix this, DC should permit updating only stream properties.
+		 */
+		for (j = 0; j < status->plane_count; j++)
+			dummy_updates[j].surface = status->plane_states[0];
 
-		/*TODO How it works with MPO ?*/
-		if (!commit_planes_to_stream(
-				dm,
-				dm->dc,
-				status->plane_states,
-				status->plane_count,
-				dm_new_crtc_state,
-				to_dm_crtc_state(old_crtc_state),
-				dc_state))
-			dm_error("%s: Failed to update stream scaling!\n", __func__);
+
+		mutex_lock(&dm->dc_lock);
+		dc_commit_updates_for_stream(dm->dc,
+						     dummy_updates,
+						     status->plane_count,
+						     dm_new_crtc_state->stream,
+						     &stream_update,
+						     dc_state);
+		mutex_unlock(&dm->dc_lock);
 	}
 
 	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
@@ -5191,6 +5200,12 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
 			continue;
 
 		manage_dm_interrupts(adev, acrtc, true);
+
+#ifdef CONFIG_DEBUG_FS
+		/* The stream has changed so CRC capture needs to re-enabled. */
+		if (dm_new_crtc_state->crc_enabled)
+			amdgpu_dm_crtc_set_crc_source(crtc, "auto");
+#endif
 	}
 
 	/* update planes when needed per crtc*/
@@ -5217,18 +5232,12 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
 	}
 	spin_unlock_irqrestore(&adev->ddev->event_lock, flags);
 
+	/* Signal HW programming completion */
+	drm_atomic_helper_commit_hw_done(state);
 
 	if (wait_for_vblank)
 		drm_atomic_helper_wait_for_flip_done(dev, state);
 
-	/*
-	 * FIXME:
-	 * Delay hw_done() until flip_done() is signaled. This is to block
-	 * another commit from freeing the CRTC state while we're still
-	 * waiting on flip_done.
-	 */
-	drm_atomic_helper_commit_hw_done(state);
-
 	drm_atomic_helper_cleanup_planes(dev, state);
 
 	/*
@@ -5392,10 +5401,13 @@ static void get_freesync_config_for_crtc(
 	struct mod_freesync_config config = {0};
 	struct amdgpu_dm_connector *aconnector =
 			to_amdgpu_dm_connector(new_con_state->base.connector);
+	struct drm_display_mode *mode = &new_crtc_state->base.mode;
 
-	new_crtc_state->vrr_supported = new_con_state->freesync_capable;
+	new_crtc_state->vrr_supported = new_con_state->freesync_capable &&
+		aconnector->min_vfreq <= drm_mode_vrefresh(mode);
 
-	if (new_con_state->freesync_capable) {
+	if (new_crtc_state->vrr_supported) {
+		new_crtc_state->stream->ignore_msa_timing_param = true;
 		config.state = new_crtc_state->base.vrr_enabled ?
 				VRR_STATE_ACTIVE_VARIABLE :
 				VRR_STATE_INACTIVE;
@@ -5421,15 +5433,15 @@ static void reset_freesync_config_for_crtc(
 	       sizeof(new_crtc_state->vrr_infopacket));
 }
 
-static int dm_update_crtcs_state(struct amdgpu_display_manager *dm,
-				 struct drm_atomic_state *state,
-				 bool enable,
-				 bool *lock_and_validation_needed)
+static int dm_update_crtc_state(struct amdgpu_display_manager *dm,
+				struct drm_atomic_state *state,
+				struct drm_crtc *crtc,
+				struct drm_crtc_state *old_crtc_state,
+				struct drm_crtc_state *new_crtc_state,
+				bool enable,
+				bool *lock_and_validation_needed)
 {
 	struct dm_atomic_state *dm_state = NULL;
-	struct drm_crtc *crtc;
-	struct drm_crtc_state *old_crtc_state, *new_crtc_state;
-	int i;
 	struct dm_crtc_state *dm_old_crtc_state, *dm_new_crtc_state;
 	struct dc_stream_state *new_stream;
 	int ret = 0;
@@ -5438,200 +5450,203 @@ static int dm_update_crtcs_state(struct amdgpu_display_manager *dm,
 	 * TODO Move this code into dm_crtc_atomic_check once we get rid of dc_validation_set
 	 * update changed items
 	 */
-	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
-		struct amdgpu_crtc *acrtc = NULL;
-		struct amdgpu_dm_connector *aconnector = NULL;
-		struct drm_connector_state *drm_new_conn_state = NULL, *drm_old_conn_state = NULL;
-		struct dm_connector_state *dm_new_conn_state = NULL, *dm_old_conn_state = NULL;
-		struct drm_plane_state *new_plane_state = NULL;
+	struct amdgpu_crtc *acrtc = NULL;
+	struct amdgpu_dm_connector *aconnector = NULL;
+	struct drm_connector_state *drm_new_conn_state = NULL, *drm_old_conn_state = NULL;
+	struct dm_connector_state *dm_new_conn_state = NULL, *dm_old_conn_state = NULL;
+	struct drm_plane_state *new_plane_state = NULL;
 
-		new_stream = NULL;
+	new_stream = NULL;
 
-		dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
-		dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
-		acrtc = to_amdgpu_crtc(crtc);
+	dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+	dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
+	acrtc = to_amdgpu_crtc(crtc);
 
-		new_plane_state = drm_atomic_get_new_plane_state(state, new_crtc_state->crtc->primary);
-
-		if (new_crtc_state->enable && new_plane_state && !new_plane_state->fb) {
-			ret = -EINVAL;
-			goto fail;
-		}
+	new_plane_state = drm_atomic_get_new_plane_state(state, new_crtc_state->crtc->primary);
 
-		aconnector = amdgpu_dm_find_first_crtc_matching_connector(state, crtc);
+	if (new_crtc_state->enable && new_plane_state && !new_plane_state->fb) {
+		ret = -EINVAL;
+		goto fail;
+	}
 
-		/* TODO This hack should go away */
-		if (aconnector && enable) {
-			/* Make sure fake sink is created in plug-in scenario */
-			drm_new_conn_state = drm_atomic_get_new_connector_state(state,
- 								    &aconnector->base);
-			drm_old_conn_state = drm_atomic_get_old_connector_state(state,
-								    &aconnector->base);
+	aconnector = amdgpu_dm_find_first_crtc_matching_connector(state, crtc);
 
-			if (IS_ERR(drm_new_conn_state)) {
-				ret = PTR_ERR_OR_ZERO(drm_new_conn_state);
-				break;
-			}
+	/* TODO This hack should go away */
+	if (aconnector && enable) {
+		/* Make sure fake sink is created in plug-in scenario */
+		drm_new_conn_state = drm_atomic_get_new_connector_state(state,
+							    &aconnector->base);
+		drm_old_conn_state = drm_atomic_get_old_connector_state(state,
+							    &aconnector->base);
 
-			dm_new_conn_state = to_dm_connector_state(drm_new_conn_state);
-			dm_old_conn_state = to_dm_connector_state(drm_old_conn_state);
+		if (IS_ERR(drm_new_conn_state)) {
+			ret = PTR_ERR_OR_ZERO(drm_new_conn_state);
+			goto fail;
+		}
 
-			new_stream = create_stream_for_sink(aconnector,
-							     &new_crtc_state->mode,
-							    dm_new_conn_state,
-							    dm_old_crtc_state->stream);
+		dm_new_conn_state = to_dm_connector_state(drm_new_conn_state);
+		dm_old_conn_state = to_dm_connector_state(drm_old_conn_state);
 
-			/*
-			 * we can have no stream on ACTION_SET if a display
-			 * was disconnected during S3, in this case it is not an
-			 * error, the OS will be updated after detection, and
-			 * will do the right thing on next atomic commit
-			 */
+		if (!drm_atomic_crtc_needs_modeset(new_crtc_state))
+			goto skip_modeset;
 
-			if (!new_stream) {
-				DRM_DEBUG_DRIVER("%s: Failed to create new stream for crtc %d\n",
-						__func__, acrtc->base.base.id);
-				break;
-			}
+		new_stream = create_stream_for_sink(aconnector,
+						     &new_crtc_state->mode,
+						    dm_new_conn_state,
+						    dm_old_crtc_state->stream);
 
-			dm_new_crtc_state->abm_level = dm_new_conn_state->abm_level;
+		/*
+		 * we can have no stream on ACTION_SET if a display
+		 * was disconnected during S3, in this case it is not an
+		 * error, the OS will be updated after detection, and
+		 * will do the right thing on next atomic commit
+		 */
 
-			if (dc_is_stream_unchanged(new_stream, dm_old_crtc_state->stream) &&
-			    dc_is_stream_scaling_unchanged(new_stream, dm_old_crtc_state->stream)) {
-				new_crtc_state->mode_changed = false;
-				DRM_DEBUG_DRIVER("Mode change not required, setting mode_changed to %d",
-						 new_crtc_state->mode_changed);
-			}
+		if (!new_stream) {
+			DRM_DEBUG_DRIVER("%s: Failed to create new stream for crtc %d\n",
+					__func__, acrtc->base.base.id);
+			ret = -ENOMEM;
+			goto fail;
 		}
 
-		if (!drm_atomic_crtc_needs_modeset(new_crtc_state))
-			goto next_crtc;
+		dm_new_crtc_state->abm_level = dm_new_conn_state->abm_level;
 
-		DRM_DEBUG_DRIVER(
-			"amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, "
-			"planes_changed:%d, mode_changed:%d,active_changed:%d,"
-			"connectors_changed:%d\n",
-			acrtc->crtc_id,
-			new_crtc_state->enable,
-			new_crtc_state->active,
-			new_crtc_state->planes_changed,
-			new_crtc_state->mode_changed,
-			new_crtc_state->active_changed,
-			new_crtc_state->connectors_changed);
+		if (dc_is_stream_unchanged(new_stream, dm_old_crtc_state->stream) &&
+		    dc_is_stream_scaling_unchanged(new_stream, dm_old_crtc_state->stream)) {
+			new_crtc_state->mode_changed = false;
+			DRM_DEBUG_DRIVER("Mode change not required, setting mode_changed to %d",
+					 new_crtc_state->mode_changed);
+		}
+	}
 
-		/* Remove stream for any changed/disabled CRTC */
-		if (!enable) {
+	/* mode_changed flag may get updated above, need to check again */
+	if (!drm_atomic_crtc_needs_modeset(new_crtc_state))
+		goto skip_modeset;
 
-			if (!dm_old_crtc_state->stream)
-				goto next_crtc;
+	DRM_DEBUG_DRIVER(
+		"amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, "
+		"planes_changed:%d, mode_changed:%d,active_changed:%d,"
+		"connectors_changed:%d\n",
+		acrtc->crtc_id,
+		new_crtc_state->enable,
+		new_crtc_state->active,
+		new_crtc_state->planes_changed,
+		new_crtc_state->mode_changed,
+		new_crtc_state->active_changed,
+		new_crtc_state->connectors_changed);
 
-			ret = dm_atomic_get_state(state, &dm_state);
-			if (ret)
-				goto fail;
+	/* Remove stream for any changed/disabled CRTC */
+	if (!enable) {
 
-			DRM_DEBUG_DRIVER("Disabling DRM crtc: %d\n",
-					crtc->base.id);
+		if (!dm_old_crtc_state->stream)
+			goto skip_modeset;
 
-			/* i.e. reset mode */
-			if (dc_remove_stream_from_ctx(
-					dm->dc,
-					dm_state->context,
-					dm_old_crtc_state->stream) != DC_OK) {
-				ret = -EINVAL;
-				goto fail;
-			}
+		ret = dm_atomic_get_state(state, &dm_state);
+		if (ret)
+			goto fail;
 
-			dc_stream_release(dm_old_crtc_state->stream);
-			dm_new_crtc_state->stream = NULL;
+		DRM_DEBUG_DRIVER("Disabling DRM crtc: %d\n",
+				crtc->base.id);
 
-			reset_freesync_config_for_crtc(dm_new_crtc_state);
+		/* i.e. reset mode */
+		if (dc_remove_stream_from_ctx(
+				dm->dc,
+				dm_state->context,
+				dm_old_crtc_state->stream) != DC_OK) {
+			ret = -EINVAL;
+			goto fail;
+		}
 
-			*lock_and_validation_needed = true;
+		dc_stream_release(dm_old_crtc_state->stream);
+		dm_new_crtc_state->stream = NULL;
 
-		} else {/* Add stream for any updated/enabled CRTC */
-			/*
-			 * Quick fix to prevent NULL pointer on new_stream when
-			 * added MST connectors not found in existing crtc_state in the chained mode
-			 * TODO: need to dig out the root cause of that
-			 */
-			if (!aconnector || (!aconnector->dc_sink && aconnector->mst_port))
-				goto next_crtc;
+		reset_freesync_config_for_crtc(dm_new_crtc_state);
 
-			if (modereset_required(new_crtc_state))
-				goto next_crtc;
+		*lock_and_validation_needed = true;
 
-			if (modeset_required(new_crtc_state, new_stream,
-					     dm_old_crtc_state->stream)) {
+	} else {/* Add stream for any updated/enabled CRTC */
+		/*
+		 * Quick fix to prevent NULL pointer on new_stream when
+		 * added MST connectors not found in existing crtc_state in the chained mode
+		 * TODO: need to dig out the root cause of that
+		 */
+		if (!aconnector || (!aconnector->dc_sink && aconnector->mst_port))
+			goto skip_modeset;
 
-				WARN_ON(dm_new_crtc_state->stream);
+		if (modereset_required(new_crtc_state))
+			goto skip_modeset;
 
-				ret = dm_atomic_get_state(state, &dm_state);
-				if (ret)
-					goto fail;
+		if (modeset_required(new_crtc_state, new_stream,
+				     dm_old_crtc_state->stream)) {
 
-				dm_new_crtc_state->stream = new_stream;
+			WARN_ON(dm_new_crtc_state->stream);
 
-				dc_stream_retain(new_stream);
+			ret = dm_atomic_get_state(state, &dm_state);
+			if (ret)
+				goto fail;
 
-				DRM_DEBUG_DRIVER("Enabling DRM crtc: %d\n",
-							crtc->base.id);
+			dm_new_crtc_state->stream = new_stream;
 
-				if (dc_add_stream_to_ctx(
-						dm->dc,
-						dm_state->context,
-						dm_new_crtc_state->stream) != DC_OK) {
-					ret = -EINVAL;
-					goto fail;
-				}
+			dc_stream_retain(new_stream);
+
+			DRM_DEBUG_DRIVER("Enabling DRM crtc: %d\n",
+						crtc->base.id);
 
-				*lock_and_validation_needed = true;
+			if (dc_add_stream_to_ctx(
+					dm->dc,
+					dm_state->context,
+					dm_new_crtc_state->stream) != DC_OK) {
+				ret = -EINVAL;
+				goto fail;
 			}
-		}
 
-next_crtc:
-		/* Release extra reference */
-		if (new_stream)
-			 dc_stream_release(new_stream);
+			*lock_and_validation_needed = true;
+		}
+	}
 
-		/*
-		 * We want to do dc stream updates that do not require a
-		 * full modeset below.
-		 */
-		if (!(enable && aconnector && new_crtc_state->enable &&
-		      new_crtc_state->active))
-			continue;
-		/*
-		 * Given above conditions, the dc state cannot be NULL because:
-		 * 1. We're in the process of enabling CRTCs (just been added
-		 *    to the dc context, or already is on the context)
-		 * 2. Has a valid connector attached, and
-		 * 3. Is currently active and enabled.
-		 * => The dc stream state currently exists.
-		 */
-		BUG_ON(dm_new_crtc_state->stream == NULL);
+skip_modeset:
+	/* Release extra reference */
+	if (new_stream)
+		 dc_stream_release(new_stream);
 
-		/* Scaling or underscan settings */
-		if (is_scaling_state_different(dm_old_conn_state, dm_new_conn_state))
-			update_stream_scaling_settings(
-				&new_crtc_state->mode, dm_new_conn_state, dm_new_crtc_state->stream);
+	/*
+	 * We want to do dc stream updates that do not require a
+	 * full modeset below.
+	 */
+	if (!(enable && aconnector && new_crtc_state->enable &&
+	      new_crtc_state->active))
+		return 0;
+	/*
+	 * Given above conditions, the dc state cannot be NULL because:
+	 * 1. We're in the process of enabling CRTCs (just been added
+	 *    to the dc context, or already is on the context)
+	 * 2. Has a valid connector attached, and
+	 * 3. Is currently active and enabled.
+	 * => The dc stream state currently exists.
+	 */
+	BUG_ON(dm_new_crtc_state->stream == NULL);
 
-		/*
-		 * Color management settings. We also update color properties
-		 * when a modeset is needed, to ensure it gets reprogrammed.
-		 */
-		if (dm_new_crtc_state->base.color_mgmt_changed ||
-		    drm_atomic_crtc_needs_modeset(new_crtc_state)) {
-			ret = amdgpu_dm_set_regamma_lut(dm_new_crtc_state);
-			if (ret)
-				goto fail;
-			amdgpu_dm_set_ctm(dm_new_crtc_state);
-		}
+	/* Scaling or underscan settings */
+	if (is_scaling_state_different(dm_old_conn_state, dm_new_conn_state))
+		update_stream_scaling_settings(
+			&new_crtc_state->mode, dm_new_conn_state, dm_new_crtc_state->stream);
 
-		/* Update Freesync settings. */
-		get_freesync_config_for_crtc(dm_new_crtc_state,
-					     dm_new_conn_state);
+	/*
+	 * Color management settings. We also update color properties
+	 * when a modeset is needed, to ensure it gets reprogrammed.
+	 */
+	if (dm_new_crtc_state->base.color_mgmt_changed ||
+	    drm_atomic_crtc_needs_modeset(new_crtc_state)) {
+		ret = amdgpu_dm_set_regamma_lut(dm_new_crtc_state);
+		if (ret)
+			goto fail;
+		amdgpu_dm_set_ctm(dm_new_crtc_state);
 	}
 
+	/* Update Freesync settings. */
+	get_freesync_config_for_crtc(dm_new_crtc_state,
+				     dm_new_conn_state);
+
 	return ret;
 
 fail:
@@ -5640,145 +5655,141 @@ fail:
 	return ret;
 }
 
-static int dm_update_planes_state(struct dc *dc,
-				  struct drm_atomic_state *state,
-				  bool enable,
-				  bool *lock_and_validation_needed)
+static int dm_update_plane_state(struct dc *dc,
+				 struct drm_atomic_state *state,
+				 struct drm_plane *plane,
+				 struct drm_plane_state *old_plane_state,
+				 struct drm_plane_state *new_plane_state,
+				 bool enable,
+				 bool *lock_and_validation_needed)
 {
 
 	struct dm_atomic_state *dm_state = NULL;
 	struct drm_crtc *new_plane_crtc, *old_plane_crtc;
 	struct drm_crtc_state *old_crtc_state, *new_crtc_state;
-	struct drm_plane *plane;
-	struct drm_plane_state *old_plane_state, *new_plane_state;
 	struct dm_crtc_state *dm_new_crtc_state, *dm_old_crtc_state;
 	struct dm_plane_state *dm_new_plane_state, *dm_old_plane_state;
-	int i ;
 	/* TODO return page_flip_needed() function */
 	bool pflip_needed  = !state->allow_modeset;
 	int ret = 0;
 
 
-	/* Add new planes, in reverse order as DC expectation */
-	for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, new_plane_state, i) {
-		new_plane_crtc = new_plane_state->crtc;
-		old_plane_crtc = old_plane_state->crtc;
-		dm_new_plane_state = to_dm_plane_state(new_plane_state);
-		dm_old_plane_state = to_dm_plane_state(old_plane_state);
+	new_plane_crtc = new_plane_state->crtc;
+	old_plane_crtc = old_plane_state->crtc;
+	dm_new_plane_state = to_dm_plane_state(new_plane_state);
+	dm_old_plane_state = to_dm_plane_state(old_plane_state);
 
-		/*TODO Implement atomic check for cursor plane */
-		if (plane->type == DRM_PLANE_TYPE_CURSOR)
-			continue;
+	/*TODO Implement atomic check for cursor plane */
+	if (plane->type == DRM_PLANE_TYPE_CURSOR)
+		return 0;
 
-		/* Remove any changed/removed planes */
-		if (!enable) {
-			if (pflip_needed &&
-			    plane->type != DRM_PLANE_TYPE_OVERLAY)
-				continue;
+	/* Remove any changed/removed planes */
+	if (!enable) {
+		if (pflip_needed &&
+		    plane->type != DRM_PLANE_TYPE_OVERLAY)
+			return 0;
 
-			if (!old_plane_crtc)
-				continue;
+		if (!old_plane_crtc)
+			return 0;
 
-			old_crtc_state = drm_atomic_get_old_crtc_state(
-					state, old_plane_crtc);
-			dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+		old_crtc_state = drm_atomic_get_old_crtc_state(
+				state, old_plane_crtc);
+		dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
 
-			if (!dm_old_crtc_state->stream)
-				continue;
+		if (!dm_old_crtc_state->stream)
+			return 0;
 
-			DRM_DEBUG_ATOMIC("Disabling DRM plane: %d on DRM crtc %d\n",
-					plane->base.id, old_plane_crtc->base.id);
+		DRM_DEBUG_ATOMIC("Disabling DRM plane: %d on DRM crtc %d\n",
+				plane->base.id, old_plane_crtc->base.id);
 
-			ret = dm_atomic_get_state(state, &dm_state);
-			if (ret)
-				return ret;
+		ret = dm_atomic_get_state(state, &dm_state);
+		if (ret)
+			return ret;
 
-			if (!dc_remove_plane_from_context(
-					dc,
-					dm_old_crtc_state->stream,
-					dm_old_plane_state->dc_state,
-					dm_state->context)) {
+		if (!dc_remove_plane_from_context(
+				dc,
+				dm_old_crtc_state->stream,
+				dm_old_plane_state->dc_state,
+				dm_state->context)) {
 
-				ret = EINVAL;
-				return ret;
-			}
+			ret = EINVAL;
+			return ret;
+		}
 
 
-			dc_plane_state_release(dm_old_plane_state->dc_state);
-			dm_new_plane_state->dc_state = NULL;
+		dc_plane_state_release(dm_old_plane_state->dc_state);
+		dm_new_plane_state->dc_state = NULL;
 
-			*lock_and_validation_needed = true;
+		*lock_and_validation_needed = true;
 
-		} else { /* Add new planes */
-			struct dc_plane_state *dc_new_plane_state;
+	} else { /* Add new planes */
+		struct dc_plane_state *dc_new_plane_state;
 
-			if (drm_atomic_plane_disabling(plane->state, new_plane_state))
-				continue;
+		if (drm_atomic_plane_disabling(plane->state, new_plane_state))
+			return 0;
 
-			if (!new_plane_crtc)
-				continue;
+		if (!new_plane_crtc)
+			return 0;
 
-			new_crtc_state = drm_atomic_get_new_crtc_state(state, new_plane_crtc);
-			dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
+		new_crtc_state = drm_atomic_get_new_crtc_state(state, new_plane_crtc);
+		dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
 
-			if (!dm_new_crtc_state->stream)
-				continue;
+		if (!dm_new_crtc_state->stream)
+			return 0;
 
-			if (pflip_needed &&
-			    plane->type != DRM_PLANE_TYPE_OVERLAY)
-				continue;
+		if (pflip_needed && plane->type != DRM_PLANE_TYPE_OVERLAY)
+			return 0;
 
-			WARN_ON(dm_new_plane_state->dc_state);
+		WARN_ON(dm_new_plane_state->dc_state);
 
-			dc_new_plane_state = dc_create_plane_state(dc);
-			if (!dc_new_plane_state)
-				return -ENOMEM;
+		dc_new_plane_state = dc_create_plane_state(dc);
+		if (!dc_new_plane_state)
+			return -ENOMEM;
 
-			DRM_DEBUG_DRIVER("Enabling DRM plane: %d on DRM crtc %d\n",
-					plane->base.id, new_plane_crtc->base.id);
+		DRM_DEBUG_DRIVER("Enabling DRM plane: %d on DRM crtc %d\n",
+				plane->base.id, new_plane_crtc->base.id);
 
-			ret = fill_plane_attributes(
-				new_plane_crtc->dev->dev_private,
-				dc_new_plane_state,
-				new_plane_state,
-				new_crtc_state);
-			if (ret) {
-				dc_plane_state_release(dc_new_plane_state);
-				return ret;
-			}
+		ret = fill_plane_attributes(
+			new_plane_crtc->dev->dev_private,
+			dc_new_plane_state,
+			new_plane_state,
+			new_crtc_state);
+		if (ret) {
+			dc_plane_state_release(dc_new_plane_state);
+			return ret;
+		}
 
-			ret = dm_atomic_get_state(state, &dm_state);
-			if (ret) {
-				dc_plane_state_release(dc_new_plane_state);
-				return ret;
-			}
+		ret = dm_atomic_get_state(state, &dm_state);
+		if (ret) {
+			dc_plane_state_release(dc_new_plane_state);
+			return ret;
+		}
 
-			/*
-			 * Any atomic check errors that occur after this will
-			 * not need a release. The plane state will be attached
-			 * to the stream, and therefore part of the atomic
-			 * state. It'll be released when the atomic state is
-			 * cleaned.
-			 */
-			if (!dc_add_plane_to_context(
-					dc,
-					dm_new_crtc_state->stream,
-					dc_new_plane_state,
-					dm_state->context)) {
-
-				dc_plane_state_release(dc_new_plane_state);
-				return -EINVAL;
-			}
+		/*
+		 * Any atomic check errors that occur after this will
+		 * not need a release. The plane state will be attached
+		 * to the stream, and therefore part of the atomic
+		 * state. It'll be released when the atomic state is
+		 * cleaned.
+		 */
+		if (!dc_add_plane_to_context(
+				dc,
+				dm_new_crtc_state->stream,
+				dc_new_plane_state,
+				dm_state->context)) {
 
-			dm_new_plane_state->dc_state = dc_new_plane_state;
+			dc_plane_state_release(dc_new_plane_state);
+			return -EINVAL;
+		}
 
-			/* Tell DC to do a full surface update every time there
-			 * is a plane change. Inefficient, but works for now.
-			 */
-			dm_new_plane_state->dc_state->update_flags.bits.full_update = 1;
+		dm_new_plane_state->dc_state = dc_new_plane_state;
 
-			*lock_and_validation_needed = true;
-		}
+		/* Tell DC to do a full surface update every time there
+		 * is a plane change. Inefficient, but works for now.
+		 */
+		dm_new_plane_state->dc_state->update_flags.bits.full_update = 1;
+
+		*lock_and_validation_needed = true;
 	}
 
 
@@ -5802,11 +5813,13 @@ dm_determine_update_type_for_commit(struct dc *dc,
 	struct dm_crtc_state *new_dm_crtc_state, *old_dm_crtc_state;
 	struct dc_stream_status *status = NULL;
 
-	struct dc_surface_update *updates = kzalloc(MAX_SURFACES * sizeof(struct dc_surface_update), GFP_KERNEL);
-	struct dc_plane_state *surface = kzalloc(MAX_SURFACES * sizeof(struct dc_plane_state), GFP_KERNEL);
-	struct dc_stream_update stream_update;
+	struct dc_surface_update *updates;
+	struct dc_plane_state *surface;
 	enum surface_update_type update_type = UPDATE_TYPE_FAST;
 
+	updates = kcalloc(MAX_SURFACES, sizeof(*updates), GFP_KERNEL);
+	surface = kcalloc(MAX_SURFACES, sizeof(*surface), GFP_KERNEL);
+
 	if (!updates || !surface) {
 		DRM_ERROR("Plane or surface update failed to allocate");
 		/* Set type to FULL to avoid crashing in DC*/
@@ -5815,79 +5828,89 @@ dm_determine_update_type_for_commit(struct dc *dc,
 	}
 
 	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
+		struct dc_stream_update stream_update = { 0 };
+
 		new_dm_crtc_state = to_dm_crtc_state(new_crtc_state);
 		old_dm_crtc_state = to_dm_crtc_state(old_crtc_state);
 		num_plane = 0;
 
-		if (new_dm_crtc_state->stream) {
-
-			for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, j) {
-				new_plane_crtc = new_plane_state->crtc;
-				old_plane_crtc = old_plane_state->crtc;
-				new_dm_plane_state = to_dm_plane_state(new_plane_state);
-				old_dm_plane_state = to_dm_plane_state(old_plane_state);
-
-				if (plane->type == DRM_PLANE_TYPE_CURSOR)
-					continue;
-
-				if (!state->allow_modeset)
-					continue;
-
-				if (crtc == new_plane_crtc) {
-					updates[num_plane].surface = &surface[num_plane];
-
-					if (new_crtc_state->mode_changed) {
-						updates[num_plane].surface->src_rect =
-									new_dm_plane_state->dc_state->src_rect;
-						updates[num_plane].surface->dst_rect =
-									new_dm_plane_state->dc_state->dst_rect;
-						updates[num_plane].surface->rotation =
-									new_dm_plane_state->dc_state->rotation;
-						updates[num_plane].surface->in_transfer_func =
-									new_dm_plane_state->dc_state->in_transfer_func;
-						stream_update.dst = new_dm_crtc_state->stream->dst;
-						stream_update.src = new_dm_crtc_state->stream->src;
-					}
-
-					if (new_crtc_state->color_mgmt_changed) {
-						updates[num_plane].gamma =
-								new_dm_plane_state->dc_state->gamma_correction;
-						updates[num_plane].in_transfer_func =
-								new_dm_plane_state->dc_state->in_transfer_func;
-						stream_update.gamut_remap =
-								&new_dm_crtc_state->stream->gamut_remap_matrix;
-						stream_update.out_transfer_func =
-								new_dm_crtc_state->stream->out_transfer_func;
-					}
-
-					num_plane++;
-				}
-			}
+		if (new_dm_crtc_state->stream != old_dm_crtc_state->stream) {
+			update_type = UPDATE_TYPE_FULL;
+			goto cleanup;
+		}
 
-			if (num_plane > 0) {
-				ret = dm_atomic_get_state(state, &dm_state);
-				if (ret)
-					goto cleanup;
+		if (!new_dm_crtc_state->stream)
+			continue;
 
-				old_dm_state = dm_atomic_get_old_state(state);
-				if (!old_dm_state) {
-					ret = -EINVAL;
-					goto cleanup;
-				}
+		for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, j) {
+			new_plane_crtc = new_plane_state->crtc;
+			old_plane_crtc = old_plane_state->crtc;
+			new_dm_plane_state = to_dm_plane_state(new_plane_state);
+			old_dm_plane_state = to_dm_plane_state(old_plane_state);
+
+			if (plane->type == DRM_PLANE_TYPE_CURSOR)
+				continue;
 
-				status = dc_state_get_stream_status(old_dm_state->context,
-								    new_dm_crtc_state->stream);
+			if (new_dm_plane_state->dc_state != old_dm_plane_state->dc_state) {
+				update_type = UPDATE_TYPE_FULL;
+				goto cleanup;
+			}
 
-				update_type = dc_check_update_surfaces_for_stream(dc, updates, num_plane,
-										  &stream_update, status);
+			if (!state->allow_modeset)
+				continue;
 
-				if (update_type > UPDATE_TYPE_MED) {
-					update_type = UPDATE_TYPE_FULL;
-					goto cleanup;
-				}
+			if (crtc != new_plane_crtc)
+				continue;
+
+			updates[num_plane].surface = &surface[num_plane];
+
+			if (new_crtc_state->mode_changed) {
+				updates[num_plane].surface->src_rect =
+						new_dm_plane_state->dc_state->src_rect;
+				updates[num_plane].surface->dst_rect =
+						new_dm_plane_state->dc_state->dst_rect;
+				updates[num_plane].surface->rotation =
+						new_dm_plane_state->dc_state->rotation;
+				updates[num_plane].surface->in_transfer_func =
+						new_dm_plane_state->dc_state->in_transfer_func;
+				stream_update.dst = new_dm_crtc_state->stream->dst;
+				stream_update.src = new_dm_crtc_state->stream->src;
+			}
+
+			if (new_crtc_state->color_mgmt_changed) {
+				updates[num_plane].gamma =
+						new_dm_plane_state->dc_state->gamma_correction;
+				updates[num_plane].in_transfer_func =
+						new_dm_plane_state->dc_state->in_transfer_func;
+				stream_update.gamut_remap =
+						&new_dm_crtc_state->stream->gamut_remap_matrix;
+				stream_update.out_transfer_func =
+						new_dm_crtc_state->stream->out_transfer_func;
 			}
 
-		} else if (!new_dm_crtc_state->stream && old_dm_crtc_state->stream) {
+			num_plane++;
+		}
+
+		if (num_plane == 0)
+			continue;
+
+		ret = dm_atomic_get_state(state, &dm_state);
+		if (ret)
+			goto cleanup;
+
+		old_dm_state = dm_atomic_get_old_state(state);
+		if (!old_dm_state) {
+			ret = -EINVAL;
+			goto cleanup;
+		}
+
+		status = dc_stream_get_status_from_state(old_dm_state->context,
+							 new_dm_crtc_state->stream);
+
+		update_type = dc_check_update_surfaces_for_stream(dc, updates, num_plane,
+								  &stream_update, status);
+
+		if (update_type > UPDATE_TYPE_MED) {
 			update_type = UPDATE_TYPE_FULL;
 			goto cleanup;
 		}
@@ -5936,6 +5959,8 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
 	struct drm_connector_state *old_con_state, *new_con_state;
 	struct drm_crtc *crtc;
 	struct drm_crtc_state *old_crtc_state, *new_crtc_state;
+	struct drm_plane *plane;
+	struct drm_plane_state *old_plane_state, *new_plane_state;
 	enum surface_update_type update_type = UPDATE_TYPE_FAST;
 	enum surface_update_type overall_update_type = UPDATE_TYPE_FAST;
 
@@ -5969,28 +5994,84 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
 			goto fail;
 	}
 
+	/*
+	 * Add all primary and overlay planes on the CRTC to the state
+	 * whenever a plane is enabled to maintain correct z-ordering
+	 * and to enable fast surface updates.
+	 */
+	drm_for_each_crtc(crtc, dev) {
+		bool modified = false;
+
+		for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
+			if (plane->type == DRM_PLANE_TYPE_CURSOR)
+				continue;
+
+			if (new_plane_state->crtc == crtc ||
+			    old_plane_state->crtc == crtc) {
+				modified = true;
+				break;
+			}
+		}
+
+		if (!modified)
+			continue;
+
+		drm_for_each_plane_mask(plane, state->dev, crtc->state->plane_mask) {
+			if (plane->type == DRM_PLANE_TYPE_CURSOR)
+				continue;
+
+			new_plane_state =
+				drm_atomic_get_plane_state(state, plane);
+
+			if (IS_ERR(new_plane_state)) {
+				ret = PTR_ERR(new_plane_state);
+				goto fail;
+			}
+		}
+	}
+
 	/* Remove exiting planes if they are modified */
-	ret = dm_update_planes_state(dc, state, false, &lock_and_validation_needed);
-	if (ret) {
-		goto fail;
+	for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, new_plane_state, i) {
+		ret = dm_update_plane_state(dc, state, plane,
+					    old_plane_state,
+					    new_plane_state,
+					    false,
+					    &lock_and_validation_needed);
+		if (ret)
+			goto fail;
 	}
 
 	/* Disable all crtcs which require disable */
-	ret = dm_update_crtcs_state(&adev->dm, state, false, &lock_and_validation_needed);
-	if (ret) {
-		goto fail;
+	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
+		ret = dm_update_crtc_state(&adev->dm, state, crtc,
+					   old_crtc_state,
+					   new_crtc_state,
+					   false,
+					   &lock_and_validation_needed);
+		if (ret)
+			goto fail;
 	}
 
 	/* Enable all crtcs which require enable */
-	ret = dm_update_crtcs_state(&adev->dm, state, true, &lock_and_validation_needed);
-	if (ret) {
-		goto fail;
+	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
+		ret = dm_update_crtc_state(&adev->dm, state, crtc,
+					   old_crtc_state,
+					   new_crtc_state,
+					   true,
+					   &lock_and_validation_needed);
+		if (ret)
+			goto fail;
 	}
 
 	/* Add new/modified planes */
-	ret = dm_update_planes_state(dc, state, true, &lock_and_validation_needed);
-	if (ret) {
-		goto fail;
+	for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, new_plane_state, i) {
+		ret = dm_update_plane_state(dc, state, plane,
+					    old_plane_state,
+					    new_plane_state,
+					    true,
+					    &lock_and_validation_needed);
+		if (ret)
+			goto fail;
 	}
 
 	/* Run this here since we want to validate the streams we created */
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c
index f088ac585978..a10e3a50d9ef 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c
@@ -64,8 +64,10 @@ amdgpu_dm_crtc_verify_crc_source(struct drm_crtc *crtc, const char *src_name,
 
 int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name)
 {
+	struct amdgpu_device *adev = crtc->dev->dev_private;
 	struct dm_crtc_state *crtc_state = to_dm_crtc_state(crtc->state);
 	struct dc_stream_state *stream_state = crtc_state->stream;
+	bool enable;
 
 	enum amdgpu_dm_pipe_crc_source source = dm_parse_crc_source(src_name);
 
@@ -80,29 +82,33 @@ int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name)
 		return -EINVAL;
 	}
 
-	/* When enabling CRC, we should also disable dithering. */
-	if (source == AMDGPU_DM_PIPE_CRC_SOURCE_AUTO) {
-		if (dc_stream_configure_crc(stream_state->ctx->dc,
-					    stream_state,
-					    true, true)) {
-			crtc_state->crc_enabled = true;
-			dc_stream_set_dither_option(stream_state,
-						    DITHER_OPTION_TRUN8);
-		}
-		else
-			return -EINVAL;
-	} else {
-		if (dc_stream_configure_crc(stream_state->ctx->dc,
-					    stream_state,
-					    false, false)) {
-			crtc_state->crc_enabled = false;
-			dc_stream_set_dither_option(stream_state,
-						    DITHER_OPTION_DEFAULT);
-		}
-		else
-			return -EINVAL;
+	enable = (source == AMDGPU_DM_PIPE_CRC_SOURCE_AUTO);
+
+	mutex_lock(&adev->dm.dc_lock);
+	if (!dc_stream_configure_crc(stream_state->ctx->dc, stream_state,
+				     enable, enable)) {
+		mutex_unlock(&adev->dm.dc_lock);
+		return -EINVAL;
 	}
 
+	/* When enabling CRC, we should also disable dithering. */
+	dc_stream_set_dither_option(stream_state,
+				    enable ? DITHER_OPTION_TRUN8
+					   : DITHER_OPTION_DEFAULT);
+
+	mutex_unlock(&adev->dm.dc_lock);
+
+	/*
+	 * Reading the CRC requires the vblank interrupt handler to be
+	 * enabled. Keep a reference until CRC capture stops.
+	 */
+	if (!crtc_state->crc_enabled && enable)
+		drm_crtc_vblank_get(crtc);
+	else if (crtc_state->crc_enabled && !enable)
+		drm_crtc_vblank_put(crtc);
+
+	crtc_state->crc_enabled = enable;
+
 	/* Reset crc_skipped on dm state */
 	crtc_state->crc_skip_count = 0;
 	return 0;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index ddd75a4d8ba5..4a55cde027cf 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -803,6 +803,45 @@ static ssize_t dtn_log_write(
 	return size;
 }
 
+/*
+ * Backlight at this moment.  Read only.
+ * As written to display, taking ABM and backlight lut into account.
+ * Ranges from 0x0 to 0x10000 (= 100% PWM)
+ */
+static int current_backlight_read(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *)m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct amdgpu_device *adev = dev->dev_private;
+	struct dc *dc = adev->dm.dc;
+	unsigned int backlight = dc_get_current_backlight_pwm(dc);
+
+	seq_printf(m, "0x%x\n", backlight);
+	return 0;
+}
+
+/*
+ * Backlight value that is being approached.  Read only.
+ * As written to display, taking ABM and backlight lut into account.
+ * Ranges from 0x0 to 0x10000 (= 100% PWM)
+ */
+static int target_backlight_read(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *)m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct amdgpu_device *adev = dev->dev_private;
+	struct dc *dc = adev->dm.dc;
+	unsigned int backlight = dc_get_target_backlight_pwm(dc);
+
+	seq_printf(m, "0x%x\n", backlight);
+	return 0;
+}
+
+static const struct drm_info_list amdgpu_dm_debugfs_list[] = {
+	{"amdgpu_current_backlight_pwm", &current_backlight_read},
+	{"amdgpu_target_backlight_pwm", &target_backlight_read},
+};
+
 int dtn_debugfs_init(struct amdgpu_device *adev)
 {
 	static const struct file_operations dtn_log_fops = {
@@ -813,9 +852,15 @@ int dtn_debugfs_init(struct amdgpu_device *adev)
 	};
 
 	struct drm_minor *minor = adev->ddev->primary;
-	struct dentry *root = minor->debugfs_root;
+	struct dentry *ent, *root = minor->debugfs_root;
+	int ret;
+
+	ret = amdgpu_debugfs_add_files(adev, amdgpu_dm_debugfs_list,
+				ARRAY_SIZE(amdgpu_dm_debugfs_list));
+	if (ret)
+		return ret;
 
-	struct dentry *ent = debugfs_create_file(
+	ent = debugfs_create_file(
 		"amdgpu_dm_dtn_log",
 		0644,
 		root,
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 39997d977efb..b39766bd2840 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -29,7 +29,7 @@
 #include <linux/i2c.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/amdgpu_drm.h>
 #include <drm/drm_edid.h>
 
@@ -192,7 +192,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
 	int bpp = 0;
 	int pbn = 0;
 
-	aconnector = stream->sink->priv;
+	aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context;
 
 	if (!aconnector || !aconnector->mst_port)
 		return false;
@@ -205,7 +205,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
 	mst_port = aconnector->port;
 
 	if (enable) {
-		clock = stream->timing.pix_clk_khz;
+		clock = stream->timing.pix_clk_100hz / 10;
 
 		switch (stream->timing.display_color_depth) {
 
@@ -263,6 +263,13 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
 	return true;
 }
 
+/*
+ * poll pending down reply before clear payload allocation table
+ */
+void dm_helpers_dp_mst_poll_pending_down_reply(
+	struct dc_context *ctx,
+	const struct dc_link *link)
+{}
 
 /*
  * Clear payload allocation table before enable MST DP link.
@@ -284,7 +291,7 @@ bool dm_helpers_dp_mst_poll_for_allocation_change_trigger(
 	struct drm_dp_mst_topology_mgr *mst_mgr;
 	int ret;
 
-	aconnector = stream->sink->priv;
+	aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context;
 
 	if (!aconnector || !aconnector->mst_port)
 		return false;
@@ -312,7 +319,7 @@ bool dm_helpers_dp_mst_send_payload_allocation(
 	struct drm_dp_mst_port *mst_port;
 	int ret;
 
-	aconnector = stream->sink->priv;
+	aconnector = (struct amdgpu_dm_connector *)stream->dm_stream_context;
 
 	if (!aconnector || !aconnector->mst_port)
 		return false;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 1b0d209d8367..f51d52eb52e6 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -35,6 +35,8 @@
 
 #include "dc_link_ddc.h"
 
+#include "i2caux_interface.h"
+
 /* #define TRACE_DPCD */
 
 #ifdef TRACE_DPCD
@@ -81,80 +83,24 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
 				  struct drm_dp_aux_msg *msg)
 {
 	ssize_t result = 0;
-	enum i2caux_transaction_action action;
-	enum aux_transaction_type type;
+	struct aux_payload payload;
 
 	if (WARN_ON(msg->size > 16))
 		return -E2BIG;
 
-	switch (msg->request & ~DP_AUX_I2C_MOT) {
-	case DP_AUX_NATIVE_READ:
-		type = AUX_TRANSACTION_TYPE_DP;
-		action = I2CAUX_TRANSACTION_ACTION_DP_READ;
-
-		result = dc_link_aux_transfer(TO_DM_AUX(aux)->ddc_service,
-					      msg->address,
-					      &msg->reply,
-					      msg->buffer,
-					      msg->size,
-					      type,
-					      action);
-		break;
-	case DP_AUX_NATIVE_WRITE:
-		type = AUX_TRANSACTION_TYPE_DP;
-		action = I2CAUX_TRANSACTION_ACTION_DP_WRITE;
-
-		dc_link_aux_transfer(TO_DM_AUX(aux)->ddc_service,
-				     msg->address,
-				     &msg->reply,
-				     msg->buffer,
-				     msg->size,
-				     type,
-				     action);
-		result = msg->size;
-		break;
-	case DP_AUX_I2C_READ:
-		type = AUX_TRANSACTION_TYPE_I2C;
-		if (msg->request & DP_AUX_I2C_MOT)
-			action = I2CAUX_TRANSACTION_ACTION_I2C_READ_MOT;
-		else
-			action = I2CAUX_TRANSACTION_ACTION_I2C_READ;
-
-		result = dc_link_aux_transfer(TO_DM_AUX(aux)->ddc_service,
-					      msg->address,
-					      &msg->reply,
-					      msg->buffer,
-					      msg->size,
-					      type,
-					      action);
-		break;
-	case DP_AUX_I2C_WRITE:
-		type = AUX_TRANSACTION_TYPE_I2C;
-		if (msg->request & DP_AUX_I2C_MOT)
-			action = I2CAUX_TRANSACTION_ACTION_I2C_WRITE_MOT;
-		else
-			action = I2CAUX_TRANSACTION_ACTION_I2C_WRITE;
-
-		dc_link_aux_transfer(TO_DM_AUX(aux)->ddc_service,
-				     msg->address,
-				     &msg->reply,
-				     msg->buffer,
-				     msg->size,
-				     type,
-				     action);
-		result = msg->size;
-		break;
-	default:
-		return -EINVAL;
-	}
+	payload.address = msg->address;
+	payload.data = msg->buffer;
+	payload.length = msg->size;
+	payload.reply = &msg->reply;
+	payload.i2c_over_aux = (msg->request & DP_AUX_NATIVE_WRITE) == 0;
+	payload.write = (msg->request & DP_AUX_I2C_READ) == 0;
+	payload.mot = (msg->request & DP_AUX_I2C_MOT) != 0;
+	payload.defer_delay = 0;
 
-#ifdef TRACE_DPCD
-	log_dpcd(msg->request,
-		 msg->address,
-		 msg->buffer,
-		 msg->size,
-		 r == DDC_RESULT_SUCESSFULL);
-#endif
+	result = dc_link_aux_transfer(TO_DM_AUX(aux)->ddc_service, &payload);
+
+	if (payload.write)
+		result = msg->size;
 
 	if (result < 0) /* DC doesn't know about kernel error codes */
 		result = -EIO;
@@ -191,6 +137,7 @@ dm_dp_mst_connector_destroy(struct drm_connector *connector)
 	drm_encoder_cleanup(&amdgpu_encoder->base);
 	kfree(amdgpu_encoder);
 	drm_connector_cleanup(connector);
+	drm_dp_mst_put_port_malloc(amdgpu_dm_connector->port);
 	kfree(amdgpu_dm_connector);
 }
 
@@ -227,6 +174,11 @@ static int dm_dp_mst_get_modes(struct drm_connector *connector)
 		aconnector->edid = edid;
 	}
 
+	if (aconnector->dc_sink && aconnector->dc_sink->sink_signal == SIGNAL_TYPE_VIRTUAL) {
+		dc_sink_release(aconnector->dc_sink);
+		aconnector->dc_sink = NULL;
+	}
+
 	if (!aconnector->dc_sink) {
 		struct dc_sink *dc_sink;
 		struct dc_sink_init_data init_params = {
@@ -363,7 +315,9 @@ dm_dp_add_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
 	amdgpu_dm_connector_funcs_reset(connector);
 
 	DRM_INFO("DM_MST: added connector: %p [id: %d] [master: %p]\n",
-			aconnector, connector->base.id, aconnector->mst_port);
+		 aconnector, connector->base.id, aconnector->mst_port);
+
+	drm_dp_mst_get_port_malloc(port);
 
 	DRM_DEBUG_KMS(":%d\n", connector->base.id);
 
@@ -379,12 +333,12 @@ static void dm_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
 	struct amdgpu_dm_connector *aconnector = to_amdgpu_dm_connector(connector);
 
 	DRM_INFO("DM_MST: Disabling connector: %p [id: %d] [master: %p]\n",
-				aconnector, connector->base.id, aconnector->mst_port);
+		 aconnector, connector->base.id, aconnector->mst_port);
 
-	aconnector->port = NULL;
 	if (aconnector->dc_sink) {
 		amdgpu_dm_update_freesync_caps(connector, NULL);
-		dc_link_remove_remote_sink(aconnector->dc_link, aconnector->dc_sink);
+		dc_link_remove_remote_sink(aconnector->dc_link,
+					   aconnector->dc_sink);
 		dc_sink_release(aconnector->dc_sink);
 		aconnector->dc_sink = NULL;
 	}
@@ -395,14 +349,6 @@ static void dm_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
 	drm_connector_put(connector);
 }
 
-static void dm_dp_mst_hotplug(struct drm_dp_mst_topology_mgr *mgr)
-{
-	struct amdgpu_dm_connector *master = container_of(mgr, struct amdgpu_dm_connector, mst_mgr);
-	struct drm_device *dev = master->base.dev;
-
-	drm_kms_helper_hotplug_event(dev);
-}
-
 static void dm_dp_mst_register_connector(struct drm_connector *connector)
 {
 	struct drm_device *dev = connector->dev;
@@ -419,7 +365,6 @@ static void dm_dp_mst_register_connector(struct drm_connector *connector)
 static const struct drm_dp_mst_topology_cbs dm_mst_cbs = {
 	.add_connector = dm_dp_add_mst_connector,
 	.destroy_connector = dm_dp_destroy_mst_connector,
-	.hotplug = dm_dp_mst_hotplug,
 	.register_connector = dm_dp_mst_register_connector
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
index 9d2d6986b983..a114954d6a5b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
@@ -25,7 +25,7 @@
 #include <linux/acpi.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/amdgpu_drm.h>
 #include "dm_services.h"
 #include "amdgpu.h"
@@ -559,6 +559,58 @@ void pp_rv_set_pme_wa_enable(struct pp_smu *pp)
 	pp_funcs->notify_smu_enable_pwe(pp_handle);
 }
 
+void pp_rv_set_active_display_count(struct pp_smu *pp, int count)
+{
+	const struct dc_context *ctx = pp->dm;
+	struct amdgpu_device *adev = ctx->driver_context;
+	void *pp_handle = adev->powerplay.pp_handle;
+	const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
+
+	if (!pp_funcs || !pp_funcs->set_active_display_count)
+		return;
+
+	pp_funcs->set_active_display_count(pp_handle, count);
+}
+
+void pp_rv_set_min_deep_sleep_dcfclk(struct pp_smu *pp, int clock)
+{
+	const struct dc_context *ctx = pp->dm;
+	struct amdgpu_device *adev = ctx->driver_context;
+	void *pp_handle = adev->powerplay.pp_handle;
+	const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
+
+	if (!pp_funcs || !pp_funcs->set_min_deep_sleep_dcefclk)
+		return;
+
+	pp_funcs->set_min_deep_sleep_dcefclk(pp_handle, clock);
+}
+
+void pp_rv_set_hard_min_dcefclk_by_freq(struct pp_smu *pp, int clock)
+{
+	const struct dc_context *ctx = pp->dm;
+	struct amdgpu_device *adev = ctx->driver_context;
+	void *pp_handle = adev->powerplay.pp_handle;
+	const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
+
+	if (!pp_funcs || !pp_funcs->set_hard_min_dcefclk_by_freq)
+		return;
+
+	pp_funcs->set_hard_min_dcefclk_by_freq(pp_handle, clock);
+}
+
+void pp_rv_set_hard_min_fclk_by_freq(struct pp_smu *pp, int mhz)
+{
+	const struct dc_context *ctx = pp->dm;
+	struct amdgpu_device *adev = ctx->driver_context;
+	void *pp_handle = adev->powerplay.pp_handle;
+	const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
+
+	if (!pp_funcs || !pp_funcs->set_hard_min_fclk_by_freq)
+		return;
+
+	pp_funcs->set_hard_min_fclk_by_freq(pp_handle, mhz);
+}
+
 void dm_pp_get_funcs_rv(
 		struct dc_context *ctx,
 		struct pp_smu_funcs_rv *funcs)
@@ -567,4 +619,9 @@ void dm_pp_get_funcs_rv(
 	funcs->set_display_requirement = pp_rv_set_display_requirement;
 	funcs->set_wm_ranges = pp_rv_set_wm_ranges;
 	funcs->set_pme_wa_enable = pp_rv_set_pme_wa_enable;
+	funcs->set_display_count = pp_rv_set_active_display_count;
+	funcs->set_min_deep_sleep_dcfclk = pp_rv_set_min_deep_sleep_dcfclk;
+	funcs->set_hard_min_dcfclk_by_freq = pp_rv_set_hard_min_dcefclk_by_freq;
+	funcs->set_hard_min_fclk_by_freq = pp_rv_set_hard_min_fclk_by_freq;
 }
+
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_services.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_services.c
index 516795342dd2..d915e8c8769b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_services.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_services.c
@@ -27,7 +27,7 @@
 #include <linux/acpi.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/amdgpu_drm.h>
 #include "dm_services.h"
 #include "amdgpu.h"
diff --git a/drivers/gpu/drm/amd/display/dc/Makefile b/drivers/gpu/drm/amd/display/dc/Makefile
index aed538a4d1ba..b8ddb4acccdb 100644
--- a/drivers/gpu/drm/amd/display/dc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/Makefile
@@ -23,7 +23,7 @@
 # Makefile for Display Core (dc) component.
 #
 
-DC_LIBS = basics bios calcs dce gpio i2caux irq virtual
+DC_LIBS = basics bios calcs dce gpio irq virtual
 
 ifdef CONFIG_DRM_AMD_DC_DCN1_0
 DC_LIBS += dcn10 dml
@@ -41,7 +41,8 @@ AMD_DC = $(addsuffix /Makefile, $(addprefix $(FULL_AMD_DISPLAY_PATH)/dc/,$(DC_LI
 include $(AMD_DC)
 
 DISPLAY_CORE = dc.o dc_link.o dc_resource.o dc_hw_sequencer.o dc_sink.o \
-dc_surface.o dc_link_hwss.o dc_link_dp.o dc_link_ddc.o dc_debug.o dc_stream.o
+dc_surface.o dc_link_hwss.o dc_link_dp.o dc_link_ddc.o dc_debug.o dc_stream.o \
+dc_vm_helper.o
 
 AMD_DISPLAY_CORE = $(addprefix $(AMDDALPATH)/dc/core/,$(DISPLAY_CORE))
 
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
index c2ab026aee91..a4c97d32e751 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser.c
@@ -835,18 +835,6 @@ static enum bp_result bios_parser_enable_crtc(
 	return bp->cmd_tbl.enable_crtc(bp, id, enable);
 }
 
-static enum bp_result bios_parser_crtc_source_select(
-	struct dc_bios *dcb,
-	struct bp_crtc_source_select *bp_params)
-{
-	struct bios_parser *bp = BP_FROM_DCB(dcb);
-
-	if (!bp->cmd_tbl.select_crtc_source)
-		return BP_RESULT_FAILURE;
-
-	return bp->cmd_tbl.select_crtc_source(bp, bp_params);
-}
-
 static enum bp_result bios_parser_enable_disp_power_gating(
 	struct dc_bios *dcb,
 	enum controller_id controller_id,
@@ -2842,8 +2830,6 @@ static const struct dc_vbios_funcs vbios_funcs = {
 
 	.program_crtc_timing = bios_parser_program_crtc_timing, /* still use.  should probably retire and program directly */
 
-	.crtc_source_select = bios_parser_crtc_source_select,  /* still use.  should probably retire and program directly */
-
 	.program_display_engine_pll = bios_parser_program_display_engine_pll,
 
 	.enable_disp_power_gating = bios_parser_enable_disp_power_gating,
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
index c513ab6f3843..fd5266a58297 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
@@ -265,6 +265,7 @@ static struct atom_display_object_path_v2 *get_bios_object(
 					&& id.enum_id == obj_id.enum_id)
 				return &bp->object_info_tbl.v1_4->display_path[i];
 		}
+		/* fall through */
 	case OBJECT_TYPE_CONNECTOR:
 	case OBJECT_TYPE_GENERIC:
 		/* Both Generic and Connector Object ID
@@ -277,6 +278,7 @@ static struct atom_display_object_path_v2 *get_bios_object(
 					&& id.enum_id == obj_id.enum_id)
 				return &bp->object_info_tbl.v1_4->display_path[i];
 		}
+		/* fall through */
 	default:
 		return NULL;
 	}
@@ -1083,18 +1085,6 @@ static enum bp_result bios_parser_enable_crtc(
 	return bp->cmd_tbl.enable_crtc(bp, id, enable);
 }
 
-static enum bp_result bios_parser_crtc_source_select(
-	struct dc_bios *dcb,
-	struct bp_crtc_source_select *bp_params)
-{
-	struct bios_parser *bp = BP_FROM_DCB(dcb);
-
-	if (!bp->cmd_tbl.select_crtc_source)
-		return BP_RESULT_FAILURE;
-
-	return bp->cmd_tbl.select_crtc_source(bp, bp_params);
-}
-
 static enum bp_result bios_parser_enable_disp_power_gating(
 	struct dc_bios *dcb,
 	enum controller_id controller_id,
@@ -1899,8 +1889,6 @@ static const struct dc_vbios_funcs vbios_funcs = {
 
 	.is_accelerated_mode = bios_parser_is_accelerated_mode,
 
-	.is_active_display = bios_is_active_display,
-
 	.set_scratch_critical_state = bios_parser_set_scratch_critical_state,
 
 
@@ -1917,8 +1905,6 @@ static const struct dc_vbios_funcs vbios_funcs = {
 
 	.program_crtc_timing = bios_parser_program_crtc_timing,
 
-	.crtc_source_select = bios_parser_crtc_source_select,
-
 	.enable_disp_power_gating = bios_parser_enable_disp_power_gating,
 
 	.bios_parser_destroy = firmware_parser_destroy,
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.c
index fdda8aa8e303..fce46ab54c54 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.c
@@ -83,101 +83,7 @@ uint32_t bios_get_vga_enabled_displays(
 {
 	uint32_t active_disp = 1;
 
-	if (bios->regs->BIOS_SCRATCH_3) /*follow up with other asic, todo*/
-		active_disp = REG_READ(BIOS_SCRATCH_3) & 0XFFFF;
+	active_disp = REG_READ(BIOS_SCRATCH_3) & 0XFFFF;
 	return active_disp;
 }
 
-bool bios_is_active_display(
-		struct dc_bios *bios,
-		enum signal_type signal,
-		const struct connector_device_tag_info *device_tag)
-{
-	uint32_t active = 0;
-	uint32_t connected = 0;
-	uint32_t bios_scratch_0 = 0;
-	uint32_t bios_scratch_3 = 0;
-
-	switch (signal)	{
-	case SIGNAL_TYPE_DVI_SINGLE_LINK:
-	case SIGNAL_TYPE_DVI_DUAL_LINK:
-	case SIGNAL_TYPE_HDMI_TYPE_A:
-	case SIGNAL_TYPE_DISPLAY_PORT:
-	case SIGNAL_TYPE_DISPLAY_PORT_MST:
-		{
-			if (device_tag->dev_id.device_type == DEVICE_TYPE_DFP) {
-				switch (device_tag->dev_id.enum_id)	{
-				case 1:
-					{
-						active    = ATOM_S3_DFP1_ACTIVE;
-						connected = 0x0008;	//ATOM_DISPLAY_DFP1_CONNECT
-					}
-					break;
-
-				case 2:
-					{
-						active    = ATOM_S3_DFP2_ACTIVE;
-						connected = 0x0080; //ATOM_DISPLAY_DFP2_CONNECT
-					}
-					break;
-
-				case 3:
-					{
-						active    = ATOM_S3_DFP3_ACTIVE;
-						connected = 0x0200; //ATOM_DISPLAY_DFP3_CONNECT
-					}
-					break;
-
-				case 4:
-					{
-						active    = ATOM_S3_DFP4_ACTIVE;
-						connected = 0x0400;	//ATOM_DISPLAY_DFP4_CONNECT
-					}
-					break;
-
-				case 5:
-					{
-						active    = ATOM_S3_DFP5_ACTIVE;
-						connected = 0x0800; //ATOM_DISPLAY_DFP5_CONNECT
-					}
-					break;
-
-				case 6:
-					{
-						active    = ATOM_S3_DFP6_ACTIVE;
-						connected = 0x0040; //ATOM_DISPLAY_DFP6_CONNECT
-					}
-					break;
-
-				default:
-					break;
-				}
-				}
-			}
-			break;
-
-	case SIGNAL_TYPE_LVDS:
-	case SIGNAL_TYPE_EDP:
-		{
-			active    = ATOM_S3_LCD1_ACTIVE;
-			connected = 0x0002;	//ATOM_DISPLAY_LCD1_CONNECT
-		}
-		break;
-
-	default:
-		break;
-	}
-
-
-	if (bios->regs->BIOS_SCRATCH_0) /*follow up with other asic, todo*/
-		bios_scratch_0 = REG_READ(BIOS_SCRATCH_0);
-	if (bios->regs->BIOS_SCRATCH_3) /*follow up with other asic, todo*/
-		bios_scratch_3 = REG_READ(BIOS_SCRATCH_3);
-
-	bios_scratch_3 &= ATOM_S3_DEVICE_ACTIVE_MASK;
-	if ((active & bios_scratch_3) && (connected & bios_scratch_0))
-		return true;
-
-	return false;
-}
-
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.h b/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.h
index f33cac2147e3..75a29e68fb27 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.h
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.h
@@ -35,10 +35,6 @@ bool bios_is_accelerated_mode(struct dc_bios *bios);
 void bios_set_scratch_acc_mode_change(struct dc_bios *bios);
 void bios_set_scratch_critical_state(struct dc_bios *bios, bool state);
 uint32_t bios_get_vga_enabled_displays(struct dc_bios *bios);
-bool bios_is_active_display(
-	struct dc_bios *bios,
-	enum signal_type signal,
-	const struct connector_device_tag_info *device_tag);
 
 #define GET_IMAGE(type, offset) ((type *) bios_get_image(&bp->base, offset, sizeof(type)))
 
diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.c b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
index 2bd7cd97e00d..5815983caaf8 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.c
@@ -55,7 +55,6 @@ static void init_adjust_display_pll(struct bios_parser *bp);
 static void init_dac_encoder_control(struct bios_parser *bp);
 static void init_dac_output_control(struct bios_parser *bp);
 static void init_set_crtc_timing(struct bios_parser *bp);
-static void init_select_crtc_source(struct bios_parser *bp);
 static void init_enable_crtc(struct bios_parser *bp);
 static void init_enable_crtc_mem_req(struct bios_parser *bp);
 static void init_external_encoder_control(struct bios_parser *bp);
@@ -73,7 +72,6 @@ void dal_bios_parser_init_cmd_tbl(struct bios_parser *bp)
 	init_dac_encoder_control(bp);
 	init_dac_output_control(bp);
 	init_set_crtc_timing(bp);
-	init_select_crtc_source(bp);
 	init_enable_crtc(bp);
 	init_enable_crtc_mem_req(bp);
 	init_program_clock(bp);
@@ -964,9 +962,9 @@ static enum bp_result set_pixel_clock_v3(
 	allocation.sPCLKInput.ucPostDiv =
 			(uint8_t)bp_params->pixel_clock_post_divider;
 
-	/* We need to convert from KHz units into 10KHz units */
+	/* We need to convert from 100Hz units into 10KHz units */
 	allocation.sPCLKInput.usPixelClock =
-			cpu_to_le16((uint16_t)(bp_params->target_pixel_clock / 10));
+			cpu_to_le16((uint16_t)(bp_params->target_pixel_clock_100hz / 100));
 
 	params = (PIXEL_CLOCK_PARAMETERS_V3 *)&allocation.sPCLKInput;
 	params->ucTransmitterId =
@@ -1042,9 +1040,9 @@ static enum bp_result set_pixel_clock_v5(
 				(uint8_t)bp->cmd_helper->encoder_mode_bp_to_atom(
 						bp_params->signal_type, false);
 
-		/* We need to convert from KHz units into 10KHz units */
+		/* We need to convert from 100Hz units into 10KHz units */
 		clk.sPCLKInput.usPixelClock =
-				cpu_to_le16((uint16_t)(bp_params->target_pixel_clock / 10));
+				cpu_to_le16((uint16_t)(bp_params->target_pixel_clock_100hz / 100));
 
 		if (bp_params->flags.FORCE_PROGRAMMING_OF_PLL)
 			clk.sPCLKInput.ucMiscInfo |=
@@ -1118,9 +1116,9 @@ static enum bp_result set_pixel_clock_v6(
 				(uint8_t) bp->cmd_helper->encoder_mode_bp_to_atom(
 						bp_params->signal_type, false);
 
-		/* We need to convert from KHz units into 10KHz units */
+		/* We need to convert from 100 Hz units into 10KHz units */
 		clk.sPCLKInput.ulCrtcPclkFreq.ulPixelClock =
-				cpu_to_le32(bp_params->target_pixel_clock / 10);
+				cpu_to_le32(bp_params->target_pixel_clock_100hz / 100);
 
 		if (bp_params->flags.FORCE_PROGRAMMING_OF_PLL) {
 			clk.sPCLKInput.ucMiscInfo |=
@@ -1182,8 +1180,7 @@ static enum bp_result set_pixel_clock_v7(
 		clk.ucTransmitterID = bp->cmd_helper->encoder_id_to_atom(dal_graphics_object_id_get_encoder_id(bp_params->encoder_object_id));
 		clk.ucEncoderMode = (uint8_t) bp->cmd_helper->encoder_mode_bp_to_atom(bp_params->signal_type, false);
 
-		/* We need to convert from KHz units into 10KHz units */
-		clk.ulPixelClock = cpu_to_le32(bp_params->target_pixel_clock * 10);
+		clk.ulPixelClock = cpu_to_le32(bp_params->target_pixel_clock_100hz);
 
 		clk.ucDeepColorRatio = (uint8_t) bp->cmd_helper->transmitter_color_depth_to_atom(bp_params->color_depth);
 
@@ -1899,120 +1896,6 @@ static enum bp_result set_crtc_using_dtd_timing_v3(
 /*******************************************************************************
  ********************************************************************************
  **
- **                  SELECT CRTC SOURCE
- **
- ********************************************************************************
- *******************************************************************************/
-
-static enum bp_result select_crtc_source_v2(
-	struct bios_parser *bp,
-	struct bp_crtc_source_select *bp_params);
-static enum bp_result select_crtc_source_v3(
-	struct bios_parser *bp,
-	struct bp_crtc_source_select *bp_params);
-
-static void init_select_crtc_source(struct bios_parser *bp)
-{
-	switch (BIOS_CMD_TABLE_PARA_REVISION(SelectCRTC_Source)) {
-	case 2:
-		bp->cmd_tbl.select_crtc_source = select_crtc_source_v2;
-		break;
-	case 3:
-		bp->cmd_tbl.select_crtc_source = select_crtc_source_v3;
-		break;
-	default:
-		dm_output_to_console("Don't select_crtc_source enable_crtc for v%d\n",
-			 BIOS_CMD_TABLE_PARA_REVISION(SelectCRTC_Source));
-		bp->cmd_tbl.select_crtc_source = NULL;
-		break;
-	}
-}
-
-static enum bp_result select_crtc_source_v2(
-	struct bios_parser *bp,
-	struct bp_crtc_source_select *bp_params)
-{
-	enum bp_result result = BP_RESULT_FAILURE;
-	SELECT_CRTC_SOURCE_PARAMETERS_V2 params;
-	uint8_t atom_controller_id;
-	uint32_t atom_engine_id;
-	enum signal_type s = bp_params->signal;
-
-	memset(&params, 0, sizeof(params));
-
-	/* set controller id */
-	if (bp->cmd_helper->controller_id_to_atom(
-			bp_params->controller_id, &atom_controller_id))
-		params.ucCRTC = atom_controller_id;
-	else
-		return BP_RESULT_FAILURE;
-
-	/* set encoder id */
-	if (bp->cmd_helper->engine_bp_to_atom(
-			bp_params->engine_id, &atom_engine_id))
-		params.ucEncoderID = (uint8_t)atom_engine_id;
-	else
-		return BP_RESULT_FAILURE;
-
-	if (SIGNAL_TYPE_EDP == s ||
-			(SIGNAL_TYPE_DISPLAY_PORT == s &&
-					SIGNAL_TYPE_LVDS == bp_params->sink_signal))
-		s = SIGNAL_TYPE_LVDS;
-
-	params.ucEncodeMode =
-			(uint8_t)bp->cmd_helper->encoder_mode_bp_to_atom(
-					s, bp_params->enable_dp_audio);
-
-	if (EXEC_BIOS_CMD_TABLE(SelectCRTC_Source, params))
-		result = BP_RESULT_OK;
-
-	return result;
-}
-
-static enum bp_result select_crtc_source_v3(
-	struct bios_parser *bp,
-	struct bp_crtc_source_select *bp_params)
-{
-	bool result = BP_RESULT_FAILURE;
-	SELECT_CRTC_SOURCE_PARAMETERS_V3 params;
-	uint8_t atom_controller_id;
-	uint32_t atom_engine_id;
-	enum signal_type s = bp_params->signal;
-
-	memset(&params, 0, sizeof(params));
-
-	if (bp->cmd_helper->controller_id_to_atom(bp_params->controller_id,
-			&atom_controller_id))
-		params.ucCRTC = atom_controller_id;
-	else
-		return result;
-
-	if (bp->cmd_helper->engine_bp_to_atom(bp_params->engine_id,
-			&atom_engine_id))
-		params.ucEncoderID = (uint8_t)atom_engine_id;
-	else
-		return result;
-
-	if (SIGNAL_TYPE_EDP == s ||
-			(SIGNAL_TYPE_DISPLAY_PORT == s &&
-					SIGNAL_TYPE_LVDS == bp_params->sink_signal))
-		s = SIGNAL_TYPE_LVDS;
-
-	params.ucEncodeMode =
-			bp->cmd_helper->encoder_mode_bp_to_atom(
-					s, bp_params->enable_dp_audio);
-	/* Needed for VBIOS Random Spatial Dithering feature */
-	params.ucDstBpc = (uint8_t)(bp_params->display_output_bit_depth);
-
-	if (EXEC_BIOS_CMD_TABLE(SelectCRTC_Source, params))
-		result = BP_RESULT_OK;
-
-	return result;
-}
-
-/*******************************************************************************
- ********************************************************************************
- **
  **                  ENABLE CRTC
  **
  ********************************************************************************
@@ -2164,7 +2047,7 @@ static enum bp_result program_clock_v5(
 	/* We need to convert from KHz units into 10KHz units */
 	params.sPCLKInput.ucPpll = (uint8_t) atom_pll_id;
 	params.sPCLKInput.usPixelClock =
-			cpu_to_le16((uint16_t) (bp_params->target_pixel_clock / 10));
+			cpu_to_le16((uint16_t) (bp_params->target_pixel_clock_100hz / 100));
 	params.sPCLKInput.ucCRTC = (uint8_t) ATOM_CRTC_INVALID;
 
 	if (bp_params->flags.SET_EXTERNAL_REF_DIV_SRC)
@@ -2196,7 +2079,7 @@ static enum bp_result program_clock_v6(
 	/* We need to convert from KHz units into 10KHz units */
 	params.sPCLKInput.ucPpll = (uint8_t)atom_pll_id;
 	params.sPCLKInput.ulDispEngClkFreq =
-			cpu_to_le32(bp_params->target_pixel_clock / 10);
+			cpu_to_le32(bp_params->target_pixel_clock_100hz / 100);
 
 	if (bp_params->flags.SET_EXTERNAL_REF_DIV_SRC)
 		params.sPCLKInput.ucMiscInfo |= PIXEL_CLOCK_MISC_REF_DIV_SRC;
diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table.h b/drivers/gpu/drm/amd/display/dc/bios/command_table.h
index 94f3d43a7471..ad533775e724 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table.h
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table.h
@@ -71,9 +71,6 @@ struct cmd_tbl {
 	enum bp_result (*set_crtc_timing)(
 		struct bios_parser *bp,
 		struct bp_hw_crtc_timing_parameters *bp_params);
-	enum bp_result (*select_crtc_source)(
-		struct bios_parser *bp,
-		struct bp_crtc_source_select *bp_params);
 	enum bp_result (*enable_crtc)(
 		struct bios_parser *bp,
 		enum controller_id controller_id,
diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
index 2b5dc499a35e..bb2e8105e6ab 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
@@ -301,17 +301,17 @@ static enum bp_result set_pixel_clock_v7(
 			cmd_helper->encoder_mode_bp_to_atom(
 				bp_params->signal_type, false);
 
-		/* We need to convert from KHz units into 10KHz units */
-		clk.pixclk_100hz = cpu_to_le32(bp_params->target_pixel_clock *
-				10);
+		clk.pixclk_100hz = cpu_to_le32(bp_params->target_pixel_clock_100hz);
 
 		clk.deep_color_ratio =
 			(uint8_t) bp->cmd_helper->
 				transmitter_color_depth_to_atom(
 					bp_params->color_depth);
-		DC_LOG_BIOS("%s:program display clock = %d"\
-				"colorDepth = %d\n", __func__,\
-				bp_params->target_pixel_clock, bp_params->color_depth);
+
+		DC_LOG_BIOS("%s:program display clock = %d, tg = %d, pll = %d, "\
+				"colorDepth = %d\n", __func__,
+				bp_params->target_pixel_clock_100hz, (int)controller_id,
+				pll_id, bp_params->color_depth);
 
 		if (bp_params->flags.FORCE_PROGRAMMING_OF_PLL)
 			clk.miscinfo |= PIXEL_CLOCK_V7_MISC_FORCE_PROG_PPLL;
@@ -463,75 +463,6 @@ static enum bp_result set_crtc_using_dtd_timing_v3(
 /******************************************************************************
  ******************************************************************************
  **
- **                  SELECT CRTC SOURCE
- **
- ******************************************************************************
- *****************************************************************************/
-
-
-static enum bp_result select_crtc_source_v3(
-	struct bios_parser *bp,
-	struct bp_crtc_source_select *bp_params);
-
-static void init_select_crtc_source(struct bios_parser *bp)
-{
-	switch (BIOS_CMD_TABLE_PARA_REVISION(selectcrtc_source)) {
-	case 3:
-		bp->cmd_tbl.select_crtc_source = select_crtc_source_v3;
-		break;
-	default:
-		dm_output_to_console("Don't select_crtc_source enable_crtc for v%d\n",
-			 BIOS_CMD_TABLE_PARA_REVISION(selectcrtc_source));
-		bp->cmd_tbl.select_crtc_source = NULL;
-		break;
-	}
-}
-
-
-static enum bp_result select_crtc_source_v3(
-	struct bios_parser *bp,
-	struct bp_crtc_source_select *bp_params)
-{
-	bool result = BP_RESULT_FAILURE;
-	struct select_crtc_source_parameters_v2_3 params;
-	uint8_t atom_controller_id;
-	uint32_t atom_engine_id;
-	enum signal_type s = bp_params->signal;
-
-	memset(&params, 0, sizeof(params));
-
-	if (bp->cmd_helper->controller_id_to_atom(bp_params->controller_id,
-			&atom_controller_id))
-		params.crtc_id = atom_controller_id;
-	else
-		return result;
-
-	if (bp->cmd_helper->engine_bp_to_atom(bp_params->engine_id,
-			&atom_engine_id))
-		params.encoder_id = (uint8_t)atom_engine_id;
-	else
-		return result;
-
-	if (s == SIGNAL_TYPE_EDP ||
-		(s == SIGNAL_TYPE_DISPLAY_PORT && bp_params->sink_signal ==
-							SIGNAL_TYPE_LVDS))
-		s = SIGNAL_TYPE_LVDS;
-
-	params.encode_mode =
-			bp->cmd_helper->encoder_mode_bp_to_atom(
-					s, bp_params->enable_dp_audio);
-	/* Needed for VBIOS Random Spatial Dithering feature */
-	params.dst_bpc = (uint8_t)(bp_params->display_output_bit_depth);
-
-	if (EXEC_BIOS_CMD_TABLE(selectcrtc_source, params))
-		result = BP_RESULT_OK;
-
-	return result;
-}
-
-/******************************************************************************
- ******************************************************************************
- **
  **                  ENABLE CRTC
  **
  ******************************************************************************
@@ -808,7 +739,6 @@ void dal_firmware_parser_init_cmd_tbl(struct bios_parser *bp)
 
 	init_set_crtc_timing(bp);
 
-	init_select_crtc_source(bp);
 	init_enable_crtc(bp);
 
 	init_external_encoder_control(bp);
diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table2.h b/drivers/gpu/drm/amd/display/dc/bios/command_table2.h
index ec1c0c9f3f1d..7a2af24dfe60 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table2.h
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table2.h
@@ -71,9 +71,6 @@ struct cmd_tbl {
 	enum bp_result (*set_crtc_timing)(
 		struct bios_parser *bp,
 		struct bp_hw_crtc_timing_parameters *bp_params);
-	enum bp_result (*select_crtc_source)(
-		struct bios_parser *bp,
-		struct bp_crtc_source_select *bp_params);
 	enum bp_result (*enable_crtc)(
 		struct bios_parser *bp,
 		enum controller_id controller_id,
diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
index 9ebe30ba4dab..f3aa7b53d2aa 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
@@ -2792,7 +2792,7 @@ static void populate_initial_data(
 		data->lpt_en[num_displays + 4] = false;
 		data->h_total[num_displays + 4] = bw_int_to_fixed(pipe[i].stream->timing.h_total);
 		data->v_total[num_displays + 4] = bw_int_to_fixed(pipe[i].stream->timing.v_total);
-		data->pixel_rate[num_displays + 4] = bw_frc_to_fixed(pipe[i].stream->timing.pix_clk_khz, 1000);
+		data->pixel_rate[num_displays + 4] = bw_frc_to_fixed(pipe[i].stream->timing.pix_clk_100hz, 10000);
 		data->src_width[num_displays + 4] = bw_int_to_fixed(pipe[i].plane_res.scl_data.viewport.width);
 		data->pitch_in_pixels[num_displays + 4] = data->src_width[num_displays + 4];
 		data->src_height[num_displays + 4] = bw_int_to_fixed(pipe[i].plane_res.scl_data.viewport.height);
@@ -2881,7 +2881,7 @@ static void populate_initial_data(
 
 	/* Pipes without underlay after */
 	for (i = 0; i < pipe_count; i++) {
-		unsigned int pixel_clock_khz;
+		unsigned int pixel_clock_100hz;
 		if (!pipe[i].stream || pipe[i].bottom_pipe)
 			continue;
 
@@ -2890,10 +2890,10 @@ static void populate_initial_data(
 		data->lpt_en[num_displays + 4] = false;
 		data->h_total[num_displays + 4] = bw_int_to_fixed(pipe[i].stream->timing.h_total);
 		data->v_total[num_displays + 4] = bw_int_to_fixed(pipe[i].stream->timing.v_total);
-		pixel_clock_khz = pipe[i].stream->timing.pix_clk_khz;
+		pixel_clock_100hz = pipe[i].stream->timing.pix_clk_100hz;
 		if (pipe[i].stream->timing.timing_3d_format == TIMING_3D_FORMAT_HW_FRAME_PACKING)
-			pixel_clock_khz *= 2;
-		data->pixel_rate[num_displays + 4] = bw_frc_to_fixed(pixel_clock_khz, 1000);
+			pixel_clock_100hz *= 2;
+		data->pixel_rate[num_displays + 4] = bw_frc_to_fixed(pixel_clock_100hz, 10000);
 		if (pipe[i].plane_state) {
 			data->src_width[num_displays + 4] = bw_int_to_fixed(pipe[i].plane_res.scl_data.viewport.width);
 			data->pitch_in_pixels[num_displays + 4] = data->src_width[num_displays + 4];
diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c
index d0fc54f8fb1c..1ef0074302c5 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.c
@@ -63,7 +63,7 @@ void scaler_settings_calculation(struct dcn_bw_internal_vars *v)
 		if (v->interlace_output[k] == 1.0) {
 			v->v_ratio[k] = 2.0 * v->v_ratio[k];
 		}
-		if ((v->underscan_output[k] == 1.0)) {
+		if (v->underscan_output[k] == 1.0) {
 			v->h_ratio[k] = v->h_ratio[k] * v->under_scan_factor;
 			v->v_ratio[k] = v->v_ratio[k] * v->under_scan_factor;
 		}
@@ -797,9 +797,40 @@ void mode_support_and_system_configuration(struct dcn_bw_internal_vars *v)
 				else {
 					v->maximum_vstartup = v->v_sync_plus_back_porch[k] - 1.0;
 				}
-				v->line_times_for_prefetch[k] = v->maximum_vstartup - v->urgent_latency / (v->htotal[k] / v->pixel_clock[k]) - (v->time_calc + v->time_setup) / (v->htotal[k] / v->pixel_clock[k]) - (v->dst_y_after_scaler + v->dst_x_after_scaler / v->htotal[k]);
-				v->line_times_for_prefetch[k] =dcn_bw_floor2(4.0 * (v->line_times_for_prefetch[k] + 0.125), 1.0) / 4;
-				v->prefetch_bw[k] = (v->meta_pte_bytes_per_frame[k] + 2.0 * v->meta_row_bytes[k] + 2.0 * v->dpte_bytes_per_row[k] + v->prefetch_lines_y[k] * v->swath_width_yper_state[i][j][k] *dcn_bw_ceil2(v->byte_per_pixel_in_dety[k], 1.0) + v->prefetch_lines_c[k] * v->swath_width_yper_state[i][j][k] / 2.0 *dcn_bw_ceil2(v->byte_per_pixel_in_detc[k], 2.0)) / (v->line_times_for_prefetch[k] * v->htotal[k] / v->pixel_clock[k]);
+
+				do {
+					v->line_times_for_prefetch[k] = v->maximum_vstartup - v->urgent_latency / (v->htotal[k] / v->pixel_clock[k]) - (v->time_calc + v->time_setup) / (v->htotal[k] / v->pixel_clock[k]) - (v->dst_y_after_scaler + v->dst_x_after_scaler / v->htotal[k]);
+					v->line_times_for_prefetch[k] =dcn_bw_floor2(4.0 * (v->line_times_for_prefetch[k] + 0.125), 1.0) / 4;
+					v->prefetch_bw[k] = (v->meta_pte_bytes_per_frame[k] + 2.0 * v->meta_row_bytes[k] + 2.0 * v->dpte_bytes_per_row[k] + v->prefetch_lines_y[k] * v->swath_width_yper_state[i][j][k] *dcn_bw_ceil2(v->byte_per_pixel_in_dety[k], 1.0) + v->prefetch_lines_c[k] * v->swath_width_yper_state[i][j][k] / 2.0 *dcn_bw_ceil2(v->byte_per_pixel_in_detc[k], 2.0)) / (v->line_times_for_prefetch[k] * v->htotal[k] / v->pixel_clock[k]);
+
+					if (v->pte_enable == dcn_bw_yes && v->dcc_enable[k] == dcn_bw_yes) {
+						v->time_for_meta_pte_without_immediate_flip = dcn_bw_max3(
+								v->meta_pte_bytes_frame[k] / v->prefetch_bandwidth[k],
+								v->extra_latency,
+								v->htotal[k] / v->pixel_clock[k] / 4.0);
+					} else {
+						v->time_for_meta_pte_without_immediate_flip = v->htotal[k] / v->pixel_clock[k] / 4.0;
+					}
+
+					if (v->pte_enable == dcn_bw_yes || v->dcc_enable[k] == dcn_bw_yes) {
+						v->time_for_meta_and_dpte_row_without_immediate_flip = dcn_bw_max3((
+								v->meta_row_bytes[k] + v->dpte_bytes_per_row[k]) / v->prefetch_bandwidth[k],
+								v->htotal[k] / v->pixel_clock[k] - v->time_for_meta_pte_without_immediate_flip,
+								v->extra_latency);
+					} else {
+						v->time_for_meta_and_dpte_row_without_immediate_flip = dcn_bw_max2(
+								v->htotal[k] / v->pixel_clock[k] - v->time_for_meta_pte_without_immediate_flip,
+								v->extra_latency - v->time_for_meta_pte_with_immediate_flip);
+					}
+
+					v->lines_for_meta_pte_without_immediate_flip[k] =dcn_bw_floor2(4.0 * (v->time_for_meta_pte_without_immediate_flip / (v->htotal[k] / v->pixel_clock[k]) + 0.125), 1.0) / 4;
+					v->lines_for_meta_and_dpte_row_without_immediate_flip[k] =dcn_bw_floor2(4.0 * (v->time_for_meta_and_dpte_row_without_immediate_flip / (v->htotal[k] / v->pixel_clock[k]) + 0.125), 1.0) / 4;
+					v->maximum_vstartup = v->maximum_vstartup - 1;
+
+					if (v->lines_for_meta_pte_without_immediate_flip[k] < 8.0 && v->lines_for_meta_and_dpte_row_without_immediate_flip[k] < 16.0)
+						break;
+
+				} while(1);
 			}
 			v->bw_available_for_immediate_flip = v->return_bw_per_state[i];
 			for (k = 0; k <= v->number_of_active_planes - 1; k++) {
@@ -814,24 +845,18 @@ void mode_support_and_system_configuration(struct dcn_bw_internal_vars *v)
 			for (k = 0; k <= v->number_of_active_planes - 1; k++) {
 				if (v->pte_enable == dcn_bw_yes && v->dcc_enable[k] == dcn_bw_yes) {
 					v->time_for_meta_pte_with_immediate_flip =dcn_bw_max5(v->meta_pte_bytes_per_frame[k] / v->prefetch_bw[k], v->meta_pte_bytes_per_frame[k] * v->total_immediate_flip_bytes[k] / (v->bw_available_for_immediate_flip * (v->meta_pte_bytes_per_frame[k] + v->meta_row_bytes[k] + v->dpte_bytes_per_row[k])), v->extra_latency, v->urgent_latency, v->htotal[k] / v->pixel_clock[k] / 4.0);
-					v->time_for_meta_pte_without_immediate_flip =dcn_bw_max3(v->meta_pte_bytes_per_frame[k] / v->prefetch_bw[k], v->extra_latency, v->htotal[k] / v->pixel_clock[k] / 4.0);
 				}
 				else {
 					v->time_for_meta_pte_with_immediate_flip = v->htotal[k] / v->pixel_clock[k] / 4.0;
-					v->time_for_meta_pte_without_immediate_flip = v->htotal[k] / v->pixel_clock[k] / 4.0;
 				}
 				if (v->pte_enable == dcn_bw_yes || v->dcc_enable[k] == dcn_bw_yes) {
 					v->time_for_meta_and_dpte_row_with_immediate_flip =dcn_bw_max5((v->meta_row_bytes[k] + v->dpte_bytes_per_row[k]) / v->prefetch_bw[k], (v->meta_row_bytes[k] + v->dpte_bytes_per_row[k]) * v->total_immediate_flip_bytes[k] / (v->bw_available_for_immediate_flip * (v->meta_pte_bytes_per_frame[k] + v->meta_row_bytes[k] + v->dpte_bytes_per_row[k])), v->htotal[k] / v->pixel_clock[k] - v->time_for_meta_pte_with_immediate_flip, v->extra_latency, 2.0 * v->urgent_latency);
-					v->time_for_meta_and_dpte_row_without_immediate_flip =dcn_bw_max3((v->meta_row_bytes[k] + v->dpte_bytes_per_row[k]) / v->prefetch_bw[k], v->htotal[k] / v->pixel_clock[k] - v->time_for_meta_pte_without_immediate_flip, v->extra_latency);
 				}
 				else {
 					v->time_for_meta_and_dpte_row_with_immediate_flip =dcn_bw_max2(v->htotal[k] / v->pixel_clock[k] - v->time_for_meta_pte_with_immediate_flip, v->extra_latency - v->time_for_meta_pte_with_immediate_flip);
-					v->time_for_meta_and_dpte_row_without_immediate_flip =dcn_bw_max2(v->htotal[k] / v->pixel_clock[k] - v->time_for_meta_pte_without_immediate_flip, v->extra_latency - v->time_for_meta_pte_without_immediate_flip);
 				}
 				v->lines_for_meta_pte_with_immediate_flip[k] =dcn_bw_floor2(4.0 * (v->time_for_meta_pte_with_immediate_flip / (v->htotal[k] / v->pixel_clock[k]) + 0.125), 1.0) / 4;
-				v->lines_for_meta_pte_without_immediate_flip[k] =dcn_bw_floor2(4.0 * (v->time_for_meta_pte_without_immediate_flip / (v->htotal[k] / v->pixel_clock[k]) + 0.125), 1.0) / 4;
 				v->lines_for_meta_and_dpte_row_with_immediate_flip[k] =dcn_bw_floor2(4.0 * (v->time_for_meta_and_dpte_row_with_immediate_flip / (v->htotal[k] / v->pixel_clock[k]) + 0.125), 1.0) / 4;
-				v->lines_for_meta_and_dpte_row_without_immediate_flip[k] =dcn_bw_floor2(4.0 * (v->time_for_meta_and_dpte_row_without_immediate_flip / (v->htotal[k] / v->pixel_clock[k]) + 0.125), 1.0) / 4;
 				v->line_times_to_request_prefetch_pixel_data_with_immediate_flip = v->line_times_for_prefetch[k] - v->lines_for_meta_pte_with_immediate_flip[k] - v->lines_for_meta_and_dpte_row_with_immediate_flip[k];
 				v->line_times_to_request_prefetch_pixel_data_without_immediate_flip = v->line_times_for_prefetch[k] - v->lines_for_meta_pte_without_immediate_flip[k] - v->lines_for_meta_and_dpte_row_without_immediate_flip[k];
 				if (v->line_times_to_request_prefetch_pixel_data_with_immediate_flip > 0.0) {
diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
index 43e4a2be0fa6..12d1842079ae 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
@@ -290,41 +290,34 @@ static void pipe_ctx_to_e2e_pipe_params (
 	switch (pipe->plane_state->tiling_info.gfx9.swizzle) {
 	/* for 4/8/16 high tiles */
 	case DC_SW_LINEAR:
-		input->src.is_display_sw = 1;
 		input->src.macro_tile_size = dm_4k_tile;
 		break;
 	case DC_SW_4KB_S:
 	case DC_SW_4KB_S_X:
-		input->src.is_display_sw = 0;
 		input->src.macro_tile_size = dm_4k_tile;
 		break;
 	case DC_SW_64KB_S:
 	case DC_SW_64KB_S_X:
 	case DC_SW_64KB_S_T:
-		input->src.is_display_sw = 0;
 		input->src.macro_tile_size = dm_64k_tile;
 		break;
 	case DC_SW_VAR_S:
 	case DC_SW_VAR_S_X:
-		input->src.is_display_sw = 0;
 		input->src.macro_tile_size = dm_256k_tile;
 		break;
 
 	/* For 64bpp 2 high tiles */
 	case DC_SW_4KB_D:
 	case DC_SW_4KB_D_X:
-		input->src.is_display_sw = 1;
 		input->src.macro_tile_size = dm_4k_tile;
 		break;
 	case DC_SW_64KB_D:
 	case DC_SW_64KB_D_X:
 	case DC_SW_64KB_D_T:
-		input->src.is_display_sw = 1;
 		input->src.macro_tile_size = dm_64k_tile;
 		break;
 	case DC_SW_VAR_D:
 	case DC_SW_VAR_D_X:
-		input->src.is_display_sw = 1;
 		input->src.macro_tile_size = dm_256k_tile;
 		break;
 
@@ -423,7 +416,7 @@ static void pipe_ctx_to_e2e_pipe_params (
 			- pipe->stream->timing.v_addressable
 			- pipe->stream->timing.v_border_bottom
 			- pipe->stream->timing.v_border_top;
-	input->dest.pixel_rate_mhz = pipe->stream->timing.pix_clk_khz/1000.0;
+	input->dest.pixel_rate_mhz = pipe->stream->timing.pix_clk_100hz/10000.0;
 	input->dest.vstartup_start = pipe->pipe_dlg_param.vstartup_start;
 	input->dest.vupdate_offset = pipe->pipe_dlg_param.vupdate_offset;
 	input->dest.vupdate_offset = pipe->pipe_dlg_param.vupdate_offset;
@@ -670,9 +663,9 @@ static void hack_disable_optional_pipe_split(struct dcn_bw_internal_vars *v)
 }
 
 static void hack_force_pipe_split(struct dcn_bw_internal_vars *v,
-		unsigned int pixel_rate_khz)
+		unsigned int pixel_rate_100hz)
 {
-	float pixel_rate_mhz = pixel_rate_khz / 1000;
+	float pixel_rate_mhz = pixel_rate_100hz / 10000;
 
 	/*
 	 * force enabling pipe split by lower dpp clock for DPM0 to just
@@ -695,7 +688,7 @@ static void hack_bounding_box(struct dcn_bw_internal_vars *v,
 
 	if (context->stream_count == 1 &&
 			dbg->force_single_disp_pipe_split)
-		hack_force_pipe_split(v, context->streams[0]->timing.pix_clk_khz);
+		hack_force_pipe_split(v, context->streams[0]->timing.pix_clk_100hz);
 }
 
 bool dcn_validate_bandwidth(
@@ -852,7 +845,7 @@ bool dcn_validate_bandwidth(
 		v->v_sync_plus_back_porch[input_idx] = pipe->stream->timing.v_total
 				- v->vactive[input_idx]
 				- pipe->stream->timing.v_front_porch;
-		v->pixel_clock[input_idx] = pipe->stream->timing.pix_clk_khz/1000.0;
+		v->pixel_clock[input_idx] = pipe->stream->timing.pix_clk_100hz/10000.0;
 		if (pipe->stream->timing.timing_3d_format == TIMING_3D_FORMAT_HW_FRAME_PACKING)
 			v->pixel_clock[input_idx] *= 2;
 		if (!pipe->plane_state) {
@@ -961,7 +954,7 @@ bool dcn_validate_bandwidth(
 		v->dcc_rate[input_idx] = 1; /*TODO: Worst case? does this change?*/
 		v->output_format[input_idx] = pipe->stream->timing.pixel_encoding ==
 				PIXEL_ENCODING_YCBCR420 ? dcn_bw_420 : dcn_bw_444;
-		v->output[input_idx] = pipe->stream->sink->sink_signal ==
+		v->output[input_idx] = pipe->stream->signal ==
 				SIGNAL_TYPE_HDMI_TYPE_A ? dcn_bw_hdmi : dcn_bw_dp;
 		v->output_deep_color[input_idx] = dcn_bw_encoder_8bpc;
 		if (v->output[input_idx] == dcn_bw_hdmi) {
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 5fd52094d459..c68fbd55db3c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -384,7 +384,7 @@ void dc_stream_set_dither_option(struct dc_stream_state *stream,
 		enum dc_dither_option option)
 {
 	struct bit_depth_reduction_params params;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 	struct pipe_ctx *pipes = NULL;
 	int i;
 
@@ -451,7 +451,7 @@ bool dc_stream_program_csc_matrix(struct dc *dc, struct dc_stream_state *stream)
 					pipes,
 					stream->output_color_space,
 					stream->csc_color_matrix.matrix,
-					pipes->plane_res.hubp->opp_id);
+					pipes->plane_res.hubp ? pipes->plane_res.hubp->opp_id : 0);
 			ret = true;
 		}
 	}
@@ -526,9 +526,8 @@ void dc_link_set_preferred_link_settings(struct dc *dc,
 
 	for (i = 0; i < MAX_PIPES; i++) {
 		pipe = &dc->current_state->res_ctx.pipe_ctx[i];
-		if (pipe->stream && pipe->stream->sink
-			&& pipe->stream->sink->link) {
-			if (pipe->stream->sink->link == link)
+		if (pipe->stream && pipe->stream->link) {
+			if (pipe->stream->link == link)
 				break;
 		}
 	}
@@ -586,9 +585,6 @@ static void destruct(struct dc *dc)
 	if (dc->ctx->gpio_service)
 		dal_gpio_service_destroy(&dc->ctx->gpio_service);
 
-	if (dc->ctx->i2caux)
-		dal_i2caux_destroy(&dc->ctx->i2caux);
-
 	if (dc->ctx->created_bios)
 		dal_bios_parser_destroy(&dc->ctx->dc_bios);
 
@@ -625,7 +621,6 @@ static bool construct(struct dc *dc,
 #endif
 
 	enum dce_version dc_version = DCE_VERSION_UNKNOWN;
-
 	dc_dceip = kzalloc(sizeof(*dc_dceip), GFP_KERNEL);
 	if (!dc_dceip) {
 		dm_error("%s: failed to create dceip\n", __func__);
@@ -670,6 +665,7 @@ static bool construct(struct dc *dc,
 	dc_ctx->dc = dc;
 	dc_ctx->asic_id = init_params->asic_id;
 	dc_ctx->dc_sink_id_count = 0;
+	dc_ctx->dc_stream_id_count = 0;
 	dc->ctx = dc_ctx;
 
 	dc->current_state = dc_create_state();
@@ -709,14 +705,6 @@ static bool construct(struct dc *dc,
 		dc_ctx->created_bios = true;
 		}
 
-	/* Create I2C AUX */
-	dc_ctx->i2caux = dal_i2caux_create(dc_ctx);
-
-	if (!dc_ctx->i2caux) {
-		ASSERT_CRITICAL(false);
-		goto fail;
-	}
-
 	dc_ctx->perf_trace = dc_perf_trace_create();
 	if (!dc_ctx->perf_trace) {
 		ASSERT_CRITICAL(false);
@@ -840,6 +828,11 @@ alloc_fail:
 	return NULL;
 }
 
+void dc_init_callbacks(struct dc *dc,
+		const struct dc_callback_init *init_params)
+{
+}
+
 void dc_destroy(struct dc **dc)
 {
 	destruct(*dc);
@@ -875,8 +868,9 @@ static void program_timing_sync(
 		struct dc *dc,
 		struct dc_state *ctx)
 {
-	int i, j;
+	int i, j, k;
 	int group_index = 0;
+	int num_group = 0;
 	int pipe_count = dc->res_pool->pipe_count;
 	struct pipe_ctx *unsynced_pipes[MAX_PIPES] = { NULL };
 
@@ -913,11 +907,11 @@ static void program_timing_sync(
 			}
 		}
 
-		/* set first unblanked pipe as master */
+		/* set first pipe with plane as master */
 		for (j = 0; j < group_size; j++) {
 			struct pipe_ctx *temp;
 
-			if (pipe_set[j]->stream_res.tg->funcs->is_blanked && !pipe_set[j]->stream_res.tg->funcs->is_blanked(pipe_set[j]->stream_res.tg)) {
+			if (pipe_set[j]->plane_state) {
 				if (j == 0)
 					break;
 
@@ -928,9 +922,21 @@ static void program_timing_sync(
 			}
 		}
 
-		/* remove any other unblanked pipes as they have already been synced */
+
+		for (k = 0; k < group_size; k++) {
+			struct dc_stream_status *status = dc_stream_get_status_from_state(ctx, pipe_set[k]->stream);
+
+			status->timing_sync_info.group_id = num_group;
+			status->timing_sync_info.group_size = group_size;
+			if (k == 0)
+				status->timing_sync_info.master = true;
+			else
+				status->timing_sync_info.master = false;
+
+		}
+		/* remove any other pipes with plane as they have already been synced */
 		for (j = j + 1; j < group_size; j++) {
-			if (pipe_set[j]->stream_res.tg->funcs->is_blanked && !pipe_set[j]->stream_res.tg->funcs->is_blanked(pipe_set[j]->stream_res.tg)) {
+			if (pipe_set[j]->plane_state) {
 				group_size--;
 				pipe_set[j] = pipe_set[group_size];
 				j--;
@@ -942,6 +948,7 @@ static void program_timing_sync(
 				dc, group_index, group_size, pipe_set);
 			group_index++;
 		}
+		num_group++;
 	}
 }
 
@@ -962,6 +969,52 @@ static bool context_changed(
 	return false;
 }
 
+bool dc_validate_seamless_boot_timing(struct dc *dc,
+				const struct dc_sink *sink,
+				struct dc_crtc_timing *crtc_timing)
+{
+	struct timing_generator *tg;
+	struct dc_link *link = sink->link;
+	unsigned int inst;
+
+	/* Check for enabled DIG to identify enabled display */
+	if (!link->link_enc->funcs->is_dig_enabled(link->link_enc))
+		return false;
+
+	/* Check for which front end is used by this encoder.
+	 * Note the inst is 1 indexed, where 0 is undefined.
+	 * Note that DIG_FE can source from different OTG but our
+	 * current implementation always map 1-to-1, so this code makes
+	 * the same assumption and doesn't check OTG source.
+	 */
+	inst = link->link_enc->funcs->get_dig_frontend(link->link_enc) - 1;
+
+	/* Instance should be within the range of the pool */
+	if (inst >= dc->res_pool->pipe_count)
+		return false;
+
+	tg = dc->res_pool->timing_generators[inst];
+
+	if (!tg->funcs->is_matching_timing)
+		return false;
+
+	if (!tg->funcs->is_matching_timing(tg, crtc_timing))
+		return false;
+
+	if (dc_is_dp_signal(link->connector_signal)) {
+		unsigned int pix_clk_100hz;
+
+		dc->res_pool->dp_clock_source->funcs->get_pixel_clk_frequency_100hz(
+			dc->res_pool->dp_clock_source,
+			inst, &pix_clk_100hz);
+
+		if (crtc_timing->pix_clk_100hz != pix_clk_100hz)
+			return false;
+	}
+
+	return true;
+}
+
 bool dc_enable_stereo(
 	struct dc *dc,
 	struct dc_state *context,
@@ -1040,7 +1093,11 @@ static enum dc_status dc_commit_state_no_check(struct dc *dc, struct dc_state *c
 
 	/* Program all planes within new context*/
 	for (i = 0; i < context->stream_count; i++) {
-		const struct dc_sink *sink = context->streams[i]->sink;
+		const struct dc_link *link = context->streams[i]->link;
+		struct dc_stream_status *status;
+
+		if (context->streams[i]->apply_seamless_boot_optimization)
+			context->streams[i]->apply_seamless_boot_optimization = false;
 
 		if (!context->streams[i]->mode_changed)
 			continue;
@@ -1065,12 +1122,15 @@ static enum dc_status dc_commit_state_no_check(struct dc *dc, struct dc_state *c
 			}
 		}
 
-		CONN_MSG_MODE(sink->link, "{%dx%d, %dx%d@%dKhz}",
+		status = dc_stream_get_status_from_state(context, context->streams[i]);
+		context->streams[i]->out.otg_offset = status->primary_otg_inst;
+
+		CONN_MSG_MODE(link, "{%dx%d, %dx%d@%dKhz}",
 				context->streams[i]->timing.h_addressable,
 				context->streams[i]->timing.v_addressable,
 				context->streams[i]->timing.h_total,
 				context->streams[i]->timing.v_total,
-				context->streams[i]->timing.pix_clk_khz);
+				context->streams[i]->timing.pix_clk_100hz / 10);
 	}
 
 	dc_enable_stereo(dc, context, dc_streams, context->stream_count);
@@ -1078,6 +1138,9 @@ static enum dc_status dc_commit_state_no_check(struct dc *dc, struct dc_state *c
 	/* pplib is notified if disp_num changed */
 	dc->hwss.optimize_bandwidth(dc, context);
 
+	for (i = 0; i < context->stream_count; i++)
+		context->streams[i]->mode_changed = false;
+
 	dc_release_state(dc->current_state);
 
 	dc->current_state = context;
@@ -1114,6 +1177,9 @@ bool dc_post_update_surfaces_to_stream(struct dc *dc)
 	int i;
 	struct dc_state *context = dc->current_state;
 
+	if (dc->optimized_required == false)
+		return true;
+
 	post_surface_trace(dc);
 
 	for (i = 0; i < dc->res_pool->pipe_count; i++)
@@ -1215,6 +1281,12 @@ static enum surface_update_type get_plane_info_update_type(const struct dc_surfa
 		 */
 		update_flags->bits.bpp_change = 1;
 
+	if (u->plane_info->plane_size.grph.surface_pitch != u->surface->plane_size.grph.surface_pitch
+			|| u->plane_info->plane_size.video.luma_pitch != u->surface->plane_size.video.luma_pitch
+			|| u->plane_info->plane_size.video.chroma_pitch != u->surface->plane_size.video.chroma_pitch)
+		update_flags->bits.plane_size_change = 1;
+
+
 	if (memcmp(&u->plane_info->tiling_info, &u->surface->tiling_info,
 			sizeof(union dc_tiling_info)) != 0) {
 		update_flags->bits.swizzle_change = 1;
@@ -1236,7 +1308,7 @@ static enum surface_update_type get_plane_info_update_type(const struct dc_surfa
 			|| update_flags->bits.output_tf_change)
 		return UPDATE_TYPE_FULL;
 
-	return UPDATE_TYPE_MED;
+	return update_flags->raw ? UPDATE_TYPE_MED : UPDATE_TYPE_FAST;
 }
 
 static enum surface_update_type get_scaling_info_update_type(
@@ -1436,6 +1508,101 @@ static struct dc_stream_status *stream_get_status(
 
 static const enum surface_update_type update_surface_trace_level = UPDATE_TYPE_FULL;
 
+static void copy_surface_update_to_plane(
+		struct dc_plane_state *surface,
+		struct dc_surface_update *srf_update)
+{
+	if (srf_update->flip_addr) {
+		surface->address = srf_update->flip_addr->address;
+		surface->flip_immediate =
+			srf_update->flip_addr->flip_immediate;
+		surface->time.time_elapsed_in_us[surface->time.index] =
+			srf_update->flip_addr->flip_timestamp_in_us -
+				surface->time.prev_update_time_in_us;
+		surface->time.prev_update_time_in_us =
+			srf_update->flip_addr->flip_timestamp_in_us;
+		surface->time.index++;
+		if (surface->time.index >= DC_PLANE_UPDATE_TIMES_MAX)
+			surface->time.index = 0;
+	}
+
+	if (srf_update->scaling_info) {
+		surface->scaling_quality =
+				srf_update->scaling_info->scaling_quality;
+		surface->dst_rect =
+				srf_update->scaling_info->dst_rect;
+		surface->src_rect =
+				srf_update->scaling_info->src_rect;
+		surface->clip_rect =
+				srf_update->scaling_info->clip_rect;
+	}
+
+	if (srf_update->plane_info) {
+		surface->color_space =
+				srf_update->plane_info->color_space;
+		surface->format =
+				srf_update->plane_info->format;
+		surface->plane_size =
+				srf_update->plane_info->plane_size;
+		surface->rotation =
+				srf_update->plane_info->rotation;
+		surface->horizontal_mirror =
+				srf_update->plane_info->horizontal_mirror;
+		surface->stereo_format =
+				srf_update->plane_info->stereo_format;
+		surface->tiling_info =
+				srf_update->plane_info->tiling_info;
+		surface->visible =
+				srf_update->plane_info->visible;
+		surface->per_pixel_alpha =
+				srf_update->plane_info->per_pixel_alpha;
+		surface->global_alpha =
+				srf_update->plane_info->global_alpha;
+		surface->global_alpha_value =
+				srf_update->plane_info->global_alpha_value;
+		surface->dcc =
+				srf_update->plane_info->dcc;
+		surface->sdr_white_level =
+				srf_update->plane_info->sdr_white_level;
+	}
+
+	if (srf_update->gamma &&
+			(surface->gamma_correction !=
+					srf_update->gamma)) {
+		memcpy(&surface->gamma_correction->entries,
+			&srf_update->gamma->entries,
+			sizeof(struct dc_gamma_entries));
+		surface->gamma_correction->is_identity =
+			srf_update->gamma->is_identity;
+		surface->gamma_correction->num_entries =
+			srf_update->gamma->num_entries;
+		surface->gamma_correction->type =
+			srf_update->gamma->type;
+	}
+
+	if (srf_update->in_transfer_func &&
+			(surface->in_transfer_func !=
+				srf_update->in_transfer_func)) {
+		surface->in_transfer_func->sdr_ref_white_level =
+			srf_update->in_transfer_func->sdr_ref_white_level;
+		surface->in_transfer_func->tf =
+			srf_update->in_transfer_func->tf;
+		surface->in_transfer_func->type =
+			srf_update->in_transfer_func->type;
+		memcpy(&surface->in_transfer_func->tf_pts,
+			&srf_update->in_transfer_func->tf_pts,
+			sizeof(struct dc_transfer_func_distributed_points));
+	}
+
+	if (srf_update->input_csc_color_matrix)
+		surface->input_csc_color_matrix =
+			*srf_update->input_csc_color_matrix;
+
+	if (srf_update->coeff_reduction_factor)
+		surface->coeff_reduction_factor =
+			*srf_update->coeff_reduction_factor;
+}
+
 static void commit_planes_do_stream_update(struct dc *dc,
 		struct dc_stream_state *stream,
 		struct dc_stream_update *stream_update,
@@ -1459,11 +1626,13 @@ static void commit_planes_do_stream_update(struct dc *dc,
 					stream_update->adjust->v_total_min,
 					stream_update->adjust->v_total_max);
 
-			if (stream_update->periodic_fn_vsync_delta &&
-					pipe_ctx->stream_res.tg->funcs->program_vline_interrupt)
-				pipe_ctx->stream_res.tg->funcs->program_vline_interrupt(
-					pipe_ctx->stream_res.tg, &pipe_ctx->stream->timing,
-					pipe_ctx->stream->periodic_fn_vsync_delta);
+			if (stream_update->periodic_interrupt0 &&
+					dc->hwss.setup_periodic_interrupt)
+				dc->hwss.setup_periodic_interrupt(pipe_ctx, VLINE0);
+
+			if (stream_update->periodic_interrupt1 &&
+					dc->hwss.setup_periodic_interrupt)
+				dc->hwss.setup_periodic_interrupt(pipe_ctx, VLINE1);
 
 			if ((stream_update->hdr_static_metadata && !stream->use_dynamic_meta) ||
 					stream_update->vrr_infopacket ||
@@ -1605,7 +1774,6 @@ void dc_commit_updates_for_stream(struct dc *dc,
 		int surface_count,
 		struct dc_stream_state *stream,
 		struct dc_stream_update *stream_update,
-		struct dc_plane_state **plane_states,
 		struct dc_state *state)
 {
 	const struct dc_stream_status *stream_status;
@@ -1640,14 +1808,7 @@ void dc_commit_updates_for_stream(struct dc *dc,
 	for (i = 0; i < surface_count; i++) {
 		struct dc_plane_state *surface = srf_updates[i].surface;
 
-		/* TODO: On flip we don't build the state, so it still has the
-		 * old address. Which is why we are updating the address here
-		 */
-		if (srf_updates[i].flip_addr) {
-			surface->address = srf_updates[i].flip_addr->address;
-			surface->flip_immediate = srf_updates[i].flip_addr->flip_immediate;
-
-		}
+		copy_surface_update_to_plane(surface, &srf_updates[i]);
 
 		if (update_type >= UPDATE_TYPE_MED) {
 			for (j = 0; j < dc->res_pool->pipe_count; j++) {
@@ -1764,6 +1925,26 @@ void dc_resume(struct dc *dc)
 		core_link_resume(dc->links[i]);
 }
 
+unsigned int dc_get_current_backlight_pwm(struct dc *dc)
+{
+	struct abm *abm = dc->res_pool->abm;
+
+	if (abm)
+		return abm->funcs->get_current_backlight(abm);
+
+	return 0;
+}
+
+unsigned int dc_get_target_backlight_pwm(struct dc *dc)
+{
+	struct abm *abm = dc->res_pool->abm;
+
+	if (abm)
+		return abm->funcs->get_target_backlight(abm);
+
+	return 0;
+}
+
 bool dc_is_dmcu_initialized(struct dc *dc)
 {
 	struct dmcu *dmcu = dc->res_pool->dmcu;
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index b0265dbebd4c..7f5a947ad31d 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -43,10 +43,6 @@
 #include "dpcd_defs.h"
 #include "dmcu.h"
 
-#include "dce/dce_11_0_d.h"
-#include "dce/dce_11_0_enum.h"
-#include "dce/dce_11_0_sh_mask.h"
-
 #define DC_LOGGER_INIT(logger)
 
 
@@ -80,6 +76,12 @@ static void destruct(struct dc_link *link)
 {
 	int i;
 
+	if (link->hpd_gpio != NULL) {
+		dal_gpio_close(link->hpd_gpio);
+		dal_gpio_destroy_irq(&link->hpd_gpio);
+		link->hpd_gpio = NULL;
+	}
+
 	if (link->ddc)
 		dal_ddc_service_destroy(&link->ddc);
 
@@ -789,7 +791,7 @@ bool dc_link_detect(struct dc_link *link, enum dc_detect_reason reason)
 			return false;
 		}
 
-		sink->dongle_max_pix_clk = sink_caps.max_hdmi_pixel_clock;
+		sink->link->dongle_max_pix_clk = sink_caps.max_hdmi_pixel_clock;
 		sink->converter_disable_audio = converter_disable_audio;
 
 		link->local_sink = sink;
@@ -935,18 +937,11 @@ bool dc_link_detect(struct dc_link *link, enum dc_detect_reason reason)
 
 bool dc_link_get_hpd_state(struct dc_link *dc_link)
 {
-	struct gpio *hpd_pin;
 	uint32_t state;
 
-	hpd_pin = get_hpd_gpio(dc_link->ctx->dc_bios,
-					dc_link->link_id, dc_link->ctx->gpio_service);
-	if (hpd_pin == NULL)
-		ASSERT(false);
-
-	dal_gpio_open(hpd_pin, GPIO_MODE_INTERRUPT);
-	dal_gpio_get_value(hpd_pin, &state);
-	dal_gpio_close(hpd_pin);
-	dal_gpio_destroy_irq(&hpd_pin);
+	dal_gpio_lock_pin(dc_link->hpd_gpio);
+	dal_gpio_get_value(dc_link->hpd_gpio, &state);
+	dal_gpio_unlock_pin(dc_link->hpd_gpio);
 
 	return state;
 }
@@ -1102,7 +1097,6 @@ static bool construct(
 	const struct link_init_data *init_params)
 {
 	uint8_t i;
-	struct gpio *hpd_gpio = NULL;
 	struct ddc_service_init_data ddc_service_init_data = { { 0 } };
 	struct dc_context *dc_ctx = init_params->ctx;
 	struct encoder_init_data enc_init_data = { 0 };
@@ -1132,10 +1126,12 @@ static bool construct(
 	if (link->dc->res_pool->funcs->link_init)
 		link->dc->res_pool->funcs->link_init(link);
 
-	hpd_gpio = get_hpd_gpio(link->ctx->dc_bios, link->link_id, link->ctx->gpio_service);
-
-	if (hpd_gpio != NULL)
-		link->irq_source_hpd = dal_irq_get_source(hpd_gpio);
+	link->hpd_gpio = get_hpd_gpio(link->ctx->dc_bios, link->link_id, link->ctx->gpio_service);
+	if (link->hpd_gpio != NULL) {
+		dal_gpio_open(link->hpd_gpio, GPIO_MODE_INTERRUPT);
+		dal_gpio_unlock_pin(link->hpd_gpio);
+		link->irq_source_hpd = dal_irq_get_source(link->hpd_gpio);
+	}
 
 	switch (link->link_id.id) {
 	case CONNECTOR_ID_HDMI_TYPE_A:
@@ -1153,18 +1149,18 @@ static bool construct(
 	case CONNECTOR_ID_DISPLAY_PORT:
 		link->connector_signal =	SIGNAL_TYPE_DISPLAY_PORT;
 
-		if (hpd_gpio != NULL)
+		if (link->hpd_gpio != NULL)
 			link->irq_source_hpd_rx =
-					dal_irq_get_rx_source(hpd_gpio);
+					dal_irq_get_rx_source(link->hpd_gpio);
 
 		break;
 	case CONNECTOR_ID_EDP:
 		link->connector_signal = SIGNAL_TYPE_EDP;
 
-		if (hpd_gpio != NULL) {
+		if (link->hpd_gpio != NULL) {
 			link->irq_source_hpd = DC_IRQ_SOURCE_INVALID;
 			link->irq_source_hpd_rx =
-					dal_irq_get_rx_source(hpd_gpio);
+					dal_irq_get_rx_source(link->hpd_gpio);
 		}
 		break;
 	case CONNECTOR_ID_LVDS:
@@ -1175,10 +1171,7 @@ static bool construct(
 		goto create_fail;
 	}
 
-	if (hpd_gpio != NULL) {
-		dal_gpio_destroy_irq(&hpd_gpio);
-		hpd_gpio = NULL;
-	}
+
 
 	/* TODO: #DAL3 Implement id to str function.*/
 	LINK_INFO("Connector[%d] description:"
@@ -1281,8 +1274,9 @@ link_enc_create_fail:
 ddc_create_fail:
 create_fail:
 
-	if (hpd_gpio != NULL) {
-		dal_gpio_destroy_irq(&hpd_gpio);
+	if (link->hpd_gpio != NULL) {
+		dal_gpio_destroy_irq(&link->hpd_gpio);
+		link->hpd_gpio = NULL;
 	}
 
 	return false;
@@ -1372,7 +1366,7 @@ static void dpcd_configure_panel_mode(
 static void enable_stream_features(struct pipe_ctx *pipe_ctx)
 {
 	struct dc_stream_state *stream = pipe_ctx->stream;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 	union down_spread_ctrl old_downspread;
 	union down_spread_ctrl new_downspread;
 
@@ -1397,7 +1391,7 @@ static enum dc_status enable_link_dp(
 	struct dc_stream_state *stream = pipe_ctx->stream;
 	enum dc_status status;
 	bool skip_video_pattern;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 	struct dc_link_settings link_settings = {0};
 	enum dp_panel_mode panel_mode;
 
@@ -1414,8 +1408,8 @@ static enum dc_status enable_link_dp(
 		pipe_ctx->clock_source->id,
 		&link_settings);
 
-	if (stream->sink->edid_caps.panel_patch.dppowerup_delay > 0) {
-		int delay_dp_power_up_in_ms = stream->sink->edid_caps.panel_patch.dppowerup_delay;
+	if (stream->sink_patches.dppowerup_delay > 0) {
+		int delay_dp_power_up_in_ms = stream->sink_patches.dppowerup_delay;
 
 		msleep(delay_dp_power_up_in_ms);
 	}
@@ -1448,7 +1442,7 @@ static enum dc_status enable_link_edp(
 {
 	enum dc_status status;
 	struct dc_stream_state *stream = pipe_ctx->stream;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 	/*in case it is not on*/
 	link->dc->hwss.edp_power_control(link, true);
 	link->dc->hwss.edp_wait_for_hpd_ready(link, true);
@@ -1463,7 +1457,7 @@ static enum dc_status enable_link_dp_mst(
 		struct dc_state *state,
 		struct pipe_ctx *pipe_ctx)
 {
-	struct dc_link *link = pipe_ctx->stream->sink->link;
+	struct dc_link *link = pipe_ctx->stream->link;
 
 	/* sink signal type after MST branch is MST. Multiple MST sinks
 	 * share one link. Link DP PHY is enable or training only once.
@@ -1471,6 +1465,11 @@ static enum dc_status enable_link_dp_mst(
 	if (link->cur_link_settings.lane_count != LANE_COUNT_UNKNOWN)
 		return DC_OK;
 
+	/* to make sure the pending down rep can be processed
+	 * before clear payload table
+	 */
+	dm_helpers_dp_mst_poll_pending_down_reply(link->ctx, link);
+
 	/* clear payload table */
 	dm_helpers_dp_mst_clear_payload_allocation_table(link->ctx, link);
 
@@ -1597,7 +1596,7 @@ static bool i2c_write(struct pipe_ctx *pipe_ctx,
 	cmd.payloads = &payload;
 
 	if (dm_helpers_submit_i2c(pipe_ctx->stream->ctx,
-			pipe_ctx->stream->sink->link, &cmd))
+			pipe_ctx->stream->link, &cmd))
 		return true;
 
 	return false;
@@ -1651,7 +1650,7 @@ static void write_i2c_retimer_setting(
 				else {
 					i2c_success =
 						dal_ddc_service_query_ddc_data(
-						pipe_ctx->stream->sink->link->ddc,
+						pipe_ctx->stream->link->ddc,
 						slave_address, &offset, 1, &value, 1);
 					if (!i2c_success)
 						/* Write failure */
@@ -1704,7 +1703,7 @@ static void write_i2c_retimer_setting(
 					else {
 						i2c_success =
 								dal_ddc_service_query_ddc_data(
-								pipe_ctx->stream->sink->link->ddc,
+								pipe_ctx->stream->link->ddc,
 								slave_address, &offset, 1, &value, 1);
 						if (!i2c_success)
 							/* Write failure */
@@ -1929,7 +1928,7 @@ static void write_i2c_redriver_setting(
 static void enable_link_hdmi(struct pipe_ctx *pipe_ctx)
 {
 	struct dc_stream_state *stream = pipe_ctx->stream;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 	enum dc_color_depth display_color_depth;
 	enum engine_id eng_id;
 	struct ext_hdmi_settings settings = {0};
@@ -1938,12 +1937,12 @@ static void enable_link_hdmi(struct pipe_ctx *pipe_ctx)
 			&& (stream->timing.v_addressable == 480);
 
 	if (stream->phy_pix_clk == 0)
-		stream->phy_pix_clk = stream->timing.pix_clk_khz;
+		stream->phy_pix_clk = stream->timing.pix_clk_100hz / 10;
 	if (stream->phy_pix_clk > 340000)
 		is_over_340mhz = true;
 
 	if (dc_is_hdmi_signal(pipe_ctx->stream->signal)) {
-		unsigned short masked_chip_caps = pipe_ctx->stream->sink->link->chip_caps &
+		unsigned short masked_chip_caps = pipe_ctx->stream->link->chip_caps &
 				EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK;
 		if (masked_chip_caps == EXT_DISPLAY_PATH_CAPS__HDMI20_TISN65DP159RSBT) {
 			/* DP159, Retimer settings */
@@ -1964,11 +1963,11 @@ static void enable_link_hdmi(struct pipe_ctx *pipe_ctx)
 
 	if (dc_is_hdmi_signal(pipe_ctx->stream->signal))
 		dal_ddc_service_write_scdc_data(
-			stream->sink->link->ddc,
+			stream->link->ddc,
 			stream->phy_pix_clk,
 			stream->timing.flags.LTE_340MCSC_SCRAMBLE);
 
-	memset(&stream->sink->link->cur_link_settings, 0,
+	memset(&stream->link->cur_link_settings, 0,
 			sizeof(struct dc_link_settings));
 
 	display_color_depth = stream->timing.display_color_depth;
@@ -1989,12 +1988,12 @@ static void enable_link_hdmi(struct pipe_ctx *pipe_ctx)
 static void enable_link_lvds(struct pipe_ctx *pipe_ctx)
 {
 	struct dc_stream_state *stream = pipe_ctx->stream;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 
 	if (stream->phy_pix_clk == 0)
-		stream->phy_pix_clk = stream->timing.pix_clk_khz;
+		stream->phy_pix_clk = stream->timing.pix_clk_100hz / 10;
 
-	memset(&stream->sink->link->cur_link_settings, 0,
+	memset(&stream->link->cur_link_settings, 0,
 			sizeof(struct dc_link_settings));
 
 	link->link_enc->funcs->enable_lvds_output(
@@ -2067,7 +2066,7 @@ static bool dp_active_dongle_validate_timing(
 		const struct dc_crtc_timing *timing,
 		const struct dpcd_caps *dpcd_caps)
 {
-	unsigned int required_pix_clk = timing->pix_clk_khz;
+	unsigned int required_pix_clk_100hz = timing->pix_clk_100hz;
 	const struct dc_dongle_caps *dongle_caps = &dpcd_caps->dongle_caps;
 
 	switch (dpcd_caps->dongle_type) {
@@ -2107,9 +2106,9 @@ static bool dp_active_dongle_validate_timing(
 
 	/* Check Color Depth and Pixel Clock */
 	if (timing->pixel_encoding == PIXEL_ENCODING_YCBCR420)
-		required_pix_clk /= 2;
+		required_pix_clk_100hz /= 2;
 	else if (timing->pixel_encoding == PIXEL_ENCODING_YCBCR422)
-		required_pix_clk = required_pix_clk * 2 / 3;
+		required_pix_clk_100hz = required_pix_clk_100hz * 2 / 3;
 
 	switch (timing->display_color_depth) {
 	case COLOR_DEPTH_666:
@@ -2119,12 +2118,12 @@ static bool dp_active_dongle_validate_timing(
 	case COLOR_DEPTH_101010:
 		if (dongle_caps->dp_hdmi_max_bpc < 10)
 			return false;
-		required_pix_clk = required_pix_clk * 10 / 8;
+		required_pix_clk_100hz = required_pix_clk_100hz * 10 / 8;
 		break;
 	case COLOR_DEPTH_121212:
 		if (dongle_caps->dp_hdmi_max_bpc < 12)
 			return false;
-		required_pix_clk = required_pix_clk * 12 / 8;
+		required_pix_clk_100hz = required_pix_clk_100hz * 12 / 8;
 		break;
 
 	case COLOR_DEPTH_141414:
@@ -2134,7 +2133,7 @@ static bool dp_active_dongle_validate_timing(
 		return false;
 	}
 
-	if (required_pix_clk > dongle_caps->dp_hdmi_max_pixel_clk)
+	if (required_pix_clk_100hz > (dongle_caps->dp_hdmi_max_pixel_clk * 10))
 		return false;
 
 	return true;
@@ -2145,7 +2144,7 @@ enum dc_status dc_link_validate_mode_timing(
 		struct dc_link *link,
 		const struct dc_crtc_timing *timing)
 {
-	uint32_t max_pix_clk = stream->sink->dongle_max_pix_clk;
+	uint32_t max_pix_clk = stream->link->dongle_max_pix_clk * 10;
 	struct dpcd_caps *dpcd_caps = &link->dpcd_caps;
 
 	/* A hack to avoid failing any modes for EDID override feature on
@@ -2155,7 +2154,7 @@ enum dc_status dc_link_validate_mode_timing(
 		return DC_OK;
 
 	/* Passive Dongle */
-	if (0 != max_pix_clk && timing->pix_clk_khz > max_pix_clk)
+	if (0 != max_pix_clk && timing->pix_clk_100hz > max_pix_clk)
 		return DC_EXCEED_DONGLE_CAP;
 
 	/* Active Dongle*/
@@ -2214,7 +2213,7 @@ bool dc_link_set_backlight_level(const struct dc_link *link,
 		for (i = 0; i < MAX_PIPES; i++) {
 			if (core_dc->current_state->res_ctx.pipe_ctx[i].stream) {
 				if (core_dc->current_state->res_ctx.
-						pipe_ctx[i].stream->sink->link
+						pipe_ctx[i].stream->link
 						== link)
 					/* DMCU -1 for all controller id values,
 					 * therefore +1 here
@@ -2274,7 +2273,7 @@ void core_link_resume(struct dc_link *link)
 static struct fixed31_32 get_pbn_per_slot(struct dc_stream_state *stream)
 {
 	struct dc_link_settings *link_settings =
-			&stream->sink->link->cur_link_settings;
+			&stream->link->cur_link_settings;
 	uint32_t link_rate_in_mbps =
 			link_settings->link_rate * LINK_RATE_REF_FREQ_IN_MHZ;
 	struct fixed31_32 mbps = dc_fixpt_from_int(
@@ -2305,7 +2304,7 @@ static struct fixed31_32 get_pbn_from_timing(struct pipe_ctx *pipe_ctx)
 	uint32_t denominator;
 
 	bpc = get_color_depth(pipe_ctx->stream_res.pix_clk_params.color_depth);
-	kbps = pipe_ctx->stream_res.pix_clk_params.requested_pix_clk * bpc * 3;
+	kbps = pipe_ctx->stream_res.pix_clk_params.requested_pix_clk_100hz / 10 * bpc * 3;
 
 	/*
 	 * margin 5300ppm + 300ppm ~ 0.6% as per spec, factor is 1.006
@@ -2381,7 +2380,7 @@ static void update_mst_stream_alloc_table(
 static enum dc_status allocate_mst_payload(struct pipe_ctx *pipe_ctx)
 {
 	struct dc_stream_state *stream = pipe_ctx->stream;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 	struct link_encoder *link_encoder = link->link_enc;
 	struct stream_encoder *stream_encoder = pipe_ctx->stream_res.stream_enc;
 	struct dp_mst_stream_allocation_table proposed_table = {0};
@@ -2461,7 +2460,7 @@ static enum dc_status allocate_mst_payload(struct pipe_ctx *pipe_ctx)
 static enum dc_status deallocate_mst_payload(struct pipe_ctx *pipe_ctx)
 {
 	struct dc_stream_state *stream = pipe_ctx->stream;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 	struct link_encoder *link_encoder = link->link_enc;
 	struct stream_encoder *stream_encoder = pipe_ctx->stream_res.stream_enc;
 	struct dp_mst_stream_allocation_table proposed_table = {0};
@@ -2546,8 +2545,8 @@ void core_link_enable_stream(
 	DC_LOGGER_INIT(pipe_ctx->stream->ctx->logger);
 
 	if (pipe_ctx->stream->signal != SIGNAL_TYPE_VIRTUAL) {
-		stream->sink->link->link_enc->funcs->setup(
-			stream->sink->link->link_enc,
+		stream->link->link_enc->funcs->setup(
+			stream->link->link_enc,
 			pipe_ctx->stream->signal);
 		pipe_ctx->stream_res.stream_enc->funcs->setup_stereo_sync(
 			pipe_ctx->stream_res.stream_enc,
@@ -2581,13 +2580,23 @@ void core_link_enable_stream(
 			&stream->timing);
 
 	if (!IS_FPGA_MAXIMUS_DC(core_dc->ctx->dce_environment)) {
+		bool apply_edp_fast_boot_optimization =
+			pipe_ctx->stream->apply_edp_fast_boot_optimization;
+
+		pipe_ctx->stream->apply_edp_fast_boot_optimization = false;
+
 		resource_build_info_frame(pipe_ctx);
 		core_dc->hwss.update_info_frame(pipe_ctx);
 
+		/* Do not touch link on seamless boot optimization. */
+		if (pipe_ctx->stream->apply_seamless_boot_optimization) {
+			pipe_ctx->stream->dpms_off = false;
+			return;
+		}
+
 		/* eDP lit up by bios already, no need to enable again. */
 		if (pipe_ctx->stream->signal == SIGNAL_TYPE_EDP &&
-				pipe_ctx->stream->apply_edp_fast_boot_optimization) {
-			pipe_ctx->stream->apply_edp_fast_boot_optimization = false;
+					apply_edp_fast_boot_optimization) {
 			pipe_ctx->stream->dpms_off = false;
 			return;
 		}
@@ -2599,7 +2608,7 @@ void core_link_enable_stream(
 
 		if (status != DC_OK) {
 			DC_LOG_WARNING("enabling link %u failed: %d\n",
-			pipe_ctx->stream->sink->link->link_index,
+			pipe_ctx->stream->link->link_index,
 			status);
 
 			/* Abort stream enable *unless* the failure was due to
@@ -2614,6 +2623,8 @@ void core_link_enable_stream(
 			}
 		}
 
+		stream->link->link_status.link_active = true;
+
 		core_dc->hwss.enable_audio_stream(pipe_ctx);
 
 		/* turn off otg test pattern if enable */
@@ -2628,7 +2639,7 @@ void core_link_enable_stream(
 			allocate_mst_payload(pipe_ctx);
 
 		core_dc->hwss.unblank_stream(pipe_ctx,
-			&pipe_ctx->stream->sink->link->cur_link_settings);
+			&pipe_ctx->stream->link->cur_link_settings);
 
 		if (dc_is_dp_signal(pipe_ctx->stream->signal))
 			enable_stream_features(pipe_ctx);
@@ -2647,7 +2658,9 @@ void core_link_disable_stream(struct pipe_ctx *pipe_ctx, int option)
 
 	core_dc->hwss.disable_stream(pipe_ctx, option);
 
-	disable_link(pipe_ctx->stream->sink->link, pipe_ctx->stream->signal);
+	disable_link(pipe_ctx->stream->link, pipe_ctx->stream->signal);
+
+	pipe_ctx->stream->link->link_status.link_active = false;
 }
 
 void core_link_set_avmute(struct pipe_ctx *pipe_ctx, bool enable)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
index 506a97e16956..b7ee63cd8dc7 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
@@ -33,7 +33,7 @@
 #include "include/vector.h"
 #include "core_types.h"
 #include "dc_link_ddc.h"
-#include "aux_engine.h"
+#include "dce/dce_aux.h"
 
 #define AUX_POWER_UP_WA_DELAY 500
 #define I2C_OVER_AUX_DEFER_WA_DELAY 70
@@ -42,7 +42,6 @@
 #define CV_SMART_DONGLE_ADDRESS 0x20
 /* DVI-HDMI dongle slave address for retrieving dongle signature*/
 #define DVI_HDMI_DONGLE_ADDRESS 0x68
-static const int8_t dvi_hdmi_dongle_signature_str[] = "6140063500G";
 struct dvi_hdmi_dongle_signature_data {
 	int8_t vendor[3];/* "AMD" */
 	uint8_t version[2];
@@ -165,43 +164,6 @@ static void dal_ddc_i2c_payloads_destroy(struct i2c_payloads **p)
 
 }
 
-static struct aux_payloads *dal_ddc_aux_payloads_create(struct dc_context *ctx, uint32_t count)
-{
-	struct aux_payloads *payloads;
-
-	payloads = kzalloc(sizeof(struct aux_payloads), GFP_KERNEL);
-
-	if (!payloads)
-		return NULL;
-
-	if (dal_vector_construct(
-		&payloads->payloads, ctx, count, sizeof(struct aux_payload)))
-		return payloads;
-
-	kfree(payloads);
-	return NULL;
-}
-
-static struct aux_payload *dal_ddc_aux_payloads_get(struct aux_payloads *p)
-{
-	return (struct aux_payload *)p->payloads.container;
-}
-
-static uint32_t  dal_ddc_aux_payloads_get_count(struct aux_payloads *p)
-{
-	return p->payloads.count;
-}
-
-static void dal_ddc_aux_payloads_destroy(struct aux_payloads **p)
-{
-	if (!p || !*p)
-		return;
-
-	dal_vector_destruct(&(*p)->payloads);
-	kfree(*p);
-	*p = NULL;
-}
-
 #define DDC_MIN(a, b) (((a) < (b)) ? (a) : (b))
 
 void dal_ddc_i2c_payloads_add(
@@ -225,27 +187,6 @@ void dal_ddc_i2c_payloads_add(
 
 }
 
-void dal_ddc_aux_payloads_add(
-	struct aux_payloads *payloads,
-	uint32_t address,
-	uint32_t len,
-	uint8_t *data,
-	bool write)
-{
-	uint32_t payload_size = DEFAULT_AUX_MAX_DATA_SIZE;
-	uint32_t pos;
-
-	for (pos = 0; pos < len; pos += payload_size) {
-		struct aux_payload payload = {
-			.i2c_over_aux = true,
-			.write = write,
-			.address = address,
-			.length = DDC_MIN(payload_size, len - pos),
-			.data = data + pos };
-		dal_vector_append(&payloads->payloads, &payload);
-	}
-}
-
 static void construct(
 	struct ddc_service *ddc_service,
 	struct ddc_service_init_data *init_data)
@@ -574,32 +515,34 @@ bool dal_ddc_service_query_ddc_data(
 	/*TODO: len of payload data for i2c and aux is uint8!!!!,
 	 *  but we want to read 256 over i2c!!!!*/
 	if (dal_ddc_service_is_in_aux_transaction_mode(ddc)) {
-
-		struct aux_payloads *payloads =
-			dal_ddc_aux_payloads_create(ddc->ctx, payloads_num);
-
-		struct aux_command command = {
-			.payloads = dal_ddc_aux_payloads_get(payloads),
-			.number_of_payloads = 0,
+		struct aux_payload write_payload = {
+			.i2c_over_aux = true,
+			.write = true,
+			.mot = true,
+			.address = address,
+			.length = write_size,
+			.data = write_buf,
+			.reply = NULL,
 			.defer_delay = get_defer_delay(ddc),
-			.max_defer_write_retry = 0 };
+		};
 
-		dal_ddc_aux_payloads_add(
-			payloads, address, write_size, write_buf, true);
-
-		dal_ddc_aux_payloads_add(
-			payloads, address, read_size, read_buf, false);
-
-		command.number_of_payloads =
-			dal_ddc_aux_payloads_get_count(payloads);
+		struct aux_payload read_payload = {
+			.i2c_over_aux = true,
+			.write = false,
+			.mot = false,
+			.address = address,
+			.length = read_size,
+			.data = read_buf,
+			.reply = NULL,
+			.defer_delay = get_defer_delay(ddc),
+		};
 
-		ret = dal_i2caux_submit_aux_command(
-				ddc->ctx->i2caux,
-				ddc->ddc_pin,
-				&command);
+		ret = dc_link_aux_transfer_with_retries(ddc, &write_payload);
 
-		dal_ddc_aux_payloads_destroy(&payloads);
+		if (!ret)
+			return false;
 
+		ret = dc_link_aux_transfer_with_retries(ddc, &read_payload);
 	} else {
 		struct i2c_payloads *payloads =
 			dal_ddc_i2c_payloads_create(ddc->ctx, payloads_num);
@@ -631,56 +574,15 @@ bool dal_ddc_service_query_ddc_data(
 }
 
 int dc_link_aux_transfer(struct ddc_service *ddc,
-			     unsigned int address,
-			     uint8_t *reply,
-			     void *buffer,
-			     unsigned int size,
-			     enum aux_transaction_type type,
-			     enum i2caux_transaction_action action)
+		struct aux_payload *payload)
 {
-	struct ddc *ddc_pin = ddc->ddc_pin;
-	struct aux_engine *aux_engine;
-	enum aux_channel_operation_result operation_result;
-	struct aux_request_transaction_data aux_req;
-	struct aux_reply_transaction_data aux_rep;
-	uint8_t returned_bytes = 0;
-	int res = -1;
-	uint32_t status;
-
-	memset(&aux_req, 0, sizeof(aux_req));
-	memset(&aux_rep, 0, sizeof(aux_rep));
-
-	aux_engine = ddc->ctx->dc->res_pool->engines[ddc_pin->pin_data->en];
-	aux_engine->funcs->acquire(aux_engine, ddc_pin);
-
-	aux_req.type = type;
-	aux_req.action = action;
-
-	aux_req.address = address;
-	aux_req.delay = 0;
-	aux_req.length = size;
-	aux_req.data = buffer;
-
-	aux_engine->funcs->submit_channel_request(aux_engine, &aux_req);
-	operation_result = aux_engine->funcs->get_channel_status(aux_engine, &returned_bytes);
-
-	switch (operation_result) {
-	case AUX_CHANNEL_OPERATION_SUCCEEDED:
-		res = aux_engine->funcs->read_channel_reply(aux_engine, size,
-							buffer, reply,
-							&status);
-		break;
-	case AUX_CHANNEL_OPERATION_FAILED_HPD_DISCON:
-		res = 0;
-		break;
-	case AUX_CHANNEL_OPERATION_FAILED_REASON_UNKNOWN:
-	case AUX_CHANNEL_OPERATION_FAILED_INVALID_REPLY:
-	case AUX_CHANNEL_OPERATION_FAILED_TIMEOUT:
-		res = -1;
-		break;
-	}
-	aux_engine->funcs->release_engine(aux_engine);
-	return res;
+	return dce_aux_transfer(ddc, payload);
+}
+
+bool dc_link_aux_transfer_with_retries(struct ddc_service *ddc,
+		struct aux_payload *payload)
+{
+	return dce_aux_transfer_with_retries(ddc, payload);
 }
 
 /*test only function*/
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 0caacb60b02f..09d301216076 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -49,6 +49,8 @@ static void wait_for_training_aux_rd_interval(
 {
 	union training_aux_rd_interval training_rd_interval;
 
+	memset(&training_rd_interval, 0, sizeof(training_rd_interval));
+
 	/* overwrite the delay if rev > 1.1*/
 	if (link->dpcd_caps.dpcd_rev.raw >= DPCD_REV_12) {
 		/* DP 1.2 or later - retrieve delay through
@@ -117,6 +119,13 @@ static void dpcd_set_link_settings(
 	core_link_write_dpcd(link, DP_DOWNSPREAD_CTRL,
 	&downspread.raw, sizeof(downspread));
 
+	if (link->dpcd_caps.dpcd_rev.raw >= DPCD_REV_14 &&
+		(link->dpcd_caps.link_rate_set >= 1 &&
+		link->dpcd_caps.link_rate_set <= 8)) {
+		core_link_write_dpcd(link, DP_LINK_RATE_SET,
+		&link->dpcd_caps.link_rate_set, 1);
+	}
+
 	DC_LOG_HW_LINK_TRAINING("%s\n %x rate = %x\n %x lane = %x\n %x spread = %x\n",
 		__func__,
 		DP_LINK_BW_SET,
@@ -1542,7 +1551,7 @@ static uint32_t bandwidth_in_kbps_from_timing(
 
 	ASSERT(bits_per_channel != 0);
 
-	kbps = timing->pix_clk_khz;
+	kbps = timing->pix_clk_100hz / 10;
 	kbps *= bits_per_channel;
 
 	if (timing->flags.Y_ONLY != 1) {
@@ -1584,7 +1593,7 @@ bool dp_validate_mode_timing(
 	const struct dc_link_settings *link_setting;
 
 	/*always DP fail safe mode*/
-	if (timing->pix_clk_khz == (uint32_t) 25175 &&
+	if ((timing->pix_clk_100hz / 10) == (uint32_t) 25175 &&
 		timing->h_addressable == (uint32_t) 640 &&
 		timing->v_addressable == (uint32_t) 480)
 		return true;
@@ -1634,7 +1643,7 @@ void decide_link_settings(struct dc_stream_state *stream,
 
 	req_bw = bandwidth_in_kbps_from_timing(&stream->timing);
 
-	link = stream->sink->link;
+	link = stream->link;
 
 	/* if preferred is specified through AMDDP, use it, if it's enough
 	 * to drive the mode
@@ -1656,7 +1665,7 @@ void decide_link_settings(struct dc_stream_state *stream,
 	}
 
 	/* EDP use the link cap setting */
-	if (stream->sink->sink_signal == SIGNAL_TYPE_EDP) {
+	if (link->connector_signal == SIGNAL_TYPE_EDP) {
 		*link_setting = link->verified_link_cap;
 		return;
 	}
@@ -2002,11 +2011,7 @@ static void handle_automated_test(struct dc_link *link)
 		dp_test_send_phy_test_pattern(link);
 		test_response.bits.ACK = 1;
 	}
-	if (!test_request.raw)
-		/* no requests, revert all test signals
-		 * TODO: revert all test signals
-		 */
-		test_response.bits.ACK = 1;
+
 	/* send request acknowledgment */
 	if (test_response.bits.ACK)
 		core_link_write_dpcd(
@@ -2493,13 +2498,72 @@ bool detect_dp_sink_caps(struct dc_link *link)
 	/* TODO save sink caps in link->sink */
 }
 
+enum dc_link_rate linkRateInKHzToLinkRateMultiplier(uint32_t link_rate_in_khz)
+{
+	enum dc_link_rate link_rate;
+	// LinkRate is normally stored as a multiplier of 0.27 Gbps per lane. Do the translation.
+	switch (link_rate_in_khz) {
+	case 1620000:
+		link_rate = LINK_RATE_LOW;		// Rate_1 (RBR)		- 1.62 Gbps/Lane
+		break;
+	case 2160000:
+		link_rate = LINK_RATE_RATE_2;	// Rate_2			- 2.16 Gbps/Lane
+		break;
+	case 2430000:
+		link_rate = LINK_RATE_RATE_3;	// Rate_3			- 2.43 Gbps/Lane
+		break;
+	case 2700000:
+		link_rate = LINK_RATE_HIGH;		// Rate_4 (HBR)		- 2.70 Gbps/Lane
+		break;
+	case 3240000:
+		link_rate = LINK_RATE_RBR2;		// Rate_5 (RBR2)	- 3.24 Gbps/Lane
+		break;
+	case 4320000:
+		link_rate = LINK_RATE_RATE_6;	// Rate_6			- 4.32 Gbps/Lane
+		break;
+	case 5400000:
+		link_rate = LINK_RATE_HIGH2;	// Rate_7 (HBR2)	- 5.40 Gbps/Lane
+		break;
+	case 8100000:
+		link_rate = LINK_RATE_HIGH3;	// Rate_8 (HBR3)	- 8.10 Gbps/Lane
+		break;
+	default:
+		link_rate = LINK_RATE_UNKNOWN;
+		break;
+	}
+	return link_rate;
+}
+
 void detect_edp_sink_caps(struct dc_link *link)
 {
-	retrieve_link_cap(link);
+	uint8_t supported_link_rates[16] = {0};
+	uint32_t entry;
+	uint32_t link_rate_in_khz;
+	enum dc_link_rate link_rate = LINK_RATE_UNKNOWN;
 
-	if (link->reported_link_cap.link_rate == LINK_RATE_UNKNOWN)
-		link->reported_link_cap.link_rate = LINK_RATE_HIGH2;
+	retrieve_link_cap(link);
 
+	if (link->dpcd_caps.dpcd_rev.raw >= DPCD_REV_14) {
+		// Read DPCD 00010h - 0001Fh 16 bytes at one shot
+		core_link_read_dpcd(link, DP_SUPPORTED_LINK_RATES,
+							supported_link_rates, sizeof(supported_link_rates));
+
+		link->dpcd_caps.link_rate_set = 0;
+		for (entry = 0; entry < 16; entry += 2) {
+			// DPCD register reports per-lane link rate = 16-bit link rate capability
+			// value X 200 kHz. Need multipler to find link rate in kHz.
+			link_rate_in_khz = (supported_link_rates[entry+1] * 0x100 +
+										supported_link_rates[entry]) * 200;
+
+			if (link_rate_in_khz != 0) {
+				link_rate = linkRateInKHzToLinkRateMultiplier(link_rate_in_khz);
+				if (link->reported_link_cap.link_rate < link_rate) {
+					link->reported_link_cap.link_rate = link_rate;
+					link->dpcd_caps.link_rate_set = entry;
+				}
+			}
+		}
+	}
 	link->verified_link_cap = link->reported_link_cap;
 }
 
@@ -2621,7 +2685,7 @@ bool dc_link_dp_set_test_pattern(
 	memset(&training_pattern, 0, sizeof(training_pattern));
 
 	for (i = 0; i < MAX_PIPES; i++) {
-		if (pipes[i].stream->sink->link == link) {
+		if (pipes[i].stream->link == link) {
 			pipe_ctx = &pipes[i];
 			break;
 		}
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
index 0065ec7d5330..f7f7515f65f4 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_hwss.c
@@ -70,13 +70,12 @@ void dp_enable_link_phy(
 	 */
 	for (i = 0; i < MAX_PIPES; i++) {
 		if (pipes[i].stream != NULL &&
-			pipes[i].stream->sink != NULL &&
-			pipes[i].stream->sink->link == link) {
+			pipes[i].stream->link == link) {
 			if (pipes[i].clock_source != NULL &&
 					pipes[i].clock_source->id != CLOCK_SOURCE_ID_DP_DTO) {
 				pipes[i].clock_source = dp_cs;
-				pipes[i].stream_res.pix_clk_params.requested_pix_clk =
-						pipes[i].stream->timing.pix_clk_khz;
+				pipes[i].stream_res.pix_clk_params.requested_pix_clk_100hz =
+						pipes[i].stream->timing.pix_clk_100hz;
 				pipes[i].clock_source->funcs->program_pix_clk(
 							pipes[i].clock_source,
 							&pipes[i].stream_res.pix_clk_params,
@@ -120,6 +119,10 @@ bool edp_receiver_ready_T9(struct dc_link *link)
 			break;
 		udelay(100); //MAx T9
 	} while (++tries < 50);
+
+	if (link->local_sink->edid_caps.panel_patch.extra_delay_backlight_off > 0)
+		udelay(link->local_sink->edid_caps.panel_patch.extra_delay_backlight_off * 1000);
+
 	return result;
 }
 bool edp_receiver_ready_T7(struct dc_link *link)
@@ -279,10 +282,8 @@ void dp_retrain_link_dp_test(struct dc_link *link,
 	for (i = 0; i < MAX_PIPES; i++) {
 		if (pipes[i].stream != NULL &&
 			!pipes[i].top_pipe &&
-			pipes[i].stream->sink != NULL &&
-			pipes[i].stream->sink->link != NULL &&
-			pipes[i].stream_res.stream_enc != NULL &&
-			pipes[i].stream->sink->link == link) {
+			pipes[i].stream->link != NULL &&
+			pipes[i].stream_res.stream_enc != NULL) {
 			udelay(100);
 
 			pipes[i].stream_res.stream_enc->funcs->dp_blank(
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
index 76137df74a53..349ab8017776 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
@@ -355,8 +355,8 @@ bool resource_are_streams_timing_synchronizable(
 				!= stream2->timing.v_addressable)
 		return false;
 
-	if (stream1->timing.pix_clk_khz
-				!= stream2->timing.pix_clk_khz)
+	if (stream1->timing.pix_clk_100hz
+				!= stream2->timing.pix_clk_100hz)
 		return false;
 
 	if (stream1->clamping.c_depth != stream2->clamping.c_depth)
@@ -1559,7 +1559,7 @@ static struct stream_encoder *find_first_free_match_stream_enc_for_link(
 {
 	int i;
 	int j = -1;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 
 	for (i = 0; i < pool->stream_enc_count; i++) {
 		if (!res_ctx->is_stream_enc_acquired[i] &&
@@ -1748,7 +1748,7 @@ static struct dc_stream_state *find_pll_sharable_stream(
 		if (resource_are_streams_timing_synchronizable(
 			stream_needs_pll, stream_has_pll)
 			&& !dc_is_dp_signal(stream_has_pll->signal)
-			&& stream_has_pll->sink->link->connector_signal
+			&& stream_has_pll->link->connector_signal
 			!= SIGNAL_TYPE_VIRTUAL)
 			return stream_has_pll;
 
@@ -1759,7 +1759,7 @@ static struct dc_stream_state *find_pll_sharable_stream(
 
 static int get_norm_pix_clk(const struct dc_crtc_timing *timing)
 {
-	uint32_t pix_clk = timing->pix_clk_khz;
+	uint32_t pix_clk = timing->pix_clk_100hz;
 	uint32_t normalized_pix_clk = pix_clk;
 
 	if (timing->pixel_encoding == PIXEL_ENCODING_YCBCR420)
@@ -1791,15 +1791,60 @@ static void calculate_phy_pix_clks(struct dc_stream_state *stream)
 	/* update actual pixel clock on all streams */
 	if (dc_is_hdmi_signal(stream->signal))
 		stream->phy_pix_clk = get_norm_pix_clk(
-			&stream->timing);
+			&stream->timing) / 10;
 	else
 		stream->phy_pix_clk =
-			stream->timing.pix_clk_khz;
+			stream->timing.pix_clk_100hz / 10;
 
 	if (stream->timing.timing_3d_format == TIMING_3D_FORMAT_HW_FRAME_PACKING)
 		stream->phy_pix_clk *= 2;
 }
 
+static int acquire_resource_from_hw_enabled_state(
+		struct resource_context *res_ctx,
+		const struct resource_pool *pool,
+		struct dc_stream_state *stream)
+{
+	struct dc_link *link = stream->link;
+	unsigned int inst;
+
+	/* Check for enabled DIG to identify enabled display */
+	if (!link->link_enc->funcs->is_dig_enabled(link->link_enc))
+		return -1;
+
+	/* Check for which front end is used by this encoder.
+	 * Note the inst is 1 indexed, where 0 is undefined.
+	 * Note that DIG_FE can source from different OTG but our
+	 * current implementation always map 1-to-1, so this code makes
+	 * the same assumption and doesn't check OTG source.
+	 */
+	inst = link->link_enc->funcs->get_dig_frontend(link->link_enc) - 1;
+
+	/* Instance should be within the range of the pool */
+	if (inst >= pool->pipe_count)
+		return -1;
+
+	if (!res_ctx->pipe_ctx[inst].stream) {
+		struct pipe_ctx *pipe_ctx = &res_ctx->pipe_ctx[inst];
+
+		pipe_ctx->stream_res.tg = pool->timing_generators[inst];
+		pipe_ctx->plane_res.mi = pool->mis[inst];
+		pipe_ctx->plane_res.hubp = pool->hubps[inst];
+		pipe_ctx->plane_res.ipp = pool->ipps[inst];
+		pipe_ctx->plane_res.xfm = pool->transforms[inst];
+		pipe_ctx->plane_res.dpp = pool->dpps[inst];
+		pipe_ctx->stream_res.opp = pool->opps[inst];
+		if (pool->dpps[inst])
+			pipe_ctx->plane_res.mpcc_inst = pool->dpps[inst]->inst;
+		pipe_ctx->pipe_idx = inst;
+
+		pipe_ctx->stream = stream;
+		return inst;
+	}
+
+	return -1;
+}
+
 enum dc_status resource_map_pool_resources(
 		const struct dc  *dc,
 		struct dc_state *context,
@@ -1824,8 +1869,15 @@ enum dc_status resource_map_pool_resources(
 
 	calculate_phy_pix_clks(stream);
 
-	/* acquire new resources */
-	pipe_idx = acquire_first_free_pipe(&context->res_ctx, pool, stream);
+	if (stream->apply_seamless_boot_optimization)
+		pipe_idx = acquire_resource_from_hw_enabled_state(
+				&context->res_ctx,
+				pool,
+				stream);
+
+	if (pipe_idx < 0)
+		/* acquire new resources */
+		pipe_idx = acquire_first_free_pipe(&context->res_ctx, pool, stream);
 
 #ifdef CONFIG_DRM_AMD_DC_DCN1_0
 	if (pipe_idx < 0)
@@ -1842,7 +1894,7 @@ enum dc_status resource_map_pool_resources(
 			&context->res_ctx, pool, stream);
 
 	if (!pipe_ctx->stream_res.stream_enc)
-		return DC_NO_STREAM_ENG_RESOURCE;
+		return DC_NO_STREAM_ENC_RESOURCE;
 
 	update_stream_engine_usage(
 		&context->res_ctx, pool,
@@ -1850,7 +1902,7 @@ enum dc_status resource_map_pool_resources(
 		true);
 
 	/* TODO: Add check if ASIC support and EDID audio */
-	if (!stream->sink->converter_disable_audio &&
+	if (!stream->converter_disable_audio &&
 	    dc_is_audio_capable_signal(pipe_ctx->stream->signal) &&
 	    stream->audio_info.mode_count) {
 		pipe_ctx->stream_res.audio = find_first_free_audio(
@@ -2112,7 +2164,7 @@ static void set_avi_info_frame(
 	itc = true;
 	itc_value = 1;
 
-	support = stream->sink->edid_caps.content_support;
+	support = stream->content_support;
 
 	if (itc) {
 		if (!support.bits.valid_content_type) {
@@ -2151,8 +2203,8 @@ static void set_avi_info_frame(
 
 	/* TODO : We should handle YCC quantization */
 	/* but we do not have matrix calculation */
-	if (stream->sink->edid_caps.qs_bit == 1 &&
-			stream->sink->edid_caps.qy_bit == 1) {
+	if (stream->qs_bit == 1 &&
+			stream->qy_bit == 1) {
 		if (color_space == COLOR_SPACE_SRGB ||
 			color_space == COLOR_SPACE_2020_RGB_FULLRANGE) {
 			hdmi_info.bits.Q0_Q1   = RGB_QUANTIZATION_FULL_RANGE;
@@ -2596,7 +2648,7 @@ void resource_build_bit_depth_reduction_params(struct dc_stream_state *stream,
 enum dc_status dc_validate_stream(struct dc *dc, struct dc_stream_state *stream)
 {
 	struct dc  *core_dc = dc;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 	struct timing_generator *tg = core_dc->res_pool->timing_generators[0];
 	enum dc_status res = DC_OK;
 
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 66e5c4623a49..996298c35f42 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -35,20 +35,17 @@
 /*******************************************************************************
  * Private functions
  ******************************************************************************/
-void update_stream_signal(struct dc_stream_state *stream)
+void update_stream_signal(struct dc_stream_state *stream, struct dc_sink *sink)
 {
-
-	struct dc_sink *dc_sink = stream->sink;
-
-	if (dc_sink->sink_signal == SIGNAL_TYPE_NONE)
-		stream->signal = stream->sink->link->connector_signal;
+	if (sink->sink_signal == SIGNAL_TYPE_NONE)
+		stream->signal = stream->link->connector_signal;
 	else
-		stream->signal = dc_sink->sink_signal;
+		stream->signal = sink->sink_signal;
 
 	if (dc_is_dvi_signal(stream->signal)) {
 		if (stream->ctx->dc->caps.dual_link_dvi &&
-		    stream->timing.pix_clk_khz > TMDS_MAX_PIXEL_CLOCK &&
-		    stream->sink->sink_signal != SIGNAL_TYPE_DVI_SINGLE_LINK)
+		    (stream->timing.pix_clk_100hz / 10) > TMDS_MAX_PIXEL_CLOCK &&
+		    sink->sink_signal != SIGNAL_TYPE_DVI_SINGLE_LINK)
 			stream->signal = SIGNAL_TYPE_DVI_DUAL_LINK;
 		else
 			stream->signal = SIGNAL_TYPE_DVI_SINGLE_LINK;
@@ -61,10 +58,15 @@ static void construct(struct dc_stream_state *stream,
 	uint32_t i = 0;
 
 	stream->sink = dc_sink_data;
-	stream->ctx = stream->sink->ctx;
-
 	dc_sink_retain(dc_sink_data);
 
+	stream->ctx = dc_sink_data->ctx;
+	stream->link = dc_sink_data->link;
+	stream->sink_patches = dc_sink_data->edid_caps.panel_patch;
+	stream->converter_disable_audio = dc_sink_data->converter_disable_audio;
+	stream->qs_bit = dc_sink_data->edid_caps.qs_bit;
+	stream->qy_bit = dc_sink_data->edid_caps.qy_bit;
+
 	/* Copy audio modes */
 	/* TODO - Remove this translation */
 	for (i = 0; i < (dc_sink_data->edid_caps.audio_mode_count); i++)
@@ -100,11 +102,14 @@ static void construct(struct dc_stream_state *stream,
 	/* EDID CAP translation for HDMI 2.0 */
 	stream->timing.flags.LTE_340MCSC_SCRAMBLE = dc_sink_data->edid_caps.lte_340mcsc_scramble;
 
-	update_stream_signal(stream);
+	update_stream_signal(stream, dc_sink_data);
 
 	stream->out_transfer_func = dc_create_transfer_func();
 	stream->out_transfer_func->type = TF_TYPE_BYPASS;
 	stream->out_transfer_func->ctx = stream->ctx;
+
+	stream->stream_id = stream->ctx->dc_stream_id_count;
+	stream->ctx->dc_stream_id_count++;
 }
 
 static void destruct(struct dc_stream_state *stream)
@@ -155,21 +160,43 @@ struct dc_stream_state *dc_create_stream_for_sink(
 	return stream;
 }
 
-struct dc_stream_status *dc_stream_get_status(
+/**
+ * dc_stream_get_status_from_state - Get stream status from given dc state
+ * @state: DC state to find the stream status in
+ * @stream: The stream to get the stream status for
+ *
+ * The given stream is expected to exist in the given dc state. Otherwise, NULL
+ * will be returned.
+ */
+struct dc_stream_status *dc_stream_get_status_from_state(
+	struct dc_state *state,
 	struct dc_stream_state *stream)
 {
 	uint8_t i;
-	struct dc  *dc = stream->ctx->dc;
 
-	for (i = 0; i < dc->current_state->stream_count; i++) {
-		if (stream == dc->current_state->streams[i])
-			return &dc->current_state->stream_status[i];
+	for (i = 0; i < state->stream_count; i++) {
+		if (stream == state->streams[i])
+			return &state->stream_status[i];
 	}
 
 	return NULL;
 }
 
 /**
+ * dc_stream_get_status() - Get current stream status of the given stream state
+ * @stream: The stream to get the stream status for.
+ *
+ * The given stream is expected to exist in dc->current_state. Otherwise, NULL
+ * will be returned.
+ */
+struct dc_stream_status *dc_stream_get_status(
+	struct dc_stream_state *stream)
+{
+	struct dc *dc = stream->ctx->dc;
+	return dc_stream_get_status_from_state(dc->current_state, stream);
+}
+
+/**
  * dc_stream_set_cursor_attributes() - Update cursor attributes and set cursor surface address
  */
 bool dc_stream_set_cursor_attributes(
@@ -334,16 +361,12 @@ void dc_stream_log(const struct dc *dc, const struct dc_stream_state *stream)
 			stream->output_color_space);
 	DC_LOG_DC(
 			"\tpix_clk_khz: %d, h_total: %d, v_total: %d, pixelencoder:%d, displaycolorDepth:%d\n",
-			stream->timing.pix_clk_khz,
+			stream->timing.pix_clk_100hz / 10,
 			stream->timing.h_total,
 			stream->timing.v_total,
 			stream->timing.pixel_encoding,
 			stream->timing.display_color_depth);
 	DC_LOG_DC(
-			"\tsink name: %s, serial: %d\n",
-			stream->sink->edid_caps.display_name,
-			stream->sink->edid_caps.serial_number);
-	DC_LOG_DC(
 			"\tlink: %d\n",
-			stream->sink->link->link_index);
+			stream->link->link_index);
 }
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_surface.c b/drivers/gpu/drm/amd/display/dc/core/dc_surface.c
index c60c9b4c3075..ee6bd50f60b8 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_surface.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_surface.c
@@ -40,11 +40,14 @@ static void construct(struct dc_context *ctx, struct dc_plane_state *plane_state
 	plane_state->ctx = ctx;
 
 	plane_state->gamma_correction = dc_create_gamma();
-	plane_state->gamma_correction->is_identity = true;
+	if (plane_state->gamma_correction != NULL)
+		plane_state->gamma_correction->is_identity = true;
 
 	plane_state->in_transfer_func = dc_create_transfer_func();
-	plane_state->in_transfer_func->type = TF_TYPE_BYPASS;
-	plane_state->in_transfer_func->ctx = ctx;
+	if (plane_state->in_transfer_func != NULL) {
+		plane_state->in_transfer_func->type = TF_TYPE_BYPASS;
+		plane_state->in_transfer_func->ctx = ctx;
+	}
 }
 
 static void destruct(struct dc_plane_state *plane_state)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_vm_helper.c b/drivers/gpu/drm/amd/display/dc/core/dc_vm_helper.c
new file mode 100644
index 000000000000..6ce87b682a32
--- /dev/null
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_vm_helper.c
@@ -0,0 +1,123 @@
+/*
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: AMD
+ *
+ */
+
+#include "vm_helper.h"
+
+static void mark_vmid_used(struct vm_helper *vm_helper, unsigned int pos, uint8_t hubp_idx)
+{
+	struct vmid_usage vmids = vm_helper->hubp_vmid_usage[hubp_idx];
+
+	vmids.vmid_usage[0] = vmids.vmid_usage[1];
+	vmids.vmid_usage[1] = 1 << pos;
+}
+
+static void add_ptb_to_table(struct vm_helper *vm_helper, unsigned int vmid, uint64_t ptb)
+{
+	vm_helper->ptb_assigned_to_vmid[vmid] = ptb;
+	vm_helper->num_vmids_available--;
+}
+
+static void clear_entry_from_vmid_table(struct vm_helper *vm_helper, unsigned int vmid)
+{
+	vm_helper->ptb_assigned_to_vmid[vmid] = 0;
+	vm_helper->num_vmids_available++;
+}
+
+static void evict_vmids(struct vm_helper *vm_helper)
+{
+	int i;
+	uint16_t ord = 0;
+
+	for (i = 0; i < vm_helper->num_vmid; i++)
+		ord |= vm_helper->hubp_vmid_usage[i].vmid_usage[0] | vm_helper->hubp_vmid_usage[i].vmid_usage[1];
+
+	// At this point any positions with value 0 are unused vmids, evict them
+	for (i = 1; i < vm_helper->num_vmid; i++) {
+		if (ord & (1u << i))
+			clear_entry_from_vmid_table(vm_helper, i);
+	}
+}
+
+// Return value of -1 indicates vmid table unitialized or ptb dne in the table
+static int get_existing_vmid_for_ptb(struct vm_helper *vm_helper, uint64_t ptb)
+{
+	int i;
+
+	for (i = 0; i < vm_helper->num_vmid; i++) {
+		if (vm_helper->ptb_assigned_to_vmid[i] == ptb)
+			return i;
+	}
+
+	return -1;
+}
+
+// Expected to be called only when there's an available vmid
+static int get_next_available_vmid(struct vm_helper *vm_helper)
+{
+	int i;
+
+	for (i = 1; i < vm_helper->num_vmid; i++) {
+		if (vm_helper->ptb_assigned_to_vmid[i] == 0)
+			return i;
+	}
+
+	return -1;
+}
+
+uint8_t get_vmid_for_ptb(struct vm_helper *vm_helper, int64_t ptb, uint8_t hubp_idx)
+{
+	unsigned int vmid = 0;
+	int vmid_exists = -1;
+
+	// Physical address gets vmid 0
+	if (ptb == 0)
+		return 0;
+
+	vmid_exists = get_existing_vmid_for_ptb(vm_helper, ptb);
+
+	if (vmid_exists != -1) {
+		mark_vmid_used(vm_helper, vmid_exists, hubp_idx);
+		vmid = vmid_exists;
+	} else {
+		if (vm_helper->num_vmids_available == 0)
+			evict_vmids(vm_helper);
+
+		vmid = get_next_available_vmid(vm_helper);
+		mark_vmid_used(vm_helper, vmid, hubp_idx);
+		add_ptb_to_table(vm_helper, vmid, ptb);
+	}
+
+	return vmid;
+}
+
+void init_vm_helper(struct vm_helper *vm_helper, unsigned int num_vmid, unsigned int num_hubp)
+{
+	vm_helper->num_vmid = num_vmid;
+	vm_helper->num_hubp = num_hubp;
+	vm_helper->num_vmids_available = num_vmid - 1;
+
+	memset(vm_helper->hubp_vmid_usage, 0, sizeof(vm_helper->hubp_vmid_usage[0]) * MAX_HUBP);
+	memset(vm_helper->ptb_assigned_to_vmid, 0, sizeof(vm_helper->ptb_assigned_to_vmid[0]) * MAX_VMID);
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h
index 4b5bbb13ce7f..1a7fd6aa77eb 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -39,7 +39,7 @@
 #include "inc/hw/dmcu.h"
 #include "dml/display_mode_lib.h"
 
-#define DC_VER "3.2.08"
+#define DC_VER "3.2.17"
 
 #define MAX_SURFACES 3
 #define MAX_STREAMS 6
@@ -255,6 +255,8 @@ struct dc_debug_options {
 	bool scl_reset_length10;
 	bool hdmi20_disable;
 	bool skip_detection_link_training;
+	unsigned int force_odm_combine; //bit vector based on otg inst
+	unsigned int force_fclk_khz;
 };
 
 struct dc_debug_data {
@@ -263,7 +265,6 @@ struct dc_debug_data {
 	uint32_t auxErrorCount;
 };
 
-
 struct dc_state;
 struct resource_pool;
 struct dce_hwseq;
@@ -339,8 +340,13 @@ struct dc_init_data {
 	uint32_t log_mask;
 };
 
-struct dc *dc_create(const struct dc_init_data *init_params);
+struct dc_callback_init {
+	uint8_t reserved;
+};
 
+struct dc *dc_create(const struct dc_init_data *init_params);
+void dc_init_callbacks(struct dc *dc,
+		const struct dc_callback_init *init_params);
 void dc_destroy(struct dc **dc);
 
 /*******************************************************************************
@@ -440,6 +446,7 @@ union surface_update_flags {
 		uint32_t coeff_reduction_change:1;
 		uint32_t output_tf_change:1;
 		uint32_t pixel_format_change:1;
+		uint32_t plane_size_change:1;
 
 		/* Full updates */
 		uint32_t new_plane:1;
@@ -587,6 +594,10 @@ struct dc_validation_set {
 	uint8_t plane_count;
 };
 
+bool dc_validate_seamless_boot_timing(struct dc *dc,
+				const struct dc_sink *sink,
+				struct dc_crtc_timing *crtc_timing);
+
 enum dc_status dc_validate_plane(struct dc *dc, const struct dc_plane_state *plane_state);
 
 void get_clock_requirements_for_state(struct dc_state *state, struct AsicStateEx *info);
@@ -652,6 +663,7 @@ struct dpcd_caps {
 	int8_t branch_dev_name[6];
 	int8_t branch_hw_revision;
 	int8_t branch_fw_revision[2];
+	uint8_t link_rate_set;
 
 	bool allow_invalid_MSA_timing_param;
 	bool panel_mode_edp;
@@ -742,6 +754,9 @@ void dc_set_power_state(
 		struct dc *dc,
 		enum dc_acpi_cm_power_state power_state);
 void dc_resume(struct dc *dc);
+unsigned int dc_get_current_backlight_pwm(struct dc *dc);
+unsigned int dc_get_target_backlight_pwm(struct dc *dc);
+
 bool dc_is_dmcu_initialized(struct dc *dc);
 
 #endif /* DC_INTERFACE_H_ */
diff --git a/drivers/gpu/drm/amd/display/dc/dc_bios_types.h b/drivers/gpu/drm/amd/display/dc/dc_bios_types.h
index a8b3cedf9431..78c3b300ec45 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_bios_types.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_bios_types.h
@@ -86,10 +86,6 @@ struct dc_vbios_funcs {
 
 	bool (*is_accelerated_mode)(
 		struct dc_bios *bios);
-	bool (*is_active_display)(
-		struct dc_bios *bios,
-		enum signal_type signal,
-		const struct connector_device_tag_info *device_tag);
 	void (*set_scratch_critical_state)(
 		struct dc_bios *bios,
 		bool state);
@@ -125,10 +121,6 @@ struct dc_vbios_funcs {
 	enum bp_result (*program_crtc_timing)(
 		struct dc_bios *bios,
 		struct bp_hw_crtc_timing_parameters *bp_params);
-
-	enum bp_result (*crtc_source_select)(
-		struct dc_bios *bios,
-		struct bp_crtc_source_select *bp_params);
 	enum bp_result (*program_display_engine_pll)(
 		struct dc_bios *bios,
 		struct bp_pixel_clock_parameters *bp_params);
@@ -145,7 +137,6 @@ struct dc_vbios_funcs {
 };
 
 struct bios_registers {
-	uint32_t BIOS_SCRATCH_0;
 	uint32_t BIOS_SCRATCH_3;
 	uint32_t BIOS_SCRATCH_6;
 };
diff --git a/drivers/gpu/drm/amd/display/dc/dc_dp_types.h b/drivers/gpu/drm/amd/display/dc/dc_dp_types.h
index da93ab43f2d8..d4eab33c453b 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_dp_types.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_dp_types.h
@@ -46,11 +46,14 @@ enum dc_lane_count {
  */
 enum dc_link_rate {
 	LINK_RATE_UNKNOWN = 0,
-	LINK_RATE_LOW = 0x06,
-	LINK_RATE_HIGH = 0x0A,
-	LINK_RATE_RBR2 = 0x0C,
-	LINK_RATE_HIGH2 = 0x14,
-	LINK_RATE_HIGH3 = 0x1E
+	LINK_RATE_LOW = 0x06,		// Rate_1 (RBR)	- 1.62 Gbps/Lane
+	LINK_RATE_RATE_2 = 0x08,	// Rate_2		- 2.16 Gbps/Lane
+	LINK_RATE_RATE_3 = 0x09,	// Rate_3		- 2.43 Gbps/Lane
+	LINK_RATE_HIGH = 0x0A,		// Rate_4 (HBR)	- 2.70 Gbps/Lane
+	LINK_RATE_RBR2 = 0x0C,		// Rate_5 (RBR2)- 3.24 Gbps/Lane
+	LINK_RATE_RATE_6 = 0x10,	// Rate_6		- 4.32 Gbps/Lane
+	LINK_RATE_HIGH2 = 0x14,		// Rate_7 (HBR2)- 5.40 Gbps/Lane
+	LINK_RATE_HIGH3 = 0x1E		// Rate_8 (HBR3)- 8.10 Gbps/Lane
 };
 
 enum dc_link_spread {
diff --git a/drivers/gpu/drm/amd/display/dc/dc_helper.c b/drivers/gpu/drm/amd/display/dc/dc_helper.c
index 4842d2378bbf..597d38393379 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_helper.c
+++ b/drivers/gpu/drm/amd/display/dc/dc_helper.c
@@ -29,31 +29,59 @@
 #include "dm_services.h"
 #include <stdarg.h>
 
+struct dc_reg_value_masks {
+	uint32_t value;
+	uint32_t mask;
+};
+
+struct dc_reg_sequence {
+	uint32_t addr;
+	struct dc_reg_value_masks value_masks;
+};
+
+static inline void set_reg_field_value_masks(
+	struct dc_reg_value_masks *field_value_mask,
+	uint32_t value,
+	uint32_t mask,
+	uint8_t shift)
+{
+	ASSERT(mask != 0);
+
+	field_value_mask->value = (field_value_mask->value & ~mask) | (mask & (value << shift));
+	field_value_mask->mask = field_value_mask->mask | mask;
+}
+
 uint32_t generic_reg_update_ex(const struct dc_context *ctx,
 		uint32_t addr, uint32_t reg_val, int n,
 		uint8_t shift1, uint32_t mask1, uint32_t field_value1,
 		...)
 {
+	struct dc_reg_value_masks field_value_mask = {0};
 	uint32_t shift, mask, field_value;
 	int i = 1;
 
 	va_list ap;
 	va_start(ap, field_value1);
 
-	reg_val = set_reg_field_value_ex(reg_val, field_value1, mask1, shift1);
+	/* gather all bits value/mask getting updated in this register */
+	set_reg_field_value_masks(&field_value_mask,
+			field_value1, mask1, shift1);
 
 	while (i < n) {
 		shift = va_arg(ap, uint32_t);
 		mask = va_arg(ap, uint32_t);
 		field_value = va_arg(ap, uint32_t);
 
-		reg_val = set_reg_field_value_ex(reg_val, field_value, mask, shift);
+		set_reg_field_value_masks(&field_value_mask,
+				field_value, mask, shift);
 		i++;
 	}
-
-	dm_write_reg(ctx, addr, reg_val);
 	va_end(ap);
 
+
+	/* mmio write directly */
+	reg_val = (reg_val & ~field_value_mask.mask) | field_value_mask.value;
+	dm_write_reg(ctx, addr, reg_val);
 	return reg_val;
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
index e72fce4eca65..da55d623647a 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_hw_types.h
@@ -97,6 +97,8 @@ struct dc_plane_address {
 			union large_integer chroma_dcc_const_color;
 		} video_progressive;
 	};
+
+	union large_integer page_table_base;
 };
 
 struct dc_size {
@@ -730,7 +732,7 @@ struct dc_crtc_timing {
 	uint32_t v_front_porch;
 	uint32_t v_sync_width;
 
-	uint32_t pix_clk_khz;
+	uint32_t pix_clk_100hz;
 
 	uint32_t vic;
 	uint32_t hdmi_vic;
diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h b/drivers/gpu/drm/amd/display/dc/dc_link.h
index b2243e0dad1f..8fc223defed4 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_link.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
@@ -30,6 +30,7 @@
 #include "grph_object_defs.h"
 
 struct dc_link_status {
+	bool link_active;
 	struct dpcd_caps *dpcd_caps;
 };
 
@@ -110,6 +111,7 @@ struct dc_link {
 	union ddi_channel_mapping ddi_channel_mapping;
 	struct connector_device_tag_info device_tag;
 	struct dpcd_caps dpcd_caps;
+	uint32_t dongle_max_pix_clk;
 	unsigned short chip_caps;
 	unsigned int dpcd_sink_count;
 	enum edp_revision edp_revision;
@@ -124,6 +126,7 @@ struct dc_link {
 	struct dc_link_status link_status;
 
 	struct link_trace link_trace;
+	struct gpio *hpd_gpio;
 };
 
 const struct dc_link_status *dc_link_get_status(const struct dc_link *dc_link);
diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h b/drivers/gpu/drm/amd/display/dc/dc_stream.h
index d70c9e1cda3d..5657cb3a2ad3 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_stream.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h
@@ -32,17 +32,18 @@
 /*******************************************************************************
  * Stream Interfaces
  ******************************************************************************/
+struct timing_sync_info {
+	int group_id;
+	int group_size;
+	bool master;
+};
 
 struct dc_stream_status {
 	int primary_otg_inst;
 	int stream_enc_inst;
 	int plane_count;
+	struct timing_sync_info timing_sync_info;
 	struct dc_plane_state *plane_states[MAX_SURFACE_NUM];
-
-	/*
-	 * link this stream passes through
-	 */
-	struct dc_link *link;
 };
 
 // TODO: References to this needs to be removed..
@@ -50,8 +51,30 @@ struct freesync_context {
 	bool dummy;
 };
 
+enum vertical_interrupt_ref_point {
+	START_V_UPDATE = 0,
+	START_V_SYNC,
+	INVALID_POINT
+
+	//For now, only v_update interrupt is used.
+	//START_V_BLANK,
+	//START_V_ACTIVE
+};
+
+struct periodic_interrupt_config {
+	enum vertical_interrupt_ref_point ref_point;
+	int lines_offset;
+};
+
+
 struct dc_stream_state {
+	// sink is deprecated, new code should not reference
+	// this pointer
 	struct dc_sink *sink;
+
+	struct dc_link *link;
+	struct dc_panel_patch sink_patches;
+	union display_content_support content_support;
 	struct dc_crtc_timing timing;
 	struct dc_crtc_timing_adjust adjust;
 	struct dc_info_packet vrr_infopacket;
@@ -80,8 +103,9 @@ struct dc_stream_state {
 	enum view_3d_format view_format;
 
 	bool ignore_msa_timing_param;
-
-	unsigned long long periodic_fn_vsync_delta;
+	bool converter_disable_audio;
+	uint8_t qs_bit;
+	uint8_t qy_bit;
 
 	/* TODO: custom INFO packets */
 	/* TODO: ABM info (DMCU) */
@@ -92,6 +116,9 @@ struct dc_stream_state {
 	/* DMCU info */
 	unsigned int abm_level;
 
+	struct periodic_interrupt_config periodic_interrupt0;
+	struct periodic_interrupt_config periodic_interrupt1;
+
 	/* from core_stream struct */
 	struct dc_context *ctx;
 
@@ -102,7 +129,8 @@ struct dc_stream_state {
 	int phy_pix_clk;
 	enum signal_type signal;
 	bool dpms_off;
-	bool apply_edp_fast_boot_optimization;
+
+	void *dm_stream_context;
 
 	struct dc_cursor_attributes cursor_attributes;
 	struct dc_cursor_position cursor_position;
@@ -116,6 +144,21 @@ struct dc_stream_state {
 	/* Computed state bits */
 	bool mode_changed : 1;
 
+	/* Output from DC when stream state is committed or altered
+	 * DC may only access these values during:
+	 * dc_commit_state, dc_commit_state_no_check, dc_commit_streams
+	 * values may not change outside of those calls
+	 */
+	struct {
+		// For interrupt management, some hardware instance
+		// offsets need to be exposed to DM
+		uint8_t otg_offset;
+	} out;
+
+	bool apply_edp_fast_boot_optimization;
+	bool apply_seamless_boot_optimization;
+
+	uint32_t stream_id;
 };
 
 struct dc_stream_update {
@@ -125,7 +168,9 @@ struct dc_stream_update {
 	struct dc_info_packet *hdr_static_metadata;
 	unsigned int *abm_level;
 
-	unsigned long long *periodic_fn_vsync_delta;
+	struct periodic_interrupt_config *periodic_interrupt0;
+	struct periodic_interrupt_config *periodic_interrupt1;
+
 	struct dc_crtc_timing_adjust *adjust;
 	struct dc_info_packet *vrr_infopacket;
 	struct dc_info_packet *vsc_infopacket;
@@ -162,7 +207,6 @@ void dc_commit_updates_for_stream(struct dc *dc,
 		int surface_count,
 		struct dc_stream_state *stream,
 		struct dc_stream_update *stream_update,
-		struct dc_plane_state **plane_states,
 		struct dc_state *state);
 /*
  * Log the current stream state.
@@ -255,11 +299,14 @@ enum surface_update_type dc_check_update_surfaces_for_stream(
  */
 struct dc_stream_state *dc_create_stream_for_sink(struct dc_sink *dc_sink);
 
-void update_stream_signal(struct dc_stream_state *stream);
+void update_stream_signal(struct dc_stream_state *stream, struct dc_sink *sink);
 
 void dc_stream_retain(struct dc_stream_state *dc_stream);
 void dc_stream_release(struct dc_stream_state *dc_stream);
 
+struct dc_stream_status *dc_stream_get_status_from_state(
+	struct dc_state *state,
+	struct dc_stream_state *stream);
 struct dc_stream_status *dc_stream_get_status(
 	struct dc_stream_state *dc_stream);
 
diff --git a/drivers/gpu/drm/amd/display/dc/dc_types.h b/drivers/gpu/drm/amd/display/dc/dc_types.h
index 0b20ae23f169..da2009a108cf 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_types.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_types.h
@@ -97,8 +97,8 @@ struct dc_context {
 	struct dc_bios *dc_bios;
 	bool created_bios;
 	struct gpio_service *gpio_service;
-	struct i2caux *i2caux;
 	uint32_t dc_sink_id_count;
+	uint32_t dc_stream_id_count;
 	uint64_t fbc_gpu_addr;
 };
 
@@ -201,6 +201,7 @@ union display_content_support {
 struct dc_panel_patch {
 	unsigned int dppowerup_delay;
 	unsigned int extra_t12_ms;
+	unsigned int extra_delay_backlight_off;
 };
 
 struct dc_edid_caps {
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c b/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c
index 2a342eae80fd..da96229db53a 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_abm.c
@@ -53,6 +53,27 @@
 
 #define MCP_DISABLE_ABM_IMMEDIATELY 255
 
+static bool dce_abm_set_pipe(struct abm *abm, uint32_t controller_id)
+{
+	struct dce_abm *abm_dce = TO_DCE_ABM(abm);
+	uint32_t rampingBoundary = 0xFFFF;
+
+	REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0,
+			1, 80000);
+
+	/* set ramping boundary */
+	REG_WRITE(MASTER_COMM_DATA_REG1, rampingBoundary);
+
+	/* setDMCUParam_Pipe */
+	REG_UPDATE_2(MASTER_COMM_CMD_REG,
+			MASTER_COMM_CMD_REG_BYTE0, MCP_ABM_PIPE_SET,
+			MASTER_COMM_CMD_REG_BYTE1, controller_id);
+
+	/* notifyDMCUMsg */
+	REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
+
+	return true;
+}
 
 static unsigned int calculate_16_bit_backlight_from_pwm(struct dce_abm *abm_dce)
 {
@@ -175,7 +196,6 @@ static void dmcu_set_backlight_level(
 	uint32_t controller_id)
 {
 	unsigned int backlight_8_bit = 0;
-	uint32_t rampingBoundary = 0xFFFF;
 	uint32_t s2;
 
 	if (backlight_pwm_u16_16 & 0x10000)
@@ -185,16 +205,7 @@ static void dmcu_set_backlight_level(
 		// Take MSB of fractional part since backlight is not max
 		backlight_8_bit = (backlight_pwm_u16_16 >> 8) & 0xFF;
 
-	/* set ramping boundary */
-	REG_WRITE(MASTER_COMM_DATA_REG1, rampingBoundary);
-
-	/* setDMCUParam_Pipe */
-	REG_UPDATE_2(MASTER_COMM_CMD_REG,
-			MASTER_COMM_CMD_REG_BYTE0, MCP_ABM_PIPE_SET,
-			MASTER_COMM_CMD_REG_BYTE1, controller_id);
-
-	/* notifyDMCUMsg */
-	REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
+	dce_abm_set_pipe(&abm_dce->base, controller_id);
 
 	/* waitDMCUReadyForCmd */
 	REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT,
@@ -309,16 +320,7 @@ static bool dce_abm_immediate_disable(struct abm *abm)
 {
 	struct dce_abm *abm_dce = TO_DCE_ABM(abm);
 
-	REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0,
-			1, 80000);
-
-	/* setDMCUParam_ABMLevel */
-	REG_UPDATE_2(MASTER_COMM_CMD_REG,
-			MASTER_COMM_CMD_REG_BYTE0, MCP_ABM_LEVEL_SET,
-			MASTER_COMM_CMD_REG_BYTE2, MCP_DISABLE_ABM_IMMEDIATELY);
-
-	/* notifyDMCUMsg */
-	REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
+	dce_abm_set_pipe(abm, MCP_DISABLE_ABM_IMMEDIATELY);
 
 	abm->stored_backlight_registers.BL_PWM_CNTL =
 		REG_READ(BL_PWM_CNTL);
@@ -419,6 +421,7 @@ static const struct abm_funcs dce_funcs = {
 	.abm_init = dce_abm_init,
 	.set_abm_level = dce_abm_set_level,
 	.init_backlight = dce_abm_init_backlight,
+	.set_pipe = dce_abm_set_pipe,
 	.set_backlight_level_pwm = dce_abm_set_backlight_level_pwm,
 	.get_current_backlight = dce_abm_get_current_backlight,
 	.get_target_backlight = dce_abm_get_target_backlight,
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
index aaeb7faac0c4..4febf4ef7240 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
@@ -24,6 +24,7 @@
  */
 
 #include "dm_services.h"
+#include "core_types.h"
 #include "dce_aux.h"
 #include "dce/dce_11_0_sh_mask.h"
 
@@ -41,17 +42,17 @@
 	container_of((ptr), struct aux_engine_dce110, base)
 
 #define FROM_ENGINE(ptr) \
-	FROM_AUX_ENGINE(container_of((ptr), struct aux_engine, base))
+	FROM_AUX_ENGINE(container_of((ptr), struct dce_aux, base))
 
 #define FROM_AUX_ENGINE_ENGINE(ptr) \
-	container_of((ptr), struct aux_engine, base)
+	container_of((ptr), struct dce_aux, base)
 enum {
 	AUX_INVALID_REPLY_RETRY_COUNTER = 1,
 	AUX_TIMED_OUT_RETRY_COUNTER = 2,
 	AUX_DEFER_RETRY_COUNTER = 6
 };
 static void release_engine(
-	struct aux_engine *engine)
+	struct dce_aux *engine)
 {
 	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
 
@@ -66,7 +67,7 @@ static void release_engine(
 #define DMCU_CAN_ACCESS_AUX 2
 
 static bool is_engine_available(
-	struct aux_engine *engine)
+	struct dce_aux *engine)
 {
 	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
 
@@ -79,7 +80,7 @@ static bool is_engine_available(
 	return (field != DMCU_CAN_ACCESS_AUX);
 }
 static bool acquire_engine(
-	struct aux_engine *engine)
+	struct dce_aux *engine)
 {
 	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
 
@@ -155,7 +156,7 @@ static bool acquire_engine(
 	(0xFF & (address))
 
 static void submit_channel_request(
-	struct aux_engine *engine,
+	struct dce_aux *engine,
 	struct aux_request_transaction_data *request)
 {
 	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
@@ -247,7 +248,7 @@ static void submit_channel_request(
 	REG_UPDATE(AUX_SW_CONTROL, AUX_SW_GO, 1);
 }
 
-static int read_channel_reply(struct aux_engine *engine, uint32_t size,
+static int read_channel_reply(struct dce_aux *engine, uint32_t size,
 			      uint8_t *buffer, uint8_t *reply_result,
 			      uint32_t *sw_status)
 {
@@ -273,7 +274,8 @@ static int read_channel_reply(struct aux_engine *engine, uint32_t size,
 
 	REG_GET(AUX_SW_DATA, AUX_SW_DATA, &reply_result_32);
 	reply_result_32 = reply_result_32 >> 4;
-	*reply_result = (uint8_t)reply_result_32;
+	if (reply_result != NULL)
+		*reply_result = (uint8_t)reply_result_32;
 
 	if (reply_result_32 == 0) { /* ACK */
 		uint32_t i = 0;
@@ -299,61 +301,8 @@ static int read_channel_reply(struct aux_engine *engine, uint32_t size,
 	return 0;
 }
 
-static void process_channel_reply(
-	struct aux_engine *engine,
-	struct aux_reply_transaction_data *reply)
-{
-	int bytes_replied;
-	uint8_t reply_result;
-	uint32_t sw_status;
-
-	bytes_replied = read_channel_reply(engine, reply->length, reply->data,
-					   &reply_result, &sw_status);
-
-	/* in case HPD is LOW, exit AUX transaction */
-	if ((sw_status & AUX_SW_STATUS__AUX_SW_HPD_DISCON_MASK)) {
-		reply->status = AUX_TRANSACTION_REPLY_HPD_DISCON;
-		return;
-	}
-
-	if (bytes_replied < 0) {
-		/* Need to handle an error case...
-		 * Hopefully, upper layer function won't call this function if
-		 * the number of bytes in the reply was 0, because there was
-		 * surely an error that was asserted that should have been
-		 * handled for hot plug case, this could happens
-		 */
-		if (!(sw_status & AUX_SW_STATUS__AUX_SW_HPD_DISCON_MASK)) {
-			reply->status = AUX_TRANSACTION_REPLY_INVALID;
-			ASSERT_CRITICAL(false);
-			return;
-		}
-	} else {
-
-		switch (reply_result) {
-		case 0: /* ACK */
-			reply->status = AUX_TRANSACTION_REPLY_AUX_ACK;
-		break;
-		case 1: /* NACK */
-			reply->status = AUX_TRANSACTION_REPLY_AUX_NACK;
-		break;
-		case 2: /* DEFER */
-			reply->status = AUX_TRANSACTION_REPLY_AUX_DEFER;
-		break;
-		case 4: /* AUX ACK / I2C NACK */
-			reply->status = AUX_TRANSACTION_REPLY_I2C_NACK;
-		break;
-		case 8: /* AUX ACK / I2C DEFER */
-			reply->status = AUX_TRANSACTION_REPLY_I2C_DEFER;
-		break;
-		default:
-			reply->status = AUX_TRANSACTION_REPLY_INVALID;
-		}
-	}
-}
-
 static enum aux_channel_operation_result get_channel_status(
-	struct aux_engine *engine,
+	struct dce_aux *engine,
 	uint8_t *returned_bytes)
 {
 	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
@@ -414,469 +363,22 @@ static enum aux_channel_operation_result get_channel_status(
 		return AUX_CHANNEL_OPERATION_FAILED_TIMEOUT;
 	}
 }
-static void process_read_reply(
-	struct aux_engine *engine,
-	struct read_command_context *ctx)
-{
-	engine->funcs->process_channel_reply(engine, &ctx->reply);
-
-	switch (ctx->reply.status) {
-	case AUX_TRANSACTION_REPLY_AUX_ACK:
-		ctx->defer_retry_aux = 0;
-		if (ctx->returned_byte > ctx->current_read_length) {
-			ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-			ctx->operation_succeeded = false;
-		} else if (ctx->returned_byte < ctx->current_read_length) {
-			ctx->current_read_length -= ctx->returned_byte;
-
-			ctx->offset += ctx->returned_byte;
-
-			++ctx->invalid_reply_retry_aux_on_ack;
-
-			if (ctx->invalid_reply_retry_aux_on_ack >
-				AUX_INVALID_REPLY_RETRY_COUNTER) {
-				ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-				ctx->operation_succeeded = false;
-			}
-		} else {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_SUCCEEDED;
-			ctx->transaction_complete = true;
-			ctx->operation_succeeded = true;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_AUX_NACK:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_NACK;
-		ctx->operation_succeeded = false;
-	break;
-	case AUX_TRANSACTION_REPLY_AUX_DEFER:
-		++ctx->defer_retry_aux;
-
-		if (ctx->defer_retry_aux > AUX_DEFER_RETRY_COUNTER) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_I2C_DEFER:
-		ctx->defer_retry_aux = 0;
-
-		++ctx->defer_retry_i2c;
-
-		if (ctx->defer_retry_i2c > AUX_DEFER_RETRY_COUNTER) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_HPD_DISCON:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_HPD_DISCON;
-		ctx->operation_succeeded = false;
-	break;
-	default:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_UNKNOWN;
-		ctx->operation_succeeded = false;
-	}
-}
-static void process_read_request(
-	struct aux_engine *engine,
-	struct read_command_context *ctx)
-{
-	enum aux_channel_operation_result operation_result;
 
-	engine->funcs->submit_channel_request(engine, &ctx->request);
-
-	operation_result = engine->funcs->get_channel_status(
-		engine, &ctx->returned_byte);
-
-	switch (operation_result) {
-	case AUX_CHANNEL_OPERATION_SUCCEEDED:
-		if (ctx->returned_byte > ctx->current_read_length) {
-			ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-			ctx->operation_succeeded = false;
-		} else {
-			ctx->timed_out_retry_aux = 0;
-			ctx->invalid_reply_retry_aux = 0;
-
-			ctx->reply.length = ctx->returned_byte;
-			ctx->reply.data = ctx->buffer;
-
-			process_read_reply(engine, ctx);
-		}
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_INVALID_REPLY:
-		++ctx->invalid_reply_retry_aux;
-
-		if (ctx->invalid_reply_retry_aux >
-			AUX_INVALID_REPLY_RETRY_COUNTER) {
-			ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-			ctx->operation_succeeded = false;
-		} else
-			udelay(400);
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_TIMEOUT:
-		++ctx->timed_out_retry_aux;
-
-		if (ctx->timed_out_retry_aux > AUX_TIMED_OUT_RETRY_COUNTER) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		} else {
-			/* DP 1.2a, table 2-58:
-			 * "S3: AUX Request CMD PENDING:
-			 * retry 3 times, with 400usec wait on each"
-			 * The HW timeout is set to 550usec,
-			 * so we should not wait here
-			 */
-		}
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_HPD_DISCON:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_HPD_DISCON;
-		ctx->operation_succeeded = false;
-	break;
-	default:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_UNKNOWN;
-		ctx->operation_succeeded = false;
-	}
-}
-static bool read_command(
-	struct aux_engine *engine,
-	struct i2caux_transaction_request *request,
-	bool middle_of_transaction)
-{
-	struct read_command_context ctx;
-
-	ctx.buffer = request->payload.data;
-	ctx.current_read_length = request->payload.length;
-	ctx.offset = 0;
-	ctx.timed_out_retry_aux = 0;
-	ctx.invalid_reply_retry_aux = 0;
-	ctx.defer_retry_aux = 0;
-	ctx.defer_retry_i2c = 0;
-	ctx.invalid_reply_retry_aux_on_ack = 0;
-	ctx.transaction_complete = false;
-	ctx.operation_succeeded = true;
-
-	if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD) {
-		ctx.request.type = AUX_TRANSACTION_TYPE_DP;
-		ctx.request.action = I2CAUX_TRANSACTION_ACTION_DP_READ;
-		ctx.request.address = request->payload.address;
-	} else if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_I2C) {
-		ctx.request.type = AUX_TRANSACTION_TYPE_I2C;
-		ctx.request.action = middle_of_transaction ?
-			I2CAUX_TRANSACTION_ACTION_I2C_READ_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_READ;
-		ctx.request.address = request->payload.address >> 1;
-	} else {
-		/* in DAL2, there was no return in such case */
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	ctx.request.delay = 0;
-
-	do {
-		memset(ctx.buffer + ctx.offset, 0, ctx.current_read_length);
-
-		ctx.request.data = ctx.buffer + ctx.offset;
-		ctx.request.length = ctx.current_read_length;
-
-		process_read_request(engine, &ctx);
-
-		request->status = ctx.status;
-
-		if (ctx.operation_succeeded && !ctx.transaction_complete)
-			if (ctx.request.type == AUX_TRANSACTION_TYPE_I2C)
-				msleep(engine->delay);
-	} while (ctx.operation_succeeded && !ctx.transaction_complete);
-
-	if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD) {
-		DC_LOG_I2C_AUX("READ: addr:0x%x  value:0x%x Result:%d",
-				request->payload.address,
-				request->payload.data[0],
-				ctx.operation_succeeded);
-	}
-
-	return ctx.operation_succeeded;
-}
-
-static void process_write_reply(
-	struct aux_engine *engine,
-	struct write_command_context *ctx)
-{
-	engine->funcs->process_channel_reply(engine, &ctx->reply);
-
-	switch (ctx->reply.status) {
-	case AUX_TRANSACTION_REPLY_AUX_ACK:
-		ctx->operation_succeeded = true;
-
-		if (ctx->returned_byte) {
-			ctx->request.action = ctx->mot ?
-			I2CAUX_TRANSACTION_ACTION_I2C_STATUS_REQUEST_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_STATUS_REQUEST;
-
-			ctx->current_write_length = 0;
-
-			++ctx->ack_m_retry;
-
-			if (ctx->ack_m_retry > AUX_DEFER_RETRY_COUNTER) {
-				ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-				ctx->operation_succeeded = false;
-			} else
-				udelay(300);
-		} else {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_SUCCEEDED;
-			ctx->defer_retry_aux = 0;
-			ctx->ack_m_retry = 0;
-			ctx->transaction_complete = true;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_AUX_NACK:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_NACK;
-		ctx->operation_succeeded = false;
-	break;
-	case AUX_TRANSACTION_REPLY_AUX_DEFER:
-		++ctx->defer_retry_aux;
-
-		if (ctx->defer_retry_aux > ctx->max_defer_retry) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_I2C_DEFER:
-		ctx->defer_retry_aux = 0;
-		ctx->current_write_length = 0;
-
-		ctx->request.action = ctx->mot ?
-			I2CAUX_TRANSACTION_ACTION_I2C_STATUS_REQUEST_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_STATUS_REQUEST;
-
-		++ctx->defer_retry_i2c;
-
-		if (ctx->defer_retry_i2c > ctx->max_defer_retry) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_HPD_DISCON:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_HPD_DISCON;
-		ctx->operation_succeeded = false;
-	break;
-	default:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_UNKNOWN;
-		ctx->operation_succeeded = false;
-	}
-}
-static void process_write_request(
-	struct aux_engine *engine,
-	struct write_command_context *ctx)
-{
-	enum aux_channel_operation_result operation_result;
-
-	engine->funcs->submit_channel_request(engine, &ctx->request);
-
-	operation_result = engine->funcs->get_channel_status(
-		engine, &ctx->returned_byte);
-
-	switch (operation_result) {
-	case AUX_CHANNEL_OPERATION_SUCCEEDED:
-		ctx->timed_out_retry_aux = 0;
-		ctx->invalid_reply_retry_aux = 0;
-
-		ctx->reply.length = ctx->returned_byte;
-		ctx->reply.data = ctx->reply_data;
-
-		process_write_reply(engine, ctx);
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_INVALID_REPLY:
-		++ctx->invalid_reply_retry_aux;
-
-		if (ctx->invalid_reply_retry_aux >
-			AUX_INVALID_REPLY_RETRY_COUNTER) {
-			ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-			ctx->operation_succeeded = false;
-		} else
-			udelay(400);
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_TIMEOUT:
-		++ctx->timed_out_retry_aux;
-
-		if (ctx->timed_out_retry_aux > AUX_TIMED_OUT_RETRY_COUNTER) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		} else {
-			/* DP 1.2a, table 2-58:
-			 * "S3: AUX Request CMD PENDING:
-			 * retry 3 times, with 400usec wait on each"
-			 * The HW timeout is set to 550usec,
-			 * so we should not wait here
-			 */
-		}
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_HPD_DISCON:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_HPD_DISCON;
-		ctx->operation_succeeded = false;
-	break;
-	default:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_UNKNOWN;
-		ctx->operation_succeeded = false;
-	}
-}
-static bool write_command(
-	struct aux_engine *engine,
-	struct i2caux_transaction_request *request,
-	bool middle_of_transaction)
-{
-	struct write_command_context ctx;
-
-	ctx.mot = middle_of_transaction;
-	ctx.buffer = request->payload.data;
-	ctx.current_write_length = request->payload.length;
-	ctx.timed_out_retry_aux = 0;
-	ctx.invalid_reply_retry_aux = 0;
-	ctx.defer_retry_aux = 0;
-	ctx.defer_retry_i2c = 0;
-	ctx.ack_m_retry = 0;
-	ctx.transaction_complete = false;
-	ctx.operation_succeeded = true;
-
-	if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD) {
-		ctx.request.type = AUX_TRANSACTION_TYPE_DP;
-		ctx.request.action = I2CAUX_TRANSACTION_ACTION_DP_WRITE;
-		ctx.request.address = request->payload.address;
-	} else if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_I2C) {
-		ctx.request.type = AUX_TRANSACTION_TYPE_I2C;
-		ctx.request.action = middle_of_transaction ?
-			I2CAUX_TRANSACTION_ACTION_I2C_WRITE_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_WRITE;
-		ctx.request.address = request->payload.address >> 1;
-	} else {
-		/* in DAL2, there was no return in such case */
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	ctx.request.delay = 0;
-
-	ctx.max_defer_retry =
-		(engine->max_defer_write_retry > AUX_DEFER_RETRY_COUNTER) ?
-			engine->max_defer_write_retry : AUX_DEFER_RETRY_COUNTER;
-
-	do {
-		ctx.request.data = ctx.buffer;
-		ctx.request.length = ctx.current_write_length;
-
-		process_write_request(engine, &ctx);
-
-		request->status = ctx.status;
-
-		if (ctx.operation_succeeded && !ctx.transaction_complete)
-			if (ctx.request.type == AUX_TRANSACTION_TYPE_I2C)
-				msleep(engine->delay);
-	} while (ctx.operation_succeeded && !ctx.transaction_complete);
-
-	if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD) {
-		DC_LOG_I2C_AUX("WRITE: addr:0x%x  value:0x%x Result:%d",
-				request->payload.address,
-				request->payload.data[0],
-				ctx.operation_succeeded);
-	}
-
-	return ctx.operation_succeeded;
-}
-static bool end_of_transaction_command(
-	struct aux_engine *engine,
-	struct i2caux_transaction_request *request)
-{
-	struct i2caux_transaction_request dummy_request;
-	uint8_t dummy_data;
-
-	/* [tcheng] We only need to send the stop (read with MOT = 0)
-	 * for I2C-over-Aux, not native AUX
-	 */
-
-	if (request->payload.address_space !=
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_I2C)
-		return false;
-
-	dummy_request.operation = request->operation;
-	dummy_request.payload.address_space = request->payload.address_space;
-	dummy_request.payload.address = request->payload.address;
-
-	/*
-	 * Add a dummy byte due to some receiver quirk
-	 * where one byte is sent along with MOT = 0.
-	 * Ideally this should be 0.
-	 */
-
-	dummy_request.payload.length = 0;
-	dummy_request.payload.data = &dummy_data;
-
-	if (request->operation == I2CAUX_TRANSACTION_READ)
-		return read_command(engine, &dummy_request, false);
-	else
-		return write_command(engine, &dummy_request, false);
-
-	/* according Syed, it does not need now DoDummyMOT */
-}
-static bool submit_request(
-	struct aux_engine *engine,
-	struct i2caux_transaction_request *request,
-	bool middle_of_transaction)
-{
-
-	bool result;
-	bool mot_used = true;
-
-	switch (request->operation) {
-	case I2CAUX_TRANSACTION_READ:
-		result = read_command(engine, request, mot_used);
-	break;
-	case I2CAUX_TRANSACTION_WRITE:
-		result = write_command(engine, request, mot_used);
-	break;
-	default:
-		result = false;
-	}
-
-	/* [tcheng]
-	 * need to send stop for the last transaction to free up the AUX
-	 * if the above command fails, this would be the last transaction
-	 */
-
-	if (!middle_of_transaction || !result)
-		end_of_transaction_command(engine, request);
-
-	/* mask AUX interrupt */
-
-	return result;
-}
 enum i2caux_engine_type get_engine_type(
-		const struct aux_engine *engine)
+		const struct dce_aux *engine)
 {
 	return I2CAUX_ENGINE_TYPE_AUX;
 }
 
 static bool acquire(
-	struct aux_engine *engine,
+	struct dce_aux *engine,
 	struct ddc *ddc)
 {
 
 	enum gpio_result result;
 
-	if (engine->funcs->is_engine_available) {
-		/*check whether SW could use the engine*/
-		if (!engine->funcs->is_engine_available(engine))
-			return false;
-	}
+	if (!is_engine_available(engine))
+		return false;
 
 	result = dal_ddc_open(ddc, GPIO_MODE_HARDWARE,
 		GPIO_DDC_CONFIG_TYPE_MODE_AUX);
@@ -884,7 +386,7 @@ static bool acquire(
 	if (result != GPIO_RESULT_OK)
 		return false;
 
-	if (!engine->funcs->acquire_engine(engine)) {
+	if (!acquire_engine(engine)) {
 		dal_ddc_close(ddc);
 		return false;
 	}
@@ -894,21 +396,7 @@ static bool acquire(
 	return true;
 }
 
-static const struct aux_engine_funcs aux_engine_funcs = {
-	.acquire_engine = acquire_engine,
-	.submit_channel_request = submit_channel_request,
-	.process_channel_reply = process_channel_reply,
-	.read_channel_reply = read_channel_reply,
-	.get_channel_status = get_channel_status,
-	.is_engine_available = is_engine_available,
-	.release_engine = release_engine,
-	.destroy_engine = dce110_engine_destroy,
-	.submit_request = submit_request,
-	.get_engine_type = get_engine_type,
-	.acquire = acquire,
-};
-
-void dce110_engine_destroy(struct aux_engine **engine)
+void dce110_engine_destroy(struct dce_aux **engine)
 {
 
 	struct aux_engine_dce110 *engine110 = FROM_AUX_ENGINE(*engine);
@@ -917,7 +405,7 @@ void dce110_engine_destroy(struct aux_engine **engine)
 	*engine = NULL;
 
 }
-struct aux_engine *dce110_aux_engine_construct(struct aux_engine_dce110 *aux_engine110,
+struct dce_aux *dce110_aux_engine_construct(struct aux_engine_dce110 *aux_engine110,
 		struct dc_context *ctx,
 		uint32_t inst,
 		uint32_t timeout_period,
@@ -927,7 +415,6 @@ struct aux_engine *dce110_aux_engine_construct(struct aux_engine_dce110 *aux_eng
 	aux_engine110->base.ctx = ctx;
 	aux_engine110->base.delay = 0;
 	aux_engine110->base.max_defer_write_retry = 0;
-	aux_engine110->base.funcs = &aux_engine_funcs;
 	aux_engine110->base.inst = inst;
 	aux_engine110->timeout_period = timeout_period;
 	aux_engine110->regs = regs;
@@ -935,3 +422,101 @@ struct aux_engine *dce110_aux_engine_construct(struct aux_engine_dce110 *aux_eng
 	return &aux_engine110->base;
 }
 
+static enum i2caux_transaction_action i2caux_action_from_payload(struct aux_payload *payload)
+{
+	if (payload->i2c_over_aux) {
+		if (payload->write) {
+			if (payload->mot)
+				return I2CAUX_TRANSACTION_ACTION_I2C_WRITE_MOT;
+			return I2CAUX_TRANSACTION_ACTION_I2C_WRITE;
+		}
+		if (payload->mot)
+			return I2CAUX_TRANSACTION_ACTION_I2C_READ_MOT;
+		return I2CAUX_TRANSACTION_ACTION_I2C_READ;
+	}
+	if (payload->write)
+		return I2CAUX_TRANSACTION_ACTION_DP_WRITE;
+	return I2CAUX_TRANSACTION_ACTION_DP_READ;
+}
+
+int dce_aux_transfer(struct ddc_service *ddc,
+		struct aux_payload *payload)
+{
+	struct ddc *ddc_pin = ddc->ddc_pin;
+	struct dce_aux *aux_engine;
+	enum aux_channel_operation_result operation_result;
+	struct aux_request_transaction_data aux_req;
+	struct aux_reply_transaction_data aux_rep;
+	uint8_t returned_bytes = 0;
+	int res = -1;
+	uint32_t status;
+
+	memset(&aux_req, 0, sizeof(aux_req));
+	memset(&aux_rep, 0, sizeof(aux_rep));
+
+	aux_engine = ddc->ctx->dc->res_pool->engines[ddc_pin->pin_data->en];
+	acquire(aux_engine, ddc_pin);
+
+	if (payload->i2c_over_aux)
+		aux_req.type = AUX_TRANSACTION_TYPE_I2C;
+	else
+		aux_req.type = AUX_TRANSACTION_TYPE_DP;
+
+	aux_req.action = i2caux_action_from_payload(payload);
+
+	aux_req.address = payload->address;
+	aux_req.delay = payload->defer_delay * 10;
+	aux_req.length = payload->length;
+	aux_req.data = payload->data;
+
+	submit_channel_request(aux_engine, &aux_req);
+	operation_result = get_channel_status(aux_engine, &returned_bytes);
+
+	switch (operation_result) {
+	case AUX_CHANNEL_OPERATION_SUCCEEDED:
+		res = read_channel_reply(aux_engine, payload->length,
+							payload->data, payload->reply,
+							&status);
+		break;
+	case AUX_CHANNEL_OPERATION_FAILED_HPD_DISCON:
+		res = 0;
+		break;
+	case AUX_CHANNEL_OPERATION_FAILED_REASON_UNKNOWN:
+	case AUX_CHANNEL_OPERATION_FAILED_INVALID_REPLY:
+	case AUX_CHANNEL_OPERATION_FAILED_TIMEOUT:
+		res = -1;
+		break;
+	}
+	release_engine(aux_engine);
+	return res;
+}
+
+#define AUX_RETRY_MAX 7
+
+bool dce_aux_transfer_with_retries(struct ddc_service *ddc,
+		struct aux_payload *payload)
+{
+	int i, ret = 0;
+	uint8_t reply;
+	bool payload_reply = true;
+
+	if (!payload->reply) {
+		payload_reply = false;
+		payload->reply = &reply;
+	}
+
+	for (i = 0; i < AUX_RETRY_MAX; i++) {
+		ret = dce_aux_transfer(ddc, payload);
+
+		if (ret >= 0) {
+			if (*payload->reply == 0) {
+				if (!payload_reply)
+					payload->reply = NULL;
+				return true;
+			}
+		}
+
+		udelay(1000);
+	}
+	return false;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h
index f7caab85dc80..d27f22c05e4b 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.h
@@ -25,7 +25,9 @@
 
 #ifndef __DAL_AUX_ENGINE_DCE110_H__
 #define __DAL_AUX_ENGINE_DCE110_H__
-#include "aux_engine.h"
+
+#include "i2caux_interface.h"
+#include "inc/hw/aux_engine.h"
 
 #define AUX_COMMON_REG_LIST(id)\
 	SRI(AUX_CONTROL, DP_AUX, id), \
@@ -75,8 +77,20 @@ enum {	/* This is the timeout as defined in DP 1.2a,
 	 */
 	SW_AUX_TIMEOUT_PERIOD_MULTIPLIER = 4
 };
+
+struct dce_aux {
+	uint32_t inst;
+	struct ddc *ddc;
+	struct dc_context *ctx;
+	/* following values are expressed in milliseconds */
+	uint32_t delay;
+	uint32_t max_defer_write_retry;
+
+	bool acquire_reset;
+};
+
 struct aux_engine_dce110 {
-	struct aux_engine base;
+	struct dce_aux base;
 	const struct dce110_aux_registers *regs;
 	struct {
 		uint32_t aux_control;
@@ -96,16 +110,22 @@ struct aux_engine_dce110_init_data {
 	const struct dce110_aux_registers *regs;
 };
 
-struct aux_engine *dce110_aux_engine_construct(
+struct dce_aux *dce110_aux_engine_construct(
 		struct aux_engine_dce110 *aux_engine110,
 		struct dc_context *ctx,
 		uint32_t inst,
 		uint32_t timeout_period,
 		const struct dce110_aux_registers *regs);
 
-void dce110_engine_destroy(struct aux_engine **engine);
+void dce110_engine_destroy(struct dce_aux **engine);
 
 bool dce110_aux_engine_acquire(
-	struct aux_engine *aux_engine,
+	struct dce_aux *aux_engine,
 	struct ddc *ddc);
+
+int dce_aux_transfer(struct ddc_service *ddc,
+		struct aux_payload *cmd);
+
+bool dce_aux_transfer_with_retries(struct ddc_service *ddc,
+		struct aux_payload *cmd);
 #endif
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c
index 7a72ee46f14b..6e142c2db986 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.c
@@ -194,8 +194,8 @@ static uint32_t get_max_pixel_clock_for_all_paths(struct dc_state *context)
 		if (pipe_ctx->top_pipe)
 			continue;
 
-		if (pipe_ctx->stream_res.pix_clk_params.requested_pix_clk > max_pix_clk)
-			max_pix_clk = pipe_ctx->stream_res.pix_clk_params.requested_pix_clk;
+		if (pipe_ctx->stream_res.pix_clk_params.requested_pix_clk_100hz / 10 > max_pix_clk)
+			max_pix_clk = pipe_ctx->stream_res.pix_clk_params.requested_pix_clk_100hz / 10;
 
 		/* raise clock state for HBR3/2 if required. Confirmed with HW DCE/DPCS
 		 * logic for HBR3 still needs Nominal (0.8V) on VDDC rail
@@ -257,7 +257,7 @@ static int dce_set_clock(
 				clk_mgr_dce->dentist_vco_freq_khz / 64);
 
 	/* Prepare to program display clock*/
-	pxl_clk_params.target_pixel_clock = requested_clk_khz;
+	pxl_clk_params.target_pixel_clock_100hz = requested_clk_khz * 10;
 	pxl_clk_params.pll_id = CLOCK_SOURCE_ID_DFS;
 
 	if (clk_mgr_dce->dfs_bypass_active)
@@ -450,6 +450,42 @@ void dce_clock_read_ss_info(struct dce_clk_mgr *clk_mgr_dce)
 	}
 }
 
+/**
+ * dce121_clock_patch_xgmi_ss_info() - Save XGMI spread spectrum info
+ * @clk_mgr: clock manager base structure
+ *
+ * Reads from VBIOS the XGMI spread spectrum info and saves it within
+ * the dce clock manager. This operation will overwrite the existing dprefclk
+ * SS values if the vBIOS query succeeds. Otherwise, it does nothing. It also
+ * sets the ->xgmi_enabled flag.
+ */
+void dce121_clock_patch_xgmi_ss_info(struct clk_mgr *clk_mgr)
+{
+	struct dce_clk_mgr *clk_mgr_dce = TO_DCE_CLK_MGR(clk_mgr);
+	enum bp_result result;
+	struct spread_spectrum_info info = { { 0 } };
+	struct dc_bios *bp = clk_mgr_dce->base.ctx->dc_bios;
+
+	clk_mgr_dce->xgmi_enabled = false;
+
+	result = bp->funcs->get_spread_spectrum_info(bp, AS_SIGNAL_TYPE_XGMI,
+						     0, &info);
+	if (result == BP_RESULT_OK && info.spread_spectrum_percentage != 0) {
+		clk_mgr_dce->xgmi_enabled = true;
+		clk_mgr_dce->ss_on_dprefclk = true;
+		clk_mgr_dce->dprefclk_ss_divider =
+				info.spread_percentage_divider;
+
+		if (info.type.CENTER_MODE == 0) {
+			/* Currently for DP Reference clock we
+			 * need only SS percentage for
+			 * downspread */
+			clk_mgr_dce->dprefclk_ss_percentage =
+					info.spread_spectrum_percentage;
+		}
+	}
+}
+
 void dce110_fill_display_configs(
 	const struct dc_state *context,
 	struct dm_pp_display_configuration *pp_display_cfg)
@@ -483,18 +519,18 @@ void dce110_fill_display_configs(
 		cfg->src_height = stream->src.height;
 		cfg->src_width = stream->src.width;
 		cfg->ddi_channel_mapping =
-			stream->sink->link->ddi_channel_mapping.raw;
+			stream->link->ddi_channel_mapping.raw;
 		cfg->transmitter =
-			stream->sink->link->link_enc->transmitter;
+			stream->link->link_enc->transmitter;
 		cfg->link_settings.lane_count =
-			stream->sink->link->cur_link_settings.lane_count;
+			stream->link->cur_link_settings.lane_count;
 		cfg->link_settings.link_rate =
-			stream->sink->link->cur_link_settings.link_rate;
+			stream->link->cur_link_settings.link_rate;
 		cfg->link_settings.link_spread =
-			stream->sink->link->cur_link_settings.link_spread;
+			stream->link->cur_link_settings.link_spread;
 		cfg->sym_clock = stream->phy_pix_clk;
 		/* Round v_refresh*/
-		cfg->v_refresh = stream->timing.pix_clk_khz * 1000;
+		cfg->v_refresh = stream->timing.pix_clk_100hz * 100;
 		cfg->v_refresh /= stream->timing.h_total;
 		cfg->v_refresh = (cfg->v_refresh + stream->timing.v_total / 2)
 							/ stream->timing.v_total;
@@ -518,7 +554,7 @@ static uint32_t dce110_get_min_vblank_time_us(const struct dc_state *context)
 			 - stream->timing.v_addressable);
 
 		vertical_blank_time = vertical_blank_in_pixels
-			* 1000 / stream->timing.pix_clk_khz;
+			* 10000 / stream->timing.pix_clk_100hz;
 
 		if (min_vertical_blank_time > vertical_blank_time)
 			min_vertical_blank_time = vertical_blank_time;
@@ -620,7 +656,7 @@ static void dce11_pplib_apply_display_requirements(
 
 		pp_display_cfg->crtc_index =
 			pp_display_cfg->disp_configs[0].pipe_idx;
-		pp_display_cfg->line_time_in_us = timing->h_total * 1000 / timing->pix_clk_khz;
+		pp_display_cfg->line_time_in_us = timing->h_total * 10000 / timing->pix_clk_100hz;
 	}
 
 	if (memcmp(&dc->current_state->pp_display_cfg, pp_display_cfg, sizeof(*pp_display_cfg)) !=  0)
@@ -633,11 +669,11 @@ static void dce_update_clocks(struct clk_mgr *clk_mgr,
 {
 	struct dce_clk_mgr *clk_mgr_dce = TO_DCE_CLK_MGR(clk_mgr);
 	struct dm_pp_power_level_change_request level_change_req;
-	int unpatched_disp_clk = context->bw.dce.dispclk_khz;
+	int patched_disp_clk = context->bw.dce.dispclk_khz;
 
 	/*TODO: W/A for dal3 linux, investigate why this works */
 	if (!clk_mgr_dce->dfs_bypass_active)
-		context->bw.dce.dispclk_khz = context->bw.dce.dispclk_khz * 115 / 100;
+		patched_disp_clk = patched_disp_clk * 115 / 100;
 
 	level_change_req.power_level = dce_get_required_clocks_state(clk_mgr, context);
 	/* get max clock state from PPLIB */
@@ -647,13 +683,11 @@ static void dce_update_clocks(struct clk_mgr *clk_mgr,
 			clk_mgr_dce->cur_min_clks_state = level_change_req.power_level;
 	}
 
-	if (should_set_clock(safe_to_lower, context->bw.dce.dispclk_khz, clk_mgr->clks.dispclk_khz)) {
-		context->bw.dce.dispclk_khz = dce_set_clock(clk_mgr, context->bw.dce.dispclk_khz);
-		clk_mgr->clks.dispclk_khz = context->bw.dce.dispclk_khz; 
+	if (should_set_clock(safe_to_lower, patched_disp_clk, clk_mgr->clks.dispclk_khz)) {
+		patched_disp_clk = dce_set_clock(clk_mgr, patched_disp_clk);
+		clk_mgr->clks.dispclk_khz = patched_disp_clk;
 	}
 	dce_pplib_apply_display_requirements(clk_mgr->ctx->dc, context);
-
-	context->bw.dce.dispclk_khz = unpatched_disp_clk;
 }
 
 static void dce11_update_clocks(struct clk_mgr *clk_mgr,
@@ -689,11 +723,11 @@ static void dce112_update_clocks(struct clk_mgr *clk_mgr,
 {
 	struct dce_clk_mgr *clk_mgr_dce = TO_DCE_CLK_MGR(clk_mgr);
 	struct dm_pp_power_level_change_request level_change_req;
-	int unpatched_disp_clk = context->bw.dce.dispclk_khz;
+	int patched_disp_clk = context->bw.dce.dispclk_khz;
 
 	/*TODO: W/A for dal3 linux, investigate why this works */
 	if (!clk_mgr_dce->dfs_bypass_active)
-		context->bw.dce.dispclk_khz = context->bw.dce.dispclk_khz * 115 / 100;
+		patched_disp_clk = patched_disp_clk * 115 / 100;
 
 	level_change_req.power_level = dce_get_required_clocks_state(clk_mgr, context);
 	/* get max clock state from PPLIB */
@@ -703,13 +737,11 @@ static void dce112_update_clocks(struct clk_mgr *clk_mgr,
 			clk_mgr_dce->cur_min_clks_state = level_change_req.power_level;
 	}
 
-	if (should_set_clock(safe_to_lower, context->bw.dce.dispclk_khz, clk_mgr->clks.dispclk_khz)) {
-		context->bw.dce.dispclk_khz = dce112_set_clock(clk_mgr, context->bw.dce.dispclk_khz);
-		clk_mgr->clks.dispclk_khz = context->bw.dce.dispclk_khz;
+	if (should_set_clock(safe_to_lower, patched_disp_clk, clk_mgr->clks.dispclk_khz)) {
+		patched_disp_clk = dce112_set_clock(clk_mgr, patched_disp_clk);
+		clk_mgr->clks.dispclk_khz = patched_disp_clk;
 	}
 	dce11_pplib_apply_display_requirements(clk_mgr->ctx->dc, context);
-
-	context->bw.dce.dispclk_khz = unpatched_disp_clk;
 }
 
 static void dce12_update_clocks(struct clk_mgr *clk_mgr,
@@ -719,17 +751,23 @@ static void dce12_update_clocks(struct clk_mgr *clk_mgr,
 	struct dce_clk_mgr *clk_mgr_dce = TO_DCE_CLK_MGR(clk_mgr);
 	struct dm_pp_clock_for_voltage_req clock_voltage_req = {0};
 	int max_pix_clk = get_max_pixel_clock_for_all_paths(context);
-	int unpatched_disp_clk = context->bw.dce.dispclk_khz;
+	int patched_disp_clk = context->bw.dce.dispclk_khz;
 
 	/*TODO: W/A for dal3 linux, investigate why this works */
 	if (!clk_mgr_dce->dfs_bypass_active)
-		context->bw.dce.dispclk_khz = context->bw.dce.dispclk_khz * 115 / 100;
+		patched_disp_clk = patched_disp_clk * 115 / 100;
 
-	if (should_set_clock(safe_to_lower, context->bw.dce.dispclk_khz, clk_mgr->clks.dispclk_khz)) {
+	if (should_set_clock(safe_to_lower, patched_disp_clk, clk_mgr->clks.dispclk_khz)) {
 		clock_voltage_req.clk_type = DM_PP_CLOCK_TYPE_DISPLAY_CLK;
-		clock_voltage_req.clocks_in_khz = context->bw.dce.dispclk_khz;
-		context->bw.dce.dispclk_khz = dce112_set_clock(clk_mgr, context->bw.dce.dispclk_khz);
-		clk_mgr->clks.dispclk_khz = context->bw.dce.dispclk_khz;
+		/*
+		 * When xGMI is enabled, the display clk needs to be adjusted
+		 * with the WAFL link's SS percentage.
+		 */
+		if (clk_mgr_dce->xgmi_enabled)
+			patched_disp_clk = clk_mgr_adjust_dp_ref_freq_for_ss(
+					clk_mgr_dce, patched_disp_clk);
+		clock_voltage_req.clocks_in_khz = patched_disp_clk;
+		clk_mgr->clks.dispclk_khz = dce112_set_clock(clk_mgr, patched_disp_clk);
 
 		dm_pp_apply_clock_for_voltage_request(clk_mgr->ctx, &clock_voltage_req);
 	}
@@ -742,8 +780,6 @@ static void dce12_update_clocks(struct clk_mgr *clk_mgr,
 		dm_pp_apply_clock_for_voltage_request(clk_mgr->ctx, &clock_voltage_req);
 	}
 	dce11_pplib_apply_display_requirements(clk_mgr->ctx->dc, context);
-
-	context->bw.dce.dispclk_khz = unpatched_disp_clk;
 }
 
 static const struct clk_mgr_funcs dce120_funcs = {
@@ -895,6 +931,27 @@ struct clk_mgr *dce120_clk_mgr_create(struct dc_context *ctx)
 	return &clk_mgr_dce->base;
 }
 
+struct clk_mgr *dce121_clk_mgr_create(struct dc_context *ctx)
+{
+	struct dce_clk_mgr *clk_mgr_dce = kzalloc(sizeof(*clk_mgr_dce),
+						  GFP_KERNEL);
+
+	if (clk_mgr_dce == NULL) {
+		BREAK_TO_DEBUGGER();
+		return NULL;
+	}
+
+	memcpy(clk_mgr_dce->max_clks_by_state, dce120_max_clks_by_state,
+	       sizeof(dce120_max_clks_by_state));
+
+	dce_clk_mgr_construct(clk_mgr_dce, ctx, NULL, NULL, NULL);
+
+	clk_mgr_dce->dprefclk_khz = 625000;
+	clk_mgr_dce->base.funcs = &dce120_funcs;
+
+	return &clk_mgr_dce->base;
+}
+
 void dce_clk_mgr_destroy(struct clk_mgr **clk_mgr)
 {
 	struct dce_clk_mgr *clk_mgr_dce = TO_DCE_CLK_MGR(*clk_mgr);
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.h b/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.h
index 3bceb31d910d..c8f8c442142a 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.h
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_clk_mgr.h
@@ -94,11 +94,37 @@ struct dce_clk_mgr {
 	 * This is basically "Crystal Frequency In KHz" (XTALIN) frequency */
 	int dfs_bypass_disp_clk;
 
-	/* Flag for Enabled SS on DPREFCLK */
+	/**
+	 * @ss_on_dprefclk:
+	 *
+	 * True if spread spectrum is enabled on the DP ref clock.
+	 */
 	bool ss_on_dprefclk;
-	/* DPREFCLK SS percentage (if down-spread enabled) */
+
+	/**
+	 * @xgmi_enabled:
+	 *
+	 * True if xGMI is enabled. On VG20, both audio and display clocks need
+	 * to be adjusted with the WAFL link's SS info if xGMI is enabled.
+	 */
+	bool xgmi_enabled;
+
+	/**
+	 * @dprefclk_ss_percentage:
+	 *
+	 * DPREFCLK SS percentage (if down-spread enabled).
+	 *
+	 * Note that if XGMI is enabled, the SS info (percentage and divider)
+	 * from the WAFL link is used instead. This is decided during
+	 * dce_clk_mgr initialization.
+	 */
 	int dprefclk_ss_percentage;
-	/* DPREFCLK SS percentage Divider (100 or 1000) */
+
+	/**
+	 * @dprefclk_ss_divider:
+	 *
+	 * DPREFCLK SS percentage Divider (100 or 1000).
+	 */
 	int dprefclk_ss_divider;
 	int dprefclk_khz;
 
@@ -163,6 +189,9 @@ struct clk_mgr *dce112_clk_mgr_create(
 
 struct clk_mgr *dce120_clk_mgr_create(struct dc_context *ctx);
 
+struct clk_mgr *dce121_clk_mgr_create(struct dc_context *ctx);
+void dce121_clock_patch_xgmi_ss_info(struct clk_mgr *clk_mgr);
+
 void dce_clk_mgr_destroy(struct clk_mgr **clk_mgr);
 
 int dentist_get_divider_from_did(int did);
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c
index 723ce80ed89c..71d5777de961 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_clock_source.c
@@ -108,28 +108,28 @@ static const struct spread_spectrum_data *get_ss_data_entry(
 }
 
 /**
-* Function: calculate_fb_and_fractional_fb_divider
-*
-* * DESCRIPTION: Calculates feedback and fractional feedback dividers values
-*
-*PARAMETERS:
-* targetPixelClock             Desired frequency in 10 KHz
-* ref_divider                  Reference divider (already known)
-* postDivider                  Post Divider (already known)
-* feedback_divider_param       Pointer where to store
-*					calculated feedback divider value
-* fract_feedback_divider_param Pointer where to store
-*					calculated fract feedback divider value
-*
-*RETURNS:
-* It fills the locations pointed by feedback_divider_param
-*					and fract_feedback_divider_param
-* It returns	- true if feedback divider not 0
-*		- false should never happen)
-*/
+ * Function: calculate_fb_and_fractional_fb_divider
+ *
+ * * DESCRIPTION: Calculates feedback and fractional feedback dividers values
+ *
+ *PARAMETERS:
+ * targetPixelClock             Desired frequency in 100 Hz
+ * ref_divider                  Reference divider (already known)
+ * postDivider                  Post Divider (already known)
+ * feedback_divider_param       Pointer where to store
+ *					calculated feedback divider value
+ * fract_feedback_divider_param Pointer where to store
+ *					calculated fract feedback divider value
+ *
+ *RETURNS:
+ * It fills the locations pointed by feedback_divider_param
+ *					and fract_feedback_divider_param
+ * It returns	- true if feedback divider not 0
+ *		- false should never happen)
+ */
 static bool calculate_fb_and_fractional_fb_divider(
 		struct calc_pll_clock_source *calc_pll_cs,
-		uint32_t target_pix_clk_khz,
+		uint32_t target_pix_clk_100hz,
 		uint32_t ref_divider,
 		uint32_t post_divider,
 		uint32_t *feedback_divider_param,
@@ -138,11 +138,11 @@ static bool calculate_fb_and_fractional_fb_divider(
 	uint64_t feedback_divider;
 
 	feedback_divider =
-		(uint64_t)target_pix_clk_khz * ref_divider * post_divider;
+		(uint64_t)target_pix_clk_100hz * ref_divider * post_divider;
 	feedback_divider *= 10;
 	/* additional factor, since we divide by 10 afterwards */
 	feedback_divider *= (uint64_t)(calc_pll_cs->fract_fb_divider_factor);
-	feedback_divider = div_u64(feedback_divider, calc_pll_cs->ref_freq_khz);
+	feedback_divider = div_u64(feedback_divider, calc_pll_cs->ref_freq_khz * 10ull);
 
 /*Round to the number of precision
  * The following code replace the old code (ullfeedbackDivider + 5)/10
@@ -195,36 +195,36 @@ static bool calc_fb_divider_checking_tolerance(
 {
 	uint32_t feedback_divider;
 	uint32_t fract_feedback_divider;
-	uint32_t actual_calculated_clock_khz;
+	uint32_t actual_calculated_clock_100hz;
 	uint32_t abs_err;
-	uint64_t actual_calc_clk_khz;
+	uint64_t actual_calc_clk_100hz;
 
 	calculate_fb_and_fractional_fb_divider(
 			calc_pll_cs,
-			pll_settings->adjusted_pix_clk,
+			pll_settings->adjusted_pix_clk_100hz,
 			ref_divider,
 			post_divider,
 			&feedback_divider,
 			&fract_feedback_divider);
 
 	/*Actual calculated value*/
-	actual_calc_clk_khz = (uint64_t)feedback_divider *
+	actual_calc_clk_100hz = (uint64_t)feedback_divider *
 					calc_pll_cs->fract_fb_divider_factor +
 							fract_feedback_divider;
-	actual_calc_clk_khz *= calc_pll_cs->ref_freq_khz;
-	actual_calc_clk_khz =
-		div_u64(actual_calc_clk_khz,
+	actual_calc_clk_100hz *= calc_pll_cs->ref_freq_khz * 10;
+	actual_calc_clk_100hz =
+		div_u64(actual_calc_clk_100hz,
 			ref_divider * post_divider *
 				calc_pll_cs->fract_fb_divider_factor);
 
-	actual_calculated_clock_khz = (uint32_t)(actual_calc_clk_khz);
+	actual_calculated_clock_100hz = (uint32_t)(actual_calc_clk_100hz);
 
-	abs_err = (actual_calculated_clock_khz >
-					pll_settings->adjusted_pix_clk)
-			? actual_calculated_clock_khz -
-					pll_settings->adjusted_pix_clk
-			: pll_settings->adjusted_pix_clk -
-						actual_calculated_clock_khz;
+	abs_err = (actual_calculated_clock_100hz >
+					pll_settings->adjusted_pix_clk_100hz)
+			? actual_calculated_clock_100hz -
+					pll_settings->adjusted_pix_clk_100hz
+			: pll_settings->adjusted_pix_clk_100hz -
+						actual_calculated_clock_100hz;
 
 	if (abs_err <= tolerance) {
 		/*found good values*/
@@ -233,10 +233,10 @@ static bool calc_fb_divider_checking_tolerance(
 		pll_settings->feedback_divider = feedback_divider;
 		pll_settings->fract_feedback_divider = fract_feedback_divider;
 		pll_settings->pix_clk_post_divider = post_divider;
-		pll_settings->calculated_pix_clk =
-			actual_calculated_clock_khz;
+		pll_settings->calculated_pix_clk_100hz =
+			actual_calculated_clock_100hz;
 		pll_settings->vco_freq =
-			actual_calculated_clock_khz * post_divider;
+			actual_calculated_clock_100hz * post_divider / 10;
 		return true;
 	}
 	return false;
@@ -257,8 +257,8 @@ static bool calc_pll_dividers_in_range(
 
 /* This is err_tolerance / 10000 = 0.0025 - acceptable error of 0.25%
  * This is errorTolerance / 10000 = 0.0001 - acceptable error of 0.01%*/
-	tolerance = (pll_settings->adjusted_pix_clk * err_tolerance) /
-									10000;
+	tolerance = (pll_settings->adjusted_pix_clk_100hz * err_tolerance) /
+									100000;
 	if (tolerance < CALC_PLL_CLK_SRC_ERR_TOLERANCE)
 		tolerance = CALC_PLL_CLK_SRC_ERR_TOLERANCE;
 
@@ -294,7 +294,7 @@ static uint32_t calculate_pixel_clock_pll_dividers(
 	uint32_t min_ref_divider;
 	uint32_t max_ref_divider;
 
-	if (pll_settings->adjusted_pix_clk == 0) {
+	if (pll_settings->adjusted_pix_clk_100hz == 0) {
 		DC_LOG_ERROR(
 			"%s Bad requested pixel clock", __func__);
 		return MAX_PLL_CALC_ERROR;
@@ -306,21 +306,21 @@ static uint32_t calculate_pixel_clock_pll_dividers(
 		max_post_divider = pll_settings->pix_clk_post_divider;
 	} else {
 		min_post_divider = calc_pll_cs->min_pix_clock_pll_post_divider;
-		if (min_post_divider * pll_settings->adjusted_pix_clk <
-						calc_pll_cs->min_vco_khz) {
-			min_post_divider = calc_pll_cs->min_vco_khz /
-					pll_settings->adjusted_pix_clk;
+		if (min_post_divider * pll_settings->adjusted_pix_clk_100hz <
+						calc_pll_cs->min_vco_khz * 10) {
+			min_post_divider = calc_pll_cs->min_vco_khz * 10 /
+					pll_settings->adjusted_pix_clk_100hz;
 			if ((min_post_divider *
-					pll_settings->adjusted_pix_clk) <
-						calc_pll_cs->min_vco_khz)
+					pll_settings->adjusted_pix_clk_100hz) <
+						calc_pll_cs->min_vco_khz * 10)
 				min_post_divider++;
 		}
 
 		max_post_divider = calc_pll_cs->max_pix_clock_pll_post_divider;
-		if (max_post_divider * pll_settings->adjusted_pix_clk
-				> calc_pll_cs->max_vco_khz)
-			max_post_divider = calc_pll_cs->max_vco_khz /
-					pll_settings->adjusted_pix_clk;
+		if (max_post_divider * pll_settings->adjusted_pix_clk_100hz
+				> calc_pll_cs->max_vco_khz * 10)
+			max_post_divider = calc_pll_cs->max_vco_khz * 10 /
+					pll_settings->adjusted_pix_clk_100hz;
 	}
 
 /* 2) Find Reference divider ranges
@@ -392,47 +392,47 @@ static bool pll_adjust_pix_clk(
 		struct pixel_clk_params *pix_clk_params,
 		struct pll_settings *pll_settings)
 {
-	uint32_t actual_pix_clk_khz = 0;
-	uint32_t requested_clk_khz = 0;
+	uint32_t actual_pix_clk_100hz = 0;
+	uint32_t requested_clk_100hz = 0;
 	struct bp_adjust_pixel_clock_parameters bp_adjust_pixel_clock_params = {
 							0 };
 	enum bp_result bp_result;
 	switch (pix_clk_params->signal_type) {
 	case SIGNAL_TYPE_HDMI_TYPE_A: {
-		requested_clk_khz = pix_clk_params->requested_pix_clk;
+		requested_clk_100hz = pix_clk_params->requested_pix_clk_100hz;
 		if (pix_clk_params->pixel_encoding != PIXEL_ENCODING_YCBCR422) {
 			switch (pix_clk_params->color_depth) {
 			case COLOR_DEPTH_101010:
-				requested_clk_khz = (requested_clk_khz * 5) >> 2;
+				requested_clk_100hz = (requested_clk_100hz * 5) >> 2;
 				break; /* x1.25*/
 			case COLOR_DEPTH_121212:
-				requested_clk_khz = (requested_clk_khz * 6) >> 2;
+				requested_clk_100hz = (requested_clk_100hz * 6) >> 2;
 				break; /* x1.5*/
 			case COLOR_DEPTH_161616:
-				requested_clk_khz = requested_clk_khz * 2;
+				requested_clk_100hz = requested_clk_100hz * 2;
 				break; /* x2.0*/
 			default:
 				break;
 			}
 		}
-		actual_pix_clk_khz = requested_clk_khz;
+		actual_pix_clk_100hz = requested_clk_100hz;
 	}
 		break;
 
 	case SIGNAL_TYPE_DISPLAY_PORT:
 	case SIGNAL_TYPE_DISPLAY_PORT_MST:
 	case SIGNAL_TYPE_EDP:
-		requested_clk_khz = pix_clk_params->requested_sym_clk;
-		actual_pix_clk_khz = pix_clk_params->requested_pix_clk;
+		requested_clk_100hz = pix_clk_params->requested_sym_clk * 10;
+		actual_pix_clk_100hz = pix_clk_params->requested_pix_clk_100hz;
 		break;
 
 	default:
-		requested_clk_khz = pix_clk_params->requested_pix_clk;
-		actual_pix_clk_khz = pix_clk_params->requested_pix_clk;
+		requested_clk_100hz = pix_clk_params->requested_pix_clk_100hz;
+		actual_pix_clk_100hz = pix_clk_params->requested_pix_clk_100hz;
 		break;
 	}
 
-	bp_adjust_pixel_clock_params.pixel_clock = requested_clk_khz;
+	bp_adjust_pixel_clock_params.pixel_clock = requested_clk_100hz / 10;
 	bp_adjust_pixel_clock_params.
 		encoder_object_id = pix_clk_params->encoder_object_id;
 	bp_adjust_pixel_clock_params.signal_type = pix_clk_params->signal_type;
@@ -441,9 +441,9 @@ static bool pll_adjust_pix_clk(
 	bp_result = clk_src->bios->funcs->adjust_pixel_clock(
 			clk_src->bios, &bp_adjust_pixel_clock_params);
 	if (bp_result == BP_RESULT_OK) {
-		pll_settings->actual_pix_clk = actual_pix_clk_khz;
-		pll_settings->adjusted_pix_clk =
-			bp_adjust_pixel_clock_params.adjusted_pixel_clock;
+		pll_settings->actual_pix_clk_100hz = actual_pix_clk_100hz;
+		pll_settings->adjusted_pix_clk_100hz =
+			bp_adjust_pixel_clock_params.adjusted_pixel_clock * 10;
 		pll_settings->reference_divider =
 			bp_adjust_pixel_clock_params.reference_divider;
 		pll_settings->pix_clk_post_divider =
@@ -490,7 +490,7 @@ static uint32_t dce110_get_pix_clk_dividers_helper (
 		const struct spread_spectrum_data *ss_data = get_ss_data_entry(
 					clk_src,
 					pix_clk_params->signal_type,
-					pll_settings->adjusted_pix_clk);
+					pll_settings->adjusted_pix_clk_100hz / 10);
 
 		if (NULL != ss_data)
 			pll_settings->ss_percentage = ss_data->percentage;
@@ -502,13 +502,13 @@ static uint32_t dce110_get_pix_clk_dividers_helper (
 		 * to continue. */
 		DC_LOG_ERROR(
 			"%s: Failed to adjust pixel clock!!", __func__);
-		pll_settings->actual_pix_clk =
-				pix_clk_params->requested_pix_clk;
-		pll_settings->adjusted_pix_clk =
-				pix_clk_params->requested_pix_clk;
+		pll_settings->actual_pix_clk_100hz =
+				pix_clk_params->requested_pix_clk_100hz;
+		pll_settings->adjusted_pix_clk_100hz =
+				pix_clk_params->requested_pix_clk_100hz;
 
 		if (dc_is_dp_signal(pix_clk_params->signal_type))
-			pll_settings->adjusted_pix_clk = 100000;
+			pll_settings->adjusted_pix_clk_100hz = 1000000;
 	}
 
 	/* Calculate Dividers */
@@ -533,28 +533,28 @@ static void dce112_get_pix_clk_dividers_helper (
 		struct pll_settings *pll_settings,
 		struct pixel_clk_params *pix_clk_params)
 {
-	uint32_t actualPixelClockInKHz;
+	uint32_t actual_pixel_clock_100hz;
 
-	actualPixelClockInKHz = pix_clk_params->requested_pix_clk;
+	actual_pixel_clock_100hz = pix_clk_params->requested_pix_clk_100hz;
 	/* Calculate Dividers */
 	if (pix_clk_params->signal_type == SIGNAL_TYPE_HDMI_TYPE_A) {
 		switch (pix_clk_params->color_depth) {
 		case COLOR_DEPTH_101010:
-			actualPixelClockInKHz = (actualPixelClockInKHz * 5) >> 2;
+			actual_pixel_clock_100hz = (actual_pixel_clock_100hz * 5) >> 2;
 			break;
 		case COLOR_DEPTH_121212:
-			actualPixelClockInKHz = (actualPixelClockInKHz * 6) >> 2;
+			actual_pixel_clock_100hz = (actual_pixel_clock_100hz * 6) >> 2;
 			break;
 		case COLOR_DEPTH_161616:
-			actualPixelClockInKHz = actualPixelClockInKHz * 2;
+			actual_pixel_clock_100hz = actual_pixel_clock_100hz * 2;
 			break;
 		default:
 			break;
 		}
 	}
-	pll_settings->actual_pix_clk = actualPixelClockInKHz;
-	pll_settings->adjusted_pix_clk = actualPixelClockInKHz;
-	pll_settings->calculated_pix_clk = pix_clk_params->requested_pix_clk;
+	pll_settings->actual_pix_clk_100hz = actual_pixel_clock_100hz;
+	pll_settings->adjusted_pix_clk_100hz = actual_pixel_clock_100hz;
+	pll_settings->calculated_pix_clk_100hz = pix_clk_params->requested_pix_clk_100hz;
 }
 
 static uint32_t dce110_get_pix_clk_dividers(
@@ -567,7 +567,7 @@ static uint32_t dce110_get_pix_clk_dividers(
 	DC_LOGGER_INIT();
 
 	if (pix_clk_params == NULL || pll_settings == NULL
-			|| pix_clk_params->requested_pix_clk == 0) {
+			|| pix_clk_params->requested_pix_clk_100hz == 0) {
 		DC_LOG_ERROR(
 			"%s: Invalid parameters!!\n", __func__);
 		return pll_calc_error;
@@ -577,10 +577,10 @@ static uint32_t dce110_get_pix_clk_dividers(
 
 	if (cs->id == CLOCK_SOURCE_ID_DP_DTO ||
 			cs->id == CLOCK_SOURCE_ID_EXTERNAL) {
-		pll_settings->adjusted_pix_clk = clk_src->ext_clk_khz;
-		pll_settings->calculated_pix_clk = clk_src->ext_clk_khz;
-		pll_settings->actual_pix_clk =
-					pix_clk_params->requested_pix_clk;
+		pll_settings->adjusted_pix_clk_100hz = clk_src->ext_clk_khz * 10;
+		pll_settings->calculated_pix_clk_100hz = clk_src->ext_clk_khz * 10;
+		pll_settings->actual_pix_clk_100hz =
+					pix_clk_params->requested_pix_clk_100hz;
 		return 0;
 	}
 
@@ -599,7 +599,7 @@ static uint32_t dce112_get_pix_clk_dividers(
 	DC_LOGGER_INIT();
 
 	if (pix_clk_params == NULL || pll_settings == NULL
-			|| pix_clk_params->requested_pix_clk == 0) {
+			|| pix_clk_params->requested_pix_clk_100hz == 0) {
 		DC_LOG_ERROR(
 			"%s: Invalid parameters!!\n", __func__);
 		return -1;
@@ -609,10 +609,10 @@ static uint32_t dce112_get_pix_clk_dividers(
 
 	if (cs->id == CLOCK_SOURCE_ID_DP_DTO ||
 			cs->id == CLOCK_SOURCE_ID_EXTERNAL) {
-		pll_settings->adjusted_pix_clk = clk_src->ext_clk_khz;
-		pll_settings->calculated_pix_clk = clk_src->ext_clk_khz;
-		pll_settings->actual_pix_clk =
-					pix_clk_params->requested_pix_clk;
+		pll_settings->adjusted_pix_clk_100hz = clk_src->ext_clk_khz * 10;
+		pll_settings->calculated_pix_clk_100hz = clk_src->ext_clk_khz * 10;
+		pll_settings->actual_pix_clk_100hz =
+					pix_clk_params->requested_pix_clk_100hz;
 		return -1;
 	}
 
@@ -714,7 +714,7 @@ static bool enable_spread_spectrum(
 	ss_data = get_ss_data_entry(
 			clk_src,
 			signal,
-			pll_settings->calculated_pix_clk);
+			pll_settings->calculated_pix_clk_100hz / 10);
 
 /* Pixel clock PLL has been programmed to generate desired pixel clock,
  * now enable SS on pixel clock */
@@ -853,7 +853,7 @@ static bool dce110_program_pix_clk(
 	/*ATOMBIOS expects pixel rate adjusted by deep color ratio)*/
 	bp_pc_params.controller_id = pix_clk_params->controller_id;
 	bp_pc_params.pll_id = clock_source->id;
-	bp_pc_params.target_pixel_clock = pll_settings->actual_pix_clk;
+	bp_pc_params.target_pixel_clock_100hz = pll_settings->actual_pix_clk_100hz;
 	bp_pc_params.encoder_object_id = pix_clk_params->encoder_object_id;
 	bp_pc_params.signal_type = pix_clk_params->signal_type;
 
@@ -903,12 +903,12 @@ static bool dce112_program_pix_clk(
 #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
 	if (IS_FPGA_MAXIMUS_DC(clock_source->ctx->dce_environment)) {
 		unsigned int inst = pix_clk_params->controller_id - CONTROLLER_ID_D0;
-		unsigned dp_dto_ref_kHz = 700000;
-		unsigned clock_kHz = pll_settings->actual_pix_clk;
+		unsigned dp_dto_ref_100hz = 7000000;
+		unsigned clock_100hz = pll_settings->actual_pix_clk_100hz;
 
 		/* Set DTO values: phase = target clock, modulo = reference clock */
-		REG_WRITE(PHASE[inst], clock_kHz);
-		REG_WRITE(MODULO[inst], dp_dto_ref_kHz);
+		REG_WRITE(PHASE[inst], clock_100hz);
+		REG_WRITE(MODULO[inst], dp_dto_ref_100hz);
 
 		/* Enable DTO */
 		REG_UPDATE(PIXEL_RATE_CNTL[inst], DP_DTO0_ENABLE, 1);
@@ -927,7 +927,7 @@ static bool dce112_program_pix_clk(
 	/*ATOMBIOS expects pixel rate adjusted by deep color ratio)*/
 	bp_pc_params.controller_id = pix_clk_params->controller_id;
 	bp_pc_params.pll_id = clock_source->id;
-	bp_pc_params.target_pixel_clock = pll_settings->actual_pix_clk;
+	bp_pc_params.target_pixel_clock_100hz = pll_settings->actual_pix_clk_100hz;
 	bp_pc_params.encoder_object_id = pix_clk_params->encoder_object_id;
 	bp_pc_params.signal_type = pix_clk_params->signal_type;
 
@@ -977,6 +977,28 @@ static bool dce110_clock_source_power_down(
 	return bp_result == BP_RESULT_OK;
 }
 
+static bool get_pixel_clk_frequency_100hz(
+		struct clock_source *clock_source,
+		unsigned int inst,
+		unsigned int *pixel_clk_khz)
+{
+	struct dce110_clk_src *clk_src = TO_DCE110_CLK_SRC(clock_source);
+	unsigned int clock_hz = 0;
+
+	if (clock_source->id == CLOCK_SOURCE_ID_DP_DTO) {
+		clock_hz = REG_READ(PHASE[inst]);
+
+		/* NOTE: There is agreement with VBIOS here that MODULO is
+		 * programmed equal to DPREFCLK, in which case PHASE will be
+		 * equivalent to pixel clock.
+		 */
+		*pixel_clk_khz = clock_hz / 100;
+		return true;
+	}
+
+	return false;
+}
+
 /*****************************************/
 /* Constructor                           */
 /*****************************************/
@@ -984,12 +1006,14 @@ static bool dce110_clock_source_power_down(
 static const struct clock_source_funcs dce112_clk_src_funcs = {
 	.cs_power_down = dce110_clock_source_power_down,
 	.program_pix_clk = dce112_program_pix_clk,
-	.get_pix_clk_dividers = dce112_get_pix_clk_dividers
+	.get_pix_clk_dividers = dce112_get_pix_clk_dividers,
+	.get_pixel_clk_frequency_100hz = get_pixel_clk_frequency_100hz
 };
 static const struct clock_source_funcs dce110_clk_src_funcs = {
 	.cs_power_down = dce110_clock_source_power_down,
 	.program_pix_clk = dce110_program_pix_clk,
-	.get_pix_clk_dividers = dce110_get_pix_clk_dividers
+	.get_pix_clk_dividers = dce110_get_pix_clk_dividers,
+	.get_pixel_clk_frequency_100hz = get_pixel_clk_frequency_100hz
 };
 
 
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c b/drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c
index dea40b322191..c2926cf19dee 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_dmcu.c
@@ -51,7 +51,6 @@
 #define PSR_SET_WAITLOOP 0x31
 #define MCP_INIT_DMCU 0x88
 #define MCP_INIT_IRAM 0x89
-#define MCP_DMCU_VERSION 0x90
 #define MASTER_COMM_CNTL_REG__MASTER_COMM_INTERRUPT_MASK   0x00000001L
 
 static bool dce_dmcu_init(struct dmcu *dmcu)
@@ -317,38 +316,11 @@ static void dce_get_psr_wait_loop(
 }
 
 #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
-static void dcn10_get_dmcu_state(struct dmcu *dmcu)
-{
-	struct dce_dmcu *dmcu_dce = TO_DCE_DMCU(dmcu);
-	uint32_t dmcu_state_offset = 0xf6;
-
-	/* Enable write access to IRAM */
-	REG_UPDATE_2(DMCU_RAM_ACCESS_CTRL,
-			IRAM_HOST_ACCESS_EN, 1,
-			IRAM_RD_ADDR_AUTO_INC, 1);
-
-	REG_WAIT(DMU_MEM_PWR_CNTL, DMCU_IRAM_MEM_PWR_STATE, 0, 2, 10);
-
-	/* Write address to IRAM_RD_ADDR in DMCU_IRAM_RD_CTRL */
-	REG_WRITE(DMCU_IRAM_RD_CTRL, dmcu_state_offset);
-
-	/* Read data from IRAM_RD_DATA in DMCU_IRAM_RD_DATA*/
-	dmcu->dmcu_state = REG_READ(DMCU_IRAM_RD_DATA);
-
-	/* Disable write access to IRAM to allow dynamic sleep state */
-	REG_UPDATE_2(DMCU_RAM_ACCESS_CTRL,
-			IRAM_HOST_ACCESS_EN, 0,
-			IRAM_RD_ADDR_AUTO_INC, 0);
-}
-
 static void dcn10_get_dmcu_version(struct dmcu *dmcu)
 {
 	struct dce_dmcu *dmcu_dce = TO_DCE_DMCU(dmcu);
 	uint32_t dmcu_version_offset = 0xf1;
 
-	/* Clear scratch */
-	REG_WRITE(DC_DMCU_SCRATCH, 0);
-
 	/* Enable write access to IRAM */
 	REG_UPDATE_2(DMCU_RAM_ACCESS_CTRL,
 			IRAM_HOST_ACCESS_EN, 1,
@@ -359,85 +331,74 @@ static void dcn10_get_dmcu_version(struct dmcu *dmcu)
 	/* Write address to IRAM_RD_ADDR and read from DATA register */
 	REG_WRITE(DMCU_IRAM_RD_CTRL, dmcu_version_offset);
 	dmcu->dmcu_version.interface_version = REG_READ(DMCU_IRAM_RD_DATA);
-	dmcu->dmcu_version.year = ((REG_READ(DMCU_IRAM_RD_DATA) << 8) |
+	dmcu->dmcu_version.abm_version = REG_READ(DMCU_IRAM_RD_DATA);
+	dmcu->dmcu_version.psr_version = REG_READ(DMCU_IRAM_RD_DATA);
+	dmcu->dmcu_version.build_version = ((REG_READ(DMCU_IRAM_RD_DATA) << 8) |
 						REG_READ(DMCU_IRAM_RD_DATA));
-	dmcu->dmcu_version.month = REG_READ(DMCU_IRAM_RD_DATA);
-	dmcu->dmcu_version.date = REG_READ(DMCU_IRAM_RD_DATA);
 
 	/* Disable write access to IRAM to allow dynamic sleep state */
 	REG_UPDATE_2(DMCU_RAM_ACCESS_CTRL,
 			IRAM_HOST_ACCESS_EN, 0,
 			IRAM_RD_ADDR_AUTO_INC, 0);
-
-	/* Send MCP command message to DMCU to get version reply from FW.
-	 * We expect this version should match the one in IRAM, otherwise
-	 * something is wrong with DMCU and we should fail and disable UC.
-	 */
-	REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0, 100, 800);
-
-	/* Set command to get DMCU version from microcontroller */
-	REG_UPDATE(MASTER_COMM_CMD_REG, MASTER_COMM_CMD_REG_BYTE0,
-			MCP_DMCU_VERSION);
-
-	/* Notify microcontroller of new command */
-	REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
-
-	/* Ensure command has been executed before continuing */
-	REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0, 100, 800);
-
-	/* Somehow version does not match, so fail and return version 0 */
-	if (dmcu->dmcu_version.interface_version != REG_READ(DC_DMCU_SCRATCH))
-		dmcu->dmcu_version.interface_version = 0;
 }
 
 static bool dcn10_dmcu_init(struct dmcu *dmcu)
 {
 	struct dce_dmcu *dmcu_dce = TO_DCE_DMCU(dmcu);
+	bool status = false;
 
-	/* DMCU FW should populate the scratch register if running */
-	if (REG_READ(DC_DMCU_SCRATCH) == 0)
-		return false;
-
-	/* Check state is uninitialized */
-	dcn10_get_dmcu_state(dmcu);
-
-	/* If microcontroller is already initialized, do nothing */
-	if (dmcu->dmcu_state == DMCU_RUNNING)
-		return true;
-
-	/* Retrieve and cache the DMCU firmware version. */
-	dcn10_get_dmcu_version(dmcu);
-
-	/* Check interface version to confirm firmware is loaded and running */
-	if (dmcu->dmcu_version.interface_version == 0)
-		return false;
+	/*  Definition of DC_DMCU_SCRATCH
+	 *  0 : firmare not loaded
+	 *  1 : PSP load DMCU FW but not initialized
+	 *  2 : Firmware already initialized
+	 */
+	dmcu->dmcu_state = REG_READ(DC_DMCU_SCRATCH);
 
-	/* Wait until microcontroller is ready to process interrupt */
-	REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0, 100, 800);
+	switch (dmcu->dmcu_state) {
+	case DMCU_UNLOADED:
+		status = false;
+		break;
+	case DMCU_LOADED_UNINITIALIZED:
+		/* Wait until microcontroller is ready to process interrupt */
+		REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0, 100, 800);
 
-	/* Set initialized ramping boundary value */
-	REG_WRITE(MASTER_COMM_DATA_REG1, 0xFFFF);
+		/* Set initialized ramping boundary value */
+		REG_WRITE(MASTER_COMM_DATA_REG1, 0xFFFF);
 
-	/* Set command to initialize microcontroller */
-	REG_UPDATE(MASTER_COMM_CMD_REG, MASTER_COMM_CMD_REG_BYTE0,
+		/* Set command to initialize microcontroller */
+		REG_UPDATE(MASTER_COMM_CMD_REG, MASTER_COMM_CMD_REG_BYTE0,
 			MCP_INIT_DMCU);
 
-	/* Notify microcontroller of new command */
-	REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
+		/* Notify microcontroller of new command */
+		REG_UPDATE(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 1);
 
-	/* Ensure command has been executed before continuing */
-	REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0, 100, 800);
+		/* Ensure command has been executed before continuing */
+		REG_WAIT(MASTER_COMM_CNTL_REG, MASTER_COMM_INTERRUPT, 0, 100, 800);
 
-	// Check state is initialized
-	dcn10_get_dmcu_state(dmcu);
+		// Check state is initialized
+		dmcu->dmcu_state = REG_READ(DC_DMCU_SCRATCH);
 
-	// If microcontroller is not in running state, fail
-	if (dmcu->dmcu_state != DMCU_RUNNING)
-		return false;
+		// If microcontroller is not in running state, fail
+		if (dmcu->dmcu_state == DMCU_RUNNING) {
+			/* Retrieve and cache the DMCU firmware version. */
+			dcn10_get_dmcu_version(dmcu);
+			status = true;
+		} else
+			status = false;
 
-	return true;
+		break;
+	case DMCU_RUNNING:
+		status = true;
+		break;
+	default:
+		status = false;
+		break;
+	}
+
+	return status;
 }
 
+
 static bool dcn10_dmcu_load_iram(struct dmcu *dmcu,
 		unsigned int start_offset,
 		const char *src,
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h b/drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h
index c83a7f05f14c..956bdf14503f 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h
@@ -133,6 +133,10 @@
 	SR(DCHUB_AGP_TOP), \
 	BL_REG_LIST()
 
+#define HWSEQ_VG20_REG_LIST() \
+	HWSEQ_DCE120_REG_LIST(),\
+	MMHUB_SR(MC_VM_XGMI_LFB_CNTL)
+
 #define HWSEQ_DCE112_REG_LIST() \
 	HWSEQ_DCE10_REG_LIST(), \
 	HWSEQ_PIXEL_RATE_REG_LIST(CRTC), \
@@ -298,6 +302,7 @@ struct dce_hwseq_registers {
 	uint32_t MC_VM_SYSTEM_APERTURE_DEFAULT_ADDR_LSB;
 	uint32_t MC_VM_SYSTEM_APERTURE_LOW_ADDR;
 	uint32_t MC_VM_SYSTEM_APERTURE_HIGH_ADDR;
+	uint32_t MC_VM_XGMI_LFB_CNTL;
 	uint32_t AZALIA_AUDIO_DTO;
 	uint32_t AZALIA_CONTROLLER_CLOCK_GATING;
 };
@@ -382,6 +387,11 @@ struct dce_hwseq_registers {
 	HWS_SF(, LVTMA_PWRSEQ_CNTL, LVTMA_BLON, mask_sh), \
 	HWS_SF(, LVTMA_PWRSEQ_STATE, LVTMA_PWRSEQ_TARGET_STATE_R, mask_sh)
 
+#define HWSEQ_VG20_MASK_SH_LIST(mask_sh)\
+	HWSEQ_DCE12_MASK_SH_LIST(mask_sh),\
+	HWS_SF(, MC_VM_XGMI_LFB_CNTL, PF_LFB_REGION, mask_sh),\
+	HWS_SF(, MC_VM_XGMI_LFB_CNTL, PF_MAX_REGION, mask_sh)
+
 #define HWSEQ_DCN_MASK_SH_LIST(mask_sh)\
 	HWSEQ_PIXEL_RATE_MASK_SH_LIST(mask_sh, OTG0_),\
 	HWS_SF1(OTG0_, PHYPLL_PIXEL_RATE_CNTL, PHYPLL_PIXEL_RATE_SOURCE, mask_sh), \
@@ -470,6 +480,8 @@ struct dce_hwseq_registers {
 	type PHYSICAL_PAGE_NUMBER_MSB;\
 	type PHYSICAL_PAGE_NUMBER_LSB;\
 	type LOGICAL_ADDR; \
+	type PF_LFB_REGION;\
+	type PF_MAX_REGION;\
 	type ENABLE_L1_TLB;\
 	type SYSTEM_ACCESS_MODE;\
 	type LVTMA_BLON;\
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c
index 3e18ea84b1f9..314c04a915d2 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c
@@ -599,12 +599,12 @@ bool dce110_link_encoder_validate_dvi_output(
 	if ((connector_signal == SIGNAL_TYPE_DVI_SINGLE_LINK ||
 		connector_signal == SIGNAL_TYPE_HDMI_TYPE_A) &&
 		signal != SIGNAL_TYPE_HDMI_TYPE_A &&
-		crtc_timing->pix_clk_khz > TMDS_MAX_PIXEL_CLOCK)
+		crtc_timing->pix_clk_100hz > (TMDS_MAX_PIXEL_CLOCK * 10))
 		return false;
-	if (crtc_timing->pix_clk_khz < TMDS_MIN_PIXEL_CLOCK)
+	if (crtc_timing->pix_clk_100hz < (TMDS_MIN_PIXEL_CLOCK * 10))
 		return false;
 
-	if (crtc_timing->pix_clk_khz > max_pixel_clock)
+	if (crtc_timing->pix_clk_100hz > (max_pixel_clock * 10))
 		return false;
 
 	/* DVI supports 6/8bpp single-link and 10/16bpp dual-link */
@@ -788,7 +788,7 @@ bool dce110_link_encoder_validate_output_with_stream(
 	case SIGNAL_TYPE_DVI_DUAL_LINK:
 		is_valid = dce110_link_encoder_validate_dvi_output(
 			enc110,
-			stream->sink->link->connector_signal,
+			stream->link->connector_signal,
 			stream->signal,
 			&stream->timing);
 	break;
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c
index 85686d917636..a24a2bda8656 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_mem_input.c
@@ -479,7 +479,7 @@ static void program_grph_pixel_format(
 	case SURFACE_PIXEL_FORMAT_GRPH_ABGR16161616F:
 		sign = 1;
 		floating = 1;
-		/* no break */
+		/* fall through */
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616F: /* shouldn't this get float too? */
 	case SURFACE_PIXEL_FORMAT_GRPH_ARGB16161616:
 		grph_depth = 3;
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_stream_encoder.c b/drivers/gpu/drm/amd/display/dc/dce/dce_stream_encoder.c
index cce0d18f91da..1fa2d4fd7a35 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_stream_encoder.c
@@ -288,9 +288,18 @@ static void dce110_stream_encoder_dp_set_stream_attribute(
 #endif
 
 	struct dce110_stream_encoder *enc110 = DCE110STRENC_FROM_STRENC(enc);
-
+	struct dc_crtc_timing hw_crtc_timing = *crtc_timing;
+	if (hw_crtc_timing.flags.INTERLACE) {
+		/*the input timing is in VESA spec format with Interlace flag =1*/
+		hw_crtc_timing.v_total /= 2;
+		hw_crtc_timing.v_border_top /= 2;
+		hw_crtc_timing.v_addressable /= 2;
+		hw_crtc_timing.v_border_bottom /= 2;
+		hw_crtc_timing.v_front_porch /= 2;
+		hw_crtc_timing.v_sync_width /= 2;
+	}
 	/* set pixel encoding */
-	switch (crtc_timing->pixel_encoding) {
+	switch (hw_crtc_timing.pixel_encoding) {
 	case PIXEL_ENCODING_YCBCR422:
 		REG_UPDATE(DP_PIXEL_FORMAT, DP_PIXEL_ENCODING,
 				DP_PIXEL_ENCODING_TYPE_YCBCR422);
@@ -299,8 +308,8 @@ static void dce110_stream_encoder_dp_set_stream_attribute(
 		REG_UPDATE(DP_PIXEL_FORMAT, DP_PIXEL_ENCODING,
 				DP_PIXEL_ENCODING_TYPE_YCBCR444);
 
-		if (crtc_timing->flags.Y_ONLY)
-			if (crtc_timing->display_color_depth != COLOR_DEPTH_666)
+		if (hw_crtc_timing.flags.Y_ONLY)
+			if (hw_crtc_timing.display_color_depth != COLOR_DEPTH_666)
 				/* HW testing only, no use case yet.
 				 * Color depth of Y-only could be
 				 * 8, 10, 12, 16 bits */
@@ -335,7 +344,7 @@ static void dce110_stream_encoder_dp_set_stream_attribute(
 
 	/* set color depth */
 
-	switch (crtc_timing->display_color_depth) {
+	switch (hw_crtc_timing.display_color_depth) {
 	case COLOR_DEPTH_666:
 		REG_UPDATE(DP_PIXEL_FORMAT, DP_COMPONENT_DEPTH,
 				0);
@@ -363,7 +372,7 @@ static void dce110_stream_encoder_dp_set_stream_attribute(
 
 
 #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
-	switch (crtc_timing->display_color_depth) {
+	switch (hw_crtc_timing.display_color_depth) {
 	case COLOR_DEPTH_666:
 		colorimetry_bpc = 0;
 		break;
@@ -401,9 +410,9 @@ static void dce110_stream_encoder_dp_set_stream_attribute(
 			misc0 = misc0 | 0x8; /* bit3=1, bit4=0 */
 			misc1 = misc1 & ~0x80; /* bit7 = 0*/
 			dynamic_range_ycbcr = 0; /*bt601*/
-			if (crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR422)
+			if (hw_crtc_timing.pixel_encoding == PIXEL_ENCODING_YCBCR422)
 				misc0 = misc0 | 0x2; /* bit2=0, bit1=1 */
-			else if (crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR444)
+			else if (hw_crtc_timing.pixel_encoding == PIXEL_ENCODING_YCBCR444)
 				misc0 = misc0 | 0x4; /* bit2=1, bit1=0 */
 			break;
 		case COLOR_SPACE_YCBCR709:
@@ -411,9 +420,9 @@ static void dce110_stream_encoder_dp_set_stream_attribute(
 			misc0 = misc0 | 0x18; /* bit3=1, bit4=1 */
 			misc1 = misc1 & ~0x80; /* bit7 = 0*/
 			dynamic_range_ycbcr = 1; /*bt709*/
-			if (crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR422)
+			if (hw_crtc_timing.pixel_encoding == PIXEL_ENCODING_YCBCR422)
 				misc0 = misc0 | 0x2; /* bit2=0, bit1=1 */
-			else if (crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR444)
+			else if (hw_crtc_timing.pixel_encoding == PIXEL_ENCODING_YCBCR444)
 				misc0 = misc0 | 0x4; /* bit2=1, bit1=0 */
 			break;
 		case COLOR_SPACE_2020_RGB_LIMITEDRANGE:
@@ -453,27 +462,27 @@ static void dce110_stream_encoder_dp_set_stream_attribute(
 	 */
 		if (REG(DP_MSA_TIMING_PARAM1))
 			REG_SET_2(DP_MSA_TIMING_PARAM1, 0,
-					DP_MSA_HTOTAL, crtc_timing->h_total,
-					DP_MSA_VTOTAL, crtc_timing->v_total);
+					DP_MSA_HTOTAL, hw_crtc_timing.h_total,
+					DP_MSA_VTOTAL, hw_crtc_timing.v_total);
 #endif
 
 		/* calcuate from vesa timing parameters
 		 * h_active_start related to leading edge of sync
 		 */
 
-		h_blank = crtc_timing->h_total - crtc_timing->h_border_left -
-				crtc_timing->h_addressable - crtc_timing->h_border_right;
+		h_blank = hw_crtc_timing.h_total - hw_crtc_timing.h_border_left -
+				hw_crtc_timing.h_addressable - hw_crtc_timing.h_border_right;
 
-		h_back_porch = h_blank - crtc_timing->h_front_porch -
-				crtc_timing->h_sync_width;
+		h_back_porch = h_blank - hw_crtc_timing.h_front_porch -
+				hw_crtc_timing.h_sync_width;
 
 		/* start at begining of left border */
-		h_active_start = crtc_timing->h_sync_width + h_back_porch;
+		h_active_start = hw_crtc_timing.h_sync_width + h_back_porch;
 
 
-		v_active_start = crtc_timing->v_total - crtc_timing->v_border_top -
-				crtc_timing->v_addressable - crtc_timing->v_border_bottom -
-				crtc_timing->v_front_porch;
+		v_active_start = hw_crtc_timing.v_total - hw_crtc_timing.v_border_top -
+				hw_crtc_timing.v_addressable - hw_crtc_timing.v_border_bottom -
+				hw_crtc_timing.v_front_porch;
 
 
 #if defined(CONFIG_DRM_AMD_DC_DCN1_0)
@@ -486,21 +495,21 @@ static void dce110_stream_encoder_dp_set_stream_attribute(
 		if (REG(DP_MSA_TIMING_PARAM3))
 			REG_SET_4(DP_MSA_TIMING_PARAM3, 0,
 					DP_MSA_HSYNCWIDTH,
-					crtc_timing->h_sync_width,
+					hw_crtc_timing.h_sync_width,
 					DP_MSA_HSYNCPOLARITY,
-					!crtc_timing->flags.HSYNC_POSITIVE_POLARITY,
+					!hw_crtc_timing.flags.HSYNC_POSITIVE_POLARITY,
 					DP_MSA_VSYNCWIDTH,
-					crtc_timing->v_sync_width,
+					hw_crtc_timing.v_sync_width,
 					DP_MSA_VSYNCPOLARITY,
-					!crtc_timing->flags.VSYNC_POSITIVE_POLARITY);
+					!hw_crtc_timing.flags.VSYNC_POSITIVE_POLARITY);
 
 		/* HWDITH include border or overscan */
 		if (REG(DP_MSA_TIMING_PARAM4))
 			REG_SET_2(DP_MSA_TIMING_PARAM4, 0,
-				DP_MSA_HWIDTH, crtc_timing->h_border_left +
-				crtc_timing->h_addressable + crtc_timing->h_border_right,
-				DP_MSA_VHEIGHT, crtc_timing->v_border_top +
-				crtc_timing->v_addressable + crtc_timing->v_border_bottom);
+				DP_MSA_HWIDTH, hw_crtc_timing.h_border_left +
+				hw_crtc_timing.h_addressable + hw_crtc_timing.h_border_right,
+				DP_MSA_VHEIGHT, hw_crtc_timing.v_border_top +
+				hw_crtc_timing.v_addressable + hw_crtc_timing.v_border_bottom);
 #endif
 	}
 #endif
@@ -662,7 +671,7 @@ static void dce110_stream_encoder_dvi_set_stream_attribute(
 	cntl.signal = is_dual_link ?
 			SIGNAL_TYPE_DVI_DUAL_LINK : SIGNAL_TYPE_DVI_SINGLE_LINK;
 	cntl.enable_dp_audio = false;
-	cntl.pixel_clock = crtc_timing->pix_clk_khz;
+	cntl.pixel_clock = crtc_timing->pix_clk_100hz / 10;
 	cntl.lanes_number = (is_dual_link) ? LANE_COUNT_EIGHT : LANE_COUNT_FOUR;
 
 	if (enc110->base.bp->funcs->encoder_control(
@@ -686,7 +695,7 @@ static void dce110_stream_encoder_lvds_set_stream_attribute(
 	cntl.engine_id = enc110->base.id;
 	cntl.signal = SIGNAL_TYPE_LVDS;
 	cntl.enable_dp_audio = false;
-	cntl.pixel_clock = crtc_timing->pix_clk_khz;
+	cntl.pixel_clock = crtc_timing->pix_clk_100hz / 10;
 	cntl.lanes_number = LANE_COUNT_FOUR;
 
 	if (enc110->base.bp->funcs->encoder_control(
@@ -1575,6 +1584,14 @@ static void setup_stereo_sync(
 	REG_UPDATE(DIG_FE_CNTL, DIG_STEREOSYNC_GATE_EN, !enable);
 }
 
+static void dig_connect_to_otg(
+	struct stream_encoder *enc,
+	int tg_inst)
+{
+	struct dce110_stream_encoder *enc110 = DCE110STRENC_FROM_STRENC(enc);
+
+	REG_UPDATE(DIG_FE_CNTL, DIG_SOURCE_SELECT, tg_inst);
+}
 
 static const struct stream_encoder_funcs dce110_str_enc_funcs = {
 	.dp_set_stream_attribute =
@@ -1609,7 +1626,7 @@ static const struct stream_encoder_funcs dce110_str_enc_funcs = {
 	.hdmi_audio_disable = dce110_se_hdmi_audio_disable,
 	.setup_stereo_sync  = setup_stereo_sync,
 	.set_avmute = dce110_stream_encoder_set_avmute,
-
+	.dig_connect_to_otg  = dig_connect_to_otg,
 };
 
 void dce110_stream_encoder_construct(
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_stream_encoder.h b/drivers/gpu/drm/amd/display/dc/dce/dce_stream_encoder.h
index 6c28229c76eb..f9cdf2b5242c 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_stream_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_stream_encoder.h
@@ -199,7 +199,8 @@
 	SE_SF(DP_SEC_CNTL, DP_SEC_ATP_ENABLE, mask_sh),\
 	SE_SF(DP_SEC_CNTL, DP_SEC_AIP_ENABLE, mask_sh),\
 	SE_SF(DP_SEC_CNTL, DP_SEC_ACM_ENABLE, mask_sh),\
-	SE_SF(AFMT_AUDIO_PACKET_CONTROL, AFMT_AUDIO_SAMPLE_SEND, mask_sh)
+	SE_SF(AFMT_AUDIO_PACKET_CONTROL, AFMT_AUDIO_SAMPLE_SEND, mask_sh),\
+	SE_SF(DIG_FE_CNTL, DIG_SOURCE_SELECT, mask_sh)
 
 #define SE_COMMON_MASK_SH_LIST_DCE_COMMON(mask_sh)\
 	SE_COMMON_MASK_SH_LIST_DCE_COMMON_BASE(mask_sh)
@@ -284,7 +285,8 @@
 	SE_SF(DIG0_DIG_FE_CNTL, TMDS_PIXEL_ENCODING, mask_sh),\
 	SE_SF(DIG0_DIG_FE_CNTL, TMDS_COLOR_FORMAT, mask_sh),\
 	SE_SF(DIG0_DIG_FE_CNTL, DIG_STEREOSYNC_SELECT, mask_sh),\
-	SE_SF(DIG0_DIG_FE_CNTL, DIG_STEREOSYNC_GATE_EN, mask_sh)
+	SE_SF(DIG0_DIG_FE_CNTL, DIG_STEREOSYNC_GATE_EN, mask_sh),\
+	SE_SF(DIG0_DIG_FE_CNTL, DIG_SOURCE_SELECT, mask_sh)
 
 #define SE_COMMON_MASK_SH_LIST_SOC(mask_sh)\
 	SE_COMMON_MASK_SH_LIST_SOC_BASE(mask_sh)
@@ -494,6 +496,7 @@ struct dce_stream_encoder_shift {
 	uint8_t HDMI_DB_DISABLE;
 	uint8_t DP_VID_N_MUL;
 	uint8_t DP_VID_M_DOUBLE_VALUE_EN;
+	uint8_t DIG_SOURCE_SELECT;
 };
 
 struct dce_stream_encoder_mask {
@@ -624,6 +627,7 @@ struct dce_stream_encoder_mask {
 	uint32_t HDMI_DB_DISABLE;
 	uint32_t DP_VID_N_MUL;
 	uint32_t DP_VID_M_DOUBLE_VALUE_EN;
+	uint32_t DIG_SOURCE_SELECT;
 };
 
 struct dce110_stream_enc_registers {
diff --git a/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c b/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c
index 6ae51a5dfc04..23044e6723e8 100644
--- a/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce100/dce100_resource.c
@@ -76,6 +76,7 @@
 
 #ifndef mmBIOS_SCRATCH_2
 	#define mmBIOS_SCRATCH_2 0x05CB
+	#define mmBIOS_SCRATCH_3 0x05CC
 	#define mmBIOS_SCRATCH_6 0x05CF
 #endif
 
@@ -365,6 +366,7 @@ static const struct dce_abm_mask abm_mask = {
 #define DCFE_MEM_PWR_CTRL_REG_BASE 0x1b03
 
 static const struct bios_registers bios_regs = {
+	.BIOS_SCRATCH_3 = mmBIOS_SCRATCH_3,
 	.BIOS_SCRATCH_6 = mmBIOS_SCRATCH_6
 };
 
@@ -587,7 +589,7 @@ struct output_pixel_processor *dce100_opp_create(
 	return &opp->base;
 }
 
-struct aux_engine *dce100_aux_engine_create(
+struct dce_aux *dce100_aux_engine_create(
 	struct dc_context *ctx,
 	uint32_t inst)
 {
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_compressor.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_compressor.c
index 52d50e24a995..7b23239d33fe 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_compressor.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_compressor.c
@@ -62,8 +62,6 @@ static const struct dce110_compressor_reg_offsets reg_offsets[] = {
 }
 };
 
-static const uint32_t dce11_one_lpt_channel_max_resolution = 2560 * 1600;
-
 static uint32_t align_to_chunks_number_per_line(uint32_t pixels)
 {
 	return 256 * ((pixels + 255) / 256);
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index 8f09b8625c5d..5e4db3712eef 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -614,55 +614,6 @@ dce110_set_output_transfer_func(struct pipe_ctx *pipe_ctx,
 	return true;
 }
 
-static enum dc_status bios_parser_crtc_source_select(
-		struct pipe_ctx *pipe_ctx)
-{
-	struct dc_bios *dcb;
-	/* call VBIOS table to set CRTC source for the HW
-	 * encoder block
-	 * note: video bios clears all FMT setting here. */
-	struct bp_crtc_source_select crtc_source_select = {0};
-	const struct dc_sink *sink = pipe_ctx->stream->sink;
-
-	crtc_source_select.engine_id = pipe_ctx->stream_res.stream_enc->id;
-	crtc_source_select.controller_id = pipe_ctx->stream_res.tg->inst + 1;
-	/*TODO: Need to un-hardcode color depth, dp_audio and account for
-	 * the case where signal and sink signal is different (translator
-	 * encoder)*/
-	crtc_source_select.signal = pipe_ctx->stream->signal;
-	crtc_source_select.enable_dp_audio = false;
-	crtc_source_select.sink_signal = pipe_ctx->stream->signal;
-
-	switch (pipe_ctx->stream->timing.display_color_depth) {
-	case COLOR_DEPTH_666:
-		crtc_source_select.display_output_bit_depth = PANEL_6BIT_COLOR;
-		break;
-	case COLOR_DEPTH_888:
-		crtc_source_select.display_output_bit_depth = PANEL_8BIT_COLOR;
-		break;
-	case COLOR_DEPTH_101010:
-		crtc_source_select.display_output_bit_depth = PANEL_10BIT_COLOR;
-		break;
-	case COLOR_DEPTH_121212:
-		crtc_source_select.display_output_bit_depth = PANEL_12BIT_COLOR;
-		break;
-	default:
-		BREAK_TO_DEBUGGER();
-		crtc_source_select.display_output_bit_depth = PANEL_8BIT_COLOR;
-		break;
-	}
-
-	dcb = sink->ctx->dc_bios;
-
-	if (BP_RESULT_OK != dcb->funcs->crtc_source_select(
-		dcb,
-		&crtc_source_select)) {
-		return DC_ERROR_UNEXPECTED;
-	}
-
-	return DC_OK;
-}
-
 void dce110_update_info_frame(struct pipe_ctx *pipe_ctx)
 {
 	bool is_hdmi;
@@ -692,10 +643,10 @@ void dce110_update_info_frame(struct pipe_ctx *pipe_ctx)
 void dce110_enable_stream(struct pipe_ctx *pipe_ctx)
 {
 	enum dc_lane_count lane_count =
-		pipe_ctx->stream->sink->link->cur_link_settings.lane_count;
+		pipe_ctx->stream->link->cur_link_settings.lane_count;
 
 	struct dc_crtc_timing *timing = &pipe_ctx->stream->timing;
-	struct dc_link *link = pipe_ctx->stream->sink->link;
+	struct dc_link *link = pipe_ctx->stream->link;
 
 
 	uint32_t active_total_with_borders;
@@ -1053,7 +1004,7 @@ void dce110_disable_audio_stream(struct pipe_ctx *pipe_ctx, int option)
 void dce110_disable_stream(struct pipe_ctx *pipe_ctx, int option)
 {
 	struct dc_stream_state *stream = pipe_ctx->stream;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 	struct dc *dc = pipe_ctx->stream->ctx->dc;
 
 	if (dc_is_hdmi_signal(pipe_ctx->stream->signal))
@@ -1078,11 +1029,10 @@ void dce110_unblank_stream(struct pipe_ctx *pipe_ctx,
 {
 	struct encoder_unblank_param params = { { 0 } };
 	struct dc_stream_state *stream = pipe_ctx->stream;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 
 	/* only 3 items below are used by unblank */
-	params.pixel_clk_khz =
-		pipe_ctx->stream->timing.pix_clk_khz;
+	params.pixel_clk_khz = pipe_ctx->stream->timing.pix_clk_100hz / 10;
 	params.link_settings.link_rate = link_settings->link_rate;
 
 	if (dc_is_dp_signal(pipe_ctx->stream->signal))
@@ -1092,10 +1042,11 @@ void dce110_unblank_stream(struct pipe_ctx *pipe_ctx,
 		link->dc->hwss.edp_backlight_control(link, true);
 	}
 }
+
 void dce110_blank_stream(struct pipe_ctx *pipe_ctx)
 {
 	struct dc_stream_state *stream = pipe_ctx->stream;
-	struct dc_link *link = stream->sink->link;
+	struct dc_link *link = stream->link;
 
 	if (link->local_sink && link->local_sink->sink_signal == SIGNAL_TYPE_EDP) {
 		link->dc->hwss.edp_backlight_control(link, false);
@@ -1168,27 +1119,27 @@ static void build_audio_output(
 			stream->timing.flags.INTERLACE;
 
 	audio_output->crtc_info.refresh_rate =
-		(stream->timing.pix_clk_khz*1000)/
+		(stream->timing.pix_clk_100hz*10000)/
 		(stream->timing.h_total*stream->timing.v_total);
 
 	audio_output->crtc_info.color_depth =
 		stream->timing.display_color_depth;
 
 	audio_output->crtc_info.requested_pixel_clock =
-			pipe_ctx->stream_res.pix_clk_params.requested_pix_clk;
+			pipe_ctx->stream_res.pix_clk_params.requested_pix_clk_100hz / 10;
 
 	audio_output->crtc_info.calculated_pixel_clock =
-			pipe_ctx->stream_res.pix_clk_params.requested_pix_clk;
+			pipe_ctx->stream_res.pix_clk_params.requested_pix_clk_100hz / 10;
 
 /*for HDMI, audio ACR is with deep color ratio factor*/
 	if (dc_is_hdmi_signal(pipe_ctx->stream->signal) &&
 		audio_output->crtc_info.requested_pixel_clock ==
-				stream->timing.pix_clk_khz) {
+				(stream->timing.pix_clk_100hz / 10)) {
 		if (pipe_ctx->stream_res.pix_clk_params.pixel_encoding == PIXEL_ENCODING_YCBCR420) {
 			audio_output->crtc_info.requested_pixel_clock =
 					audio_output->crtc_info.requested_pixel_clock/2;
 			audio_output->crtc_info.calculated_pixel_clock =
-					pipe_ctx->stream_res.pix_clk_params.requested_pix_clk/2;
+					pipe_ctx->stream_res.pix_clk_params.requested_pix_clk_100hz/20;
 
 		}
 	}
@@ -1299,8 +1250,6 @@ static enum dc_status dce110_enable_stream_timing(
 	struct pipe_ctx *pipe_ctx_old = &dc->current_state->res_ctx.
 			pipe_ctx[pipe_ctx->pipe_idx];
 	struct tg_color black_color = {0};
-	struct drr_params params = {0};
-	unsigned int event_triggers = 0;
 
 	if (!pipe_ctx_old->stream) {
 
@@ -1329,20 +1278,6 @@ static enum dc_status dce110_enable_stream_timing(
 				pipe_ctx->stream_res.tg,
 				&stream->timing,
 				true);
-
-		params.vertical_total_min = stream->adjust.v_total_min;
-		params.vertical_total_max = stream->adjust.v_total_max;
-		if (pipe_ctx->stream_res.tg->funcs->set_drr)
-			pipe_ctx->stream_res.tg->funcs->set_drr(
-				pipe_ctx->stream_res.tg, &params);
-
-		// DRR should set trigger event to monitor surface update event
-		if (stream->adjust.v_total_min != 0 &&
-				stream->adjust.v_total_max != 0)
-			event_triggers = 0x80;
-		if (pipe_ctx->stream_res.tg->funcs->set_static_screen_control)
-			pipe_ctx->stream_res.tg->funcs->set_static_screen_control(
-				pipe_ctx->stream_res.tg, event_triggers);
 	}
 
 	if (!pipe_ctx_old->stream) {
@@ -1362,6 +1297,12 @@ static enum dc_status apply_single_controller_ctx_to_hw(
 		struct dc *dc)
 {
 	struct dc_stream_state *stream = pipe_ctx->stream;
+	struct drr_params params = {0};
+	unsigned int event_triggers = 0;
+
+	if (dc->hwss.disable_stream_gating) {
+		dc->hwss.disable_stream_gating(dc, pipe_ctx);
+	}
 
 	if (pipe_ctx->stream_res.audio != NULL) {
 		struct audio_output audio_output;
@@ -1388,14 +1329,30 @@ static enum dc_status apply_single_controller_ctx_to_hw(
 	}
 
 	/*  */
-	dc->hwss.enable_stream_timing(pipe_ctx, context, dc);
+	/* Do not touch stream timing on seamless boot optimization. */
+	if (!pipe_ctx->stream->apply_seamless_boot_optimization)
+		dc->hwss.enable_stream_timing(pipe_ctx, context, dc);
+
+	if (dc->hwss.setup_vupdate_interrupt)
+		dc->hwss.setup_vupdate_interrupt(pipe_ctx);
+
+	params.vertical_total_min = stream->adjust.v_total_min;
+	params.vertical_total_max = stream->adjust.v_total_max;
+	if (pipe_ctx->stream_res.tg->funcs->set_drr)
+		pipe_ctx->stream_res.tg->funcs->set_drr(
+			pipe_ctx->stream_res.tg, &params);
+
+	// DRR should set trigger event to monitor surface update event
+	if (stream->adjust.v_total_min != 0 && stream->adjust.v_total_max != 0)
+		event_triggers = 0x80;
+	if (pipe_ctx->stream_res.tg->funcs->set_static_screen_control)
+		pipe_ctx->stream_res.tg->funcs->set_static_screen_control(
+				pipe_ctx->stream_res.tg, event_triggers);
 
-	/* TODO: move to stream encoder */
 	if (pipe_ctx->stream->signal != SIGNAL_TYPE_VIRTUAL)
-		if (DC_OK != bios_parser_crtc_source_select(pipe_ctx)) {
-			BREAK_TO_DEBUGGER();
-			return DC_ERROR_UNEXPECTED;
-		}
+		pipe_ctx->stream_res.stream_enc->funcs->dig_connect_to_otg(
+			pipe_ctx->stream_res.stream_enc,
+			pipe_ctx->stream_res.tg->inst);
 
 	pipe_ctx->stream_res.opp->funcs->opp_set_dyn_expansion(
 			pipe_ctx->stream_res.opp,
@@ -1413,7 +1370,7 @@ static enum dc_status apply_single_controller_ctx_to_hw(
 
 	pipe_ctx->plane_res.scl_data.lb_params.alpha_en = pipe_ctx->bottom_pipe != 0;
 
-	pipe_ctx->stream->sink->link->psr_enabled = false;
+	pipe_ctx->stream->link->psr_enabled = false;
 
 	return DC_OK;
 }
@@ -1523,7 +1480,7 @@ static struct dc_link *get_link_for_edp(struct dc *dc)
 	return NULL;
 }
 
-static struct dc_link *get_link_for_edp_not_in_use(
+static struct dc_link *get_link_for_edp_to_turn_off(
 		struct dc *dc,
 		struct dc_state *context)
 {
@@ -1532,8 +1489,12 @@ static struct dc_link *get_link_for_edp_not_in_use(
 
 	/* check if eDP panel is suppose to be set mode, if yes, no need to disable */
 	for (i = 0; i < context->stream_count; i++) {
-		if (context->streams[i]->signal == SIGNAL_TYPE_EDP)
-			return NULL;
+		if (context->streams[i]->signal == SIGNAL_TYPE_EDP) {
+			if (context->streams[i]->dpms_off == true)
+				return context->streams[i]->sink->link;
+			else
+				return NULL;
+		}
 	}
 
 	/* check if there is an eDP panel not in use */
@@ -1560,9 +1521,16 @@ void dce110_enable_accelerated_mode(struct dc *dc, struct dc_state *context)
 	int i;
 	struct dc_link *edp_link_to_turnoff = NULL;
 	struct dc_link *edp_link = get_link_for_edp(dc);
-	struct dc_bios *bios = dc->ctx->dc_bios;
 	bool can_edp_fast_boot_optimize = false;
 	bool apply_edp_fast_boot_optimization = false;
+	bool can_apply_seamless_boot = false;
+
+	for (i = 0; i < context->stream_count; i++) {
+		if (context->streams[i]->apply_seamless_boot_optimization) {
+			can_apply_seamless_boot = true;
+			break;
+		}
+	}
 
 	if (edp_link) {
 		/* this seems to cause blank screens on DCE8 */
@@ -1576,7 +1544,7 @@ void dce110_enable_accelerated_mode(struct dc *dc, struct dc_state *context)
 	}
 
 	if (can_edp_fast_boot_optimize)
-		edp_link_to_turnoff = get_link_for_edp_not_in_use(dc, context);
+		edp_link_to_turnoff = get_link_for_edp_to_turn_off(dc, context);
 
 	/* if OS doesn't light up eDP and eDP link is available, we want to disable
 	 * If resume from S4/S5, should optimization.
@@ -1587,25 +1555,11 @@ void dce110_enable_accelerated_mode(struct dc *dc, struct dc_state *context)
 			if (context->streams[i]->signal == SIGNAL_TYPE_EDP) {
 				context->streams[i]->apply_edp_fast_boot_optimization = true;
 				apply_edp_fast_boot_optimization = true;
-
-				/* When after S4 and S5, vbios may post edp and previous dpms_off
-				 * doesn't make sense.
-				 * Update dpms_off state to align hw and sw state via check
-				 * vBios scratch register.
-				 */
-				if (bios->funcs->is_active_display)	{
-					const struct connector_device_tag_info *device_tag = &(edp_link->device_tag);
-
-					if (bios->funcs->is_active_display(bios,
-							context->streams[i]->signal,
-							device_tag))
-						context->streams[i]->dpms_off = false;
-				}
 			}
 		}
 	}
 
-	if (!apply_edp_fast_boot_optimization) {
+	if (!apply_edp_fast_boot_optimization && !can_apply_seamless_boot) {
 		if (edp_link_to_turnoff) {
 			/*turn off backlight before DP_blank and encoder powered down*/
 			dc->hwss.edp_backlight_control(edp_link_to_turnoff, false);
@@ -1629,8 +1583,8 @@ static uint32_t compute_pstate_blackout_duration(
 	pstate_blackout_duration_ns = 1000 * blackout_duration.value >> 24;
 
 	total_dest_line_time_ns = 1000000UL *
-		stream->timing.h_total /
-		stream->timing.pix_clk_khz +
+		(stream->timing.h_total * 10) /
+		stream->timing.pix_clk_100hz +
 		pstate_blackout_duration_ns;
 
 	return total_dest_line_time_ns;
@@ -1818,18 +1772,15 @@ static bool should_enable_fbc(struct dc *dc,
 	if (i == dc->res_pool->pipe_count)
 		return false;
 
-	if (!pipe_ctx->stream->sink)
-		return false;
-
-	if (!pipe_ctx->stream->sink->link)
+	if (!pipe_ctx->stream->link)
 		return false;
 
 	/* Only supports eDP */
-	if (pipe_ctx->stream->sink->link->connector_signal != SIGNAL_TYPE_EDP)
+	if (pipe_ctx->stream->link->connector_signal != SIGNAL_TYPE_EDP)
 		return false;
 
 	/* PSR should not be enabled */
-	if (pipe_ctx->stream->sink->link->psr_enabled)
+	if (pipe_ctx->stream->link->psr_enabled)
 		return false;
 
 	/* Nothing to compress */
@@ -2334,6 +2285,11 @@ static void dce110_enable_per_frame_crtc_position_reset(
 
 }
 
+static void init_pipes(struct dc *dc, struct dc_state *context)
+{
+	// Do nothing
+}
+
 static void init_hw(struct dc *dc)
 {
 	int i;
@@ -2578,7 +2534,7 @@ static void dce110_apply_ctx_for_surface(
 				pipe_ctx->plane_res.mi,
 				pipe_ctx->stream->timing.h_total,
 				pipe_ctx->stream->timing.v_total,
-				pipe_ctx->stream->timing.pix_clk_khz,
+				pipe_ctx->stream->timing.pix_clk_100hz / 10,
 				context->stream_count);
 
 		dce110_program_front_end_for_pipe(dc, pipe_ctx);
@@ -2600,7 +2556,7 @@ static void dce110_apply_ctx_for_surface(
 	}
 
 	if (dc->fbc_compressor)
-		enable_fbc(dc, dc->current_state);
+		enable_fbc(dc, context);
 }
 
 static void dce110_power_down_fe(struct dc *dc, struct pipe_ctx *pipe_ctx)
@@ -2627,13 +2583,35 @@ static void dce110_wait_for_mpcc_disconnect(
 	/* do nothing*/
 }
 
+static void program_output_csc(struct dc *dc,
+		struct pipe_ctx *pipe_ctx,
+		enum dc_color_space colorspace,
+		uint16_t *matrix,
+		int opp_id)
+{
+	int i;
+	struct out_csc_color_matrix tbl_entry;
+
+	if (pipe_ctx->stream->csc_color_matrix.enable_adjustment == true) {
+		enum dc_color_space color_space = pipe_ctx->stream->output_color_space;
+
+		for (i = 0; i < 12; i++)
+			tbl_entry.regval[i] = pipe_ctx->stream->csc_color_matrix.matrix[i];
+
+		tbl_entry.color_space = color_space;
+
+		pipe_ctx->plane_res.xfm->funcs->opp_set_csc_adjustment(
+				pipe_ctx->plane_res.xfm, &tbl_entry);
+	}
+}
+
 void dce110_set_cursor_position(struct pipe_ctx *pipe_ctx)
 {
 	struct dc_cursor_position pos_cpy = pipe_ctx->stream->cursor_position;
 	struct input_pixel_processor *ipp = pipe_ctx->plane_res.ipp;
 	struct mem_input *mi = pipe_ctx->plane_res.mi;
 	struct dc_cursor_mi_param param = {
-		.pixel_clk_khz = pipe_ctx->stream->timing.pix_clk_khz,
+		.pixel_clk_khz = pipe_ctx->stream->timing.pix_clk_100hz / 10,
 		.ref_clk_khz = pipe_ctx->stream->ctx->dc->res_pool->ref_clock_inKhz,
 		.viewport = pipe_ctx->plane_res.scl_data.viewport,
 		.h_scale_ratio = pipe_ctx->plane_res.scl_data.ratios.horz,
@@ -2677,7 +2655,9 @@ void dce110_set_cursor_attribute(struct pipe_ctx *pipe_ctx)
 
 static const struct hw_sequencer_funcs dce110_funcs = {
 	.program_gamut_remap = program_gamut_remap,
+	.program_output_csc = program_output_csc,
 	.init_hw = init_hw,
+	.init_pipes = init_pipes,
 	.apply_ctx_to_hw = dce110_apply_ctx_to_hw,
 	.apply_ctx_for_surface = dce110_apply_ctx_for_surface,
 	.update_plane_addr = update_plane_addr,
@@ -2706,6 +2686,8 @@ static const struct hw_sequencer_funcs dce110_funcs = {
 	.set_static_screen_control = set_static_screen_control,
 	.reset_hw_ctx_wrap = dce110_reset_hw_ctx_wrap,
 	.enable_stream_timing = dce110_enable_stream_timing,
+	.disable_stream_gating = NULL,
+	.enable_stream_gating = NULL,
 	.setup_stereo = NULL,
 	.set_avmute = dce110_set_avmute,
 	.wait_for_mpcc_disconnect = dce110_wait_for_mpcc_disconnect,
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
index e33d11785b1f..7549adaa1542 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
@@ -84,6 +84,7 @@
 
 #ifndef mmBIOS_SCRATCH_2
 	#define mmBIOS_SCRATCH_2 0x05CB
+	#define mmBIOS_SCRATCH_3 0x05CC
 	#define mmBIOS_SCRATCH_6 0x05CF
 #endif
 
@@ -369,6 +370,7 @@ static const struct dce110_clk_src_mask cs_mask = {
 };
 
 static const struct bios_registers bios_regs = {
+	.BIOS_SCRATCH_3 = mmBIOS_SCRATCH_3,
 	.BIOS_SCRATCH_6 = mmBIOS_SCRATCH_6
 };
 
@@ -606,7 +608,7 @@ static struct output_pixel_processor *dce110_opp_create(
 	return &opp->base;
 }
 
-struct aux_engine *dce110_aux_engine_create(
+struct dce_aux *dce110_aux_engine_create(
 	struct dc_context *ctx,
 	uint32_t inst)
 {
@@ -779,8 +781,8 @@ static void get_pixel_clock_parameters(
 	 * the pixel clock normalization for hdmi up to here instead of doing it
 	 * in pll_adjust_pix_clk
 	 */
-	pixel_clk_params->requested_pix_clk = stream->timing.pix_clk_khz;
-	pixel_clk_params->encoder_object_id = stream->sink->link->link_enc->id;
+	pixel_clk_params->requested_pix_clk_100hz = stream->timing.pix_clk_100hz;
+	pixel_clk_params->encoder_object_id = stream->link->link_enc->id;
 	pixel_clk_params->signal_type = pipe_ctx->stream->signal;
 	pixel_clk_params->controller_id = pipe_ctx->stream_res.tg->inst + 1;
 	/* TODO: un-hardcode*/
@@ -797,10 +799,10 @@ static void get_pixel_clock_parameters(
 		pixel_clk_params->color_depth = COLOR_DEPTH_888;
 	}
 	if (stream->timing.pixel_encoding == PIXEL_ENCODING_YCBCR420) {
-		pixel_clk_params->requested_pix_clk  = pixel_clk_params->requested_pix_clk / 2;
+		pixel_clk_params->requested_pix_clk_100hz  = pixel_clk_params->requested_pix_clk_100hz / 2;
 	}
 	if (stream->timing.timing_3d_format == TIMING_3D_FORMAT_HW_FRAME_PACKING)
-		pixel_clk_params->requested_pix_clk *= 2;
+		pixel_clk_params->requested_pix_clk_100hz *= 2;
 
 }
 
@@ -874,7 +876,7 @@ static bool dce110_validate_bandwidth(
 			__func__,
 			context->streams[0]->timing.h_addressable,
 			context->streams[0]->timing.v_addressable,
-			context->streams[0]->timing.pix_clk_khz);
+			context->streams[0]->timing.pix_clk_100hz / 10);
 
 	if (memcmp(&dc->current_state->bw.dce,
 			&context->bw.dce, sizeof(context->bw.dce))) {
@@ -1055,7 +1057,7 @@ static struct pipe_ctx *dce110_acquire_underlay(
 		pipe_ctx->plane_res.mi->funcs->allocate_mem_input(pipe_ctx->plane_res.mi,
 				stream->timing.h_total,
 				stream->timing.v_total,
-				stream->timing.pix_clk_khz,
+				stream->timing.pix_clk_100hz / 10,
 				context->stream_count);
 
 		color_space_to_black_color(dc,
diff --git a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
index 969d4e72dc94..ea3065d63372 100644
--- a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
@@ -76,6 +76,7 @@
 
 #ifndef mmBIOS_SCRATCH_2
 	#define mmBIOS_SCRATCH_2 0x05CB
+	#define mmBIOS_SCRATCH_3 0x05CC
 	#define mmBIOS_SCRATCH_6 0x05CF
 #endif
 
@@ -376,6 +377,7 @@ static const struct dce110_clk_src_mask cs_mask = {
 };
 
 static const struct bios_registers bios_regs = {
+	.BIOS_SCRATCH_3 = mmBIOS_SCRATCH_3,
 	.BIOS_SCRATCH_6 = mmBIOS_SCRATCH_6
 };
 
@@ -607,7 +609,7 @@ struct output_pixel_processor *dce112_opp_create(
 	return &opp->base;
 }
 
-struct aux_engine *dce112_aux_engine_create(
+struct dce_aux *dce112_aux_engine_create(
 	struct dc_context *ctx,
 	uint32_t inst)
 {
@@ -763,7 +765,7 @@ static struct clock_source *find_matching_pll(
 		const struct resource_pool *pool,
 		const struct dc_stream_state *const stream)
 {
-	switch (stream->sink->link->link_enc->transmitter) {
+	switch (stream->link->link_enc->transmitter) {
 	case TRANSMITTER_UNIPHY_A:
 		return pool->clock_sources[DCE112_CLK_SRC_PLL0];
 	case TRANSMITTER_UNIPHY_B:
diff --git a/drivers/gpu/drm/amd/display/dc/dce120/dce120_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce120/dce120_hw_sequencer.c
index eb0f5f9a973b..1ca30928025e 100644
--- a/drivers/gpu/drm/amd/display/dc/dce120/dce120_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce120/dce120_hw_sequencer.c
@@ -244,6 +244,21 @@ static void dce120_update_dchub(
 	dh_data->dchub_info_valid = false;
 }
 
+/**
+ * dce121_xgmi_enabled() - Check if xGMI is enabled
+ * @hws: DCE hardware sequencer object
+ *
+ * Return true if xGMI is enabled. False otherwise.
+ */
+bool dce121_xgmi_enabled(struct dce_hwseq *hws)
+{
+	uint32_t pf_max_region;
+
+	REG_GET(MC_VM_XGMI_LFB_CNTL, PF_MAX_REGION, &pf_max_region);
+	/* PF_MAX_REGION == 0 means xgmi is disabled */
+	return !!pf_max_region;
+}
+
 void dce120_hw_sequencer_construct(struct dc *dc)
 {
 	/* All registers used by dce11.2 match those in dce11 in offset and
diff --git a/drivers/gpu/drm/amd/display/dc/dce120/dce120_hw_sequencer.h b/drivers/gpu/drm/amd/display/dc/dce120/dce120_hw_sequencer.h
index 77a6b86d7606..c51afbd0b012 100644
--- a/drivers/gpu/drm/amd/display/dc/dce120/dce120_hw_sequencer.h
+++ b/drivers/gpu/drm/amd/display/dc/dce120/dce120_hw_sequencer.h
@@ -30,6 +30,7 @@
 
 struct dc;
 
+bool dce121_xgmi_enabled(struct dce_hwseq *hws);
 void dce120_hw_sequencer_construct(struct dc *dc);
 
 #endif /* __DC_HWSS_DCE112_H__ */
diff --git a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
index f12696674eb0..312a0aebf91f 100644
--- a/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce120/dce120_resource.c
@@ -62,6 +62,8 @@
 #include "soc15_hw_ip.h"
 #include "vega10_ip_offset.h"
 #include "nbio/nbio_6_1_offset.h"
+#include "mmhub/mmhub_9_4_0_offset.h"
+#include "mmhub/mmhub_9_4_0_sh_mask.h"
 #include "reg_helper.h"
 
 #include "dce100/dce100_resource.h"
@@ -139,6 +141,17 @@ static const struct dce110_timing_generator_offsets dce120_tg_offsets[] = {
 	.reg_name = BASE(mm ## block ## id ## _ ## reg_name ## _BASE_IDX) + \
 					mm ## block ## id ## _ ## reg_name
 
+/* MMHUB */
+#define MMHUB_BASE_INNER(seg) \
+	MMHUB_BASE__INST0_SEG ## seg
+
+#define MMHUB_BASE(seg) \
+	MMHUB_BASE_INNER(seg)
+
+#define MMHUB_SR(reg_name)\
+		.reg_name = MMHUB_BASE(mm ## reg_name ## _BASE_IDX) +  \
+					mm ## reg_name
+
 /* macros to expend register list macro defined in HW object header file
  * end *********************/
 
@@ -378,7 +391,7 @@ struct output_pixel_processor *dce120_opp_create(
 			     ctx, inst, &opp_regs[inst], &opp_shift, &opp_mask);
 	return &opp->base;
 }
-struct aux_engine *dce120_aux_engine_create(
+struct dce_aux *dce120_aux_engine_create(
 	struct dc_context *ctx,
 	uint32_t inst)
 {
@@ -429,6 +442,7 @@ struct dce_i2c_hw *dce120_i2c_hw_create(
 	return dce_i2c_hw;
 }
 static const struct bios_registers bios_regs = {
+	.BIOS_SCRATCH_3 = mmBIOS_SCRATCH_3 + NBIO_BASE(mmBIOS_SCRATCH_3_BASE_IDX),
 	.BIOS_SCRATCH_6 = mmBIOS_SCRATCH_6 + NBIO_BASE(mmBIOS_SCRATCH_6_BASE_IDX)
 };
 
@@ -681,6 +695,19 @@ static const struct dce_hwseq_mask hwseq_mask = {
 		HWSEQ_DCE12_MASK_SH_LIST(_MASK)
 };
 
+/* HWSEQ regs for VG20 */
+static const struct dce_hwseq_registers dce121_hwseq_reg = {
+		HWSEQ_VG20_REG_LIST()
+};
+
+static const struct dce_hwseq_shift dce121_hwseq_shift = {
+		HWSEQ_VG20_MASK_SH_LIST(__SHIFT)
+};
+
+static const struct dce_hwseq_mask dce121_hwseq_mask = {
+		HWSEQ_VG20_MASK_SH_LIST(_MASK)
+};
+
 static struct dce_hwseq *dce120_hwseq_create(
 	struct dc_context *ctx)
 {
@@ -695,6 +722,20 @@ static struct dce_hwseq *dce120_hwseq_create(
 	return hws;
 }
 
+static struct dce_hwseq *dce121_hwseq_create(
+	struct dc_context *ctx)
+{
+	struct dce_hwseq *hws = kzalloc(sizeof(struct dce_hwseq), GFP_KERNEL);
+
+	if (hws) {
+		hws->ctx = ctx;
+		hws->regs = &dce121_hwseq_reg;
+		hws->shifts = &dce121_hwseq_shift;
+		hws->masks = &dce121_hwseq_mask;
+	}
+	return hws;
+}
+
 static const struct resource_create_funcs res_create_funcs = {
 	.read_dce_straps = read_dce_straps,
 	.create_audio = create_audio,
@@ -702,6 +743,14 @@ static const struct resource_create_funcs res_create_funcs = {
 	.create_hwseq = dce120_hwseq_create,
 };
 
+static const struct resource_create_funcs dce121_res_create_funcs = {
+	.read_dce_straps = read_dce_straps,
+	.create_audio = create_audio,
+	.create_stream_encoder = dce120_stream_encoder_create,
+	.create_hwseq = dce121_hwseq_create,
+};
+
+
 #define mi_inst_regs(id) { MI_DCE12_REG_LIST(id) }
 static const struct dce_mem_input_registers mi_regs[] = {
 		mi_inst_regs(0),
@@ -911,7 +960,8 @@ static bool construct(
 	int j;
 	struct dc_context *ctx = dc->ctx;
 	struct irq_service_init_data irq_init_data;
-	bool harvest_enabled = ASICREV_IS_VEGA20_P(ctx->asic_id.hw_internal_rev);
+	static const struct resource_create_funcs *res_funcs;
+	bool is_vg20 = ASICREV_IS_VEGA20_P(ctx->asic_id.hw_internal_rev);
 	uint32_t pipe_fuses;
 
 	ctx->dc_bios->regs = &bios_regs;
@@ -975,7 +1025,11 @@ static bool construct(
 		}
 	}
 
-	pool->base.clk_mgr = dce120_clk_mgr_create(ctx);
+	if (is_vg20)
+		pool->base.clk_mgr = dce121_clk_mgr_create(ctx);
+	else
+		pool->base.clk_mgr = dce120_clk_mgr_create(ctx);
+
 	if (pool->base.clk_mgr == NULL) {
 		dm_error("DC: failed to create display clock!\n");
 		BREAK_TO_DEBUGGER();
@@ -1008,14 +1062,14 @@ static bool construct(
 	if (!pool->base.irqs)
 		goto irqs_create_fail;
 
-	/* retrieve valid pipe fuses */
-	if (harvest_enabled)
+	/* VG20: Pipe harvesting enabled, retrieve valid pipe fuses */
+	if (is_vg20)
 		pipe_fuses = read_pipe_fuses(ctx);
 
 	/* index to valid pipe resource */
 	j = 0;
 	for (i = 0; i < pool->base.pipe_count; i++) {
-		if (harvest_enabled) {
+		if (is_vg20) {
 			if ((pipe_fuses & (1 << i)) != 0) {
 				dm_error("DC: skip invalid pipe %d!\n", i);
 				continue;
@@ -1093,10 +1147,24 @@ static bool construct(
 	pool->base.pipe_count = j;
 	pool->base.timing_generator_count = j;
 
-	if (!resource_construct(num_virtual_links, dc, &pool->base,
-			 &res_create_funcs))
+	if (is_vg20)
+		res_funcs = &dce121_res_create_funcs;
+	else
+		res_funcs = &res_create_funcs;
+
+	if (!resource_construct(num_virtual_links, dc, &pool->base, res_funcs))
 		goto res_create_fail;
 
+	/*
+	 * This is a bit of a hack. The xGMI enabled info is used to determine
+	 * if audio and display clocks need to be adjusted with the WAFL link's
+	 * SS info. This is a responsiblity of the clk_mgr. But since MMHUB is
+	 * under hwseq, and the relevant register is in MMHUB, we have to do it
+	 * here.
+	 */
+	if (is_vg20 && dce121_xgmi_enabled(dc->hwseq))
+		dce121_clock_patch_xgmi_ss_info(pool->base.clk_mgr);
+
 	/* Create hardware sequencer */
 	if (!dce120_hw_sequencer_create(dc))
 		goto controller_create_fail;
diff --git a/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c b/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
index 4e9ea50141bd..c109ace96be9 100644
--- a/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce80/dce80_resource.c
@@ -77,6 +77,7 @@
 
 #ifndef mmBIOS_SCRATCH_2
 	#define mmBIOS_SCRATCH_2 0x05CB
+	#define mmBIOS_SCRATCH_3 0x05CC
 	#define mmBIOS_SCRATCH_6 0x05CF
 #endif
 
@@ -358,6 +359,7 @@ static const struct dce110_clk_src_mask cs_mask = {
 };
 
 static const struct bios_registers bios_regs = {
+	.BIOS_SCRATCH_3 = mmBIOS_SCRATCH_3,
 	.BIOS_SCRATCH_6 = mmBIOS_SCRATCH_6
 };
 
@@ -467,7 +469,7 @@ static struct output_pixel_processor *dce80_opp_create(
 	return &opp->base;
 }
 
-struct aux_engine *dce80_aux_engine_create(
+struct dce_aux *dce80_aux_engine_create(
 	struct dc_context *ctx,
 	uint32_t inst)
 {
diff --git a/drivers/gpu/drm/amd/display/dc/dce80/dce80_timing_generator.c b/drivers/gpu/drm/amd/display/dc/dce80/dce80_timing_generator.c
index 3ba4712a35ab..8b5ce557ee71 100644
--- a/drivers/gpu/drm/amd/display/dc/dce80/dce80_timing_generator.c
+++ b/drivers/gpu/drm/amd/display/dc/dce80/dce80_timing_generator.c
@@ -84,17 +84,17 @@ static const struct dce110_timing_generator_offsets reg_offsets[] = {
 #define DCP_REG(reg) (reg + tg110->offsets.dcp)
 #define DMIF_REG(reg) (reg + tg110->offsets.dmif)
 
-static void program_pix_dur(struct timing_generator *tg, uint32_t pix_clk_khz)
+static void program_pix_dur(struct timing_generator *tg, uint32_t pix_clk_100hz)
 {
 	uint64_t pix_dur;
 	uint32_t addr = mmDMIF_PG0_DPG_PIPE_ARBITRATION_CONTROL1
 					+ DCE110TG_FROM_TG(tg)->offsets.dmif;
 	uint32_t value = dm_read_reg(tg->ctx, addr);
 
-	if (pix_clk_khz == 0)
+	if (pix_clk_100hz == 0)
 		return;
 
-	pix_dur = 1000000000 / pix_clk_khz;
+	pix_dur = div_u64(10000000000ull, pix_clk_100hz);
 
 	set_reg_field_value(
 		value,
@@ -110,7 +110,7 @@ static void program_timing(struct timing_generator *tg,
 	bool use_vbios)
 {
 	if (!use_vbios)
-		program_pix_dur(tg, timing->pix_clk_khz);
+		program_pix_dur(tg, timing->pix_clk_100hz);
 
 	dce110_tg_program_timing(tg, timing, use_vbios);
 }
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c
index 54abedbf1b43..afe8c42211cd 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_clk_mgr.c
@@ -161,69 +161,17 @@ static int get_active_display_cnt(
 	return display_count;
 }
 
-static void notify_deep_sleep_dcfclk_to_smu(
-		struct pp_smu_funcs_rv *pp_smu, int min_dcef_deep_sleep_clk_khz)
-{
-	int min_dcef_deep_sleep_clk_mhz; //minimum required DCEF Deep Sleep clock in mhz
-	/*
-	 * if function pointer not set up, this message is
-	 * sent as part of pplib_apply_display_requirements.
-	 * So just return.
-	 */
-	if (!pp_smu || !pp_smu->set_min_deep_sleep_dcfclk)
-		return;
-
-	min_dcef_deep_sleep_clk_mhz = (min_dcef_deep_sleep_clk_khz + 999) / 1000; //Round up
-	pp_smu->set_min_deep_sleep_dcfclk(&pp_smu->pp_smu, min_dcef_deep_sleep_clk_mhz);
-}
-
-static void notify_hard_min_dcfclk_to_smu(
-		struct pp_smu_funcs_rv *pp_smu, int min_dcf_clk_khz)
-{
-	int min_dcf_clk_mhz; //minimum required DCF clock in mhz
-
-	/*
-	 * if function pointer not set up, this message is
-	 * sent as part of pplib_apply_display_requirements.
-	 * So just return.
-	 */
-	if (!pp_smu || !pp_smu->set_hard_min_dcfclk_by_freq)
-		return;
-
-	min_dcf_clk_mhz = min_dcf_clk_khz / 1000;
-
-	pp_smu->set_hard_min_dcfclk_by_freq(&pp_smu->pp_smu, min_dcf_clk_mhz);
-}
-
-static void notify_hard_min_fclk_to_smu(
-		struct pp_smu_funcs_rv *pp_smu, int min_f_clk_khz)
-{
-	int min_f_clk_mhz; //minimum required F clock in mhz
-
-	/*
-	 * if function pointer not set up, this message is
-	 * sent as part of pplib_apply_display_requirements.
-	 * So just return.
-	 */
-	if (!pp_smu || !pp_smu->set_hard_min_fclk_by_freq)
-		return;
-
-	min_f_clk_mhz = min_f_clk_khz / 1000;
-
-	pp_smu->set_hard_min_fclk_by_freq(&pp_smu->pp_smu, min_f_clk_mhz);
-}
-
 static void dcn1_update_clocks(struct clk_mgr *clk_mgr,
 			struct dc_state *context,
 			bool safe_to_lower)
 {
 	struct dc *dc = clk_mgr->ctx->dc;
+	struct dc_debug_options *debug = &dc->debug;
 	struct dc_clocks *new_clocks = &context->bw.dcn.clk;
 	struct pp_smu_display_requirement_rv *smu_req_cur =
 			&dc->res_pool->pp_smu_req;
 	struct pp_smu_display_requirement_rv smu_req = *smu_req_cur;
 	struct pp_smu_funcs_rv *pp_smu = dc->res_pool->pp_smu;
-	uint32_t requested_dcf_clock_in_khz = 0;
 	bool send_request_to_increase = false;
 	bool send_request_to_lower = false;
 	int display_count;
@@ -243,9 +191,8 @@ static void dcn1_update_clocks(struct clk_mgr *clk_mgr,
 		 */
 		if (pp_smu->set_display_count)
 			pp_smu->set_display_count(&pp_smu->pp_smu, display_count);
-		else
-			smu_req.display_count = display_count;
 
+		smu_req.display_count = display_count;
 	}
 
 	if (new_clocks->dispclk_khz > clk_mgr->clks.dispclk_khz
@@ -261,12 +208,13 @@ static void dcn1_update_clocks(struct clk_mgr *clk_mgr,
 	}
 
 	// F Clock
+	if (debug->force_fclk_khz != 0)
+		new_clocks->fclk_khz = debug->force_fclk_khz;
+
 	if (should_set_clock(safe_to_lower, new_clocks->fclk_khz, clk_mgr->clks.fclk_khz)) {
 		clk_mgr->clks.fclk_khz = new_clocks->fclk_khz;
 		smu_req.hard_min_fclk_mhz = new_clocks->fclk_khz / 1000;
 
-		notify_hard_min_fclk_to_smu(pp_smu, new_clocks->fclk_khz);
-
 		send_request_to_lower = true;
 	}
 
@@ -281,7 +229,7 @@ static void dcn1_update_clocks(struct clk_mgr *clk_mgr,
 	if (should_set_clock(safe_to_lower,
 			new_clocks->dcfclk_deep_sleep_khz, clk_mgr->clks.dcfclk_deep_sleep_khz)) {
 		clk_mgr->clks.dcfclk_deep_sleep_khz = new_clocks->dcfclk_deep_sleep_khz;
-		smu_req.min_deep_sleep_dcefclk_mhz = new_clocks->dcfclk_deep_sleep_khz / 1000;
+		smu_req.min_deep_sleep_dcefclk_mhz = (new_clocks->dcfclk_deep_sleep_khz + 999) / 1000;
 
 		send_request_to_lower = true;
 	}
@@ -291,15 +239,18 @@ static void dcn1_update_clocks(struct clk_mgr *clk_mgr,
 	 */
 	if (send_request_to_increase) {
 		/*use dcfclk to request voltage*/
-		requested_dcf_clock_in_khz = dcn_find_dcfclk_suits_all(dc, new_clocks);
-
-		notify_hard_min_dcfclk_to_smu(pp_smu, requested_dcf_clock_in_khz);
-
-		if (pp_smu->set_display_requirement)
-			pp_smu->set_display_requirement(&pp_smu->pp_smu, &smu_req);
-
-		notify_deep_sleep_dcfclk_to_smu(pp_smu, clk_mgr->clks.dcfclk_deep_sleep_khz);
-		dcn1_pplib_apply_display_requirements(dc, context);
+		if (pp_smu->set_hard_min_fclk_by_freq &&
+				pp_smu->set_hard_min_dcfclk_by_freq &&
+				pp_smu->set_min_deep_sleep_dcfclk) {
+
+			pp_smu->set_hard_min_fclk_by_freq(&pp_smu->pp_smu, smu_req.hard_min_fclk_mhz);
+			pp_smu->set_hard_min_dcfclk_by_freq(&pp_smu->pp_smu, smu_req.hard_min_dcefclk_mhz);
+			pp_smu->set_min_deep_sleep_dcfclk(&pp_smu->pp_smu, smu_req.min_deep_sleep_dcefclk_mhz);
+		} else {
+			if (pp_smu->set_display_requirement)
+				pp_smu->set_display_requirement(&pp_smu->pp_smu, &smu_req);
+			dcn1_pplib_apply_display_requirements(dc, context);
+		}
 	}
 
 	/* dcn1 dppclk is tied to dispclk */
@@ -314,18 +265,20 @@ static void dcn1_update_clocks(struct clk_mgr *clk_mgr,
 
 	if (!send_request_to_increase && send_request_to_lower) {
 		/*use dcfclk to request voltage*/
-		requested_dcf_clock_in_khz = dcn_find_dcfclk_suits_all(dc, new_clocks);
-
-		notify_hard_min_dcfclk_to_smu(pp_smu, requested_dcf_clock_in_khz);
-
-		if (pp_smu->set_display_requirement)
-			pp_smu->set_display_requirement(&pp_smu->pp_smu, &smu_req);
-
-		notify_deep_sleep_dcfclk_to_smu(pp_smu, clk_mgr->clks.dcfclk_deep_sleep_khz);
-		dcn1_pplib_apply_display_requirements(dc, context);
+		if (pp_smu->set_hard_min_fclk_by_freq &&
+				pp_smu->set_hard_min_dcfclk_by_freq &&
+				pp_smu->set_min_deep_sleep_dcfclk) {
+
+			pp_smu->set_hard_min_fclk_by_freq(&pp_smu->pp_smu, smu_req.hard_min_fclk_mhz);
+			pp_smu->set_hard_min_dcfclk_by_freq(&pp_smu->pp_smu, smu_req.hard_min_dcefclk_mhz);
+			pp_smu->set_min_deep_sleep_dcfclk(&pp_smu->pp_smu, smu_req.min_deep_sleep_dcefclk_mhz);
+		} else {
+			if (pp_smu->set_display_requirement)
+				pp_smu->set_display_requirement(&pp_smu->pp_smu, &smu_req);
+			dcn1_pplib_apply_display_requirements(dc, context);
+		}
 	}
 
-
 	*smu_req_cur = smu_req;
 }
 static const struct clk_mgr_funcs dcn1_funcs = {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c
index 116977eb24e2..41f0f4c912e7 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c
@@ -51,10 +51,6 @@
 
 #define NUM_ELEMENTS(a) (sizeof(a) / sizeof((a)[0]))
 
-struct dcn10_input_csc_matrix {
-	enum dc_color_space color_space;
-	uint16_t regval[12];
-};
 
 enum dcn10_coef_filter_type_sel {
 	SCL_COEF_LUMA_VERT_FILTER = 0,
@@ -99,7 +95,7 @@ enum gamut_remap_select {
 	GAMUT_REMAP_COMB_COEFF
 };
 
-static const struct dcn10_input_csc_matrix dcn10_input_csc_matrix[] = {
+static const struct dpp_input_csc_matrix dpp_input_csc_matrix[] = {
 	{COLOR_SPACE_SRGB,
 		{0x2000, 0, 0, 0, 0, 0x2000, 0, 0, 0, 0, 0x2000, 0} },
 	{COLOR_SPACE_SRGB_LIMITED,
@@ -454,7 +450,7 @@ void dpp1_program_input_csc(
 {
 	struct dcn10_dpp *dpp = TO_DCN10_DPP(dpp_base);
 	int i;
-	int arr_size = sizeof(dcn10_input_csc_matrix)/sizeof(struct dcn10_input_csc_matrix);
+	int arr_size = sizeof(dpp_input_csc_matrix)/sizeof(struct dpp_input_csc_matrix);
 	const uint16_t *regval = NULL;
 	uint32_t cur_select = 0;
 	enum dcn10_input_csc_select select;
@@ -467,8 +463,8 @@ void dpp1_program_input_csc(
 
 	if (tbl_entry == NULL) {
 		for (i = 0; i < arr_size; i++)
-			if (dcn10_input_csc_matrix[i].color_space == color_space) {
-				regval = dcn10_input_csc_matrix[i].regval;
+			if (dpp_input_csc_matrix[i].color_space == color_space) {
+				regval = dpp_input_csc_matrix[i].regval;
 				break;
 			}
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c
index 4a863a5dab41..c7642e748297 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c
@@ -597,11 +597,13 @@ static void dpp1_dscl_set_manual_ratio_init(
 		SCL_V_INIT_FRAC, init_frac,
 		SCL_V_INIT_INT, init_int);
 
-	init_frac = dc_fixpt_u0d19(data->inits.v_bot) << 5;
-	init_int = dc_fixpt_floor(data->inits.v_bot);
-	REG_SET_2(SCL_VERT_FILTER_INIT_BOT, 0,
-		SCL_V_INIT_FRAC_BOT, init_frac,
-		SCL_V_INIT_INT_BOT, init_int);
+	if (REG(SCL_VERT_FILTER_INIT_BOT)) {
+		init_frac = dc_fixpt_u0d19(data->inits.v_bot) << 5;
+		init_int = dc_fixpt_floor(data->inits.v_bot);
+		REG_SET_2(SCL_VERT_FILTER_INIT_BOT, 0,
+			SCL_V_INIT_FRAC_BOT, init_frac,
+			SCL_V_INIT_INT_BOT, init_int);
+	}
 
 	init_frac = dc_fixpt_u0d19(data->inits.v_c) << 5;
 	init_int = dc_fixpt_floor(data->inits.v_c);
@@ -609,11 +611,13 @@ static void dpp1_dscl_set_manual_ratio_init(
 		SCL_V_INIT_FRAC_C, init_frac,
 		SCL_V_INIT_INT_C, init_int);
 
-	init_frac = dc_fixpt_u0d19(data->inits.v_c_bot) << 5;
-	init_int = dc_fixpt_floor(data->inits.v_c_bot);
-	REG_SET_2(SCL_VERT_FILTER_INIT_BOT_C, 0,
-		SCL_V_INIT_FRAC_BOT_C, init_frac,
-		SCL_V_INIT_INT_BOT_C, init_int);
+	if (REG(SCL_VERT_FILTER_INIT_BOT_C)) {
+		init_frac = dc_fixpt_u0d19(data->inits.v_c_bot) << 5;
+		init_int = dc_fixpt_floor(data->inits.v_c_bot);
+		REG_SET_2(SCL_VERT_FILTER_INIT_BOT_C, 0,
+			SCL_V_INIT_FRAC_BOT_C, init_frac,
+			SCL_V_INIT_INT_BOT_C, init_int);
+	}
 }
 
 
@@ -688,15 +692,17 @@ void dpp1_dscl_set_scaler_manual_scale(
 		return;
 
 	/* Black offsets */
-	if (ycbcr)
-		REG_SET_2(SCL_BLACK_OFFSET, 0,
-				SCL_BLACK_OFFSET_RGB_Y, BLACK_OFFSET_RGB_Y,
-				SCL_BLACK_OFFSET_CBCR, BLACK_OFFSET_CBCR);
-	else
+	if (REG(SCL_BLACK_OFFSET)) {
+		if (ycbcr)
+			REG_SET_2(SCL_BLACK_OFFSET, 0,
+					SCL_BLACK_OFFSET_RGB_Y, BLACK_OFFSET_RGB_Y,
+					SCL_BLACK_OFFSET_CBCR, BLACK_OFFSET_CBCR);
+		else
 
-		REG_SET_2(SCL_BLACK_OFFSET, 0,
-				SCL_BLACK_OFFSET_RGB_Y, BLACK_OFFSET_RGB_Y,
-				SCL_BLACK_OFFSET_CBCR, BLACK_OFFSET_RGB_Y);
+			REG_SET_2(SCL_BLACK_OFFSET, 0,
+					SCL_BLACK_OFFSET_RGB_Y, BLACK_OFFSET_RGB_Y,
+					SCL_BLACK_OFFSET_CBCR, BLACK_OFFSET_RGB_Y);
+	}
 
 	/* Manually calculate scale ratio and init values */
 	dpp1_dscl_set_manual_ratio_init(dpp, scl_data);
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c
index c7d1e678ebf5..e161ad836812 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c
@@ -29,19 +29,20 @@
 #include "reg_helper.h"
 
 #define CTX \
-	hubbub->ctx
+	hubbub1->base.ctx
 #define DC_LOGGER \
-	hubbub->ctx->logger
+	hubbub1->base.ctx->logger
 #define REG(reg)\
-	hubbub->regs->reg
+	hubbub1->regs->reg
 
 #undef FN
 #define FN(reg_name, field_name) \
-	hubbub->shifts->field_name, hubbub->masks->field_name
+	hubbub1->shifts->field_name, hubbub1->masks->field_name
 
 void hubbub1_wm_read_state(struct hubbub *hubbub,
 		struct dcn_hubbub_wm *wm)
 {
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
 	struct dcn_hubbub_wm_set *s;
 
 	memset(wm, 0, sizeof(struct dcn_hubbub_wm));
@@ -87,14 +88,23 @@ void hubbub1_wm_read_state(struct hubbub *hubbub,
 	s->dram_clk_chanage = REG_READ(DCHUBBUB_ARB_ALLOW_DRAM_CLK_CHANGE_WATERMARK_D);
 }
 
-void hubbub1_disable_allow_self_refresh(struct hubbub *hubbub)
+void hubbub1_allow_self_refresh_control(struct hubbub *hubbub, bool allow)
 {
-	REG_UPDATE(DCHUBBUB_ARB_DRAM_STATE_CNTL,
-			DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_ENABLE, 0);
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
+
+	/*
+	 * DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_ENABLE = 1 means do not allow stutter
+	 * DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_ENABLE = 0 means allow stutter
+	 */
+
+	REG_UPDATE_2(DCHUBBUB_ARB_DRAM_STATE_CNTL,
+			DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_VALUE, 0,
+			DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_ENABLE, !allow);
 }
 
 bool hububu1_is_allow_self_refresh_enabled(struct hubbub *hubbub)
 {
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
 	uint32_t enable = 0;
 
 	REG_GET(DCHUBBUB_ARB_DRAM_STATE_CNTL,
@@ -107,6 +117,8 @@ bool hububu1_is_allow_self_refresh_enabled(struct hubbub *hubbub)
 bool hubbub1_verify_allow_pstate_change_high(
 	struct hubbub *hubbub)
 {
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
+
 	/* pstate latency is ~20us so if we wait over 40us and pstate allow
 	 * still not asserted, we are probably stuck and going to hang
 	 *
@@ -193,7 +205,7 @@ bool hubbub1_verify_allow_pstate_change_high(
 	 * 31:    SOC pstate change request
 	 */
 
-	REG_WRITE(DCHUBBUB_TEST_DEBUG_INDEX, hubbub->debug_test_index_pstate);
+	REG_WRITE(DCHUBBUB_TEST_DEBUG_INDEX, hubbub1->debug_test_index_pstate);
 
 	for (i = 0; i < pstate_wait_timeout_us; i++) {
 		debug_data = REG_READ(DCHUBBUB_TEST_DEBUG_DATA);
@@ -244,6 +256,8 @@ static uint32_t convert_and_clamp(
 
 void hubbub1_wm_change_req_wa(struct hubbub *hubbub)
 {
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
+
 	REG_UPDATE_SEQ(DCHUBBUB_ARB_WATERMARK_CHANGE_CNTL,
 			DCHUBBUB_ARB_WATERMARK_CHANGE_REQUEST, 0, 1);
 }
@@ -254,7 +268,7 @@ void hubbub1_program_watermarks(
 		unsigned int refclk_mhz,
 		bool safe_to_lower)
 {
-	uint32_t force_en = hubbub->ctx->dc->debug.disable_stutter ? 1 : 0;
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
 	/*
 	 * Need to clamp to max of the register values (i.e. no wrap)
 	 * for dcn1, all wm registers are 21-bit wide
@@ -264,8 +278,8 @@ void hubbub1_program_watermarks(
 
 	/* Repeat for water mark set A, B, C and D. */
 	/* clock state A */
-	if (safe_to_lower || watermarks->a.urgent_ns > hubbub->watermarks.a.urgent_ns) {
-		hubbub->watermarks.a.urgent_ns = watermarks->a.urgent_ns;
+	if (safe_to_lower || watermarks->a.urgent_ns > hubbub1->watermarks.a.urgent_ns) {
+		hubbub1->watermarks.a.urgent_ns = watermarks->a.urgent_ns;
 		prog_wm_value = convert_and_clamp(watermarks->a.urgent_ns,
 				refclk_mhz, 0x1fffff);
 		REG_WRITE(DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_A, prog_wm_value);
@@ -275,20 +289,22 @@ void hubbub1_program_watermarks(
 			watermarks->a.urgent_ns, prog_wm_value);
 	}
 
-	if (safe_to_lower || watermarks->a.pte_meta_urgent_ns > hubbub->watermarks.a.pte_meta_urgent_ns) {
-		hubbub->watermarks.a.pte_meta_urgent_ns = watermarks->a.pte_meta_urgent_ns;
-		prog_wm_value = convert_and_clamp(watermarks->a.pte_meta_urgent_ns,
-				refclk_mhz, 0x1fffff);
-		REG_WRITE(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_A, prog_wm_value);
-		DC_LOG_BANDWIDTH_CALCS("PTE_META_URGENCY_WATERMARK_A calculated =%d\n"
-			"HW register value = 0x%x\n",
-			watermarks->a.pte_meta_urgent_ns, prog_wm_value);
+	if (REG(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_A)) {
+		if (safe_to_lower || watermarks->a.pte_meta_urgent_ns > hubbub1->watermarks.a.pte_meta_urgent_ns) {
+			hubbub1->watermarks.a.pte_meta_urgent_ns = watermarks->a.pte_meta_urgent_ns;
+			prog_wm_value = convert_and_clamp(watermarks->a.pte_meta_urgent_ns,
+					refclk_mhz, 0x1fffff);
+			REG_WRITE(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_A, prog_wm_value);
+			DC_LOG_BANDWIDTH_CALCS("PTE_META_URGENCY_WATERMARK_A calculated =%d\n"
+				"HW register value = 0x%x\n",
+				watermarks->a.pte_meta_urgent_ns, prog_wm_value);
+		}
 	}
 
 	if (REG(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_A)) {
 		if (safe_to_lower || watermarks->a.cstate_pstate.cstate_enter_plus_exit_ns
-				> hubbub->watermarks.a.cstate_pstate.cstate_enter_plus_exit_ns) {
-			hubbub->watermarks.a.cstate_pstate.cstate_enter_plus_exit_ns =
+				> hubbub1->watermarks.a.cstate_pstate.cstate_enter_plus_exit_ns) {
+			hubbub1->watermarks.a.cstate_pstate.cstate_enter_plus_exit_ns =
 					watermarks->a.cstate_pstate.cstate_enter_plus_exit_ns;
 			prog_wm_value = convert_and_clamp(
 					watermarks->a.cstate_pstate.cstate_enter_plus_exit_ns,
@@ -300,8 +316,8 @@ void hubbub1_program_watermarks(
 		}
 
 		if (safe_to_lower || watermarks->a.cstate_pstate.cstate_exit_ns
-				> hubbub->watermarks.a.cstate_pstate.cstate_exit_ns) {
-			hubbub->watermarks.a.cstate_pstate.cstate_exit_ns =
+				> hubbub1->watermarks.a.cstate_pstate.cstate_exit_ns) {
+			hubbub1->watermarks.a.cstate_pstate.cstate_exit_ns =
 					watermarks->a.cstate_pstate.cstate_exit_ns;
 			prog_wm_value = convert_and_clamp(
 					watermarks->a.cstate_pstate.cstate_exit_ns,
@@ -314,8 +330,8 @@ void hubbub1_program_watermarks(
 	}
 
 	if (safe_to_lower || watermarks->a.cstate_pstate.pstate_change_ns
-			> hubbub->watermarks.a.cstate_pstate.pstate_change_ns) {
-		hubbub->watermarks.a.cstate_pstate.pstate_change_ns =
+			> hubbub1->watermarks.a.cstate_pstate.pstate_change_ns) {
+		hubbub1->watermarks.a.cstate_pstate.pstate_change_ns =
 				watermarks->a.cstate_pstate.pstate_change_ns;
 		prog_wm_value = convert_and_clamp(
 				watermarks->a.cstate_pstate.pstate_change_ns,
@@ -327,8 +343,8 @@ void hubbub1_program_watermarks(
 	}
 
 	/* clock state B */
-	if (safe_to_lower || watermarks->b.urgent_ns > hubbub->watermarks.b.urgent_ns) {
-		hubbub->watermarks.b.urgent_ns = watermarks->b.urgent_ns;
+	if (safe_to_lower || watermarks->b.urgent_ns > hubbub1->watermarks.b.urgent_ns) {
+		hubbub1->watermarks.b.urgent_ns = watermarks->b.urgent_ns;
 		prog_wm_value = convert_and_clamp(watermarks->b.urgent_ns,
 				refclk_mhz, 0x1fffff);
 		REG_WRITE(DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_B, prog_wm_value);
@@ -338,20 +354,22 @@ void hubbub1_program_watermarks(
 			watermarks->b.urgent_ns, prog_wm_value);
 	}
 
-	if (safe_to_lower || watermarks->b.pte_meta_urgent_ns > hubbub->watermarks.b.pte_meta_urgent_ns) {
-		hubbub->watermarks.b.pte_meta_urgent_ns = watermarks->b.pte_meta_urgent_ns;
-		prog_wm_value = convert_and_clamp(watermarks->b.pte_meta_urgent_ns,
-				refclk_mhz, 0x1fffff);
-		REG_WRITE(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_B, prog_wm_value);
-		DC_LOG_BANDWIDTH_CALCS("PTE_META_URGENCY_WATERMARK_B calculated =%d\n"
-			"HW register value = 0x%x\n",
-			watermarks->b.pte_meta_urgent_ns, prog_wm_value);
+	if (REG(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_B)) {
+		if (safe_to_lower || watermarks->b.pte_meta_urgent_ns > hubbub1->watermarks.b.pte_meta_urgent_ns) {
+			hubbub1->watermarks.b.pte_meta_urgent_ns = watermarks->b.pte_meta_urgent_ns;
+			prog_wm_value = convert_and_clamp(watermarks->b.pte_meta_urgent_ns,
+					refclk_mhz, 0x1fffff);
+			REG_WRITE(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_B, prog_wm_value);
+			DC_LOG_BANDWIDTH_CALCS("PTE_META_URGENCY_WATERMARK_B calculated =%d\n"
+				"HW register value = 0x%x\n",
+				watermarks->b.pte_meta_urgent_ns, prog_wm_value);
+		}
 	}
 
 	if (REG(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_B)) {
 		if (safe_to_lower || watermarks->b.cstate_pstate.cstate_enter_plus_exit_ns
-				> hubbub->watermarks.b.cstate_pstate.cstate_enter_plus_exit_ns) {
-			hubbub->watermarks.b.cstate_pstate.cstate_enter_plus_exit_ns =
+				> hubbub1->watermarks.b.cstate_pstate.cstate_enter_plus_exit_ns) {
+			hubbub1->watermarks.b.cstate_pstate.cstate_enter_plus_exit_ns =
 					watermarks->b.cstate_pstate.cstate_enter_plus_exit_ns;
 			prog_wm_value = convert_and_clamp(
 					watermarks->b.cstate_pstate.cstate_enter_plus_exit_ns,
@@ -363,8 +381,8 @@ void hubbub1_program_watermarks(
 		}
 
 		if (safe_to_lower || watermarks->b.cstate_pstate.cstate_exit_ns
-				> hubbub->watermarks.b.cstate_pstate.cstate_exit_ns) {
-			hubbub->watermarks.b.cstate_pstate.cstate_exit_ns =
+				> hubbub1->watermarks.b.cstate_pstate.cstate_exit_ns) {
+			hubbub1->watermarks.b.cstate_pstate.cstate_exit_ns =
 					watermarks->b.cstate_pstate.cstate_exit_ns;
 			prog_wm_value = convert_and_clamp(
 					watermarks->b.cstate_pstate.cstate_exit_ns,
@@ -377,8 +395,8 @@ void hubbub1_program_watermarks(
 	}
 
 	if (safe_to_lower || watermarks->b.cstate_pstate.pstate_change_ns
-			> hubbub->watermarks.b.cstate_pstate.pstate_change_ns) {
-		hubbub->watermarks.b.cstate_pstate.pstate_change_ns =
+			> hubbub1->watermarks.b.cstate_pstate.pstate_change_ns) {
+		hubbub1->watermarks.b.cstate_pstate.pstate_change_ns =
 				watermarks->b.cstate_pstate.pstate_change_ns;
 		prog_wm_value = convert_and_clamp(
 				watermarks->b.cstate_pstate.pstate_change_ns,
@@ -390,8 +408,8 @@ void hubbub1_program_watermarks(
 	}
 
 	/* clock state C */
-	if (safe_to_lower || watermarks->c.urgent_ns > hubbub->watermarks.c.urgent_ns) {
-		hubbub->watermarks.c.urgent_ns = watermarks->c.urgent_ns;
+	if (safe_to_lower || watermarks->c.urgent_ns > hubbub1->watermarks.c.urgent_ns) {
+		hubbub1->watermarks.c.urgent_ns = watermarks->c.urgent_ns;
 		prog_wm_value = convert_and_clamp(watermarks->c.urgent_ns,
 				refclk_mhz, 0x1fffff);
 		REG_WRITE(DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_C, prog_wm_value);
@@ -401,20 +419,22 @@ void hubbub1_program_watermarks(
 			watermarks->c.urgent_ns, prog_wm_value);
 	}
 
-	if (safe_to_lower || watermarks->c.pte_meta_urgent_ns > hubbub->watermarks.c.pte_meta_urgent_ns) {
-		hubbub->watermarks.c.pte_meta_urgent_ns = watermarks->c.pte_meta_urgent_ns;
-		prog_wm_value = convert_and_clamp(watermarks->c.pte_meta_urgent_ns,
-				refclk_mhz, 0x1fffff);
-		REG_WRITE(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_C, prog_wm_value);
-		DC_LOG_BANDWIDTH_CALCS("PTE_META_URGENCY_WATERMARK_C calculated =%d\n"
-			"HW register value = 0x%x\n",
-			watermarks->c.pte_meta_urgent_ns, prog_wm_value);
+	if (REG(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_C)) {
+		if (safe_to_lower || watermarks->c.pte_meta_urgent_ns > hubbub1->watermarks.c.pte_meta_urgent_ns) {
+			hubbub1->watermarks.c.pte_meta_urgent_ns = watermarks->c.pte_meta_urgent_ns;
+			prog_wm_value = convert_and_clamp(watermarks->c.pte_meta_urgent_ns,
+					refclk_mhz, 0x1fffff);
+			REG_WRITE(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_C, prog_wm_value);
+			DC_LOG_BANDWIDTH_CALCS("PTE_META_URGENCY_WATERMARK_C calculated =%d\n"
+				"HW register value = 0x%x\n",
+				watermarks->c.pte_meta_urgent_ns, prog_wm_value);
+		}
 	}
 
 	if (REG(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_C)) {
 		if (safe_to_lower || watermarks->c.cstate_pstate.cstate_enter_plus_exit_ns
-				> hubbub->watermarks.c.cstate_pstate.cstate_enter_plus_exit_ns) {
-			hubbub->watermarks.c.cstate_pstate.cstate_enter_plus_exit_ns =
+				> hubbub1->watermarks.c.cstate_pstate.cstate_enter_plus_exit_ns) {
+			hubbub1->watermarks.c.cstate_pstate.cstate_enter_plus_exit_ns =
 					watermarks->c.cstate_pstate.cstate_enter_plus_exit_ns;
 			prog_wm_value = convert_and_clamp(
 					watermarks->c.cstate_pstate.cstate_enter_plus_exit_ns,
@@ -426,8 +446,8 @@ void hubbub1_program_watermarks(
 		}
 
 		if (safe_to_lower || watermarks->c.cstate_pstate.cstate_exit_ns
-				> hubbub->watermarks.c.cstate_pstate.cstate_exit_ns) {
-			hubbub->watermarks.c.cstate_pstate.cstate_exit_ns =
+				> hubbub1->watermarks.c.cstate_pstate.cstate_exit_ns) {
+			hubbub1->watermarks.c.cstate_pstate.cstate_exit_ns =
 					watermarks->c.cstate_pstate.cstate_exit_ns;
 			prog_wm_value = convert_and_clamp(
 					watermarks->c.cstate_pstate.cstate_exit_ns,
@@ -440,8 +460,8 @@ void hubbub1_program_watermarks(
 	}
 
 	if (safe_to_lower || watermarks->c.cstate_pstate.pstate_change_ns
-			> hubbub->watermarks.c.cstate_pstate.pstate_change_ns) {
-		hubbub->watermarks.c.cstate_pstate.pstate_change_ns =
+			> hubbub1->watermarks.c.cstate_pstate.pstate_change_ns) {
+		hubbub1->watermarks.c.cstate_pstate.pstate_change_ns =
 				watermarks->c.cstate_pstate.pstate_change_ns;
 		prog_wm_value = convert_and_clamp(
 				watermarks->c.cstate_pstate.pstate_change_ns,
@@ -453,8 +473,8 @@ void hubbub1_program_watermarks(
 	}
 
 	/* clock state D */
-	if (safe_to_lower || watermarks->d.urgent_ns > hubbub->watermarks.d.urgent_ns) {
-		hubbub->watermarks.d.urgent_ns = watermarks->d.urgent_ns;
+	if (safe_to_lower || watermarks->d.urgent_ns > hubbub1->watermarks.d.urgent_ns) {
+		hubbub1->watermarks.d.urgent_ns = watermarks->d.urgent_ns;
 		prog_wm_value = convert_and_clamp(watermarks->d.urgent_ns,
 				refclk_mhz, 0x1fffff);
 		REG_WRITE(DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_D, prog_wm_value);
@@ -464,20 +484,22 @@ void hubbub1_program_watermarks(
 			watermarks->d.urgent_ns, prog_wm_value);
 	}
 
-	if (safe_to_lower || watermarks->d.pte_meta_urgent_ns > hubbub->watermarks.d.pte_meta_urgent_ns) {
-		hubbub->watermarks.d.pte_meta_urgent_ns = watermarks->d.pte_meta_urgent_ns;
-		prog_wm_value = convert_and_clamp(watermarks->d.pte_meta_urgent_ns,
-				refclk_mhz, 0x1fffff);
-		REG_WRITE(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_D, prog_wm_value);
-		DC_LOG_BANDWIDTH_CALCS("PTE_META_URGENCY_WATERMARK_D calculated =%d\n"
-			"HW register value = 0x%x\n",
-			watermarks->d.pte_meta_urgent_ns, prog_wm_value);
+	if (REG(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_D)) {
+		if (safe_to_lower || watermarks->d.pte_meta_urgent_ns > hubbub1->watermarks.d.pte_meta_urgent_ns) {
+			hubbub1->watermarks.d.pte_meta_urgent_ns = watermarks->d.pte_meta_urgent_ns;
+			prog_wm_value = convert_and_clamp(watermarks->d.pte_meta_urgent_ns,
+					refclk_mhz, 0x1fffff);
+			REG_WRITE(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_D, prog_wm_value);
+			DC_LOG_BANDWIDTH_CALCS("PTE_META_URGENCY_WATERMARK_D calculated =%d\n"
+				"HW register value = 0x%x\n",
+				watermarks->d.pte_meta_urgent_ns, prog_wm_value);
+		}
 	}
 
 	if (REG(DCHUBBUB_ARB_ALLOW_SR_ENTER_WATERMARK_D)) {
 		if (safe_to_lower || watermarks->d.cstate_pstate.cstate_enter_plus_exit_ns
-				> hubbub->watermarks.d.cstate_pstate.cstate_enter_plus_exit_ns) {
-			hubbub->watermarks.d.cstate_pstate.cstate_enter_plus_exit_ns =
+				> hubbub1->watermarks.d.cstate_pstate.cstate_enter_plus_exit_ns) {
+			hubbub1->watermarks.d.cstate_pstate.cstate_enter_plus_exit_ns =
 					watermarks->d.cstate_pstate.cstate_enter_plus_exit_ns;
 			prog_wm_value = convert_and_clamp(
 					watermarks->d.cstate_pstate.cstate_enter_plus_exit_ns,
@@ -489,8 +511,8 @@ void hubbub1_program_watermarks(
 		}
 
 		if (safe_to_lower || watermarks->d.cstate_pstate.cstate_exit_ns
-				> hubbub->watermarks.d.cstate_pstate.cstate_exit_ns) {
-			hubbub->watermarks.d.cstate_pstate.cstate_exit_ns =
+				> hubbub1->watermarks.d.cstate_pstate.cstate_exit_ns) {
+			hubbub1->watermarks.d.cstate_pstate.cstate_exit_ns =
 					watermarks->d.cstate_pstate.cstate_exit_ns;
 			prog_wm_value = convert_and_clamp(
 					watermarks->d.cstate_pstate.cstate_exit_ns,
@@ -503,8 +525,8 @@ void hubbub1_program_watermarks(
 	}
 
 	if (safe_to_lower || watermarks->d.cstate_pstate.pstate_change_ns
-			> hubbub->watermarks.d.cstate_pstate.pstate_change_ns) {
-		hubbub->watermarks.d.cstate_pstate.pstate_change_ns =
+			> hubbub1->watermarks.d.cstate_pstate.pstate_change_ns) {
+		hubbub1->watermarks.d.cstate_pstate.pstate_change_ns =
 				watermarks->d.cstate_pstate.pstate_change_ns;
 		prog_wm_value = convert_and_clamp(
 				watermarks->d.cstate_pstate.pstate_change_ns,
@@ -520,9 +542,7 @@ void hubbub1_program_watermarks(
 	REG_UPDATE(DCHUBBUB_ARB_DF_REQ_OUTSTAND,
 			DCHUBBUB_ARB_MIN_REQ_OUTSTAND, 68);
 
-	REG_UPDATE_2(DCHUBBUB_ARB_DRAM_STATE_CNTL,
-			DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_VALUE, 0,
-			DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_ENABLE, force_en);
+	hubbub1_allow_self_refresh_control(hubbub, !hubbub->ctx->dc->debug.disable_stutter);
 
 #if 0
 	REG_UPDATE_2(DCHUBBUB_ARB_WATERMARK_CHANGE_CNTL,
@@ -535,6 +555,8 @@ void hubbub1_update_dchub(
 	struct hubbub *hubbub,
 	struct dchub_init_data *dh_data)
 {
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
+
 	if (REG(DCHUBBUB_SDPIF_FB_TOP) == 0) {
 		ASSERT(false);
 		/*should not come here*/
@@ -594,6 +616,8 @@ void hubbub1_update_dchub(
 
 void hubbub1_toggle_watermark_change_req(struct hubbub *hubbub)
 {
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
+
 	uint32_t watermark_change_req;
 
 	REG_GET(DCHUBBUB_ARB_WATERMARK_CHANGE_CNTL,
@@ -610,6 +634,8 @@ void hubbub1_toggle_watermark_change_req(struct hubbub *hubbub)
 
 void hubbub1_soft_reset(struct hubbub *hubbub, bool reset)
 {
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
+
 	uint32_t reset_en = reset ? 1 : 0;
 
 	REG_UPDATE(DCHUBBUB_SOFT_RESET,
@@ -752,7 +778,9 @@ static bool hubbub1_get_dcc_compression_cap(struct hubbub *hubbub,
 		const struct dc_dcc_surface_param *input,
 		struct dc_surface_dcc_cap *output)
 {
-	struct dc *dc = hubbub->ctx->dc;
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
+	struct dc *dc = hubbub1->base.ctx->dc;
+
 	/* implement section 1.6.2.1 of DCN1_Programming_Guide.docx */
 	enum dcc_control dcc_control;
 	unsigned int bpe;
@@ -764,10 +792,10 @@ static bool hubbub1_get_dcc_compression_cap(struct hubbub *hubbub,
 	if (dc->debug.disable_dcc == DCC_DISABLE)
 		return false;
 
-	if (!hubbub->funcs->dcc_support_pixel_format(input->format, &bpe))
+	if (!hubbub1->base.funcs->dcc_support_pixel_format(input->format, &bpe))
 		return false;
 
-	if (!hubbub->funcs->dcc_support_swizzle(input->swizzle_mode, bpe,
+	if (!hubbub1->base.funcs->dcc_support_swizzle(input->swizzle_mode, bpe,
 			&segment_order_horz, &segment_order_vert))
 		return false;
 
@@ -837,6 +865,7 @@ static const struct hubbub_funcs hubbub1_funcs = {
 	.dcc_support_swizzle = hubbub1_dcc_support_swizzle,
 	.dcc_support_pixel_format = hubbub1_dcc_support_pixel_format,
 	.get_dcc_compression_cap = hubbub1_get_dcc_compression_cap,
+	.wm_read_state = hubbub1_wm_read_state,
 };
 
 void hubbub1_construct(struct hubbub *hubbub,
@@ -845,18 +874,20 @@ void hubbub1_construct(struct hubbub *hubbub,
 	const struct dcn_hubbub_shift *hubbub_shift,
 	const struct dcn_hubbub_mask *hubbub_mask)
 {
-	hubbub->ctx = ctx;
+	struct dcn10_hubbub *hubbub1 = TO_DCN10_HUBBUB(hubbub);
+
+	hubbub1->base.ctx = ctx;
 
-	hubbub->funcs = &hubbub1_funcs;
+	hubbub1->base.funcs = &hubbub1_funcs;
 
-	hubbub->regs = hubbub_regs;
-	hubbub->shifts = hubbub_shift;
-	hubbub->masks = hubbub_mask;
+	hubbub1->regs = hubbub_regs;
+	hubbub1->shifts = hubbub_shift;
+	hubbub1->masks = hubbub_mask;
 
-	hubbub->debug_test_index_pstate = 0x7;
+	hubbub1->debug_test_index_pstate = 0x7;
 #if defined(CONFIG_DRM_AMD_DC_DCN1_01)
 	if (ctx->dce_version == DCN_VERSION_1_01)
-		hubbub->debug_test_index_pstate = 0xB;
+		hubbub1->debug_test_index_pstate = 0xB;
 #endif
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h
index d0f03d152913..9cd4a5194154 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.h
@@ -29,6 +29,9 @@
 #include "core_types.h"
 #include "dchubbub.h"
 
+#define TO_DCN10_HUBBUB(hubbub)\
+	container_of(hubbub, struct dcn10_hubbub, base)
+
 #define HUBHUB_REG_LIST_DCN()\
 	SR(DCHUBBUB_ARB_DATA_URGENCY_WATERMARK_A),\
 	SR(DCHUBBUB_ARB_PTE_META_URGENCY_WATERMARK_A),\
@@ -107,6 +110,12 @@ struct dcn_hubbub_registers {
 	uint32_t DCHUBBUB_SDPIF_AGP_TOP;
 	uint32_t DCHUBBUB_CRC_CTRL;
 	uint32_t DCHUBBUB_SOFT_RESET;
+	uint32_t DCN_VM_FB_LOCATION_BASE;
+	uint32_t DCN_VM_FB_LOCATION_TOP;
+	uint32_t DCN_VM_FB_OFFSET;
+	uint32_t DCN_VM_AGP_BOT;
+	uint32_t DCN_VM_AGP_TOP;
+	uint32_t DCN_VM_AGP_BASE;
 };
 
 /* set field name */
@@ -152,7 +161,13 @@ struct dcn_hubbub_registers {
 		type SDPIF_FB_OFFSET;\
 		type SDPIF_AGP_BASE;\
 		type SDPIF_AGP_BOT;\
-		type SDPIF_AGP_TOP
+		type SDPIF_AGP_TOP;\
+		type FB_BASE;\
+		type FB_TOP;\
+		type FB_OFFSET;\
+		type AGP_BOT;\
+		type AGP_TOP;\
+		type AGP_BASE
 
 
 struct dcn_hubbub_shift {
@@ -165,22 +180,8 @@ struct dcn_hubbub_mask {
 
 struct dc;
 
-struct dcn_hubbub_wm_set {
-	uint32_t wm_set;
-	uint32_t data_urgent;
-	uint32_t pte_meta_urgent;
-	uint32_t sr_enter;
-	uint32_t sr_exit;
-	uint32_t dram_clk_chanage;
-};
-
-struct dcn_hubbub_wm {
-	struct dcn_hubbub_wm_set sets[4];
-};
-
-struct hubbub {
-	const struct hubbub_funcs *funcs;
-	struct dc_context *ctx;
+struct dcn10_hubbub {
+	struct hubbub base;
 	const struct dcn_hubbub_registers *regs;
 	const struct dcn_hubbub_shift *shifts;
 	const struct dcn_hubbub_mask *masks;
@@ -203,7 +204,7 @@ void hubbub1_program_watermarks(
 		unsigned int refclk_mhz,
 		bool safe_to_lower);
 
-void hubbub1_disable_allow_self_refresh(struct hubbub *hubbub);
+void hubbub1_allow_self_refresh_control(struct hubbub *hubbub, bool allow);
 
 bool hububu1_is_allow_self_refresh_enabled(struct hubbub *hubub);
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
index d1acd7165bc8..683829466a44 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.c
@@ -115,7 +115,7 @@ static void hubp1_set_hubp_blank_en(struct hubp *hubp, bool blank)
 	REG_UPDATE(DCHUBP_CNTL, HUBP_BLANK_EN, blank_en);
 }
 
-static void hubp1_vready_workaround(struct hubp *hubp,
+void hubp1_vready_workaround(struct hubp *hubp,
 		struct _vcs_dpi_display_pipe_dest_params_st *pipe_dest)
 {
 	uint32_t value = 0;
@@ -317,7 +317,8 @@ void hubp1_program_pixel_format(
 bool hubp1_program_surface_flip_and_addr(
 	struct hubp *hubp,
 	const struct dc_plane_address *address,
-	bool flip_immediate)
+	bool flip_immediate,
+	uint8_t vmid)
 {
 	struct dcn10_hubp *hubp1 = TO_DCN10_HUBP(hubp);
 
@@ -1149,9 +1150,28 @@ void hubp1_cursor_set_position(
 	REG_UPDATE(CURSOR_CONTROL,
 			CURSOR_ENABLE, cur_en);
 
-	REG_SET_2(CURSOR_POSITION, 0,
-			CURSOR_X_POSITION, pos->x,
+	//account for cases where we see negative offset relative to overlay plane
+	if (src_x_offset < 0 && src_y_offset < 0) {
+		REG_SET_2(CURSOR_POSITION, 0,
+			CURSOR_X_POSITION, 0,
+			CURSOR_Y_POSITION, 0);
+		x_hotspot -= src_x_offset;
+		y_hotspot -= src_y_offset;
+	} else if (src_x_offset < 0) {
+		REG_SET_2(CURSOR_POSITION, 0,
+			CURSOR_X_POSITION, 0,
 			CURSOR_Y_POSITION, pos->y);
+		x_hotspot -= src_x_offset;
+	} else if (src_y_offset < 0) {
+		REG_SET_2(CURSOR_POSITION, 0,
+			CURSOR_X_POSITION, pos->x,
+			CURSOR_Y_POSITION, 0);
+		y_hotspot -= src_y_offset;
+	} else {
+		REG_SET_2(CURSOR_POSITION, 0,
+				CURSOR_X_POSITION, pos->x,
+				CURSOR_Y_POSITION, pos->y);
+	}
 
 	REG_SET_2(CURSOR_HOT_SPOT, 0,
 			CURSOR_HOT_SPOT_X, x_hotspot,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h
index 62d4232e7796..a6d6dfe00617 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubp.h
@@ -707,11 +707,6 @@ void hubp1_dcc_control(struct hubp *hubp,
 		bool enable,
 		bool independent_64b_blks);
 
-bool hubp1_program_surface_flip_and_addr(
-	struct hubp *hubp,
-	const struct dc_plane_address *address,
-	bool flip_immediate);
-
 bool hubp1_is_flip_pending(struct hubp *hubp);
 
 void hubp1_cursor_set_attributes(
@@ -745,5 +740,7 @@ void hubp1_clear_underflow(struct hubp *hubp);
 
 enum cursor_pitch hubp1_get_cursor_pitch(unsigned int pitch);
 
+void hubp1_vready_workaround(struct hubp *hubp,
+		struct _vcs_dpi_display_pipe_dest_params_st *pipe_dest);
 
 #endif
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 41883c981789..d1a8f1c302a9 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -40,7 +40,6 @@
 #include "ipp.h"
 #include "mpc.h"
 #include "reg_helper.h"
-#include "custom_float.h"
 #include "dcn10_hubp.h"
 #include "dcn10_hubbub.h"
 #include "dcn10_cm_common.h"
@@ -92,10 +91,11 @@ static void log_mpc_crc(struct dc *dc,
 void dcn10_log_hubbub_state(struct dc *dc, struct dc_log_buffer_ctx *log_ctx)
 {
 	struct dc_context *dc_ctx = dc->ctx;
-	struct dcn_hubbub_wm wm = {0};
+	struct dcn_hubbub_wm wm;
 	int i;
 
-	hubbub1_wm_read_state(dc->res_pool->hubbub, &wm);
+	memset(&wm, 0, sizeof(struct dcn_hubbub_wm));
+	dc->res_pool->hubbub->funcs->wm_read_state(dc->res_pool->hubbub, &wm);
 
 	DTN_INFO("HUBBUB WM:      data_urgent  pte_meta_urgent"
 			"         sr_enter          sr_exit  dram_clk_change\n");
@@ -636,8 +636,6 @@ static enum dc_status dcn10_enable_stream_timing(
 	struct dc_stream_state *stream = pipe_ctx->stream;
 	enum dc_color_space color_space;
 	struct tg_color black_color = {0};
-	struct drr_params params = {0};
-	unsigned int event_triggers = 0;
 
 	/* by upper caller loop, pipe0 is parent pipe and be called first.
 	 * back end is set up by for pipe0. Other children pipe share back end
@@ -705,19 +703,6 @@ static enum dc_status dcn10_enable_stream_timing(
 		return DC_ERROR_UNEXPECTED;
 	}
 
-	params.vertical_total_min = stream->adjust.v_total_min;
-	params.vertical_total_max = stream->adjust.v_total_max;
-	if (pipe_ctx->stream_res.tg->funcs->set_drr)
-		pipe_ctx->stream_res.tg->funcs->set_drr(
-			pipe_ctx->stream_res.tg, &params);
-
-	// DRR should set trigger event to monitor surface update event
-	if (stream->adjust.v_total_min != 0 && stream->adjust.v_total_max != 0)
-		event_triggers = 0x80;
-	if (pipe_ctx->stream_res.tg->funcs->set_static_screen_control)
-		pipe_ctx->stream_res.tg->funcs->set_static_screen_control(
-				pipe_ctx->stream_res.tg, event_triggers);
-
 	/* TODO program crtc source select for non-virtual signal*/
 	/* TODO program FMT */
 	/* TODO setup link_enc */
@@ -971,92 +956,62 @@ static void dcn10_disable_plane(struct dc *dc, struct pipe_ctx *pipe_ctx)
 					pipe_ctx->pipe_idx);
 }
 
-static void dcn10_init_hw(struct dc *dc)
+static void dcn10_init_pipes(struct dc *dc, struct dc_state *context)
 {
 	int i;
-	struct abm *abm = dc->res_pool->abm;
-	struct dmcu *dmcu = dc->res_pool->dmcu;
-	struct dce_hwseq *hws = dc->hwseq;
-	struct dc_bios *dcb = dc->ctx->dc_bios;
-	struct dc_state  *context = dc->current_state;
-
-	if (IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment)) {
-		REG_WRITE(REFCLK_CNTL, 0);
-		REG_UPDATE(DCHUBBUB_GLOBAL_TIMER_CNTL, DCHUBBUB_GLOBAL_TIMER_ENABLE, 1);
-		REG_WRITE(DIO_MEM_PWR_CTRL, 0);
-
-		if (!dc->debug.disable_clock_gate) {
-			/* enable all DCN clock gating */
-			REG_WRITE(DCCG_GATE_DISABLE_CNTL, 0);
-
-			REG_WRITE(DCCG_GATE_DISABLE_CNTL2, 0);
-
-			REG_UPDATE(DCFCLK_CNTL, DCFCLK_GATE_DIS, 0);
-		}
-
-		enable_power_gating_plane(dc->hwseq, true);
-	} else {
-
-		if (!dcb->funcs->is_accelerated_mode(dcb)) {
-			bool allow_self_fresh_force_enable =
-					hububu1_is_allow_self_refresh_enabled(dc->res_pool->hubbub);
-
-			bios_golden_init(dc);
-
-			/* WA for making DF sleep when idle after resume from S0i3.
-			 * DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_ENABLE is set to 1 by
-			 * command table, if DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_ENABLE = 0
-			 * before calling command table and it changed to 1 after,
-			 * it should be set back to 0.
-			 */
-			if (allow_self_fresh_force_enable == false &&
-					hububu1_is_allow_self_refresh_enabled(dc->res_pool->hubbub))
-				hubbub1_disable_allow_self_refresh(dc->res_pool->hubbub);
-
-			disable_vga(dc->hwseq);
-		}
+	bool can_apply_seamless_boot = false;
 
-		for (i = 0; i < dc->link_count; i++) {
-			/* Power up AND update implementation according to the
-			 * required signal (which may be different from the
-			 * default signal on connector).
-			 */
-			struct dc_link *link = dc->links[i];
-
-			if (link->link_enc->connector.id == CONNECTOR_ID_EDP)
-				dc->hwss.edp_power_control(link, true);
-
-			link->link_enc->funcs->hw_init(link->link_enc);
+	for (i = 0; i < context->stream_count; i++) {
+		if (context->streams[i]->apply_seamless_boot_optimization) {
+			can_apply_seamless_boot = true;
+			break;
 		}
 	}
 
 	for (i = 0; i < dc->res_pool->pipe_count; i++) {
 		struct timing_generator *tg = dc->res_pool->timing_generators[i];
+		struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i];
+
+		/* There is assumption that pipe_ctx is not mapping irregularly
+		 * to non-preferred front end. If pipe_ctx->stream is not NULL,
+		 * we will use the pipe, so don't disable
+		 */
+		if (pipe_ctx->stream != NULL)
+			continue;
 
 		if (tg->funcs->is_tg_enabled(tg))
 			tg->funcs->lock(tg);
-	}
-
-	/* Blank controller using driver code instead of
-	 * command table.
-	 */
-	for (i = 0; i < dc->res_pool->pipe_count; i++) {
-		struct timing_generator *tg = dc->res_pool->timing_generators[i];
 
+		/* Blank controller using driver code instead of
+		 * command table.
+		 */
 		if (tg->funcs->is_tg_enabled(tg)) {
 			tg->funcs->set_blank(tg, true);
 			hwss_wait_for_blank_complete(tg);
 		}
 	}
 
-	/* Reset all MPCC muxes */
-	dc->res_pool->mpc->funcs->mpc_init(dc->res_pool->mpc);
+	/* Cannot reset the MPC mux if seamless boot */
+	if (!can_apply_seamless_boot)
+		dc->res_pool->mpc->funcs->mpc_init(dc->res_pool->mpc);
 
-	for (i = 0; i < dc->res_pool->timing_generator_count; i++) {
+	for (i = 0; i < dc->res_pool->pipe_count; i++) {
 		struct timing_generator *tg = dc->res_pool->timing_generators[i];
-		struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i];
 		struct hubp *hubp = dc->res_pool->hubps[i];
 		struct dpp *dpp = dc->res_pool->dpps[i];
+		struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i];
+
+		// W/A for issue with dc_post_update_surfaces_to_stream
+		hubp->power_gated = true;
+
+		/* There is assumption that pipe_ctx is not mapping irregularly
+		 * to non-preferred front end. If pipe_ctx->stream is not NULL,
+		 * we will use the pipe, so don't disable
+		 */
+		if (pipe_ctx->stream != NULL)
+			continue;
+
+		dpp->funcs->dpp_reset(dpp);
 
 		pipe_ctx->stream_res.tg = tg;
 		pipe_ctx->pipe_idx = i;
@@ -1074,18 +1029,9 @@ static void dcn10_init_hw(struct dc *dc)
 		pipe_ctx->stream_res.opp = dc->res_pool->opps[i];
 
 		hwss1_plane_atomic_disconnect(dc, pipe_ctx);
-	}
-
-	for (i = 0; i < dc->res_pool->pipe_count; i++) {
-		struct timing_generator *tg = dc->res_pool->timing_generators[i];
 
 		if (tg->funcs->is_tg_enabled(tg))
 			tg->funcs->unlock(tg);
-	}
-
-	for (i = 0; i < dc->res_pool->pipe_count; i++) {
-		struct timing_generator *tg = dc->res_pool->timing_generators[i];
-		struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i];
 
 		dcn10_disable_plane(dc, pipe_ctx);
 
@@ -1094,10 +1040,73 @@ static void dcn10_init_hw(struct dc *dc)
 
 		tg->funcs->tg_init(tg);
 	}
+}
 
-	/* end of FPGA. Below if real ASIC */
-	if (IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment))
+static void dcn10_init_hw(struct dc *dc)
+{
+	int i;
+	struct abm *abm = dc->res_pool->abm;
+	struct dmcu *dmcu = dc->res_pool->dmcu;
+	struct dce_hwseq *hws = dc->hwseq;
+	struct dc_bios *dcb = dc->ctx->dc_bios;
+
+	if (IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment)) {
+		REG_WRITE(REFCLK_CNTL, 0);
+		REG_UPDATE(DCHUBBUB_GLOBAL_TIMER_CNTL, DCHUBBUB_GLOBAL_TIMER_ENABLE, 1);
+		REG_WRITE(DIO_MEM_PWR_CTRL, 0);
+
+		if (!dc->debug.disable_clock_gate) {
+			/* enable all DCN clock gating */
+			REG_WRITE(DCCG_GATE_DISABLE_CNTL, 0);
+
+			REG_WRITE(DCCG_GATE_DISABLE_CNTL2, 0);
+
+			REG_UPDATE(DCFCLK_CNTL, DCFCLK_GATE_DIS, 0);
+		}
+
+		enable_power_gating_plane(dc->hwseq, true);
+
+		/* end of FPGA. Below if real ASIC */
 		return;
+	}
+
+	if (!dcb->funcs->is_accelerated_mode(dcb)) {
+		bool allow_self_fresh_force_enable =
+			hububu1_is_allow_self_refresh_enabled(
+						dc->res_pool->hubbub);
+
+		bios_golden_init(dc);
+
+		/* WA for making DF sleep when idle after resume from S0i3.
+		 * DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_ENABLE is set to 1 by
+		 * command table, if DCHUBBUB_ARB_ALLOW_SELF_REFRESH_FORCE_ENABLE = 0
+		 * before calling command table and it changed to 1 after,
+		 * it should be set back to 0.
+		 */
+		if (allow_self_fresh_force_enable == false &&
+				hububu1_is_allow_self_refresh_enabled(dc->res_pool->hubbub))
+			hubbub1_allow_self_refresh_control(dc->res_pool->hubbub, true);
+
+		disable_vga(dc->hwseq);
+	}
+
+	for (i = 0; i < dc->link_count; i++) {
+		/* Power up AND update implementation according to the
+		 * required signal (which may be different from the
+		 * default signal on connector).
+		 */
+		struct dc_link *link = dc->links[i];
+
+		if (link->link_enc->connector.id == CONNECTOR_ID_EDP)
+			dc->hwss.edp_power_control(link, true);
+
+		link->link_enc->funcs->hw_init(link->link_enc);
+
+		/* Check for enabled DIG to identify enabled display */
+		if (link->link_enc->funcs->is_dig_enabled &&
+			link->link_enc->funcs->is_dig_enabled(link->link_enc))
+			link->link_status.link_active = true;
+	}
 
 	for (i = 0; i < dc->res_pool->audio_count; i++) {
 		struct audio *audio = dc->res_pool->audios[i];
@@ -1128,6 +1137,9 @@ static void dcn10_init_hw(struct dc *dc)
 	enable_power_gating_plane(dc->hwseq, true);
 
 	memset(&dc->res_pool->clk_mgr->clks, 0, sizeof(dc->res_pool->clk_mgr->clks));
+
+	if (dc->hwss.init_pipes)
+		dc->hwss.init_pipes(dc, dc->current_state);
 }
 
 static void reset_hw_ctx_wrap(
@@ -1153,11 +1165,13 @@ static void reset_hw_ctx_wrap(
 			struct clock_source *old_clk = pipe_ctx_old->clock_source;
 
 			reset_back_end_for_pipe(dc, pipe_ctx_old, dc->current_state);
+			if (dc->hwss.enable_stream_gating) {
+				dc->hwss.enable_stream_gating(dc, pipe_ctx);
+			}
 			if (old_clk)
 				old_clk->funcs->cs_power_down(old_clk);
 		}
 	}
-
 }
 
 static bool patch_address_for_sbs_tb_stereo(
@@ -1202,7 +1216,8 @@ static void dcn10_update_plane_addr(const struct dc *dc, struct pipe_ctx *pipe_c
 	pipe_ctx->plane_res.hubp->funcs->hubp_program_surface_flip_and_addr(
 			pipe_ctx->plane_res.hubp,
 			&plane_state->address,
-			plane_state->flip_immediate);
+			plane_state->flip_immediate,
+			0);
 
 	plane_state->status.requested_address = plane_state->address;
 
@@ -2048,7 +2063,7 @@ void update_dchubp_dpp(
 			dc->res_pool->dccg->funcs->update_dpp_dto(
 					dc->res_pool->dccg,
 					dpp->inst,
-					pipe_ctx->plane_res.bw.calc.dppclk_khz);
+					pipe_ctx->plane_res.bw.dppclk_khz);
 		else
 			dc->res_pool->clk_mgr->clks.dppclk_khz = should_divided_by_2 ?
 						dc->res_pool->clk_mgr->clks.dispclk_khz / 2 :
@@ -2125,7 +2140,8 @@ void update_dchubp_dpp(
 		plane_state->update_flags.bits.swizzle_change ||
 		plane_state->update_flags.bits.dcc_change ||
 		plane_state->update_flags.bits.bpp_change ||
-		plane_state->update_flags.bits.scaling_change) {
+		plane_state->update_flags.bits.scaling_change ||
+		plane_state->update_flags.bits.plane_size_change) {
 		hubp->funcs->hubp_program_surface_config(
 			hubp,
 			plane_state->format,
@@ -2176,8 +2192,10 @@ static void dcn10_blank_pixel_data(
 	if (!blank) {
 		if (stream_res->tg->funcs->set_blank)
 			stream_res->tg->funcs->set_blank(stream_res->tg, blank);
-		if (stream_res->abm)
+		if (stream_res->abm) {
+			stream_res->abm->funcs->set_pipe(stream_res->abm, stream_res->tg->inst + 1);
 			stream_res->abm->funcs->set_abm_level(stream_res->abm, stream->abm_level);
+		}
 	} else if (blank) {
 		if (stream_res->abm)
 			stream_res->abm->funcs->set_abm_immediate_disable(stream_res->abm);
@@ -2252,13 +2270,11 @@ static void program_all_pipe_in_tree(
 
 	}
 
-	if (pipe_ctx->plane_state != NULL) {
+	if (pipe_ctx->plane_state != NULL)
 		dcn10_program_pipe(dc, pipe_ctx, context);
-	}
 
-	if (pipe_ctx->bottom_pipe != NULL && pipe_ctx->bottom_pipe != pipe_ctx) {
+	if (pipe_ctx->bottom_pipe != NULL && pipe_ctx->bottom_pipe != pipe_ctx)
 		program_all_pipe_in_tree(dc, pipe_ctx->bottom_pipe, context);
-	}
 }
 
 struct pipe_ctx *find_top_pipe_for_stream(
@@ -2334,9 +2350,10 @@ static void dcn10_apply_ctx_for_surface(
 			}
 		}
 
-		if (!pipe_ctx->plane_state &&
-			old_pipe_ctx->plane_state &&
-			old_pipe_ctx->stream_res.tg == tg) {
+		if ((!pipe_ctx->plane_state ||
+		     pipe_ctx->stream_res.tg != old_pipe_ctx->stream_res.tg) &&
+		    old_pipe_ctx->plane_state &&
+		    old_pipe_ctx->stream_res.tg == tg) {
 
 			dc->hwss.plane_atomic_disconnect(dc, old_pipe_ctx);
 			removed_pipe[i] = true;
@@ -2383,6 +2400,22 @@ static void dcn10_apply_ctx_for_surface(
 		hubbub1_wm_change_req_wa(dc->res_pool->hubbub);
 }
 
+static void dcn10_stereo_hw_frame_pack_wa(struct dc *dc, struct dc_state *context)
+{
+	uint8_t i;
+
+	for (i = 0; i < context->stream_count; i++) {
+		if (context->streams[i]->timing.timing_3d_format
+				== TIMING_3D_FORMAT_HW_FRAME_PACKING) {
+			/*
+			 * Disable stutter
+			 */
+			hubbub1_allow_self_refresh_control(dc->res_pool->hubbub, false);
+			break;
+		}
+	}
+}
+
 static void dcn10_prepare_bandwidth(
 		struct dc *dc,
 		struct dc_state *context)
@@ -2404,6 +2437,7 @@ static void dcn10_prepare_bandwidth(
 			&context->bw.dcn.watermarks,
 			dc->res_pool->ref_clock_inKhz / 1000,
 			true);
+	dcn10_stereo_hw_frame_pack_wa(dc, context);
 
 	if (dc->debug.pplib_wm_report_mode == WM_REPORT_OVERRIDE)
 		dcn_bw_notify_pplib_of_wm_ranges(dc);
@@ -2433,6 +2467,7 @@ static void dcn10_optimize_bandwidth(
 			&context->bw.dcn.watermarks,
 			dc->res_pool->ref_clock_inKhz / 1000,
 			true);
+	dcn10_stereo_hw_frame_pack_wa(dc, context);
 
 	if (dc->debug.pplib_wm_report_mode == WM_REPORT_OVERRIDE)
 		dcn_bw_notify_pplib_of_wm_ranges(dc);
@@ -2518,7 +2553,7 @@ static void dcn10_config_stereo_parameters(
 			timing_3d_format == TIMING_3D_FORMAT_DP_HDMI_INBAND_FA ||
 			timing_3d_format == TIMING_3D_FORMAT_SIDEBAND_FA) {
 			enum display_dongle_type dongle = \
-					stream->sink->link->ddc->dongle_type;
+					stream->link->ddc->dongle_type;
 			if (dongle == DISPLAY_DONGLE_DP_VGA_CONVERTER ||
 				dongle == DISPLAY_DONGLE_DP_DVI_CONVERTER ||
 				dongle == DISPLAY_DONGLE_DP_HDMI_CONVERTER)
@@ -2649,7 +2684,7 @@ static void dcn10_set_cursor_position(struct pipe_ctx *pipe_ctx)
 	struct hubp *hubp = pipe_ctx->plane_res.hubp;
 	struct dpp *dpp = pipe_ctx->plane_res.dpp;
 	struct dc_cursor_mi_param param = {
-		.pixel_clk_khz = pipe_ctx->stream->timing.pix_clk_khz,
+		.pixel_clk_khz = pipe_ctx->stream->timing.pix_clk_100hz / 10,
 		.ref_clk_khz = pipe_ctx->stream->ctx->dc->res_pool->ref_clock_inKhz,
 		.viewport = pipe_ctx->plane_res.scl_data.viewport,
 		.h_scale_ratio = pipe_ctx->plane_res.scl_data.ratios.horz,
@@ -2706,9 +2741,151 @@ static void dcn10_set_cursor_sdr_white_level(struct pipe_ctx *pipe_ctx)
 			pipe_ctx->plane_res.dpp, &opt_attr);
 }
 
+/**
+* apply_front_porch_workaround  TODO FPGA still need?
+*
+* This is a workaround for a bug that has existed since R5xx and has not been
+* fixed keep Front porch at minimum 2 for Interlaced mode or 1 for progressive.
+*/
+static void apply_front_porch_workaround(
+	struct dc_crtc_timing *timing)
+{
+	if (timing->flags.INTERLACE == 1) {
+		if (timing->v_front_porch < 2)
+			timing->v_front_porch = 2;
+	} else {
+		if (timing->v_front_porch < 1)
+			timing->v_front_porch = 1;
+	}
+}
+
+int get_vupdate_offset_from_vsync(struct pipe_ctx *pipe_ctx)
+{
+	struct timing_generator *optc = pipe_ctx->stream_res.tg;
+	const struct dc_crtc_timing *dc_crtc_timing = &pipe_ctx->stream->timing;
+	struct dc_crtc_timing patched_crtc_timing;
+	int vesa_sync_start;
+	int asic_blank_end;
+	int interlace_factor;
+	int vertical_line_start;
+
+	patched_crtc_timing = *dc_crtc_timing;
+	apply_front_porch_workaround(&patched_crtc_timing);
+
+	interlace_factor = patched_crtc_timing.flags.INTERLACE ? 2 : 1;
+
+	vesa_sync_start = patched_crtc_timing.v_addressable +
+			patched_crtc_timing.v_border_bottom +
+			patched_crtc_timing.v_front_porch;
+
+	asic_blank_end = (patched_crtc_timing.v_total -
+			vesa_sync_start -
+			patched_crtc_timing.v_border_top)
+			* interlace_factor;
+
+	vertical_line_start = asic_blank_end -
+			optc->dlg_otg_param.vstartup_start + 1;
+
+	return vertical_line_start;
+}
+
+static void calc_vupdate_position(
+		struct pipe_ctx *pipe_ctx,
+		uint32_t *start_line,
+		uint32_t *end_line)
+{
+	const struct dc_crtc_timing *dc_crtc_timing = &pipe_ctx->stream->timing;
+	int vline_int_offset_from_vupdate =
+			pipe_ctx->stream->periodic_interrupt0.lines_offset;
+	int vupdate_offset_from_vsync = get_vupdate_offset_from_vsync(pipe_ctx);
+	int start_position;
+
+	if (vline_int_offset_from_vupdate > 0)
+		vline_int_offset_from_vupdate--;
+	else if (vline_int_offset_from_vupdate < 0)
+		vline_int_offset_from_vupdate++;
+
+	start_position = vline_int_offset_from_vupdate + vupdate_offset_from_vsync;
+
+	if (start_position >= 0)
+		*start_line = start_position;
+	else
+		*start_line = dc_crtc_timing->v_total + start_position - 1;
+
+	*end_line = *start_line + 2;
+
+	if (*end_line >= dc_crtc_timing->v_total)
+		*end_line = 2;
+}
+
+static void cal_vline_position(
+		struct pipe_ctx *pipe_ctx,
+		enum vline_select vline,
+		uint32_t *start_line,
+		uint32_t *end_line)
+{
+	enum vertical_interrupt_ref_point ref_point = INVALID_POINT;
+
+	if (vline == VLINE0)
+		ref_point = pipe_ctx->stream->periodic_interrupt0.ref_point;
+	else if (vline == VLINE1)
+		ref_point = pipe_ctx->stream->periodic_interrupt1.ref_point;
+
+	switch (ref_point) {
+	case START_V_UPDATE:
+		calc_vupdate_position(
+				pipe_ctx,
+				start_line,
+				end_line);
+		break;
+	case START_V_SYNC:
+		// Suppose to do nothing because vsync is 0;
+		break;
+	default:
+		ASSERT(0);
+		break;
+	}
+}
+
+static void dcn10_setup_periodic_interrupt(
+		struct pipe_ctx *pipe_ctx,
+		enum vline_select vline)
+{
+	struct timing_generator *tg = pipe_ctx->stream_res.tg;
+
+	if (vline == VLINE0) {
+		uint32_t start_line = 0;
+		uint32_t end_line = 0;
+
+		cal_vline_position(pipe_ctx, vline, &start_line, &end_line);
+
+		tg->funcs->setup_vertical_interrupt0(tg, start_line, end_line);
+
+	} else if (vline == VLINE1) {
+		pipe_ctx->stream_res.tg->funcs->setup_vertical_interrupt1(
+				tg,
+				pipe_ctx->stream->periodic_interrupt1.lines_offset);
+	}
+}
+
+static void dcn10_setup_vupdate_interrupt(struct pipe_ctx *pipe_ctx)
+{
+	struct timing_generator *tg = pipe_ctx->stream_res.tg;
+	int start_line = get_vupdate_offset_from_vsync(pipe_ctx);
+
+	if (start_line < 0) {
+		ASSERT(0);
+		start_line = 0;
+	}
+
+	if (tg->funcs->setup_vertical_interrupt2)
+		tg->funcs->setup_vertical_interrupt2(tg, start_line);
+}
+
 static const struct hw_sequencer_funcs dcn10_funcs = {
 	.program_gamut_remap = program_gamut_remap,
 	.init_hw = dcn10_init_hw,
+	.init_pipes = dcn10_init_pipes,
 	.apply_ctx_to_hw = dce110_apply_ctx_to_hw,
 	.apply_ctx_for_surface = dcn10_apply_ctx_for_surface,
 	.update_plane_addr = dcn10_update_plane_addr,
@@ -2752,7 +2929,11 @@ static const struct hw_sequencer_funcs dcn10_funcs = {
 	.edp_wait_for_hpd_ready = hwss_edp_wait_for_hpd_ready,
 	.set_cursor_position = dcn10_set_cursor_position,
 	.set_cursor_attribute = dcn10_set_cursor_attribute,
-	.set_cursor_sdr_white_level = dcn10_set_cursor_sdr_white_level
+	.set_cursor_sdr_white_level = dcn10_set_cursor_sdr_white_level,
+	.disable_stream_gating = NULL,
+	.enable_stream_gating = NULL,
+	.setup_periodic_interrupt = dcn10_setup_periodic_interrupt,
+	.setup_vupdate_interrupt = dcn10_setup_vupdate_interrupt
 };
 
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.h b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.h
index f8eea10e4c64..6d66084df55f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.h
@@ -81,4 +81,6 @@ struct pipe_ctx *find_top_pipe_for_stream(
 		struct dc_state *context,
 		const struct dc_stream_state *stream);
 
+int get_vupdate_offset_from_vsync(struct pipe_ctx *pipe_ctx);
+
 #endif /* __DC_HWSS_DCN10_H__ */
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer_debug.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer_debug.c
index cd469014baa3..98f41d250978 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer_debug.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer_debug.c
@@ -40,7 +40,6 @@
 #include "ipp.h"
 #include "mpc.h"
 #include "reg_helper.h"
-#include "custom_float.h"
 #include "dcn10_hubp.h"
 #include "dcn10_hubbub.h"
 #include "dcn10_cm_common.h"
@@ -72,7 +71,7 @@ static unsigned int snprintf_count(char *pBuf, unsigned int bufSize, char *fmt,
 static unsigned int dcn10_get_hubbub_state(struct dc *dc, char *pBuf, unsigned int bufSize)
 {
 	struct dc_context *dc_ctx = dc->ctx;
-	struct dcn_hubbub_wm wm = {0};
+	struct dcn_hubbub_wm wm;
 	int i;
 
 	unsigned int chars_printed = 0;
@@ -81,7 +80,8 @@ static unsigned int dcn10_get_hubbub_state(struct dc *dc, char *pBuf, unsigned i
 	const uint32_t ref_clk_mhz = dc_ctx->dc->res_pool->ref_clock_inKhz / 1000;
 	static const unsigned int frac = 1000;
 
-	hubbub1_wm_read_state(dc->res_pool->hubbub, &wm);
+	memset(&wm, 0, sizeof(struct dcn_hubbub_wm));
+	dc->res_pool->hubbub->funcs->wm_read_state(dc->res_pool->hubbub, &wm);
 
 	chars_printed = snprintf_count(pBuf, remaining_buffer, "wm_set_index,data_urgent,pte_meta_urgent,sr_enter,sr_exit,dram_clk_chanage\n");
 	remaining_buffer -= chars_printed;
@@ -419,20 +419,22 @@ static unsigned int dcn10_get_otg_states(struct dc *dc, char *pBuf, unsigned int
 	unsigned int remaining_buffer = bufSize;
 
 	chars_printed = snprintf_count(pBuf, remaining_buffer, "instance,v_bs,v_be,v_ss,v_se,vpol,vmax,vmin,vmax_sel,vmin_sel,"
-			"h_bs,h_be,h_ss,h_se,hpol,htot,vtot,underflow\n");
+			"h_bs,h_be,h_ss,h_se,hpol,htot,vtot,underflow,pixelclk[khz]\n");
 	remaining_buffer -= chars_printed;
 	pBuf += chars_printed;
 
 	for (i = 0; i < pool->timing_generator_count; i++) {
 		struct timing_generator *tg = pool->timing_generators[i];
 		struct dcn_otg_state s = {0};
+		int pix_clk = 0;
 
 		optc1_read_otg_state(DCN10TG_FROM_TG(tg), &s);
+		pix_clk = dc->current_state->res_ctx.pipe_ctx[i].stream_res.pix_clk_params.requested_pix_clk_100hz / 10;
 
 		//only print if OTG master is enabled
 		if (s.otg_enabled & 1) {
 			chars_printed = snprintf_count(pBuf, remaining_buffer, "%x,%d,%d,%d,%d,%d,%d,%d,%d,%d,"
-				"%d,%d,%d,%d,%d,%d,%d,%d"
+				"%d,%d,%d,%d,%d,%d,%d,%d,%d"
 				"\n",
 				tg->inst,
 				s.v_blank_start,
@@ -451,7 +453,8 @@ static unsigned int dcn10_get_otg_states(struct dc *dc, char *pBuf, unsigned int
 				s.h_sync_a_pol,
 				s.h_total,
 				s.v_total,
-				s.underflow_occurred_status);
+				s.underflow_occurred_status,
+				pix_clk);
 
 			remaining_buffer -= chars_printed;
 			pBuf += chars_printed;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_link_encoder.c
index 477ab9222216..a9db372688ff 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_link_encoder.c
@@ -85,6 +85,7 @@ static const struct link_encoder_funcs dcn10_lnk_enc_funcs = {
 	.enable_hpd = dcn10_link_encoder_enable_hpd,
 	.disable_hpd = dcn10_link_encoder_disable_hpd,
 	.is_dig_enabled = dcn10_is_dig_enabled,
+	.get_dig_frontend = dcn10_get_dig_frontend,
 	.destroy = dcn10_link_encoder_destroy
 };
 
@@ -440,7 +441,7 @@ static uint8_t get_frontend_source(
 	}
 }
 
-void configure_encoder(
+void enc1_configure_encoder(
 	struct dcn10_link_encoder *enc10,
 	const struct dc_link_settings *link_settings)
 {
@@ -495,6 +496,15 @@ bool dcn10_is_dig_enabled(struct link_encoder *enc)
 	return value;
 }
 
+unsigned int dcn10_get_dig_frontend(struct link_encoder *enc)
+{
+	struct dcn10_link_encoder *enc10 = TO_DCN10_LINK_ENC(enc);
+	uint32_t value;
+
+	REG_GET(DIG_BE_CNTL, DIG_FE_SOURCE_SELECT, &value);
+	return value;
+}
+
 static void link_encoder_disable(struct dcn10_link_encoder *enc10)
 {
 	/* reset training pattern */
@@ -543,12 +553,12 @@ bool dcn10_link_encoder_validate_dvi_output(
 	if ((connector_signal == SIGNAL_TYPE_DVI_SINGLE_LINK ||
 		connector_signal == SIGNAL_TYPE_HDMI_TYPE_A) &&
 		signal != SIGNAL_TYPE_HDMI_TYPE_A &&
-		crtc_timing->pix_clk_khz > TMDS_MAX_PIXEL_CLOCK)
+		crtc_timing->pix_clk_100hz > (TMDS_MAX_PIXEL_CLOCK * 10))
 		return false;
-	if (crtc_timing->pix_clk_khz < TMDS_MIN_PIXEL_CLOCK)
+	if (crtc_timing->pix_clk_100hz < (TMDS_MIN_PIXEL_CLOCK * 10))
 		return false;
 
-	if (crtc_timing->pix_clk_khz > max_pixel_clock)
+	if (crtc_timing->pix_clk_100hz > (max_pixel_clock * 10))
 		return false;
 
 	/* DVI supports 6/8bpp single-link and 10/16bpp dual-link */
@@ -571,7 +581,7 @@ bool dcn10_link_encoder_validate_dvi_output(
 static bool dcn10_link_encoder_validate_hdmi_output(
 	const struct dcn10_link_encoder *enc10,
 	const struct dc_crtc_timing *crtc_timing,
-	int adjusted_pix_clk_khz)
+	int adjusted_pix_clk_100hz)
 {
 	enum dc_color_depth max_deep_color =
 			enc10->base.features.max_hdmi_deep_color;
@@ -581,11 +591,11 @@ static bool dcn10_link_encoder_validate_hdmi_output(
 
 	if (crtc_timing->display_color_depth < COLOR_DEPTH_888)
 		return false;
-	if (adjusted_pix_clk_khz < TMDS_MIN_PIXEL_CLOCK)
+	if (adjusted_pix_clk_100hz < (TMDS_MIN_PIXEL_CLOCK * 10))
 		return false;
 
-	if ((adjusted_pix_clk_khz == 0) ||
-		(adjusted_pix_clk_khz > enc10->base.features.max_hdmi_pixel_clock))
+	if ((adjusted_pix_clk_100hz == 0) ||
+		(adjusted_pix_clk_100hz > (enc10->base.features.max_hdmi_pixel_clock * 10)))
 		return false;
 
 	/* DCE11 HW does not support 420 */
@@ -594,7 +604,7 @@ static bool dcn10_link_encoder_validate_hdmi_output(
 		return false;
 
 	if (!enc10->base.features.flags.bits.HDMI_6GB_EN &&
-		adjusted_pix_clk_khz >= 300000)
+		adjusted_pix_clk_100hz >= 3000000)
 		return false;
 	if (enc10->base.ctx->dc->debug.hdmi20_disable &&
 		crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR420)
@@ -738,7 +748,7 @@ bool dcn10_link_encoder_validate_output_with_stream(
 	case SIGNAL_TYPE_DVI_DUAL_LINK:
 		is_valid = dcn10_link_encoder_validate_dvi_output(
 			enc10,
-			stream->sink->link->connector_signal,
+			stream->link->connector_signal,
 			stream->signal,
 			&stream->timing);
 	break;
@@ -746,7 +756,7 @@ bool dcn10_link_encoder_validate_output_with_stream(
 		is_valid = dcn10_link_encoder_validate_hdmi_output(
 				enc10,
 				&stream->timing,
-				stream->phy_pix_clk);
+				stream->phy_pix_clk * 10);
 	break;
 	case SIGNAL_TYPE_DISPLAY_PORT:
 	case SIGNAL_TYPE_DISPLAY_PORT_MST:
@@ -910,7 +920,7 @@ void dcn10_link_encoder_enable_dp_output(
 	 * but it's not passed to asic_control.
 	 * We need to set number of lanes manually.
 	 */
-	configure_encoder(enc10, link_settings);
+	enc1_configure_encoder(enc10, link_settings);
 
 	cntl.action = TRANSMITTER_CONTROL_ENABLE;
 	cntl.engine_id = enc->preferred_engine;
@@ -949,7 +959,7 @@ void dcn10_link_encoder_enable_dp_mst_output(
 	 * but it's not passed to asic_control.
 	 * We need to set number of lanes manually.
 	 */
-	configure_encoder(enc10, link_settings);
+	enc1_configure_encoder(enc10, link_settings);
 
 	cntl.action = TRANSMITTER_CONTROL_ENABLE;
 	cntl.engine_id = ENGINE_ID_UNKNOWN;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_link_encoder.h b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_link_encoder.h
index 49ead12b2532..b74b80a247ec 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_link_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_link_encoder.h
@@ -271,7 +271,7 @@ void dcn10_link_encoder_setup(
 	struct link_encoder *enc,
 	enum signal_type signal);
 
-void configure_encoder(
+void enc1_configure_encoder(
 	struct dcn10_link_encoder *enc10,
 	const struct dc_link_settings *link_settings);
 
@@ -336,6 +336,8 @@ void dcn10_psr_program_secondary_packet(struct link_encoder *enc,
 
 bool dcn10_is_dig_enabled(struct link_encoder *enc);
 
+unsigned int dcn10_get_dig_frontend(struct link_encoder *enc);
+
 void dcn10_aux_initialize(struct dcn10_link_encoder *enc10);
 
 #endif /* __DC_LINK_ENCODER__DCN10_H__ */
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c
index 7c138615f17d..0345d51e9d6f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c
@@ -92,75 +92,36 @@ static void optc1_disable_stereo(struct timing_generator *optc)
 		OTG_3D_STRUCTURE_STEREO_SEL_OVR, 0);
 }
 
-static uint32_t get_start_vline(struct timing_generator *optc, const struct dc_crtc_timing *dc_crtc_timing)
+void optc1_setup_vertical_interrupt0(
+		struct timing_generator *optc,
+		uint32_t start_line,
+		uint32_t end_line)
 {
-	struct dc_crtc_timing patched_crtc_timing;
-	int vesa_sync_start;
-	int asic_blank_end;
-	int vertical_line_start;
-
-	patched_crtc_timing = *dc_crtc_timing;
-	optc1_apply_front_porch_workaround(optc, &patched_crtc_timing);
-
-	vesa_sync_start = patched_crtc_timing.h_addressable +
-			patched_crtc_timing.h_border_right +
-			patched_crtc_timing.h_front_porch;
-
-	asic_blank_end = patched_crtc_timing.h_total -
-			vesa_sync_start -
-			patched_crtc_timing.h_border_left;
-
-	vesa_sync_start = patched_crtc_timing.v_addressable +
-			patched_crtc_timing.v_border_bottom +
-			patched_crtc_timing.v_front_porch;
-
-	asic_blank_end = (patched_crtc_timing.v_total -
-			vesa_sync_start -
-			patched_crtc_timing.v_border_top);
-
-	vertical_line_start = asic_blank_end - optc->dlg_otg_param.vstartup_start + 1;
-	if (vertical_line_start < 0) {
-		ASSERT(0);
-		vertical_line_start = 0;
-	}
+	struct optc *optc1 = DCN10TG_FROM_TG(optc);
 
-	return vertical_line_start;
+	REG_SET_2(OTG_VERTICAL_INTERRUPT0_POSITION, 0,
+			OTG_VERTICAL_INTERRUPT0_LINE_START, start_line,
+			OTG_VERTICAL_INTERRUPT0_LINE_END, end_line);
 }
 
-void optc1_program_vline_interrupt(
+void optc1_setup_vertical_interrupt1(
 		struct timing_generator *optc,
-		const struct dc_crtc_timing *dc_crtc_timing,
-		unsigned long long vsync_delta)
+		uint32_t start_line)
 {
-
 	struct optc *optc1 = DCN10TG_FROM_TG(optc);
 
-	unsigned long long req_delta_tens_of_usec = div64_u64((vsync_delta + 9999), 10000);
-	unsigned long long pix_clk_hundreds_khz = div64_u64((dc_crtc_timing->pix_clk_khz + 99), 100);
-	uint32_t req_delta_lines = (uint32_t) div64_u64(
-			(req_delta_tens_of_usec * pix_clk_hundreds_khz + dc_crtc_timing->h_total - 1),
-								dc_crtc_timing->h_total);
-
-	uint32_t vsync_line = get_start_vline(optc, dc_crtc_timing);
-	uint32_t start_line = 0;
-	uint32_t endLine = 0;
-
-	if (req_delta_lines != 0)
-		req_delta_lines--;
-
-	if (req_delta_lines > vsync_line)
-		start_line = dc_crtc_timing->v_total - (req_delta_lines - vsync_line) + 2;
-	else
-		start_line = vsync_line - req_delta_lines;
-
-	endLine = start_line + 2;
+	REG_SET(OTG_VERTICAL_INTERRUPT1_POSITION, 0,
+				OTG_VERTICAL_INTERRUPT1_LINE_START, start_line);
+}
 
-	if (endLine >= dc_crtc_timing->v_total)
-		endLine = 2;
+void optc1_setup_vertical_interrupt2(
+		struct timing_generator *optc,
+		uint32_t start_line)
+{
+	struct optc *optc1 = DCN10TG_FROM_TG(optc);
 
-	REG_SET_2(OTG_VERTICAL_INTERRUPT0_POSITION, 0,
-			OTG_VERTICAL_INTERRUPT0_LINE_START, start_line,
-			OTG_VERTICAL_INTERRUPT0_LINE_END, endLine);
+	REG_SET(OTG_VERTICAL_INTERRUPT2_POSITION, 0,
+			OTG_VERTICAL_INTERRUPT2_LINE_START, start_line);
 }
 
 /**
@@ -265,22 +226,14 @@ void optc1_program_timing(
 			patched_crtc_timing.v_addressable +
 			patched_crtc_timing.v_border_bottom);
 
-	REG_UPDATE_2(OTG_V_BLANK_START_END,
-			OTG_V_BLANK_START, asic_blank_start,
-			OTG_V_BLANK_END, asic_blank_end);
-
-	/* Use OTG_VERTICAL_INTERRUPT2 replace VUPDATE interrupt,
-	 * program the reg for interrupt postition.
-	 */
 	vertical_line_start = asic_blank_end - optc->dlg_otg_param.vstartup_start + 1;
 	v_fp2 = 0;
 	if (vertical_line_start < 0)
 		v_fp2 = -vertical_line_start;
-	if (vertical_line_start < 0)
-		vertical_line_start = 0;
 
-	REG_SET(OTG_VERTICAL_INTERRUPT2_POSITION, 0,
-			OTG_VERTICAL_INTERRUPT2_LINE_START, vertical_line_start);
+	REG_UPDATE_2(OTG_V_BLANK_START_END,
+			OTG_V_BLANK_START, asic_blank_start,
+			OTG_V_BLANK_END, asic_blank_end);
 
 	/* v_sync polarity */
 	v_sync_polarity = patched_crtc_timing.flags.VSYNC_POSITIVE_POLARITY ?
@@ -299,16 +252,17 @@ void optc1_program_timing(
 	}
 
 	/* Interlace */
-	if (patched_crtc_timing.flags.INTERLACE == 1) {
-		REG_UPDATE(OTG_INTERLACE_CONTROL,
-				OTG_INTERLACE_ENABLE, 1);
-		v_init = v_init / 2;
-		if ((optc->dlg_otg_param.vstartup_start/2)*2 > asic_blank_end)
-			v_fp2 = v_fp2 / 2;
-	} else
-		REG_UPDATE(OTG_INTERLACE_CONTROL,
-				OTG_INTERLACE_ENABLE, 0);
-
+	if (REG(OTG_INTERLACE_CONTROL)) {
+		if (patched_crtc_timing.flags.INTERLACE == 1) {
+			REG_UPDATE(OTG_INTERLACE_CONTROL,
+					OTG_INTERLACE_ENABLE, 1);
+			v_init = v_init / 2;
+			if ((optc->dlg_otg_param.vstartup_start/2)*2 > asic_blank_end)
+				v_fp2 = v_fp2 / 2;
+		} else
+			REG_UPDATE(OTG_INTERLACE_CONTROL,
+					OTG_INTERLACE_ENABLE, 0);
+	}
 
 	/* VTG enable set to 0 first VInit */
 	REG_UPDATE(CONTROL,
@@ -338,7 +292,7 @@ void optc1_program_timing(
 
 	h_div_2 = optc1_is_two_pixels_per_containter(&patched_crtc_timing);
 	REG_UPDATE(OTG_H_TIMING_CNTL,
-			OTG_H_TIMING_DIV_BY2, h_div_2);
+			OTG_H_TIMING_DIV_BY2, h_div_2 || optc1->comb_opp_id != 0xf);
 
 }
 
@@ -1184,6 +1138,64 @@ bool optc1_is_stereo_left_eye(struct timing_generator *optc)
 	return ret;
 }
 
+bool optc1_is_matching_timing(struct timing_generator *tg,
+		const struct dc_crtc_timing *otg_timing)
+{
+	struct dc_crtc_timing hw_crtc_timing = {0};
+	struct dcn_otg_state s = {0};
+
+	if (tg == NULL || otg_timing == NULL)
+		return false;
+
+	optc1_read_otg_state(DCN10TG_FROM_TG(tg), &s);
+
+	hw_crtc_timing.h_total = s.h_total + 1;
+	hw_crtc_timing.h_addressable = s.h_total - ((s.h_total - s.h_blank_start) + s.h_blank_end);
+	hw_crtc_timing.h_front_porch = s.h_total + 1 - s.h_blank_start;
+	hw_crtc_timing.h_sync_width = s.h_sync_a_end - s.h_sync_a_start;
+
+	hw_crtc_timing.v_total = s.v_total + 1;
+	hw_crtc_timing.v_addressable = s.v_total - ((s.v_total - s.v_blank_start) + s.v_blank_end);
+	hw_crtc_timing.v_front_porch = s.v_total + 1 - s.v_blank_start;
+	hw_crtc_timing.v_sync_width = s.v_sync_a_end - s.v_sync_a_start;
+
+	if (otg_timing->h_total != hw_crtc_timing.h_total)
+		return false;
+
+	if (otg_timing->h_border_left != hw_crtc_timing.h_border_left)
+		return false;
+
+	if (otg_timing->h_addressable != hw_crtc_timing.h_addressable)
+		return false;
+
+	if (otg_timing->h_border_right != hw_crtc_timing.h_border_right)
+		return false;
+
+	if (otg_timing->h_front_porch != hw_crtc_timing.h_front_porch)
+		return false;
+
+	if (otg_timing->h_sync_width != hw_crtc_timing.h_sync_width)
+		return false;
+
+	if (otg_timing->v_total != hw_crtc_timing.v_total)
+		return false;
+
+	if (otg_timing->v_border_top != hw_crtc_timing.v_border_top)
+		return false;
+
+	if (otg_timing->v_addressable != hw_crtc_timing.v_addressable)
+		return false;
+
+	if (otg_timing->v_border_bottom != hw_crtc_timing.v_border_bottom)
+		return false;
+
+	if (otg_timing->v_sync_width != hw_crtc_timing.v_sync_width)
+		return false;
+
+	return true;
+}
+
+
 void optc1_read_otg_state(struct optc *optc1,
 		struct dcn_otg_state *s)
 {
@@ -1370,7 +1382,9 @@ bool optc1_get_crc(struct timing_generator *optc,
 static const struct timing_generator_funcs dcn10_tg_funcs = {
 		.validate_timing = optc1_validate_timing,
 		.program_timing = optc1_program_timing,
-		.program_vline_interrupt = optc1_program_vline_interrupt,
+		.setup_vertical_interrupt0 = optc1_setup_vertical_interrupt0,
+		.setup_vertical_interrupt1 = optc1_setup_vertical_interrupt1,
+		.setup_vertical_interrupt2 = optc1_setup_vertical_interrupt2,
 		.program_global_sync = optc1_program_global_sync,
 		.enable_crtc = optc1_enable_crtc,
 		.disable_crtc = optc1_disable_crtc,
@@ -1380,6 +1394,7 @@ static const struct timing_generator_funcs dcn10_tg_funcs = {
 		.get_frame_count = optc1_get_vblank_counter,
 		.get_scanoutpos = optc1_get_crtc_scanoutpos,
 		.get_otg_active_size = optc1_get_otg_active_size,
+		.is_matching_timing = optc1_is_matching_timing,
 		.set_early_control = optc1_set_early_control,
 		/* used by enable_timing_synchronization. Not need for FPGA */
 		.wait_for_state = optc1_wait_for_state,
@@ -1419,10 +1434,13 @@ void dcn10_timing_generator_init(struct optc *optc1)
 	optc1->min_v_blank_interlace = 5;
 	optc1->min_h_sync_width = 8;
 	optc1->min_v_sync_width = 1;
+	optc1->comb_opp_id = 0xf;
 }
 
 bool optc1_is_two_pixels_per_containter(const struct dc_crtc_timing *timing)
 {
-	return timing->pixel_encoding == PIXEL_ENCODING_YCBCR420;
+	bool two_pix = timing->pixel_encoding == PIXEL_ENCODING_YCBCR420;
+
+	return two_pix;
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h
index 8bacf0b6e27e..4eb9a898c237 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h
@@ -67,6 +67,8 @@
 	SRI(OTG_CLOCK_CONTROL, OTG, inst),\
 	SRI(OTG_VERTICAL_INTERRUPT0_CONTROL, OTG, inst),\
 	SRI(OTG_VERTICAL_INTERRUPT0_POSITION, OTG, inst),\
+	SRI(OTG_VERTICAL_INTERRUPT1_CONTROL, OTG, inst),\
+	SRI(OTG_VERTICAL_INTERRUPT1_POSITION, OTG, inst),\
 	SRI(OTG_VERTICAL_INTERRUPT2_CONTROL, OTG, inst),\
 	SRI(OTG_VERTICAL_INTERRUPT2_POSITION, OTG, inst),\
 	SRI(OPTC_INPUT_CLOCK_CONTROL, ODM, inst),\
@@ -135,6 +137,8 @@ struct dcn_optc_registers {
 	uint32_t OTG_CLOCK_CONTROL;
 	uint32_t OTG_VERTICAL_INTERRUPT0_CONTROL;
 	uint32_t OTG_VERTICAL_INTERRUPT0_POSITION;
+	uint32_t OTG_VERTICAL_INTERRUPT1_CONTROL;
+	uint32_t OTG_VERTICAL_INTERRUPT1_POSITION;
 	uint32_t OTG_VERTICAL_INTERRUPT2_CONTROL;
 	uint32_t OTG_VERTICAL_INTERRUPT2_POSITION;
 	uint32_t OPTC_INPUT_CLOCK_CONTROL;
@@ -227,6 +231,8 @@ struct dcn_optc_registers {
 	SF(OTG0_OTG_VERTICAL_INTERRUPT0_CONTROL, OTG_VERTICAL_INTERRUPT0_INT_ENABLE, mask_sh),\
 	SF(OTG0_OTG_VERTICAL_INTERRUPT0_POSITION, OTG_VERTICAL_INTERRUPT0_LINE_START, mask_sh),\
 	SF(OTG0_OTG_VERTICAL_INTERRUPT0_POSITION, OTG_VERTICAL_INTERRUPT0_LINE_END, mask_sh),\
+	SF(OTG0_OTG_VERTICAL_INTERRUPT1_CONTROL, OTG_VERTICAL_INTERRUPT1_INT_ENABLE, mask_sh),\
+	SF(OTG0_OTG_VERTICAL_INTERRUPT1_POSITION, OTG_VERTICAL_INTERRUPT1_LINE_START, mask_sh),\
 	SF(OTG0_OTG_VERTICAL_INTERRUPT2_CONTROL, OTG_VERTICAL_INTERRUPT2_INT_ENABLE, mask_sh),\
 	SF(OTG0_OTG_VERTICAL_INTERRUPT2_POSITION, OTG_VERTICAL_INTERRUPT2_LINE_START, mask_sh),\
 	SF(ODM0_OPTC_INPUT_CLOCK_CONTROL, OPTC_INPUT_CLK_EN, mask_sh),\
@@ -361,6 +367,8 @@ struct dcn_optc_registers {
 	type OTG_VERTICAL_INTERRUPT0_INT_ENABLE;\
 	type OTG_VERTICAL_INTERRUPT0_LINE_START;\
 	type OTG_VERTICAL_INTERRUPT0_LINE_END;\
+	type OTG_VERTICAL_INTERRUPT1_INT_ENABLE;\
+	type OTG_VERTICAL_INTERRUPT1_LINE_START;\
 	type OTG_VERTICAL_INTERRUPT2_INT_ENABLE;\
 	type OTG_VERTICAL_INTERRUPT2_LINE_START;\
 	type OPTC_INPUT_CLK_EN;\
@@ -427,7 +435,7 @@ struct optc {
 	const struct dcn_optc_shift *tg_shift;
 	const struct dcn_optc_mask *tg_mask;
 
-	enum controller_id controller_id;
+	int comb_opp_id;
 
 	uint32_t max_h_total;
 	uint32_t max_v_total;
@@ -475,9 +483,16 @@ void optc1_program_timing(
 	const struct dc_crtc_timing *dc_crtc_timing,
 	bool use_vbios);
 
-void optc1_program_vline_interrupt(struct timing_generator *optc,
-		const struct dc_crtc_timing *dc_crtc_timing,
-		unsigned long long vsync_delta);
+void optc1_setup_vertical_interrupt0(
+		struct timing_generator *optc,
+		uint32_t start_line,
+		uint32_t end_line);
+void optc1_setup_vertical_interrupt1(
+		struct timing_generator *optc,
+		uint32_t start_line);
+void optc1_setup_vertical_interrupt2(
+		struct timing_generator *optc,
+		uint32_t start_line);
 
 void optc1_program_global_sync(
 		struct timing_generator *optc);
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
index 5d4772dec0ba..09d74070a49b 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
@@ -70,7 +70,7 @@
 const struct _vcs_dpi_ip_params_st dcn1_0_ip = {
 	.rob_buffer_size_kbytes = 64,
 	.det_buffer_size_kbytes = 164,
-	.dpte_buffer_size_in_pte_reqs = 42,
+	.dpte_buffer_size_in_pte_reqs_luma = 42,
 	.dpp_output_buffer_pixels = 2560,
 	.opp_output_buffer_lines = 1,
 	.pixel_chunk_size_kbytes = 8,
@@ -436,7 +436,6 @@ static const struct dcn_optc_mask tg_mask = {
 };
 
 static const struct bios_registers bios_regs = {
-		NBIO_SR(BIOS_SCRATCH_0),
 		NBIO_SR(BIOS_SCRATCH_3),
 		NBIO_SR(BIOS_SCRATCH_6)
 };
@@ -609,7 +608,7 @@ static struct output_pixel_processor *dcn10_opp_create(
 	return &opp->base;
 }
 
-struct aux_engine *dcn10_aux_engine_create(
+struct dce_aux *dcn10_aux_engine_create(
 	struct dc_context *ctx,
 	uint32_t inst)
 {
@@ -678,18 +677,18 @@ static struct mpc *dcn10_mpc_create(struct dc_context *ctx)
 
 static struct hubbub *dcn10_hubbub_create(struct dc_context *ctx)
 {
-	struct hubbub *hubbub = kzalloc(sizeof(struct hubbub),
+	struct dcn10_hubbub *dcn10_hubbub = kzalloc(sizeof(struct dcn10_hubbub),
 					  GFP_KERNEL);
 
-	if (!hubbub)
+	if (!dcn10_hubbub)
 		return NULL;
 
-	hubbub1_construct(hubbub, ctx,
+	hubbub1_construct(&dcn10_hubbub->base, ctx,
 			&hubbub_reg,
 			&hubbub_shift,
 			&hubbub_mask);
 
-	return hubbub;
+	return &dcn10_hubbub->base;
 }
 
 static struct timing_generator *dcn10_timing_generator_create(
@@ -911,7 +910,7 @@ static void destruct(struct dcn10_resource_pool *pool)
 
 	for (i = 0; i < pool->base.res_cap->num_ddc; i++) {
 		if (pool->base.engines[i] != NULL)
-			pool->base.engines[i]->funcs->destroy_engine(&pool->base.engines[i]);
+			dce110_engine_destroy(&pool->base.engines[i]);
 		if (pool->base.hw_i2cs[i] != NULL) {
 			kfree(pool->base.hw_i2cs[i]);
 			pool->base.hw_i2cs[i] = NULL;
@@ -974,8 +973,8 @@ static void get_pixel_clock_parameters(
 	struct pixel_clk_params *pixel_clk_params)
 {
 	const struct dc_stream_state *stream = pipe_ctx->stream;
-	pixel_clk_params->requested_pix_clk = stream->timing.pix_clk_khz;
-	pixel_clk_params->encoder_object_id = stream->sink->link->link_enc->id;
+	pixel_clk_params->requested_pix_clk_100hz = stream->timing.pix_clk_100hz;
+	pixel_clk_params->encoder_object_id = stream->link->link_enc->id;
 	pixel_clk_params->signal_type = pipe_ctx->stream->signal;
 	pixel_clk_params->controller_id = pipe_ctx->stream_res.tg->inst + 1;
 	/* TODO: un-hardcode*/
@@ -991,9 +990,9 @@ static void get_pixel_clock_parameters(
 		pixel_clk_params->color_depth = COLOR_DEPTH_888;
 
 	if (stream->timing.pixel_encoding == PIXEL_ENCODING_YCBCR420)
-		pixel_clk_params->requested_pix_clk  /= 2;
+		pixel_clk_params->requested_pix_clk_100hz  /= 2;
 	if (stream->timing.timing_3d_format == TIMING_3D_FORMAT_HW_FRAME_PACKING)
-		pixel_clk_params->requested_pix_clk *= 2;
+		pixel_clk_params->requested_pix_clk_100hz *= 2;
 
 }
 
@@ -1131,6 +1130,56 @@ static enum dc_status dcn10_validate_plane(const struct dc_plane_state *plane_st
 	return DC_OK;
 }
 
+static enum dc_status dcn10_validate_global(struct dc *dc, struct dc_state *context)
+{
+	int i, j;
+	bool video_down_scaled = false;
+	bool video_large = false;
+	bool desktop_large = false;
+	bool dcc_disabled = false;
+
+	for (i = 0; i < context->stream_count; i++) {
+		if (context->stream_status[i].plane_count == 0)
+			continue;
+
+		if (context->stream_status[i].plane_count > 2)
+			return false;
+
+		for (j = 0; j < context->stream_status[i].plane_count; j++) {
+			struct dc_plane_state *plane =
+				context->stream_status[i].plane_states[j];
+
+
+			if (plane->format >= SURFACE_PIXEL_FORMAT_VIDEO_BEGIN) {
+
+				if (plane->src_rect.width > plane->dst_rect.width ||
+						plane->src_rect.height > plane->dst_rect.height)
+					video_down_scaled = true;
+
+				if (plane->src_rect.width >= 3840)
+					video_large = true;
+
+			} else {
+				if (plane->src_rect.width >= 3840)
+					desktop_large = true;
+				if (!plane->dcc.enable)
+					dcc_disabled = true;
+			}
+		}
+	}
+
+	/*
+	 * Workaround: On DCN10 there is UMC issue that causes underflow when
+	 * playing 4k video on 4k desktop with video downscaled and single channel
+	 * memory
+	 */
+	if (video_large && desktop_large && video_down_scaled && dcc_disabled &&
+			dc->dcn_soc->number_of_channels == 1)
+		return DC_FAIL_SURFACE_VALIDATE;
+
+	return DC_OK;
+}
+
 static enum dc_status dcn10_get_default_swizzle_mode(struct dc_plane_state *plane_state)
 {
 	enum dc_status result = DC_OK;
@@ -1159,6 +1208,7 @@ static const struct resource_funcs dcn10_res_pool_funcs = {
 	.validate_bandwidth = dcn_validate_bandwidth,
 	.acquire_idle_pipe_for_layer = dcn10_acquire_idle_pipe_for_layer,
 	.validate_plane = dcn10_validate_plane,
+	.validate_global = dcn10_validate_global,
 	.add_stream_to_ctx = dcn10_add_stream_to_ctx,
 	.get_default_swizzle_mode = dcn10_get_default_swizzle_mode
 };
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.c
index b8b5525a389a..b08254121251 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.c
@@ -261,17 +261,29 @@ void enc1_stream_encoder_dp_set_stream_attribute(
 	uint8_t dp_component_depth = 0;
 
 	struct dcn10_stream_encoder *enc1 = DCN10STRENC_FROM_STRENC(enc);
+	struct dc_crtc_timing hw_crtc_timing = *crtc_timing;
+
+	if (hw_crtc_timing.flags.INTERLACE) {
+		/*the input timing is in VESA spec format with Interlace flag =1*/
+		hw_crtc_timing.v_total /= 2;
+		hw_crtc_timing.v_border_top /= 2;
+		hw_crtc_timing.v_addressable /= 2;
+		hw_crtc_timing.v_border_bottom /= 2;
+		hw_crtc_timing.v_front_porch /= 2;
+		hw_crtc_timing.v_sync_width /= 2;
+	}
+
 
 	/* set pixel encoding */
-	switch (crtc_timing->pixel_encoding) {
+	switch (hw_crtc_timing.pixel_encoding) {
 	case PIXEL_ENCODING_YCBCR422:
 		dp_pixel_encoding = DP_PIXEL_ENCODING_TYPE_YCBCR422;
 		break;
 	case PIXEL_ENCODING_YCBCR444:
 		dp_pixel_encoding = DP_PIXEL_ENCODING_TYPE_YCBCR444;
 
-		if (crtc_timing->flags.Y_ONLY)
-			if (crtc_timing->display_color_depth != COLOR_DEPTH_666)
+		if (hw_crtc_timing.flags.Y_ONLY)
+			if (hw_crtc_timing.display_color_depth != COLOR_DEPTH_666)
 				/* HW testing only, no use case yet.
 				 * Color depth of Y-only could be
 				 * 8, 10, 12, 16 bits
@@ -299,7 +311,7 @@ void enc1_stream_encoder_dp_set_stream_attribute(
 	 * Pixel Encoding/Colorimetry Format and that a Sink device shall ignore MISC1, bit 7,
 	 * and MISC0, bits 7:1 (MISC1, bit 7, and MISC0, bits 7:1, become "don't care").
 	 */
-	if ((crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR420) ||
+	if ((hw_crtc_timing.pixel_encoding == PIXEL_ENCODING_YCBCR420) ||
 			(output_color_space == COLOR_SPACE_2020_YCBCR) ||
 			(output_color_space == COLOR_SPACE_2020_RGB_FULLRANGE) ||
 			(output_color_space == COLOR_SPACE_2020_RGB_LIMITEDRANGE))
@@ -308,7 +320,7 @@ void enc1_stream_encoder_dp_set_stream_attribute(
 		misc1 = misc1 & ~0x40;
 
 	/* set color depth */
-	switch (crtc_timing->display_color_depth) {
+	switch (hw_crtc_timing.display_color_depth) {
 	case COLOR_DEPTH_666:
 		dp_component_depth = DP_COMPONENT_PIXEL_DEPTH_6BPC;
 		break;
@@ -336,7 +348,7 @@ void enc1_stream_encoder_dp_set_stream_attribute(
 
 	/* set dynamic range and YCbCr range */
 
-	switch (crtc_timing->display_color_depth) {
+	switch (hw_crtc_timing.display_color_depth) {
 	case COLOR_DEPTH_666:
 		colorimetry_bpc = 0;
 		break;
@@ -372,9 +384,9 @@ void enc1_stream_encoder_dp_set_stream_attribute(
 		misc0 = misc0 | 0x8; /* bit3=1, bit4=0 */
 		misc1 = misc1 & ~0x80; /* bit7 = 0*/
 		dynamic_range_ycbcr = 0; /*bt601*/
-		if (crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR422)
+		if (hw_crtc_timing.pixel_encoding == PIXEL_ENCODING_YCBCR422)
 			misc0 = misc0 | 0x2; /* bit2=0, bit1=1 */
-		else if (crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR444)
+		else if (hw_crtc_timing.pixel_encoding == PIXEL_ENCODING_YCBCR444)
 			misc0 = misc0 | 0x4; /* bit2=1, bit1=0 */
 		break;
 	case COLOR_SPACE_YCBCR709:
@@ -382,9 +394,9 @@ void enc1_stream_encoder_dp_set_stream_attribute(
 		misc0 = misc0 | 0x18; /* bit3=1, bit4=1 */
 		misc1 = misc1 & ~0x80; /* bit7 = 0*/
 		dynamic_range_ycbcr = 1; /*bt709*/
-		if (crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR422)
+		if (hw_crtc_timing.pixel_encoding == PIXEL_ENCODING_YCBCR422)
 			misc0 = misc0 | 0x2; /* bit2=0, bit1=1 */
-		else if (crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR444)
+		else if (hw_crtc_timing.pixel_encoding == PIXEL_ENCODING_YCBCR444)
 			misc0 = misc0 | 0x4; /* bit2=1, bit1=0 */
 		break;
 	case COLOR_SPACE_2020_RGB_LIMITEDRANGE:
@@ -414,26 +426,26 @@ void enc1_stream_encoder_dp_set_stream_attribute(
 	 * dc_crtc_timing is vesa dmt struct. data from edid
 	 */
 	REG_SET_2(DP_MSA_TIMING_PARAM1, 0,
-			DP_MSA_HTOTAL, crtc_timing->h_total,
-			DP_MSA_VTOTAL, crtc_timing->v_total);
+			DP_MSA_HTOTAL, hw_crtc_timing.h_total,
+			DP_MSA_VTOTAL, hw_crtc_timing.v_total);
 
 	/* calculate from vesa timing parameters
 	 * h_active_start related to leading edge of sync
 	 */
 
-	h_blank = crtc_timing->h_total - crtc_timing->h_border_left -
-			crtc_timing->h_addressable - crtc_timing->h_border_right;
+	h_blank = hw_crtc_timing.h_total - hw_crtc_timing.h_border_left -
+			hw_crtc_timing.h_addressable - hw_crtc_timing.h_border_right;
 
-	h_back_porch = h_blank - crtc_timing->h_front_porch -
-			crtc_timing->h_sync_width;
+	h_back_porch = h_blank - hw_crtc_timing.h_front_porch -
+			hw_crtc_timing.h_sync_width;
 
 	/* start at beginning of left border */
-	h_active_start = crtc_timing->h_sync_width + h_back_porch;
+	h_active_start = hw_crtc_timing.h_sync_width + h_back_porch;
 
 
-	v_active_start = crtc_timing->v_total - crtc_timing->v_border_top -
-			crtc_timing->v_addressable - crtc_timing->v_border_bottom -
-			crtc_timing->v_front_porch;
+	v_active_start = hw_crtc_timing.v_total - hw_crtc_timing.v_border_top -
+			hw_crtc_timing.v_addressable - hw_crtc_timing.v_border_bottom -
+			hw_crtc_timing.v_front_porch;
 
 
 	/* start at beginning of left border */
@@ -443,20 +455,20 @@ void enc1_stream_encoder_dp_set_stream_attribute(
 
 	REG_SET_4(DP_MSA_TIMING_PARAM3, 0,
 			DP_MSA_HSYNCWIDTH,
-			crtc_timing->h_sync_width,
+			hw_crtc_timing.h_sync_width,
 			DP_MSA_HSYNCPOLARITY,
-			!crtc_timing->flags.HSYNC_POSITIVE_POLARITY,
+			!hw_crtc_timing.flags.HSYNC_POSITIVE_POLARITY,
 			DP_MSA_VSYNCWIDTH,
-			crtc_timing->v_sync_width,
+			hw_crtc_timing.v_sync_width,
 			DP_MSA_VSYNCPOLARITY,
-			!crtc_timing->flags.VSYNC_POSITIVE_POLARITY);
+			!hw_crtc_timing.flags.VSYNC_POSITIVE_POLARITY);
 
 	/* HWDITH include border or overscan */
 	REG_SET_2(DP_MSA_TIMING_PARAM4, 0,
-		DP_MSA_HWIDTH, crtc_timing->h_border_left +
-		crtc_timing->h_addressable + crtc_timing->h_border_right,
-		DP_MSA_VHEIGHT, crtc_timing->v_border_top +
-		crtc_timing->v_addressable + crtc_timing->v_border_bottom);
+		DP_MSA_HWIDTH, hw_crtc_timing.h_border_left +
+		hw_crtc_timing.h_addressable + hw_crtc_timing.h_border_right,
+		DP_MSA_VHEIGHT, hw_crtc_timing.v_border_top +
+		hw_crtc_timing.v_addressable + hw_crtc_timing.v_border_bottom);
 }
 
 static void enc1_stream_encoder_set_stream_attribute_helper(
@@ -594,7 +606,7 @@ void enc1_stream_encoder_dvi_set_stream_attribute(
 	cntl.signal = is_dual_link ?
 			SIGNAL_TYPE_DVI_DUAL_LINK : SIGNAL_TYPE_DVI_SINGLE_LINK;
 	cntl.enable_dp_audio = false;
-	cntl.pixel_clock = crtc_timing->pix_clk_khz;
+	cntl.pixel_clock = crtc_timing->pix_clk_100hz / 10;
 	cntl.lanes_number = (is_dual_link) ? LANE_COUNT_EIGHT : LANE_COUNT_FOUR;
 
 	if (enc1->base.bp->funcs->encoder_control(
@@ -1413,6 +1425,14 @@ void enc1_setup_stereo_sync(
 	REG_UPDATE(DIG_FE_CNTL, DIG_STEREOSYNC_GATE_EN, !enable);
 }
 
+void enc1_dig_connect_to_otg(
+	struct stream_encoder *enc,
+	int tg_inst)
+{
+	struct dcn10_stream_encoder *enc1 = DCN10STRENC_FROM_STRENC(enc);
+
+	REG_UPDATE(DIG_FE_CNTL, DIG_SOURCE_SELECT, tg_inst);
+}
 
 static const struct stream_encoder_funcs dcn10_str_enc_funcs = {
 	.dp_set_stream_attribute =
@@ -1445,6 +1465,7 @@ static const struct stream_encoder_funcs dcn10_str_enc_funcs = {
 	.hdmi_audio_disable = enc1_se_hdmi_audio_disable,
 	.setup_stereo_sync  = enc1_setup_stereo_sync,
 	.set_avmute = enc1_stream_encoder_set_avmute,
+	.dig_connect_to_otg  = enc1_dig_connect_to_otg,
 };
 
 void dcn10_stream_encoder_construct(
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h
index 67f3e4dd95c1..b7c800e10a32 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_stream_encoder.h
@@ -274,7 +274,8 @@ struct dcn10_stream_enc_registers {
 	SE_SF(DP0_DP_MSA_TIMING_PARAM4, DP_MSA_HWIDTH, mask_sh),\
 	SE_SF(DP0_DP_MSA_TIMING_PARAM4, DP_MSA_VHEIGHT, mask_sh),\
 	SE_SF(DIG0_HDMI_DB_CONTROL, HDMI_DB_DISABLE, mask_sh),\
-	SE_SF(DP0_DP_VID_TIMING, DP_VID_N_MUL, mask_sh)
+	SE_SF(DP0_DP_VID_TIMING, DP_VID_N_MUL, mask_sh),\
+	SE_SF(DIG0_DIG_FE_CNTL, DIG_SOURCE_SELECT, mask_sh)
 
 #define SE_COMMON_MASK_SH_LIST_SOC(mask_sh)\
 	SE_COMMON_MASK_SH_LIST_SOC_BASE(mask_sh)
@@ -426,7 +427,8 @@ struct dcn10_stream_enc_registers {
 	type DP_MSA_VHEIGHT;\
 	type HDMI_DB_DISABLE;\
 	type DP_VID_N_MUL;\
-	type DP_VID_M_DOUBLE_VALUE_EN
+	type DP_VID_M_DOUBLE_VALUE_EN;\
+	type DIG_SOURCE_SELECT
 
 struct dcn10_stream_encoder_shift {
 	SE_REG_FIELD_LIST_DCN1_0(uint8_t);
@@ -523,4 +525,8 @@ void enc1_se_hdmi_audio_setup(
 void enc1_se_hdmi_audio_disable(
 	struct stream_encoder *enc);
 
+void enc1_dig_connect_to_otg(
+	struct stream_encoder *enc,
+	int tg_inst);
+
 #endif /* __DC_STREAM_ENCODER_DCN10_H__ */
diff --git a/drivers/gpu/drm/amd/display/dc/dm_helpers.h b/drivers/gpu/drm/amd/display/dc/dm_helpers.h
index 5d4527d03045..e81b24374bcb 100644
--- a/drivers/gpu/drm/amd/display/dc/dm_helpers.h
+++ b/drivers/gpu/drm/amd/display/dc/dm_helpers.h
@@ -58,6 +58,13 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
 		bool enable);
 
 /*
+ * poll pending down reply before clear payload allocation table
+ */
+void dm_helpers_dp_mst_poll_pending_down_reply(
+	struct dc_context *ctx,
+	const struct dc_link *link);
+
+/*
  * Clear payload allocation table before enable MST DP link.
  */
 void dm_helpers_dp_mst_clear_payload_allocation_table(
diff --git a/drivers/gpu/drm/amd/display/dc/dm_pp_smu.h b/drivers/gpu/drm/amd/display/dc/dm_pp_smu.h
index 0029a39efb1c..14bed5b1fa97 100644
--- a/drivers/gpu/drm/amd/display/dc/dm_pp_smu.h
+++ b/drivers/gpu/drm/amd/display/dc/dm_pp_smu.h
@@ -38,7 +38,8 @@ enum pp_smu_ver {
 	 * of interface sharing between families of ASIcs.
 	 */
 	PP_SMU_UNSUPPORTED,
-	PP_SMU_VER_RV
+	PP_SMU_VER_RV,
+	PP_SMU_VER_MAX
 };
 
 struct pp_smu {
diff --git a/drivers/gpu/drm/amd/display/dc/dm_services_types.h b/drivers/gpu/drm/amd/display/dc/dm_services_types.h
index 1af8c777b3ac..77200711abbe 100644
--- a/drivers/gpu/drm/amd/display/dc/dm_services_types.h
+++ b/drivers/gpu/drm/amd/display/dc/dm_services_types.h
@@ -82,9 +82,17 @@ enum dm_pp_clock_type {
 #define DC_DECODE_PP_CLOCK_TYPE(clk_type) \
 	(clk_type) == DM_PP_CLOCK_TYPE_DISPLAY_CLK ? "Display" : \
 	(clk_type) == DM_PP_CLOCK_TYPE_ENGINE_CLK ? "Engine" : \
-	(clk_type) == DM_PP_CLOCK_TYPE_MEMORY_CLK ? "Memory" : "Invalid"
-
-#define DM_PP_MAX_CLOCK_LEVELS 8
+	(clk_type) == DM_PP_CLOCK_TYPE_MEMORY_CLK ? "Memory" : \
+	(clk_type) == DM_PP_CLOCK_TYPE_DCFCLK ? "DCF" : \
+	(clk_type) == DM_PP_CLOCK_TYPE_DCEFCLK ? "DCEF" : \
+	(clk_type) == DM_PP_CLOCK_TYPE_SOCCLK ? "SoC" : \
+	(clk_type) == DM_PP_CLOCK_TYPE_PIXELCLK ? "Pixel" : \
+	(clk_type) == DM_PP_CLOCK_TYPE_DISPLAYPHYCLK ? "Display PHY" : \
+	(clk_type) == DM_PP_CLOCK_TYPE_DPPCLK ? "DPP" : \
+	(clk_type) == DM_PP_CLOCK_TYPE_FCLK ? "F" : \
+	"Invalid"
+
+#define DM_PP_MAX_CLOCK_LEVELS 16
 
 struct dm_pp_clock_levels {
 	uint32_t num_levels;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_enums.h b/drivers/gpu/drm/amd/display/dc/dml/display_mode_enums.h
index bea4e61b94c7..c59e582c1f40 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_enums.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_enums.h
@@ -121,4 +121,30 @@ enum self_refresh_affinity {
 	dm_neither_self_refresh_nor_mclk_switch
 };
 
+enum dm_validation_status {
+	DML_VALIDATION_OK,
+	DML_FAIL_SCALE_RATIO_TAP,
+	DML_FAIL_SOURCE_PIXEL_FORMAT,
+	DML_FAIL_VIEWPORT_SIZE,
+	DML_FAIL_TOTAL_V_ACTIVE_BW,
+	DML_FAIL_DIO_SUPPORT,
+	DML_FAIL_NOT_ENOUGH_DSC,
+	DML_FAIL_DSC_CLK_REQUIRED,
+	DML_FAIL_URGENT_LATENCY,
+	DML_FAIL_REORDERING_BUFFER,
+	DML_FAIL_DISPCLK_DPPCLK,
+	DML_FAIL_TOTAL_AVAILABLE_PIPES,
+	DML_FAIL_NUM_OTG,
+	DML_FAIL_WRITEBACK_MODE,
+	DML_FAIL_WRITEBACK_LATENCY,
+	DML_FAIL_WRITEBACK_SCALE_RATIO_TAP,
+	DML_FAIL_CURSOR_SUPPORT,
+	DML_FAIL_PITCH_SUPPORT,
+	DML_FAIL_PTE_BUFFER_SIZE,
+	DML_FAIL_HOST_VM_IMMEDIATE_FLIP,
+	DML_FAIL_DSC_INPUT_BPC,
+	DML_FAIL_PREFETCH_SUPPORT,
+	DML_FAIL_V_RATIO_PREFETCH,
+};
+
 #endif
diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.c b/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.c
index dddeb0d4db8f..d303b789adfe 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.c
@@ -62,3 +62,31 @@ void dml_init_instance(struct display_mode_lib *lib, enum dml_project project)
 	}
 }
 
+const char *dml_get_status_message(enum dm_validation_status status)
+{
+	switch (status) {
+	case DML_VALIDATION_OK:                   return "Validation OK";
+	case DML_FAIL_SCALE_RATIO_TAP:            return "Scale ratio/tap";
+	case DML_FAIL_SOURCE_PIXEL_FORMAT:        return "Source pixel format";
+	case DML_FAIL_VIEWPORT_SIZE:              return "Viewport size";
+	case DML_FAIL_TOTAL_V_ACTIVE_BW:          return "Total vertical active bandwidth";
+	case DML_FAIL_DIO_SUPPORT:                return "DIO support";
+	case DML_FAIL_NOT_ENOUGH_DSC:             return "Not enough DSC Units";
+	case DML_FAIL_DSC_CLK_REQUIRED:           return "DSC clock required";
+	case DML_FAIL_URGENT_LATENCY:             return "Urgent latency";
+	case DML_FAIL_REORDERING_BUFFER:          return "Re-ordering buffer";
+	case DML_FAIL_DISPCLK_DPPCLK:             return "Dispclk and Dppclk";
+	case DML_FAIL_TOTAL_AVAILABLE_PIPES:      return "Total available pipes";
+	case DML_FAIL_NUM_OTG:                    return "Number of OTG";
+	case DML_FAIL_WRITEBACK_MODE:             return "Writeback mode";
+	case DML_FAIL_WRITEBACK_LATENCY:          return "Writeback latency";
+	case DML_FAIL_WRITEBACK_SCALE_RATIO_TAP:  return "Writeback scale ratio/tap";
+	case DML_FAIL_CURSOR_SUPPORT:             return "Cursor support";
+	case DML_FAIL_PITCH_SUPPORT:              return "Pitch support";
+	case DML_FAIL_PTE_BUFFER_SIZE:            return "PTE buffer size";
+	case DML_FAIL_DSC_INPUT_BPC:              return "DSC input bpc";
+	case DML_FAIL_PREFETCH_SUPPORT:           return "Prefetch support";
+	case DML_FAIL_V_RATIO_PREFETCH:           return "Vertical ratio prefetch";
+	default:                                  return "Unknown Status";
+	}
+}
diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.h b/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.h
index 635206248889..a730e0209c05 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.h
@@ -43,4 +43,6 @@ struct display_mode_lib {
 
 void dml_init_instance(struct display_mode_lib *lib, enum dml_project project);
 
+const char *dml_get_status_message(enum dm_validation_status status);
+
 #endif
diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h b/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h
index 5dd04520ceca..391183e3428f 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h
@@ -30,22 +30,15 @@ typedef struct _vcs_dpi_soc_bounding_box_st soc_bounding_box_st;
 typedef struct _vcs_dpi_ip_params_st ip_params_st;
 typedef struct _vcs_dpi_display_pipe_source_params_st display_pipe_source_params_st;
 typedef struct _vcs_dpi_display_output_params_st display_output_params_st;
-typedef struct _vcs_dpi_display_bandwidth_st display_bandwidth_st;
 typedef struct _vcs_dpi_scaler_ratio_depth_st scaler_ratio_depth_st;
 typedef struct _vcs_dpi_scaler_taps_st scaler_taps_st;
 typedef struct _vcs_dpi_display_pipe_dest_params_st display_pipe_dest_params_st;
 typedef struct _vcs_dpi_display_pipe_params_st display_pipe_params_st;
 typedef struct _vcs_dpi_display_clocks_and_cfg_st display_clocks_and_cfg_st;
 typedef struct _vcs_dpi_display_e2e_pipe_params_st display_e2e_pipe_params_st;
-typedef struct _vcs_dpi_dchub_buffer_sizing_st dchub_buffer_sizing_st;
-typedef struct _vcs_dpi_watermarks_perf_st watermarks_perf_st;
-typedef struct _vcs_dpi_cstate_pstate_watermarks_st cstate_pstate_watermarks_st;
-typedef struct _vcs_dpi_wm_calc_pipe_params_st wm_calc_pipe_params_st;
-typedef struct _vcs_dpi_vratio_pre_st vratio_pre_st;
 typedef struct _vcs_dpi_display_data_rq_misc_params_st display_data_rq_misc_params_st;
 typedef struct _vcs_dpi_display_data_rq_sizing_params_st display_data_rq_sizing_params_st;
 typedef struct _vcs_dpi_display_data_rq_dlg_params_st display_data_rq_dlg_params_st;
-typedef struct _vcs_dpi_display_cur_rq_dlg_params_st display_cur_rq_dlg_params_st;
 typedef struct _vcs_dpi_display_rq_dlg_params_st display_rq_dlg_params_st;
 typedef struct _vcs_dpi_display_rq_sizing_params_st display_rq_sizing_params_st;
 typedef struct _vcs_dpi_display_rq_misc_params_st display_rq_misc_params_st;
@@ -55,8 +48,6 @@ typedef struct _vcs_dpi_display_ttu_regs_st display_ttu_regs_st;
 typedef struct _vcs_dpi_display_data_rq_regs_st display_data_rq_regs_st;
 typedef struct _vcs_dpi_display_rq_regs_st display_rq_regs_st;
 typedef struct _vcs_dpi_display_dlg_sys_params_st display_dlg_sys_params_st;
-typedef struct _vcs_dpi_display_dlg_prefetch_param_st display_dlg_prefetch_param_st;
-typedef struct _vcs_dpi_display_pipe_clock_st display_pipe_clock_st;
 typedef struct _vcs_dpi_display_arb_params_st display_arb_params_st;
 
 struct _vcs_dpi_voltage_scaling_st {
@@ -111,8 +102,6 @@ struct _vcs_dpi_soc_bounding_box_st {
 	double xfc_bus_transport_time_us;
 	double xfc_xbuf_latency_tolerance_us;
 	int use_urgent_burst_bw;
-	double max_hscl_ratio;
-	double max_vscl_ratio;
 	unsigned int num_states;
 	struct _vcs_dpi_voltage_scaling_st clock_limits[8];
 };
@@ -129,7 +118,8 @@ struct _vcs_dpi_ip_params_st {
 	unsigned int odm_capable;
 	unsigned int rob_buffer_size_kbytes;
 	unsigned int det_buffer_size_kbytes;
-	unsigned int dpte_buffer_size_in_pte_reqs;
+	unsigned int dpte_buffer_size_in_pte_reqs_luma;
+	unsigned int dpte_buffer_size_in_pte_reqs_chroma;
 	unsigned int pde_proc_buffer_size_64k_reqs;
 	unsigned int dpp_output_buffer_pixels;
 	unsigned int opp_output_buffer_lines;
@@ -192,7 +182,6 @@ struct _vcs_dpi_display_xfc_params_st {
 struct _vcs_dpi_display_pipe_source_params_st {
 	int source_format;
 	unsigned char dcc;
-	unsigned int dcc_override;
 	unsigned int dcc_rate;
 	unsigned char dcc_use_global;
 	unsigned char vm;
@@ -205,7 +194,6 @@ struct _vcs_dpi_display_pipe_source_params_st {
 	int source_scan;
 	int sw_mode;
 	int macro_tile_size;
-	unsigned char is_display_sw;
 	unsigned int viewport_width;
 	unsigned int viewport_height;
 	unsigned int viewport_y_y;
@@ -252,16 +240,10 @@ struct _vcs_dpi_display_output_params_st {
 	int output_bpc;
 	int output_type;
 	int output_format;
-	int output_standard;
 	int dsc_slices;
 	struct writeback_st wb;
 };
 
-struct _vcs_dpi_display_bandwidth_st {
-	double total_bw_consumed_gbps;
-	double guaranteed_urgent_return_bw_gbps;
-};
-
 struct _vcs_dpi_scaler_ratio_depth_st {
 	double hscl_ratio;
 	double vscl_ratio;
@@ -300,11 +282,9 @@ struct _vcs_dpi_display_pipe_dest_params_st {
 	unsigned int vupdate_width;
 	unsigned int vready_offset;
 	unsigned char interlaced;
-	unsigned char underscan;
 	double pixel_rate_mhz;
 	unsigned char synchronized_vblank_all_planes;
 	unsigned char otg_inst;
-	unsigned char odm_split_cnt;
 	unsigned char odm_combine;
 	unsigned char use_maximum_vstartup;
 };
@@ -331,65 +311,6 @@ struct _vcs_dpi_display_e2e_pipe_params_st {
 	display_clocks_and_cfg_st clks_cfg;
 };
 
-struct _vcs_dpi_dchub_buffer_sizing_st {
-	unsigned int swath_width_y;
-	unsigned int swath_height_y;
-	unsigned int swath_height_c;
-	unsigned int detail_buffer_size_y;
-};
-
-struct _vcs_dpi_watermarks_perf_st {
-	double stutter_eff_in_active_region_percent;
-	double urgent_latency_supported_us;
-	double non_urgent_latency_supported_us;
-	double dram_clock_change_margin_us;
-	double dram_access_eff_percent;
-};
-
-struct _vcs_dpi_cstate_pstate_watermarks_st {
-	double cstate_exit_us;
-	double cstate_enter_plus_exit_us;
-	double pstate_change_us;
-};
-
-struct _vcs_dpi_wm_calc_pipe_params_st {
-	unsigned int num_dpp;
-	int voltage;
-	int output_type;
-	double dcfclk_mhz;
-	double socclk_mhz;
-	double dppclk_mhz;
-	double pixclk_mhz;
-	unsigned char interlace_en;
-	unsigned char pte_enable;
-	unsigned char dcc_enable;
-	double dcc_rate;
-	double bytes_per_pixel_c;
-	double bytes_per_pixel_y;
-	unsigned int swath_width_y;
-	unsigned int swath_height_y;
-	unsigned int swath_height_c;
-	unsigned int det_buffer_size_y;
-	double h_ratio;
-	double v_ratio;
-	unsigned int h_taps;
-	unsigned int h_total;
-	unsigned int v_total;
-	unsigned int v_active;
-	unsigned int e2e_index;
-	double display_pipe_line_delivery_time;
-	double read_bw;
-	unsigned int lines_in_det_y;
-	unsigned int lines_in_det_y_rounded_down_to_swath;
-	double full_det_buffering_time;
-	double dcfclk_deepsleep_mhz_per_plane;
-};
-
-struct _vcs_dpi_vratio_pre_st {
-	double vratio_pre_l;
-	double vratio_pre_c;
-};
-
 struct _vcs_dpi_display_data_rq_misc_params_st {
 	unsigned int full_swath_bytes;
 	unsigned int stored_swath_bytes;
@@ -423,16 +344,9 @@ struct _vcs_dpi_display_data_rq_dlg_params_st {
 	unsigned int meta_bytes_per_row_ub;
 };
 
-struct _vcs_dpi_display_cur_rq_dlg_params_st {
-	unsigned char enable;
-	unsigned int swath_height;
-	unsigned int req_per_line;
-};
-
 struct _vcs_dpi_display_rq_dlg_params_st {
 	display_data_rq_dlg_params_st rq_l;
 	display_data_rq_dlg_params_st rq_c;
-	display_cur_rq_dlg_params_st rq_cur0;
 };
 
 struct _vcs_dpi_display_rq_sizing_params_st {
@@ -498,6 +412,10 @@ struct _vcs_dpi_display_dlg_regs_st {
 	unsigned int xfc_reg_remote_surface_flip_latency;
 	unsigned int xfc_reg_prefetch_margin;
 	unsigned int dst_y_delta_drq_limit;
+	unsigned int refcyc_per_vm_group_vblank;
+	unsigned int refcyc_per_vm_group_flip;
+	unsigned int refcyc_per_vm_req_vblank;
+	unsigned int refcyc_per_vm_req_flip;
 };
 
 struct _vcs_dpi_display_ttu_regs_st {
@@ -556,19 +474,6 @@ struct _vcs_dpi_display_dlg_sys_params_st {
 	unsigned int total_flip_bytes;
 };
 
-struct _vcs_dpi_display_dlg_prefetch_param_st {
-	double prefetch_bw;
-	unsigned int flip_bytes;
-};
-
-struct _vcs_dpi_display_pipe_clock_st {
-	double dcfclk_mhz;
-	double dispclk_mhz;
-	double socclk_mhz;
-	double dscclk_mhz[6];
-	double dppclk_mhz[6];
-};
-
 struct _vcs_dpi_display_arb_params_st {
 	int max_req_outstanding;
 	int min_req_outstanding;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c b/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c
index c2037daa8e66..ad8571f5a142 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c
@@ -459,7 +459,7 @@ static void dml1_rq_dlg_get_row_heights(
 	/* dpte   */
 	/* ------ */
 	log2_vmpg_bytes = dml_log2(mode_lib->soc.vmm_page_size_bytes);
-	dpte_buf_in_pte_reqs = mode_lib->ip.dpte_buffer_size_in_pte_reqs;
+	dpte_buf_in_pte_reqs = mode_lib->ip.dpte_buffer_size_in_pte_reqs_luma;
 
 	log2_vmpg_height = 0;
 	log2_vmpg_width = 0;
@@ -776,7 +776,7 @@ static void get_surf_rq_param(
 	/* dpte   */
 	/* ------ */
 	log2_vmpg_bytes = dml_log2(mode_lib->soc.vmm_page_size_bytes);
-	dpte_buf_in_pte_reqs = mode_lib->ip.dpte_buffer_size_in_pte_reqs;
+	dpte_buf_in_pte_reqs = mode_lib->ip.dpte_buffer_size_in_pte_reqs_luma;
 
 	log2_vmpg_height = 0;
 	log2_vmpg_width = 0;
@@ -881,7 +881,7 @@ static void get_surf_rq_param(
 	/* the dpte_group_bytes is reduced for the specific case of vertical
 	 * access of a tile surface that has dpte request of 8x1 ptes.
 	 */
-	if (!surf_linear & (log2_dpte_req_height_ptes == 0) & surf_vert) /*reduced, in this case, will have page fault within a group */
+	if (!surf_linear && (log2_dpte_req_height_ptes == 0) && surf_vert) /*reduced, in this case, will have page fault within a group */
 		rq_sizing_param->dpte_group_bytes = 512;
 	else
 		/*full size */
diff --git a/drivers/gpu/drm/amd/display/dc/gpio/gpio_base.c b/drivers/gpu/drm/amd/display/dc/gpio/gpio_base.c
index 1d1efd72b291..cf76ea2d9f5a 100644
--- a/drivers/gpu/drm/amd/display/dc/gpio/gpio_base.c
+++ b/drivers/gpu/drm/amd/display/dc/gpio/gpio_base.c
@@ -101,6 +101,18 @@ enum gpio_mode dal_gpio_get_mode(
 	return gpio->mode;
 }
 
+enum gpio_result dal_gpio_lock_pin(
+	struct gpio *gpio)
+{
+	return dal_gpio_service_lock(gpio->service, gpio->id, gpio->en);
+}
+
+enum gpio_result dal_gpio_unlock_pin(
+	struct gpio *gpio)
+{
+	return dal_gpio_service_unlock(gpio->service, gpio->id, gpio->en);
+}
+
 enum gpio_result dal_gpio_change_mode(
 	struct gpio *gpio,
 	enum gpio_mode mode)
diff --git a/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c b/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c
index dada04296025..3c63a3c04dbb 100644
--- a/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c
+++ b/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.c
@@ -192,6 +192,34 @@ static void set_pin_free(
 	service->busyness[id][en] = false;
 }
 
+enum gpio_result dal_gpio_service_lock(
+	struct gpio_service *service,
+	enum gpio_id id,
+	uint32_t en)
+{
+	if (!service->busyness[id]) {
+		ASSERT_CRITICAL(false);
+		return GPIO_RESULT_OPEN_FAILED;
+	}
+
+	set_pin_busy(service, id, en);
+	return GPIO_RESULT_OK;
+}
+
+enum gpio_result dal_gpio_service_unlock(
+	struct gpio_service *service,
+	enum gpio_id id,
+	uint32_t en)
+{
+	if (!service->busyness[id]) {
+		ASSERT_CRITICAL(false);
+		return GPIO_RESULT_OPEN_FAILED;
+	}
+
+	set_pin_free(service, id, en);
+	return GPIO_RESULT_OK;
+}
+
 enum gpio_result dal_gpio_service_open(
 	struct gpio_service *service,
 	enum gpio_id id,
diff --git a/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.h b/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.h
index 1d501a43d13b..0c678af75331 100644
--- a/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.h
+++ b/drivers/gpu/drm/amd/display/dc/gpio/gpio_service.h
@@ -52,4 +52,14 @@ void dal_gpio_service_close(
 	struct gpio_service *service,
 	struct hw_gpio_pin **ptr);
 
+enum gpio_result dal_gpio_service_lock(
+	struct gpio_service *service,
+	enum gpio_id id,
+	uint32_t en);
+
+enum gpio_result dal_gpio_service_unlock(
+	struct gpio_service *service,
+	enum gpio_id id,
+	uint32_t en);
+
 #endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/Makefile b/drivers/gpu/drm/amd/display/dc/i2caux/Makefile
deleted file mode 100644
index 352885cb4d07..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/Makefile
+++ /dev/null
@@ -1,99 +0,0 @@
-#
-# Copyright 2017 Advanced Micro Devices, Inc.
-#
-# Permission is hereby granted, free of charge, to any person obtaining a
-# copy of this software and associated documentation files (the "Software"),
-# to deal in the Software without restriction, including without limitation
-# the rights to use, copy, modify, merge, publish, distribute, sublicense,
-# and/or sell copies of the Software, and to permit persons to whom the
-# Software is furnished to do so, subject to the following conditions:
-#
-# The above copyright notice and this permission notice shall be included in
-# all copies or substantial portions of the Software.
-#
-# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
-# THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
-# OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
-# ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
-# OTHER DEALINGS IN THE SOFTWARE.
-#
-#
-# Makefile for the 'i2c' sub-component of DAL.
-# It provides the control and status of HW i2c engine of the adapter.
-
-I2CAUX = aux_engine.o engine_base.o i2caux.o i2c_engine.o \
-	 i2c_generic_hw_engine.o i2c_hw_engine.o i2c_sw_engine.o
-
-AMD_DAL_I2CAUX = $(addprefix $(AMDDALPATH)/dc/i2caux/,$(I2CAUX))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX)
-
-###############################################################################
-# DCE 8x family
-###############################################################################
-I2CAUX_DCE80 = i2caux_dce80.o i2c_hw_engine_dce80.o \
-	i2c_sw_engine_dce80.o
-
-AMD_DAL_I2CAUX_DCE80 = $(addprefix $(AMDDALPATH)/dc/i2caux/dce80/,$(I2CAUX_DCE80))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCE80)
-
-###############################################################################
-# DCE 100 family
-###############################################################################
-I2CAUX_DCE100 = i2caux_dce100.o
-
-AMD_DAL_I2CAUX_DCE100 = $(addprefix $(AMDDALPATH)/dc/i2caux/dce100/,$(I2CAUX_DCE100))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCE100)
-
-###############################################################################
-# DCE 110 family
-###############################################################################
-I2CAUX_DCE110 = i2caux_dce110.o i2c_sw_engine_dce110.o i2c_hw_engine_dce110.o \
-	aux_engine_dce110.o
-
-AMD_DAL_I2CAUX_DCE110 = $(addprefix $(AMDDALPATH)/dc/i2caux/dce110/,$(I2CAUX_DCE110))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCE110)
-
-###############################################################################
-# DCE 112 family
-###############################################################################
-I2CAUX_DCE112 = i2caux_dce112.o
-
-AMD_DAL_I2CAUX_DCE112 = $(addprefix $(AMDDALPATH)/dc/i2caux/dce112/,$(I2CAUX_DCE112))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCE112)
-
-###############################################################################
-# DCN 1.0 family
-###############################################################################
-ifdef CONFIG_DRM_AMD_DC_DCN1_0
-I2CAUX_DCN1 = i2caux_dcn10.o
-
-AMD_DAL_I2CAUX_DCN1 = $(addprefix $(AMDDALPATH)/dc/i2caux/dcn10/,$(I2CAUX_DCN1))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCN1)
-endif
-
-###############################################################################
-# DCE 120 family
-###############################################################################
-I2CAUX_DCE120 = i2caux_dce120.o
-
-AMD_DAL_I2CAUX_DCE120 = $(addprefix $(AMDDALPATH)/dc/i2caux/dce120/,$(I2CAUX_DCE120))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DCE120)
-
-###############################################################################
-# Diagnostics on FPGA
-###############################################################################
-I2CAUX_DIAG = i2caux_diag.o
-
-AMD_DAL_I2CAUX_DIAG = $(addprefix $(AMDDALPATH)/dc/i2caux/diagnostics/,$(I2CAUX_DIAG))
-
-AMD_DISPLAY_FILES += $(AMD_DAL_I2CAUX_DIAG)
-
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.c b/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.c
deleted file mode 100644
index 8cbf38b2470d..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.c
+++ /dev/null
@@ -1,606 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-#include "dm_event_log.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "engine.h"
-
-/*
- * Header of this unit
- */
-
-#include "aux_engine.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-#include "include/link_service_types.h"
-
-/*
- * This unit
- */
-
-enum {
-	AUX_INVALID_REPLY_RETRY_COUNTER = 1,
-	AUX_TIMED_OUT_RETRY_COUNTER = 2,
-	AUX_DEFER_RETRY_COUNTER = 6
-};
-
-#define FROM_ENGINE(ptr) \
-	container_of((ptr), struct aux_engine, base)
-#define DC_LOGGER \
-	engine->base.ctx->logger
-
-enum i2caux_engine_type dal_aux_engine_get_engine_type(
-	const struct engine *engine)
-{
-	return I2CAUX_ENGINE_TYPE_AUX;
-}
-
-bool dal_aux_engine_acquire(
-	struct engine *engine,
-	struct ddc *ddc)
-{
-	struct aux_engine *aux_engine = FROM_ENGINE(engine);
-
-	enum gpio_result result;
-	if (aux_engine->funcs->is_engine_available) {
-		/*check whether SW could use the engine*/
-		if (!aux_engine->funcs->is_engine_available(aux_engine)) {
-			return false;
-		}
-	}
-
-	result = dal_ddc_open(ddc, GPIO_MODE_HARDWARE,
-		GPIO_DDC_CONFIG_TYPE_MODE_AUX);
-
-	if (result != GPIO_RESULT_OK)
-		return false;
-
-	if (!aux_engine->funcs->acquire_engine(aux_engine)) {
-		dal_ddc_close(ddc);
-		return false;
-	}
-
-	engine->ddc = ddc;
-
-	return true;
-}
-
-struct read_command_context {
-	uint8_t *buffer;
-	uint32_t current_read_length;
-	uint32_t offset;
-	enum i2caux_transaction_status status;
-
-	struct aux_request_transaction_data request;
-	struct aux_reply_transaction_data reply;
-
-	uint8_t returned_byte;
-
-	uint32_t timed_out_retry_aux;
-	uint32_t invalid_reply_retry_aux;
-	uint32_t defer_retry_aux;
-	uint32_t defer_retry_i2c;
-	uint32_t invalid_reply_retry_aux_on_ack;
-
-	bool transaction_complete;
-	bool operation_succeeded;
-};
-
-static void process_read_reply(
-	struct aux_engine *engine,
-	struct read_command_context *ctx)
-{
-	engine->funcs->process_channel_reply(engine, &ctx->reply);
-
-	switch (ctx->reply.status) {
-	case AUX_TRANSACTION_REPLY_AUX_ACK:
-		ctx->defer_retry_aux = 0;
-		if (ctx->returned_byte > ctx->current_read_length) {
-			ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-			ctx->operation_succeeded = false;
-		} else if (ctx->returned_byte < ctx->current_read_length) {
-			ctx->current_read_length -= ctx->returned_byte;
-
-			ctx->offset += ctx->returned_byte;
-
-			++ctx->invalid_reply_retry_aux_on_ack;
-
-			if (ctx->invalid_reply_retry_aux_on_ack >
-				AUX_INVALID_REPLY_RETRY_COUNTER) {
-				ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-				ctx->operation_succeeded = false;
-			}
-		} else {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_SUCCEEDED;
-			ctx->transaction_complete = true;
-			ctx->operation_succeeded = true;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_AUX_NACK:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_NACK;
-		ctx->operation_succeeded = false;
-	break;
-	case AUX_TRANSACTION_REPLY_AUX_DEFER:
-		++ctx->defer_retry_aux;
-
-		if (ctx->defer_retry_aux > AUX_DEFER_RETRY_COUNTER) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_I2C_DEFER:
-		ctx->defer_retry_aux = 0;
-
-		++ctx->defer_retry_i2c;
-
-		if (ctx->defer_retry_i2c > AUX_DEFER_RETRY_COUNTER) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_HPD_DISCON:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_HPD_DISCON;
-		ctx->operation_succeeded = false;
-	break;
-	default:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_UNKNOWN;
-		ctx->operation_succeeded = false;
-	}
-}
-
-static void process_read_request(
-	struct aux_engine *engine,
-	struct read_command_context *ctx)
-{
-	enum aux_channel_operation_result operation_result;
-
-	engine->funcs->submit_channel_request(engine, &ctx->request);
-
-	operation_result = engine->funcs->get_channel_status(
-		engine, &ctx->returned_byte);
-
-	switch (operation_result) {
-	case AUX_CHANNEL_OPERATION_SUCCEEDED:
-		if (ctx->returned_byte > ctx->current_read_length) {
-			ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-			ctx->operation_succeeded = false;
-		} else {
-			ctx->timed_out_retry_aux = 0;
-			ctx->invalid_reply_retry_aux = 0;
-
-			ctx->reply.length = ctx->returned_byte;
-			ctx->reply.data = ctx->buffer;
-
-			process_read_reply(engine, ctx);
-		}
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_INVALID_REPLY:
-		++ctx->invalid_reply_retry_aux;
-
-		if (ctx->invalid_reply_retry_aux >
-			AUX_INVALID_REPLY_RETRY_COUNTER) {
-			ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-			ctx->operation_succeeded = false;
-		} else
-			udelay(400);
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_TIMEOUT:
-		++ctx->timed_out_retry_aux;
-
-		if (ctx->timed_out_retry_aux > AUX_TIMED_OUT_RETRY_COUNTER) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		} else {
-			/* DP 1.2a, table 2-58:
-			 * "S3: AUX Request CMD PENDING:
-			 * retry 3 times, with 400usec wait on each"
-			 * The HW timeout is set to 550usec,
-			 * so we should not wait here */
-		}
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_HPD_DISCON:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_HPD_DISCON;
-		ctx->operation_succeeded = false;
-	break;
-	default:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_UNKNOWN;
-		ctx->operation_succeeded = false;
-	}
-}
-
-static bool read_command(
-	struct aux_engine *engine,
-	struct i2caux_transaction_request *request,
-	bool middle_of_transaction)
-{
-	struct read_command_context ctx;
-
-	ctx.buffer = request->payload.data;
-	ctx.current_read_length = request->payload.length;
-	ctx.offset = 0;
-	ctx.timed_out_retry_aux = 0;
-	ctx.invalid_reply_retry_aux = 0;
-	ctx.defer_retry_aux = 0;
-	ctx.defer_retry_i2c = 0;
-	ctx.invalid_reply_retry_aux_on_ack = 0;
-	ctx.transaction_complete = false;
-	ctx.operation_succeeded = true;
-
-	if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD) {
-		ctx.request.type = AUX_TRANSACTION_TYPE_DP;
-		ctx.request.action = I2CAUX_TRANSACTION_ACTION_DP_READ;
-		ctx.request.address = request->payload.address;
-	} else if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_I2C) {
-		ctx.request.type = AUX_TRANSACTION_TYPE_I2C;
-		ctx.request.action = middle_of_transaction ?
-			I2CAUX_TRANSACTION_ACTION_I2C_READ_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_READ;
-		ctx.request.address = request->payload.address >> 1;
-	} else {
-		/* in DAL2, there was no return in such case */
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	ctx.request.delay = 0;
-
-	do {
-		memset(ctx.buffer + ctx.offset, 0, ctx.current_read_length);
-
-		ctx.request.data = ctx.buffer + ctx.offset;
-		ctx.request.length = ctx.current_read_length;
-
-		process_read_request(engine, &ctx);
-
-		request->status = ctx.status;
-
-		if (ctx.operation_succeeded && !ctx.transaction_complete)
-			if (ctx.request.type == AUX_TRANSACTION_TYPE_I2C)
-				msleep(engine->delay);
-	} while (ctx.operation_succeeded && !ctx.transaction_complete);
-
-	if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD) {
-		DC_LOG_I2C_AUX("READ: addr:0x%x  value:0x%x Result:%d",
-				request->payload.address,
-				request->payload.data[0],
-				ctx.operation_succeeded);
-	}
-
-	return ctx.operation_succeeded;
-}
-
-struct write_command_context {
-	bool mot;
-
-	uint8_t *buffer;
-	uint32_t current_write_length;
-	enum i2caux_transaction_status status;
-
-	struct aux_request_transaction_data request;
-	struct aux_reply_transaction_data reply;
-
-	uint8_t returned_byte;
-
-	uint32_t timed_out_retry_aux;
-	uint32_t invalid_reply_retry_aux;
-	uint32_t defer_retry_aux;
-	uint32_t defer_retry_i2c;
-	uint32_t max_defer_retry;
-	uint32_t ack_m_retry;
-
-	uint8_t reply_data[DEFAULT_AUX_MAX_DATA_SIZE];
-
-	bool transaction_complete;
-	bool operation_succeeded;
-};
-
-static void process_write_reply(
-	struct aux_engine *engine,
-	struct write_command_context *ctx)
-{
-	engine->funcs->process_channel_reply(engine, &ctx->reply);
-
-	switch (ctx->reply.status) {
-	case AUX_TRANSACTION_REPLY_AUX_ACK:
-		ctx->operation_succeeded = true;
-
-		if (ctx->returned_byte) {
-			ctx->request.action = ctx->mot ?
-			I2CAUX_TRANSACTION_ACTION_I2C_STATUS_REQUEST_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_STATUS_REQUEST;
-
-			ctx->current_write_length = 0;
-
-			++ctx->ack_m_retry;
-
-			if (ctx->ack_m_retry > AUX_DEFER_RETRY_COUNTER) {
-				ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-				ctx->operation_succeeded = false;
-			} else
-				udelay(300);
-		} else {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_SUCCEEDED;
-			ctx->defer_retry_aux = 0;
-			ctx->ack_m_retry = 0;
-			ctx->transaction_complete = true;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_AUX_NACK:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_NACK;
-		ctx->operation_succeeded = false;
-	break;
-	case AUX_TRANSACTION_REPLY_AUX_DEFER:
-		++ctx->defer_retry_aux;
-
-		if (ctx->defer_retry_aux > ctx->max_defer_retry) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_I2C_DEFER:
-		ctx->defer_retry_aux = 0;
-		ctx->current_write_length = 0;
-
-		ctx->request.action = ctx->mot ?
-			I2CAUX_TRANSACTION_ACTION_I2C_STATUS_REQUEST_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_STATUS_REQUEST;
-
-		++ctx->defer_retry_i2c;
-
-		if (ctx->defer_retry_i2c > ctx->max_defer_retry) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		}
-	break;
-	case AUX_TRANSACTION_REPLY_HPD_DISCON:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_HPD_DISCON;
-		ctx->operation_succeeded = false;
-	break;
-	default:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_UNKNOWN;
-		ctx->operation_succeeded = false;
-	}
-}
-
-static void process_write_request(
-	struct aux_engine *engine,
-	struct write_command_context *ctx)
-{
-	enum aux_channel_operation_result operation_result;
-
-	engine->funcs->submit_channel_request(engine, &ctx->request);
-
-	operation_result = engine->funcs->get_channel_status(
-		engine, &ctx->returned_byte);
-
-	switch (operation_result) {
-	case AUX_CHANNEL_OPERATION_SUCCEEDED:
-		ctx->timed_out_retry_aux = 0;
-		ctx->invalid_reply_retry_aux = 0;
-
-		ctx->reply.length = ctx->returned_byte;
-		ctx->reply.data = ctx->reply_data;
-
-		process_write_reply(engine, ctx);
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_INVALID_REPLY:
-		++ctx->invalid_reply_retry_aux;
-
-		if (ctx->invalid_reply_retry_aux >
-			AUX_INVALID_REPLY_RETRY_COUNTER) {
-			ctx->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-			ctx->operation_succeeded = false;
-		} else
-			udelay(400);
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_TIMEOUT:
-		++ctx->timed_out_retry_aux;
-
-		if (ctx->timed_out_retry_aux > AUX_TIMED_OUT_RETRY_COUNTER) {
-			ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			ctx->operation_succeeded = false;
-		} else {
-			/* DP 1.2a, table 2-58:
-			 * "S3: AUX Request CMD PENDING:
-			 * retry 3 times, with 400usec wait on each"
-			 * The HW timeout is set to 550usec,
-			 * so we should not wait here */
-		}
-	break;
-	case AUX_CHANNEL_OPERATION_FAILED_HPD_DISCON:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_FAILED_HPD_DISCON;
-		ctx->operation_succeeded = false;
-	break;
-	default:
-		ctx->status = I2CAUX_TRANSACTION_STATUS_UNKNOWN;
-		ctx->operation_succeeded = false;
-	}
-}
-
-static bool write_command(
-	struct aux_engine *engine,
-	struct i2caux_transaction_request *request,
-	bool middle_of_transaction)
-{
-	struct write_command_context ctx;
-
-	ctx.mot = middle_of_transaction;
-	ctx.buffer = request->payload.data;
-	ctx.current_write_length = request->payload.length;
-	ctx.timed_out_retry_aux = 0;
-	ctx.invalid_reply_retry_aux = 0;
-	ctx.defer_retry_aux = 0;
-	ctx.defer_retry_i2c = 0;
-	ctx.ack_m_retry = 0;
-	ctx.transaction_complete = false;
-	ctx.operation_succeeded = true;
-
-	if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD) {
-		ctx.request.type = AUX_TRANSACTION_TYPE_DP;
-		ctx.request.action = I2CAUX_TRANSACTION_ACTION_DP_WRITE;
-		ctx.request.address = request->payload.address;
-	} else if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_I2C) {
-		ctx.request.type = AUX_TRANSACTION_TYPE_I2C;
-		ctx.request.action = middle_of_transaction ?
-			I2CAUX_TRANSACTION_ACTION_I2C_WRITE_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_WRITE;
-		ctx.request.address = request->payload.address >> 1;
-	} else {
-		/* in DAL2, there was no return in such case */
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	ctx.request.delay = 0;
-
-	ctx.max_defer_retry =
-		(engine->max_defer_write_retry > AUX_DEFER_RETRY_COUNTER) ?
-			engine->max_defer_write_retry : AUX_DEFER_RETRY_COUNTER;
-
-	do {
-		ctx.request.data = ctx.buffer;
-		ctx.request.length = ctx.current_write_length;
-
-		process_write_request(engine, &ctx);
-
-		request->status = ctx.status;
-
-		if (ctx.operation_succeeded && !ctx.transaction_complete)
-			if (ctx.request.type == AUX_TRANSACTION_TYPE_I2C)
-				msleep(engine->delay);
-	} while (ctx.operation_succeeded && !ctx.transaction_complete);
-
-	if (request->payload.address_space ==
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD) {
-		DC_LOG_I2C_AUX("WRITE: addr:0x%x  value:0x%x Result:%d",
-				request->payload.address,
-				request->payload.data[0],
-				ctx.operation_succeeded);
-	}
-
-	return ctx.operation_succeeded;
-}
-
-static bool end_of_transaction_command(
-	struct aux_engine *engine,
-	struct i2caux_transaction_request *request)
-{
-	struct i2caux_transaction_request dummy_request;
-	uint8_t dummy_data;
-
-	/* [tcheng] We only need to send the stop (read with MOT = 0)
-	 * for I2C-over-Aux, not native AUX */
-
-	if (request->payload.address_space !=
-		I2CAUX_TRANSACTION_ADDRESS_SPACE_I2C)
-		return false;
-
-	dummy_request.operation = request->operation;
-	dummy_request.payload.address_space = request->payload.address_space;
-	dummy_request.payload.address = request->payload.address;
-
-	/*
-	 * Add a dummy byte due to some receiver quirk
-	 * where one byte is sent along with MOT = 0.
-	 * Ideally this should be 0.
-	 */
-
-	dummy_request.payload.length = 0;
-	dummy_request.payload.data = &dummy_data;
-
-	if (request->operation == I2CAUX_TRANSACTION_READ)
-		return read_command(engine, &dummy_request, false);
-	else
-		return write_command(engine, &dummy_request, false);
-
-	/* according Syed, it does not need now DoDummyMOT */
-}
-
-bool dal_aux_engine_submit_request(
-	struct engine *engine,
-	struct i2caux_transaction_request *request,
-	bool middle_of_transaction)
-{
-	struct aux_engine *aux_engine = FROM_ENGINE(engine);
-
-	bool result;
-	bool mot_used = true;
-
-	switch (request->operation) {
-	case I2CAUX_TRANSACTION_READ:
-		result = read_command(aux_engine, request, mot_used);
-	break;
-	case I2CAUX_TRANSACTION_WRITE:
-		result = write_command(aux_engine, request, mot_used);
-	break;
-	default:
-		result = false;
-	}
-
-	/* [tcheng]
-	 * need to send stop for the last transaction to free up the AUX
-	 * if the above command fails, this would be the last transaction */
-
-	if (!middle_of_transaction || !result)
-		end_of_transaction_command(aux_engine, request);
-
-	/* mask AUX interrupt */
-
-	return result;
-}
-
-void dal_aux_engine_construct(
-	struct aux_engine *engine,
-	struct dc_context *ctx)
-{
-	dal_i2caux_construct_engine(&engine->base, ctx);
-	engine->delay = 0;
-	engine->max_defer_write_retry = 0;
-}
-
-void dal_aux_engine_destruct(
-	struct aux_engine *engine)
-{
-	dal_i2caux_destruct_engine(&engine->base);
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.h b/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.h
deleted file mode 100644
index c33a2898d967..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.h
+++ /dev/null
@@ -1,86 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_AUX_ENGINE_H__
-#define __DAL_AUX_ENGINE_H__
-
-#include "dc_ddc_types.h"
-
-struct aux_engine;
-
-struct aux_engine_funcs {
-	void (*destroy)(
-		struct aux_engine **ptr);
-	bool (*acquire_engine)(
-		struct aux_engine *engine);
-	void (*configure)(
-		struct aux_engine *engine,
-		union aux_config cfg);
-	void (*submit_channel_request)(
-		struct aux_engine *engine,
-		struct aux_request_transaction_data *request);
-	void (*process_channel_reply)(
-		struct aux_engine *engine,
-		struct aux_reply_transaction_data *reply);
-	int (*read_channel_reply)(
-		struct aux_engine *engine,
-		uint32_t size,
-		uint8_t *buffer,
-		uint8_t *reply_result,
-		uint32_t *sw_status);
-	enum aux_channel_operation_result (*get_channel_status)(
-		struct aux_engine *engine,
-		uint8_t *returned_bytes);
-	bool (*is_engine_available) (
-		struct aux_engine *engine);
-};
-
-struct aux_engine {
-	struct engine base;
-	const struct aux_engine_funcs *funcs;
-	/* following values are expressed in milliseconds */
-	uint32_t delay;
-	uint32_t max_defer_write_retry;
-
-	bool acquire_reset;
-};
-
-void dal_aux_engine_construct(
-	struct aux_engine *engine,
-	struct dc_context *ctx);
-
-void dal_aux_engine_destruct(
-	struct aux_engine *engine);
-bool dal_aux_engine_submit_request(
-	struct engine *ptr,
-	struct i2caux_transaction_request *request,
-	bool middle_of_transaction);
-bool dal_aux_engine_acquire(
-	struct engine *ptr,
-	struct ddc *ddc);
-enum i2caux_engine_type dal_aux_engine_get_engine_type(
-	const struct engine *engine);
-
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce100/i2caux_dce100.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce100/i2caux_dce100.c
deleted file mode 100644
index 8b704ab0471c..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce100/i2caux_dce100.c
+++ /dev/null
@@ -1,106 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-#include "include/i2caux_interface.h"
-#include "../i2caux.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_sw_engine.h"
-#include "../i2c_hw_engine.h"
-
-#include "../dce110/aux_engine_dce110.h"
-#include "../dce110/i2c_hw_engine_dce110.h"
-#include "../dce110/i2caux_dce110.h"
-
-#include "dce/dce_10_0_d.h"
-#include "dce/dce_10_0_sh_mask.h"
-
-/* set register offset */
-#define SR(reg_name)\
-	.reg_name = mm ## reg_name
-
-/* set register offset with instance */
-#define SRI(reg_name, block, id)\
-	.reg_name = mm ## block ## id ## _ ## reg_name
-
-#define aux_regs(id)\
-[id] = {\
-	AUX_COMMON_REG_LIST(id), \
-	.AUX_RESET_MASK = 0 \
-}
-
-#define hw_engine_regs(id)\
-{\
-		I2C_HW_ENGINE_COMMON_REG_LIST(id) \
-}
-
-static const struct dce110_aux_registers dce100_aux_regs[] = {
-		aux_regs(0),
-		aux_regs(1),
-		aux_regs(2),
-		aux_regs(3),
-		aux_regs(4),
-		aux_regs(5),
-};
-
-static const struct dce110_i2c_hw_engine_registers dce100_hw_engine_regs[] = {
-		hw_engine_regs(1),
-		hw_engine_regs(2),
-		hw_engine_regs(3),
-		hw_engine_regs(4),
-		hw_engine_regs(5),
-		hw_engine_regs(6)
-};
-
-static const struct dce110_i2c_hw_engine_shift i2c_shift = {
-		I2C_COMMON_MASK_SH_LIST_DCE100(__SHIFT)
-};
-
-static const struct dce110_i2c_hw_engine_mask i2c_mask = {
-		I2C_COMMON_MASK_SH_LIST_DCE100(_MASK)
-};
-
-struct i2caux *dal_i2caux_dce100_create(
-	struct dc_context *ctx)
-{
-	struct i2caux_dce110 *i2caux_dce110 =
-		kzalloc(sizeof(struct i2caux_dce110), GFP_KERNEL);
-
-	if (!i2caux_dce110) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	dal_i2caux_dce110_construct(i2caux_dce110,
-				    ctx,
-				    ARRAY_SIZE(dce100_aux_regs),
-				    dce100_aux_regs,
-				    dce100_hw_engine_regs,
-				    &i2c_shift,
-				    &i2c_mask);
-	return &i2caux_dce110->base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/aux_engine_dce110.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce110/aux_engine_dce110.c
deleted file mode 100644
index 59c3ed43d609..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/aux_engine_dce110.c
+++ /dev/null
@@ -1,505 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-#include "dm_event_log.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "../engine.h"
-#include "../aux_engine.h"
-
-/*
- * Header of this unit
- */
-
-#include "aux_engine_dce110.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-#include "dce/dce_11_0_sh_mask.h"
-
-#define CTX \
-	aux110->base.base.ctx
-#define REG(reg_name)\
-	(aux110->regs->reg_name)
-#include "reg_helper.h"
-
-/*
- * This unit
- */
-
-/*
- * @brief
- * Cast 'struct aux_engine *'
- * to 'struct aux_engine_dce110 *'
- */
-#define FROM_AUX_ENGINE(ptr) \
-	container_of((ptr), struct aux_engine_dce110, base)
-
-/*
- * @brief
- * Cast 'struct engine *'
- * to 'struct aux_engine_dce110 *'
- */
-#define FROM_ENGINE(ptr) \
-	FROM_AUX_ENGINE(container_of((ptr), struct aux_engine, base))
-
-static void release_engine(
-	struct engine *engine)
-{
-	struct aux_engine_dce110 *aux110 = FROM_ENGINE(engine);
-
-	REG_UPDATE(AUX_ARB_CONTROL, AUX_SW_DONE_USING_AUX_REG, 1);
-}
-
-static void destruct(
-	struct aux_engine_dce110 *engine);
-
-static void destroy(
-	struct aux_engine **aux_engine)
-{
-	struct aux_engine_dce110 *engine = FROM_AUX_ENGINE(*aux_engine);
-
-	destruct(engine);
-
-	kfree(engine);
-
-	*aux_engine = NULL;
-}
-
-#define SW_CAN_ACCESS_AUX 1
-#define DMCU_CAN_ACCESS_AUX 2
-
-static bool is_engine_available(
-	struct aux_engine *engine)
-{
-	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
-
-	uint32_t value = REG_READ(AUX_ARB_CONTROL);
-	uint32_t field = get_reg_field_value(
-			value,
-			AUX_ARB_CONTROL,
-			AUX_REG_RW_CNTL_STATUS);
-
-	return (field != DMCU_CAN_ACCESS_AUX);
-}
-static bool acquire_engine(
-	struct aux_engine *engine)
-{
-	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
-
-	uint32_t value = REG_READ(AUX_ARB_CONTROL);
-	uint32_t field = get_reg_field_value(
-			value,
-			AUX_ARB_CONTROL,
-			AUX_REG_RW_CNTL_STATUS);
-	if (field == DMCU_CAN_ACCESS_AUX)
-	 return false;
-	/* enable AUX before request SW to access AUX */
-	value = REG_READ(AUX_CONTROL);
-	field = get_reg_field_value(value,
-				AUX_CONTROL,
-				AUX_EN);
-
-	if (field == 0) {
-		set_reg_field_value(
-				value,
-				1,
-				AUX_CONTROL,
-				AUX_EN);
-
-		if (REG(AUX_RESET_MASK)) {
-			/*DP_AUX block as part of the enable sequence*/
-			set_reg_field_value(
-				value,
-				1,
-				AUX_CONTROL,
-				AUX_RESET);
-		}
-
-		REG_WRITE(AUX_CONTROL, value);
-
-		if (REG(AUX_RESET_MASK)) {
-			/*poll HW to make sure reset it done*/
-
-			REG_WAIT(AUX_CONTROL, AUX_RESET_DONE, 1,
-					1, 11);
-
-			set_reg_field_value(
-				value,
-				0,
-				AUX_CONTROL,
-				AUX_RESET);
-
-			REG_WRITE(AUX_CONTROL, value);
-
-			REG_WAIT(AUX_CONTROL, AUX_RESET_DONE, 0,
-					1, 11);
-		}
-	} /*if (field)*/
-
-	/* request SW to access AUX */
-	REG_UPDATE(AUX_ARB_CONTROL, AUX_SW_USE_AUX_REG_REQ, 1);
-
-	value = REG_READ(AUX_ARB_CONTROL);
-	field = get_reg_field_value(
-			value,
-			AUX_ARB_CONTROL,
-			AUX_REG_RW_CNTL_STATUS);
-
-	return (field == SW_CAN_ACCESS_AUX);
-}
-
-#define COMPOSE_AUX_SW_DATA_16_20(command, address) \
-	((command) | ((0xF0000 & (address)) >> 16))
-
-#define COMPOSE_AUX_SW_DATA_8_15(address) \
-	((0xFF00 & (address)) >> 8)
-
-#define COMPOSE_AUX_SW_DATA_0_7(address) \
-	(0xFF & (address))
-
-static void submit_channel_request(
-	struct aux_engine *engine,
-	struct aux_request_transaction_data *request)
-{
-	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
-	uint32_t value;
-	uint32_t length;
-
-	bool is_write =
-		((request->type == AUX_TRANSACTION_TYPE_DP) &&
-		 (request->action == I2CAUX_TRANSACTION_ACTION_DP_WRITE)) ||
-		((request->type == AUX_TRANSACTION_TYPE_I2C) &&
-		((request->action == I2CAUX_TRANSACTION_ACTION_I2C_WRITE) ||
-		 (request->action == I2CAUX_TRANSACTION_ACTION_I2C_WRITE_MOT)));
-	if (REG(AUXN_IMPCAL)) {
-		/* clear_aux_error */
-		REG_UPDATE_SEQ(AUXN_IMPCAL, AUXN_CALOUT_ERROR_AK,
-				1,
-				0);
-
-		REG_UPDATE_SEQ(AUXP_IMPCAL, AUXP_CALOUT_ERROR_AK,
-				1,
-				0);
-
-		/* force_default_calibrate */
-		REG_UPDATE_1BY1_2(AUXN_IMPCAL,
-				AUXN_IMPCAL_ENABLE, 1,
-				AUXN_IMPCAL_OVERRIDE_ENABLE, 0);
-
-		/* bug? why AUXN update EN and OVERRIDE_EN 1 by 1 while AUX P toggles OVERRIDE? */
-
-		REG_UPDATE_SEQ(AUXP_IMPCAL, AUXP_IMPCAL_OVERRIDE_ENABLE,
-				1,
-				0);
-	}
-	/* set the delay and the number of bytes to write */
-
-	/* The length include
-	 * the 4 bit header and the 20 bit address
-	 * (that is 3 byte).
-	 * If the requested length is non zero this means
-	 * an addition byte specifying the length is required. */
-
-	length = request->length ? 4 : 3;
-	if (is_write)
-		length += request->length;
-
-	REG_UPDATE_2(AUX_SW_CONTROL,
-			AUX_SW_START_DELAY, request->delay,
-			AUX_SW_WR_BYTES, length);
-
-	/* program action and address and payload data (if 'is_write') */
-	value = REG_UPDATE_4(AUX_SW_DATA,
-			AUX_SW_INDEX, 0,
-			AUX_SW_DATA_RW, 0,
-			AUX_SW_AUTOINCREMENT_DISABLE, 1,
-			AUX_SW_DATA, COMPOSE_AUX_SW_DATA_16_20(request->action, request->address));
-
-	value = REG_SET_2(AUX_SW_DATA, value,
-			AUX_SW_AUTOINCREMENT_DISABLE, 0,
-			AUX_SW_DATA, COMPOSE_AUX_SW_DATA_8_15(request->address));
-
-	value = REG_SET(AUX_SW_DATA, value,
-			AUX_SW_DATA, COMPOSE_AUX_SW_DATA_0_7(request->address));
-
-	if (request->length) {
-		value = REG_SET(AUX_SW_DATA, value,
-				AUX_SW_DATA, request->length - 1);
-	}
-
-	if (is_write) {
-		/* Load the HW buffer with the Data to be sent.
-		 * This is relevant for write operation.
-		 * For read, the data recived data will be
-		 * processed in process_channel_reply(). */
-		uint32_t i = 0;
-
-		while (i < request->length) {
-			value = REG_SET(AUX_SW_DATA, value,
-					AUX_SW_DATA, request->data[i]);
-
-			++i;
-		}
-	}
-
-	REG_UPDATE(AUX_INTERRUPT_CONTROL, AUX_SW_DONE_ACK, 1);
-	REG_WAIT(AUX_SW_STATUS, AUX_SW_DONE, 0,
-				10, aux110->timeout_period/10);
-	REG_UPDATE(AUX_SW_CONTROL, AUX_SW_GO, 1);
-	EVENT_LOG_AUX_REQ(engine->base.ddc->pin_data->en, EVENT_LOG_AUX_ORIGIN_NATIVE,
-					request->action, request->address, request->length, request->data);
-}
-
-static int read_channel_reply(struct aux_engine *engine, uint32_t size,
-			      uint8_t *buffer, uint8_t *reply_result,
-			      uint32_t *sw_status)
-{
-	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
-	uint32_t bytes_replied;
-	uint32_t reply_result_32;
-
-	*sw_status = REG_GET(AUX_SW_STATUS, AUX_SW_REPLY_BYTE_COUNT,
-			     &bytes_replied);
-
-	/* In case HPD is LOW, exit AUX transaction */
-	if ((*sw_status & AUX_SW_STATUS__AUX_SW_HPD_DISCON_MASK))
-		return -1;
-
-	/* Need at least the status byte */
-	if (!bytes_replied)
-		return -1;
-
-	REG_UPDATE_1BY1_3(AUX_SW_DATA,
-			  AUX_SW_INDEX, 0,
-			  AUX_SW_AUTOINCREMENT_DISABLE, 1,
-			  AUX_SW_DATA_RW, 1);
-
-	REG_GET(AUX_SW_DATA, AUX_SW_DATA, &reply_result_32);
-	reply_result_32 = reply_result_32 >> 4;
-	*reply_result = (uint8_t)reply_result_32;
-
-	if (reply_result_32 == 0) { /* ACK */
-		uint32_t i = 0;
-
-		/* First byte was already used to get the command status */
-		--bytes_replied;
-
-		/* Do not overflow buffer */
-		if (bytes_replied > size)
-			return -1;
-
-		while (i < bytes_replied) {
-			uint32_t aux_sw_data_val;
-
-			REG_GET(AUX_SW_DATA, AUX_SW_DATA, &aux_sw_data_val);
-			buffer[i] = aux_sw_data_val;
-			++i;
-		}
-
-		return i;
-	}
-
-	return 0;
-}
-
-static void process_channel_reply(
-	struct aux_engine *engine,
-	struct aux_reply_transaction_data *reply)
-{
-	int bytes_replied;
-	uint8_t reply_result;
-	uint32_t sw_status;
-
-	bytes_replied = read_channel_reply(engine, reply->length, reply->data,
-						&reply_result, &sw_status);
-	EVENT_LOG_AUX_REP(engine->base.ddc->pin_data->en,
-					EVENT_LOG_AUX_ORIGIN_NATIVE, reply_result,
-					bytes_replied, reply->data);
-
-	/* in case HPD is LOW, exit AUX transaction */
-	if ((sw_status & AUX_SW_STATUS__AUX_SW_HPD_DISCON_MASK)) {
-		reply->status = AUX_TRANSACTION_REPLY_HPD_DISCON;
-		return;
-	}
-
-	if (bytes_replied < 0) {
-		/* Need to handle an error case...
-		 * Hopefully, upper layer function won't call this function if
-		 * the number of bytes in the reply was 0, because there was
-		 * surely an error that was asserted that should have been
-		 * handled for hot plug case, this could happens
-		 */
-		if (!(sw_status & AUX_SW_STATUS__AUX_SW_HPD_DISCON_MASK)) {
-			reply->status = AUX_TRANSACTION_REPLY_INVALID;
-			ASSERT_CRITICAL(false);
-			return;
-		}
-	} else {
-
-		switch (reply_result) {
-		case 0: /* ACK */
-			reply->status = AUX_TRANSACTION_REPLY_AUX_ACK;
-		break;
-		case 1: /* NACK */
-			reply->status = AUX_TRANSACTION_REPLY_AUX_NACK;
-		break;
-		case 2: /* DEFER */
-			reply->status = AUX_TRANSACTION_REPLY_AUX_DEFER;
-		break;
-		case 4: /* AUX ACK / I2C NACK */
-			reply->status = AUX_TRANSACTION_REPLY_I2C_NACK;
-		break;
-		case 8: /* AUX ACK / I2C DEFER */
-			reply->status = AUX_TRANSACTION_REPLY_I2C_DEFER;
-		break;
-		default:
-			reply->status = AUX_TRANSACTION_REPLY_INVALID;
-		}
-	}
-}
-
-static enum aux_channel_operation_result get_channel_status(
-	struct aux_engine *engine,
-	uint8_t *returned_bytes)
-{
-	struct aux_engine_dce110 *aux110 = FROM_AUX_ENGINE(engine);
-
-	uint32_t value;
-
-	if (returned_bytes == NULL) {
-		/*caller pass NULL pointer*/
-		ASSERT_CRITICAL(false);
-		return AUX_CHANNEL_OPERATION_FAILED_REASON_UNKNOWN;
-	}
-	*returned_bytes = 0;
-
-	/* poll to make sure that SW_DONE is asserted */
-	value = REG_WAIT(AUX_SW_STATUS, AUX_SW_DONE, 1,
-				10, aux110->timeout_period/10);
-
-	/* in case HPD is LOW, exit AUX transaction */
-	if ((value & AUX_SW_STATUS__AUX_SW_HPD_DISCON_MASK))
-		return AUX_CHANNEL_OPERATION_FAILED_HPD_DISCON;
-
-	/* Note that the following bits are set in 'status.bits'
-	 * during CTS 4.2.1.2 (FW 3.3.1):
-	 * AUX_SW_RX_MIN_COUNT_VIOL, AUX_SW_RX_INVALID_STOP,
-	 * AUX_SW_RX_RECV_NO_DET, AUX_SW_RX_RECV_INVALID_H.
-	 *
-	 * AUX_SW_RX_MIN_COUNT_VIOL is an internal,
-	 * HW debugging bit and should be ignored. */
-	if (value & AUX_SW_STATUS__AUX_SW_DONE_MASK) {
-		if ((value & AUX_SW_STATUS__AUX_SW_RX_TIMEOUT_STATE_MASK) ||
-			(value & AUX_SW_STATUS__AUX_SW_RX_TIMEOUT_MASK))
-			return AUX_CHANNEL_OPERATION_FAILED_TIMEOUT;
-
-		else if ((value & AUX_SW_STATUS__AUX_SW_RX_INVALID_STOP_MASK) ||
-			(value & AUX_SW_STATUS__AUX_SW_RX_RECV_NO_DET_MASK) ||
-			(value &
-				AUX_SW_STATUS__AUX_SW_RX_RECV_INVALID_H_MASK) ||
-			(value & AUX_SW_STATUS__AUX_SW_RX_RECV_INVALID_L_MASK))
-			return AUX_CHANNEL_OPERATION_FAILED_INVALID_REPLY;
-
-		*returned_bytes = get_reg_field_value(value,
-				AUX_SW_STATUS,
-				AUX_SW_REPLY_BYTE_COUNT);
-
-		if (*returned_bytes == 0)
-			return
-			AUX_CHANNEL_OPERATION_FAILED_INVALID_REPLY;
-		else {
-			*returned_bytes -= 1;
-			return AUX_CHANNEL_OPERATION_SUCCEEDED;
-		}
-	} else {
-		/*time_elapsed >= aux_engine->timeout_period
-		 *  AUX_SW_STATUS__AUX_SW_HPD_DISCON = at this point
-		 */
-		ASSERT_CRITICAL(false);
-		return AUX_CHANNEL_OPERATION_FAILED_TIMEOUT;
-	}
-}
-
-static const struct aux_engine_funcs aux_engine_funcs = {
-	.destroy = destroy,
-	.acquire_engine = acquire_engine,
-	.submit_channel_request = submit_channel_request,
-	.process_channel_reply = process_channel_reply,
-	.read_channel_reply = read_channel_reply,
-	.get_channel_status = get_channel_status,
-	.is_engine_available = is_engine_available,
-};
-
-static const struct engine_funcs engine_funcs = {
-	.release_engine = release_engine,
-	.submit_request = dal_aux_engine_submit_request,
-	.get_engine_type = dal_aux_engine_get_engine_type,
-	.acquire = dal_aux_engine_acquire,
-};
-
-static void construct(
-	struct aux_engine_dce110 *engine,
-	const struct aux_engine_dce110_init_data *aux_init_data)
-{
-	dal_aux_engine_construct(&engine->base, aux_init_data->ctx);
-	engine->base.base.funcs = &engine_funcs;
-	engine->base.funcs = &aux_engine_funcs;
-
-	engine->timeout_period = aux_init_data->timeout_period;
-	engine->regs = aux_init_data->regs;
-}
-
-static void destruct(
-	struct aux_engine_dce110 *engine)
-{
-	dal_aux_engine_destruct(&engine->base);
-}
-
-struct aux_engine *dal_aux_engine_dce110_create(
-	const struct aux_engine_dce110_init_data *aux_init_data)
-{
-	struct aux_engine_dce110 *engine;
-
-	if (!aux_init_data) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	engine = kzalloc(sizeof(*engine), GFP_KERNEL);
-
-	if (!engine) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	construct(engine, aux_init_data);
-	return &engine->base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/aux_engine_dce110.h b/drivers/gpu/drm/amd/display/dc/i2caux/dce110/aux_engine_dce110.h
deleted file mode 100644
index 85ee82162590..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/aux_engine_dce110.h
+++ /dev/null
@@ -1,78 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_AUX_ENGINE_DCE110_H__
-#define __DAL_AUX_ENGINE_DCE110_H__
-
-#include "../aux_engine.h"
-
-#define AUX_COMMON_REG_LIST(id)\
-	SRI(AUX_CONTROL, DP_AUX, id), \
-	SRI(AUX_ARB_CONTROL, DP_AUX, id), \
-	SRI(AUX_SW_DATA, DP_AUX, id), \
-	SRI(AUX_SW_CONTROL, DP_AUX, id), \
-	SRI(AUX_INTERRUPT_CONTROL, DP_AUX, id), \
-	SRI(AUX_SW_STATUS, DP_AUX, id), \
-	SR(AUXN_IMPCAL), \
-	SR(AUXP_IMPCAL)
-
-struct dce110_aux_registers {
-	uint32_t AUX_CONTROL;
-	uint32_t AUX_ARB_CONTROL;
-	uint32_t AUX_SW_DATA;
-	uint32_t AUX_SW_CONTROL;
-	uint32_t AUX_INTERRUPT_CONTROL;
-	uint32_t AUX_SW_STATUS;
-	uint32_t AUXN_IMPCAL;
-	uint32_t AUXP_IMPCAL;
-
-	uint32_t AUX_RESET_MASK;
-};
-
-struct aux_engine_dce110 {
-	struct aux_engine base;
-	const struct dce110_aux_registers *regs;
-	struct {
-		uint32_t aux_control;
-		uint32_t aux_arb_control;
-		uint32_t aux_sw_data;
-		uint32_t aux_sw_control;
-		uint32_t aux_interrupt_control;
-		uint32_t aux_sw_status;
-	} addr;
-	uint32_t timeout_period;
-};
-
-struct aux_engine_dce110_init_data {
-	uint32_t engine_id;
-	uint32_t timeout_period;
-	struct dc_context *ctx;
-	const struct dce110_aux_registers *regs;
-};
-
-struct aux_engine *dal_aux_engine_dce110_create(
-	const struct aux_engine_dce110_init_data *aux_init_data);
-
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_hw_engine_dce110.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_hw_engine_dce110.c
deleted file mode 100644
index 9cbe1a7a6bcb..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_hw_engine_dce110.c
+++ /dev/null
@@ -1,574 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-#include "include/logger_interface.h"
-/*
- * Pre-requisites: headers required by header of this unit
- */
-
-#include "include/i2caux_interface.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_hw_engine.h"
-#include "../i2c_generic_hw_engine.h"
-/*
- * Header of this unit
- */
-
-#include "i2c_hw_engine_dce110.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-#include "reg_helper.h"
-
-/*
- * This unit
- */
-#define DC_LOGGER \
-		hw_engine->base.base.base.ctx->logger
-
-enum dc_i2c_status {
-	DC_I2C_STATUS__DC_I2C_STATUS_IDLE,
-	DC_I2C_STATUS__DC_I2C_STATUS_USED_BY_SW,
-	DC_I2C_STATUS__DC_I2C_STATUS_USED_BY_HW
-};
-
-enum dc_i2c_arbitration {
-	DC_I2C_ARBITRATION__DC_I2C_SW_PRIORITY_NORMAL,
-	DC_I2C_ARBITRATION__DC_I2C_SW_PRIORITY_HIGH
-};
-
-
-
-/*
- * @brief
- * Cast pointer to 'struct i2c_hw_engine *'
- * to pointer 'struct i2c_hw_engine_dce110 *'
- */
-#define FROM_I2C_HW_ENGINE(ptr) \
-	container_of((ptr), struct i2c_hw_engine_dce110, base)
-/*
- * @brief
- * Cast pointer to 'struct i2c_engine *'
- * to pointer to 'struct i2c_hw_engine_dce110 *'
- */
-#define FROM_I2C_ENGINE(ptr) \
-	FROM_I2C_HW_ENGINE(container_of((ptr), struct i2c_hw_engine, base))
-
-/*
- * @brief
- * Cast pointer to 'struct engine *'
- * to 'pointer to struct i2c_hw_engine_dce110 *'
- */
-#define FROM_ENGINE(ptr) \
-	FROM_I2C_ENGINE(container_of((ptr), struct i2c_engine, base))
-
-#define CTX \
-		hw_engine->base.base.base.ctx
-
-#define REG(reg_name)\
-	(hw_engine->regs->reg_name)
-
-#undef FN
-#define FN(reg_name, field_name) \
-	hw_engine->i2c_shift->field_name, hw_engine->i2c_mask->field_name
-
-#include "reg_helper.h"
-
-static void disable_i2c_hw_engine(
-	struct i2c_hw_engine_dce110 *hw_engine)
-{
-	REG_UPDATE_N(SETUP, 1, FN(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_ENABLE), 0);
-}
-
-static void release_engine(
-	struct engine *engine)
-{
-	struct i2c_hw_engine_dce110 *hw_engine = FROM_ENGINE(engine);
-
-	struct i2c_engine *base = NULL;
-	bool safe_to_reset;
-
-	base = &hw_engine->base.base;
-
-	/* Restore original HW engine speed */
-
-	base->funcs->set_speed(base, hw_engine->base.original_speed);
-
-	/* Release I2C */
-	REG_UPDATE(DC_I2C_ARBITRATION, DC_I2C_SW_DONE_USING_I2C_REG, 1);
-
-	/* Reset HW engine */
-	{
-		uint32_t i2c_sw_status = 0;
-		REG_GET(DC_I2C_SW_STATUS, DC_I2C_SW_STATUS, &i2c_sw_status);
-		/* if used by SW, safe to reset */
-		safe_to_reset = (i2c_sw_status == 1);
-	}
-
-	if (safe_to_reset)
-		REG_UPDATE_2(
-			DC_I2C_CONTROL,
-			DC_I2C_SOFT_RESET, 1,
-			DC_I2C_SW_STATUS_RESET, 1);
-	else
-		REG_UPDATE(DC_I2C_CONTROL, DC_I2C_SW_STATUS_RESET, 1);
-
-	/* HW I2c engine - clock gating feature */
-	if (!hw_engine->engine_keep_power_up_count)
-		disable_i2c_hw_engine(hw_engine);
-}
-
-static bool setup_engine(
-	struct i2c_engine *i2c_engine)
-{
-	struct i2c_hw_engine_dce110 *hw_engine = FROM_I2C_ENGINE(i2c_engine);
-	uint32_t i2c_setup_limit = I2C_SETUP_TIME_LIMIT_DCE;
-	uint32_t  reset_length = 0;
-
-	if (hw_engine->base.base.setup_limit != 0)
-		i2c_setup_limit = hw_engine->base.base.setup_limit;
-
-	/* Program pin select */
-	REG_UPDATE_6(
-			DC_I2C_CONTROL,
-			DC_I2C_GO, 0,
-			DC_I2C_SOFT_RESET, 0,
-			DC_I2C_SEND_RESET, 0,
-			DC_I2C_SW_STATUS_RESET, 1,
-			DC_I2C_TRANSACTION_COUNT, 0,
-			DC_I2C_DDC_SELECT, hw_engine->engine_id);
-
-	/* Program time limit */
-	if (hw_engine->base.base.send_reset_length == 0) {
-		/*pre-dcn*/
-		REG_UPDATE_N(
-				SETUP, 2,
-				FN(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_TIME_LIMIT), i2c_setup_limit,
-				FN(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_ENABLE), 1);
-	} else {
-		reset_length = hw_engine->base.base.send_reset_length;
-	}
-	/* Program HW priority
-	 * set to High - interrupt software I2C at any time
-	 * Enable restart of SW I2C that was interrupted by HW
-	 * disable queuing of software while I2C is in use by HW */
-	REG_UPDATE_2(
-			DC_I2C_ARBITRATION,
-			DC_I2C_NO_QUEUED_SW_GO, 0,
-			DC_I2C_SW_PRIORITY, DC_I2C_ARBITRATION__DC_I2C_SW_PRIORITY_NORMAL);
-
-	return true;
-}
-
-static uint32_t get_speed(
-	const struct i2c_engine *i2c_engine)
-{
-	const struct i2c_hw_engine_dce110 *hw_engine = FROM_I2C_ENGINE(i2c_engine);
-	uint32_t pre_scale = 0;
-
-	REG_GET(SPEED, DC_I2C_DDC1_PRESCALE, &pre_scale);
-
-	/* [anaumov] it seems following is unnecessary */
-	/*ASSERT(value.bits.DC_I2C_DDC1_PRESCALE);*/
-	return pre_scale ?
-		hw_engine->reference_frequency / pre_scale :
-		hw_engine->base.default_speed;
-}
-
-static void set_speed(
-	struct i2c_engine *i2c_engine,
-	uint32_t speed)
-{
-	struct i2c_hw_engine_dce110 *hw_engine = FROM_I2C_ENGINE(i2c_engine);
-
-	if (speed) {
-		if (hw_engine->i2c_mask->DC_I2C_DDC1_START_STOP_TIMING_CNTL)
-			REG_UPDATE_N(
-				SPEED, 3,
-				FN(DC_I2C_DDC1_SPEED, DC_I2C_DDC1_PRESCALE), hw_engine->reference_frequency / speed,
-				FN(DC_I2C_DDC1_SPEED, DC_I2C_DDC1_THRESHOLD), 2,
-				FN(DC_I2C_DDC1_SPEED, DC_I2C_DDC1_START_STOP_TIMING_CNTL), speed > 50 ? 2:1);
-		else
-			REG_UPDATE_N(
-				SPEED, 2,
-				FN(DC_I2C_DDC1_SPEED, DC_I2C_DDC1_PRESCALE), hw_engine->reference_frequency / speed,
-				FN(DC_I2C_DDC1_SPEED, DC_I2C_DDC1_THRESHOLD), 2);
-	}
-}
-
-static inline void reset_hw_engine(struct engine *engine)
-{
-	struct i2c_hw_engine_dce110 *hw_engine = FROM_ENGINE(engine);
-
-	REG_UPDATE_2(
-			DC_I2C_CONTROL,
-			DC_I2C_SW_STATUS_RESET, 1,
-			DC_I2C_SW_STATUS_RESET, 1);
-}
-
-static bool is_hw_busy(struct engine *engine)
-{
-	struct i2c_hw_engine_dce110 *hw_engine = FROM_ENGINE(engine);
-	uint32_t i2c_sw_status = 0;
-
-	REG_GET(DC_I2C_SW_STATUS, DC_I2C_SW_STATUS, &i2c_sw_status);
-	if (i2c_sw_status == DC_I2C_STATUS__DC_I2C_STATUS_IDLE)
-		return false;
-
-	reset_hw_engine(engine);
-
-	REG_GET(DC_I2C_SW_STATUS, DC_I2C_SW_STATUS, &i2c_sw_status);
-	return i2c_sw_status != DC_I2C_STATUS__DC_I2C_STATUS_IDLE;
-}
-
-
-#define STOP_TRANS_PREDICAT \
-		((hw_engine->transaction_count == 3) ||	\
-				(request->action == I2CAUX_TRANSACTION_ACTION_I2C_WRITE) ||	\
-				(request->action & I2CAUX_TRANSACTION_ACTION_I2C_READ))
-
-#define SET_I2C_TRANSACTION(id)	\
-		do {	\
-			REG_UPDATE_N(DC_I2C_TRANSACTION##id, 5,	\
-				FN(DC_I2C_TRANSACTION0, DC_I2C_STOP_ON_NACK0), 1,	\
-				FN(DC_I2C_TRANSACTION0, DC_I2C_START0), 1,	\
-				FN(DC_I2C_TRANSACTION0, DC_I2C_STOP0), STOP_TRANS_PREDICAT ? 1:0,	\
-				FN(DC_I2C_TRANSACTION0, DC_I2C_RW0), (0 != (request->action & I2CAUX_TRANSACTION_ACTION_I2C_READ)),	\
-				FN(DC_I2C_TRANSACTION0, DC_I2C_COUNT0), length);	\
-				if (STOP_TRANS_PREDICAT)	\
-					last_transaction = true;	\
-		} while (false)
-
-
-static bool process_transaction(
-	struct i2c_hw_engine_dce110 *hw_engine,
-	struct i2c_request_transaction_data *request)
-{
-	uint32_t length = request->length;
-	uint8_t *buffer = request->data;
-	uint32_t value = 0;
-
-	bool last_transaction = false;
-
-	struct dc_context *ctx = NULL;
-
-	ctx = hw_engine->base.base.base.ctx;
-
-
-
-	switch (hw_engine->transaction_count) {
-	case 0:
-		SET_I2C_TRANSACTION(0);
-		break;
-	case 1:
-		SET_I2C_TRANSACTION(1);
-		break;
-	case 2:
-		SET_I2C_TRANSACTION(2);
-		break;
-	case 3:
-		SET_I2C_TRANSACTION(3);
-		break;
-	default:
-		/* TODO Warning ? */
-		break;
-	}
-
-
-	/* Write the I2C address and I2C data
-	 * into the hardware circular buffer, one byte per entry.
-	 * As an example, the 7-bit I2C slave address for CRT monitor
-	 * for reading DDC/EDID information is 0b1010001.
-	 * For an I2C send operation, the LSB must be programmed to 0;
-	 * for I2C receive operation, the LSB must be programmed to 1. */
-	if (hw_engine->transaction_count == 0) {
-		value = REG_SET_4(DC_I2C_DATA, 0,
-				  DC_I2C_DATA_RW, false,
-				  DC_I2C_DATA, request->address,
-				  DC_I2C_INDEX, 0,
-				  DC_I2C_INDEX_WRITE, 1);
-		hw_engine->buffer_used_write = 0;
-	} else
-		value = REG_SET_2(DC_I2C_DATA, 0,
-				  DC_I2C_DATA_RW, false,
-				  DC_I2C_DATA, request->address);
-
-	hw_engine->buffer_used_write++;
-
-	if (!(request->action & I2CAUX_TRANSACTION_ACTION_I2C_READ)) {
-		while (length) {
-			REG_SET_2(DC_I2C_DATA, value,
-					DC_I2C_INDEX_WRITE, 0,
-					DC_I2C_DATA, *buffer++);
-			hw_engine->buffer_used_write++;
-			--length;
-		}
-	}
-
-	++hw_engine->transaction_count;
-	hw_engine->buffer_used_bytes += length + 1;
-
-	return last_transaction;
-}
-
-static void execute_transaction(
-	struct i2c_hw_engine_dce110 *hw_engine)
-{
-	REG_UPDATE_N(SETUP, 5,
-		FN(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_DATA_DRIVE_EN), 0,
-		FN(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_CLK_DRIVE_EN), 0,
-		FN(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_DATA_DRIVE_SEL), 0,
-		FN(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_INTRA_TRANSACTION_DELAY), 0,
-		FN(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_INTRA_BYTE_DELAY), 0);
-
-
-	REG_UPDATE_5(DC_I2C_CONTROL,
-		DC_I2C_SOFT_RESET, 0,
-		DC_I2C_SW_STATUS_RESET, 0,
-		DC_I2C_SEND_RESET, 0,
-		DC_I2C_GO, 0,
-		DC_I2C_TRANSACTION_COUNT, hw_engine->transaction_count - 1);
-
-	/* start I2C transfer */
-	REG_UPDATE(DC_I2C_CONTROL, DC_I2C_GO, 1);
-
-	/* all transactions were executed and HW buffer became empty
-	 * (even though it actually happens when status becomes DONE) */
-	hw_engine->transaction_count = 0;
-	hw_engine->buffer_used_bytes = 0;
-}
-
-static void submit_channel_request(
-	struct i2c_engine *engine,
-	struct i2c_request_transaction_data *request)
-{
-	request->status = I2C_CHANNEL_OPERATION_SUCCEEDED;
-
-	if (!process_transaction(FROM_I2C_ENGINE(engine), request))
-		return;
-
-	if (is_hw_busy(&engine->base)) {
-		request->status = I2C_CHANNEL_OPERATION_ENGINE_BUSY;
-		return;
-	}
-
-	execute_transaction(FROM_I2C_ENGINE(engine));
-}
-
-static void process_channel_reply(
-	struct i2c_engine *engine,
-	struct i2c_reply_transaction_data *reply)
-{
-	uint32_t length = reply->length;
-	uint8_t *buffer = reply->data;
-
-	struct i2c_hw_engine_dce110 *hw_engine =
-		FROM_I2C_ENGINE(engine);
-
-
-	REG_SET_3(DC_I2C_DATA, 0,
-			DC_I2C_INDEX, hw_engine->buffer_used_write,
-			DC_I2C_DATA_RW, 1,
-			DC_I2C_INDEX_WRITE, 1);
-
-	while (length) {
-		/* after reading the status,
-		 * if the I2C operation executed successfully
-		 * (i.e. DC_I2C_STATUS_DONE = 1) then the I2C controller
-		 * should read data bytes from I2C circular data buffer */
-
-		uint32_t i2c_data;
-
-		REG_GET(DC_I2C_DATA, DC_I2C_DATA, &i2c_data);
-		*buffer++ = i2c_data;
-
-		--length;
-	}
-}
-
-static enum i2c_channel_operation_result get_channel_status(
-	struct i2c_engine *i2c_engine,
-	uint8_t *returned_bytes)
-{
-	uint32_t i2c_sw_status = 0;
-	struct i2c_hw_engine_dce110 *hw_engine = FROM_I2C_ENGINE(i2c_engine);
-	uint32_t value =
-			REG_GET(DC_I2C_SW_STATUS, DC_I2C_SW_STATUS, &i2c_sw_status);
-
-	if (i2c_sw_status == DC_I2C_STATUS__DC_I2C_STATUS_USED_BY_SW)
-		return I2C_CHANNEL_OPERATION_ENGINE_BUSY;
-	else if (value & hw_engine->i2c_mask->DC_I2C_SW_STOPPED_ON_NACK)
-		return I2C_CHANNEL_OPERATION_NO_RESPONSE;
-	else if (value & hw_engine->i2c_mask->DC_I2C_SW_TIMEOUT)
-		return I2C_CHANNEL_OPERATION_TIMEOUT;
-	else if (value & hw_engine->i2c_mask->DC_I2C_SW_ABORTED)
-		return I2C_CHANNEL_OPERATION_FAILED;
-	else if (value & hw_engine->i2c_mask->DC_I2C_SW_DONE)
-		return I2C_CHANNEL_OPERATION_SUCCEEDED;
-
-	/*
-	 * this is the case when HW used for communication, I2C_SW_STATUS
-	 * could be zero
-	 */
-	return I2C_CHANNEL_OPERATION_SUCCEEDED;
-}
-
-static uint32_t get_hw_buffer_available_size(
-	const struct i2c_hw_engine *engine)
-{
-	return I2C_HW_BUFFER_SIZE -
-		FROM_I2C_HW_ENGINE(engine)->buffer_used_bytes;
-}
-
-static uint32_t get_transaction_timeout(
-	const struct i2c_hw_engine *engine,
-	uint32_t length)
-{
-	uint32_t speed = engine->base.funcs->get_speed(&engine->base);
-
-	uint32_t period_timeout;
-	uint32_t num_of_clock_stretches;
-
-	if (!speed)
-		return 0;
-
-	period_timeout = (1000 * TRANSACTION_TIMEOUT_IN_I2C_CLOCKS) / speed;
-
-	num_of_clock_stretches = 1 + (length << 3) + 1;
-	num_of_clock_stretches +=
-		(FROM_I2C_HW_ENGINE(engine)->buffer_used_bytes << 3) +
-		(FROM_I2C_HW_ENGINE(engine)->transaction_count << 1);
-
-	return period_timeout * num_of_clock_stretches;
-}
-
-static void destroy(
-	struct i2c_engine **i2c_engine)
-{
-	struct i2c_hw_engine_dce110 *engine_dce110 =
-			FROM_I2C_ENGINE(*i2c_engine);
-
-	dal_i2c_hw_engine_destruct(&engine_dce110->base);
-
-	kfree(engine_dce110);
-
-	*i2c_engine = NULL;
-}
-
-static const struct i2c_engine_funcs i2c_engine_funcs = {
-	.destroy = destroy,
-	.get_speed = get_speed,
-	.set_speed = set_speed,
-	.setup_engine = setup_engine,
-	.submit_channel_request = submit_channel_request,
-	.process_channel_reply = process_channel_reply,
-	.get_channel_status = get_channel_status,
-	.acquire_engine = dal_i2c_hw_engine_acquire_engine,
-};
-
-static const struct engine_funcs engine_funcs = {
-	.release_engine = release_engine,
-	.get_engine_type = dal_i2c_hw_engine_get_engine_type,
-	.acquire = dal_i2c_engine_acquire,
-	.submit_request = dal_i2c_hw_engine_submit_request,
-};
-
-static const struct i2c_hw_engine_funcs i2c_hw_engine_funcs = {
-	.get_hw_buffer_available_size = get_hw_buffer_available_size,
-	.get_transaction_timeout = get_transaction_timeout,
-	.wait_on_operation_result = dal_i2c_hw_engine_wait_on_operation_result,
-};
-
-static void construct(
-	struct i2c_hw_engine_dce110 *hw_engine,
-	const struct i2c_hw_engine_dce110_create_arg *arg)
-{
-	uint32_t xtal_ref_div = 0;
-
-	dal_i2c_hw_engine_construct(&hw_engine->base, arg->ctx);
-
-	hw_engine->base.base.base.funcs = &engine_funcs;
-	hw_engine->base.base.funcs = &i2c_engine_funcs;
-	hw_engine->base.funcs = &i2c_hw_engine_funcs;
-	hw_engine->base.default_speed = arg->default_speed;
-
-	hw_engine->regs = arg->regs;
-	hw_engine->i2c_shift = arg->i2c_shift;
-	hw_engine->i2c_mask = arg->i2c_mask;
-
-	hw_engine->engine_id = arg->engine_id;
-
-	hw_engine->buffer_used_bytes = 0;
-	hw_engine->transaction_count = 0;
-	hw_engine->engine_keep_power_up_count = 1;
-
-
-	REG_GET(MICROSECOND_TIME_BASE_DIV, XTAL_REF_DIV, &xtal_ref_div);
-
-	if (xtal_ref_div == 0) {
-		DC_LOG_WARNING("Invalid base timer divider [%s]\n",
-				__func__);
-		xtal_ref_div = 2;
-	}
-
-	/*Calculating Reference Clock by divding original frequency by
-	 * XTAL_REF_DIV.
-	 * At upper level, uint32_t reference_frequency =
-	 *  dal_i2caux_get_reference_clock(as) >> 1
-	 *  which already divided by 2. So we need x2 to get original
-	 *  reference clock from ppll_info
-	 */
-	hw_engine->reference_frequency =
-		(arg->reference_frequency * 2) / xtal_ref_div;
-}
-
-struct i2c_engine *dal_i2c_hw_engine_dce110_create(
-	const struct i2c_hw_engine_dce110_create_arg *arg)
-{
-	struct i2c_hw_engine_dce110 *engine_dce10;
-
-	if (!arg) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-	if (!arg->reference_frequency) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	engine_dce10 = kzalloc(sizeof(struct i2c_hw_engine_dce110),
-			       GFP_KERNEL);
-
-	if (!engine_dce10) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	construct(engine_dce10, arg);
-	return &engine_dce10->base.base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_hw_engine_dce110.h b/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_hw_engine_dce110.h
deleted file mode 100644
index fea2946906ed..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_hw_engine_dce110.h
+++ /dev/null
@@ -1,218 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_HW_ENGINE_DCE110_H__
-#define __DAL_I2C_HW_ENGINE_DCE110_H__
-
-#define I2C_HW_ENGINE_COMMON_REG_LIST(id)\
-	SRI(SETUP, DC_I2C_DDC, id),\
-	SRI(SPEED, DC_I2C_DDC, id),\
-	SR(DC_I2C_ARBITRATION),\
-	SR(DC_I2C_CONTROL),\
-	SR(DC_I2C_SW_STATUS),\
-	SR(DC_I2C_TRANSACTION0),\
-	SR(DC_I2C_TRANSACTION1),\
-	SR(DC_I2C_TRANSACTION2),\
-	SR(DC_I2C_TRANSACTION3),\
-	SR(DC_I2C_DATA),\
-	SR(MICROSECOND_TIME_BASE_DIV)
-
-#define I2C_SF(reg_name, field_name, post_fix)\
-	.field_name = reg_name ## __ ## field_name ## post_fix
-
-#define I2C_COMMON_MASK_SH_LIST_DCE_COMMON_BASE(mask_sh)\
-	I2C_SF(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_ENABLE, mask_sh),\
-	I2C_SF(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_TIME_LIMIT, mask_sh),\
-	I2C_SF(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_DATA_DRIVE_EN, mask_sh),\
-	I2C_SF(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_CLK_DRIVE_EN, mask_sh),\
-	I2C_SF(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_DATA_DRIVE_SEL, mask_sh),\
-	I2C_SF(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_INTRA_TRANSACTION_DELAY, mask_sh),\
-	I2C_SF(DC_I2C_DDC1_SETUP, DC_I2C_DDC1_INTRA_BYTE_DELAY, mask_sh),\
-	I2C_SF(DC_I2C_ARBITRATION, DC_I2C_SW_DONE_USING_I2C_REG, mask_sh),\
-	I2C_SF(DC_I2C_ARBITRATION, DC_I2C_NO_QUEUED_SW_GO, mask_sh),\
-	I2C_SF(DC_I2C_ARBITRATION, DC_I2C_SW_PRIORITY, mask_sh),\
-	I2C_SF(DC_I2C_CONTROL, DC_I2C_SOFT_RESET, mask_sh),\
-	I2C_SF(DC_I2C_CONTROL, DC_I2C_SW_STATUS_RESET, mask_sh),\
-	I2C_SF(DC_I2C_CONTROL, DC_I2C_GO, mask_sh),\
-	I2C_SF(DC_I2C_CONTROL, DC_I2C_SEND_RESET, mask_sh),\
-	I2C_SF(DC_I2C_CONTROL, DC_I2C_TRANSACTION_COUNT, mask_sh),\
-	I2C_SF(DC_I2C_CONTROL, DC_I2C_DDC_SELECT, mask_sh),\
-	I2C_SF(DC_I2C_DDC1_SPEED, DC_I2C_DDC1_PRESCALE, mask_sh),\
-	I2C_SF(DC_I2C_DDC1_SPEED, DC_I2C_DDC1_THRESHOLD, mask_sh),\
-	I2C_SF(DC_I2C_SW_STATUS, DC_I2C_SW_STOPPED_ON_NACK, mask_sh),\
-	I2C_SF(DC_I2C_SW_STATUS, DC_I2C_SW_TIMEOUT, mask_sh),\
-	I2C_SF(DC_I2C_SW_STATUS, DC_I2C_SW_ABORTED, mask_sh),\
-	I2C_SF(DC_I2C_SW_STATUS, DC_I2C_SW_DONE, mask_sh),\
-	I2C_SF(DC_I2C_SW_STATUS, DC_I2C_SW_STATUS, mask_sh),\
-	I2C_SF(DC_I2C_TRANSACTION0, DC_I2C_STOP_ON_NACK0, mask_sh),\
-	I2C_SF(DC_I2C_TRANSACTION0, DC_I2C_START0, mask_sh),\
-	I2C_SF(DC_I2C_TRANSACTION0, DC_I2C_RW0, mask_sh),\
-	I2C_SF(DC_I2C_TRANSACTION0, DC_I2C_STOP0, mask_sh),\
-	I2C_SF(DC_I2C_TRANSACTION0, DC_I2C_COUNT0, mask_sh),\
-	I2C_SF(DC_I2C_DATA, DC_I2C_DATA_RW, mask_sh),\
-	I2C_SF(DC_I2C_DATA, DC_I2C_DATA, mask_sh),\
-	I2C_SF(DC_I2C_DATA, DC_I2C_INDEX, mask_sh),\
-	I2C_SF(DC_I2C_DATA, DC_I2C_INDEX_WRITE, mask_sh),\
-	I2C_SF(MICROSECOND_TIME_BASE_DIV, XTAL_REF_DIV, mask_sh)
-
-#define I2C_COMMON_MASK_SH_LIST_DCE100(mask_sh)\
-	I2C_COMMON_MASK_SH_LIST_DCE_COMMON_BASE(mask_sh)
-
-#define I2C_COMMON_MASK_SH_LIST_DCE110(mask_sh)\
-	I2C_COMMON_MASK_SH_LIST_DCE_COMMON_BASE(mask_sh),\
-	I2C_SF(DC_I2C_DDC1_SPEED, DC_I2C_DDC1_START_STOP_TIMING_CNTL, mask_sh)
-
-struct dce110_i2c_hw_engine_shift {
-	uint8_t DC_I2C_DDC1_ENABLE;
-	uint8_t DC_I2C_DDC1_TIME_LIMIT;
-	uint8_t DC_I2C_DDC1_DATA_DRIVE_EN;
-	uint8_t DC_I2C_DDC1_CLK_DRIVE_EN;
-	uint8_t DC_I2C_DDC1_DATA_DRIVE_SEL;
-	uint8_t DC_I2C_DDC1_INTRA_TRANSACTION_DELAY;
-	uint8_t DC_I2C_DDC1_INTRA_BYTE_DELAY;
-	uint8_t DC_I2C_SW_DONE_USING_I2C_REG;
-	uint8_t DC_I2C_NO_QUEUED_SW_GO;
-	uint8_t DC_I2C_SW_PRIORITY;
-	uint8_t DC_I2C_SOFT_RESET;
-	uint8_t DC_I2C_SW_STATUS_RESET;
-	uint8_t DC_I2C_GO;
-	uint8_t DC_I2C_SEND_RESET;
-	uint8_t DC_I2C_TRANSACTION_COUNT;
-	uint8_t DC_I2C_DDC_SELECT;
-	uint8_t DC_I2C_DDC1_PRESCALE;
-	uint8_t DC_I2C_DDC1_THRESHOLD;
-	uint8_t DC_I2C_DDC1_START_STOP_TIMING_CNTL;
-	uint8_t DC_I2C_SW_STOPPED_ON_NACK;
-	uint8_t DC_I2C_SW_TIMEOUT;
-	uint8_t DC_I2C_SW_ABORTED;
-	uint8_t DC_I2C_SW_DONE;
-	uint8_t DC_I2C_SW_STATUS;
-	uint8_t DC_I2C_STOP_ON_NACK0;
-	uint8_t DC_I2C_START0;
-	uint8_t DC_I2C_RW0;
-	uint8_t DC_I2C_STOP0;
-	uint8_t DC_I2C_COUNT0;
-	uint8_t DC_I2C_DATA_RW;
-	uint8_t DC_I2C_DATA;
-	uint8_t DC_I2C_INDEX;
-	uint8_t DC_I2C_INDEX_WRITE;
-	uint8_t XTAL_REF_DIV;
-};
-
-struct dce110_i2c_hw_engine_mask {
-	uint32_t DC_I2C_DDC1_ENABLE;
-	uint32_t DC_I2C_DDC1_TIME_LIMIT;
-	uint32_t DC_I2C_DDC1_DATA_DRIVE_EN;
-	uint32_t DC_I2C_DDC1_CLK_DRIVE_EN;
-	uint32_t DC_I2C_DDC1_DATA_DRIVE_SEL;
-	uint32_t DC_I2C_DDC1_INTRA_TRANSACTION_DELAY;
-	uint32_t DC_I2C_DDC1_INTRA_BYTE_DELAY;
-	uint32_t DC_I2C_SW_DONE_USING_I2C_REG;
-	uint32_t DC_I2C_NO_QUEUED_SW_GO;
-	uint32_t DC_I2C_SW_PRIORITY;
-	uint32_t DC_I2C_SOFT_RESET;
-	uint32_t DC_I2C_SW_STATUS_RESET;
-	uint32_t DC_I2C_GO;
-	uint32_t DC_I2C_SEND_RESET;
-	uint32_t DC_I2C_TRANSACTION_COUNT;
-	uint32_t DC_I2C_DDC_SELECT;
-	uint32_t DC_I2C_DDC1_PRESCALE;
-	uint32_t DC_I2C_DDC1_THRESHOLD;
-	uint32_t DC_I2C_DDC1_START_STOP_TIMING_CNTL;
-	uint32_t DC_I2C_SW_STOPPED_ON_NACK;
-	uint32_t DC_I2C_SW_TIMEOUT;
-	uint32_t DC_I2C_SW_ABORTED;
-	uint32_t DC_I2C_SW_DONE;
-	uint32_t DC_I2C_SW_STATUS;
-	uint32_t DC_I2C_STOP_ON_NACK0;
-	uint32_t DC_I2C_START0;
-	uint32_t DC_I2C_RW0;
-	uint32_t DC_I2C_STOP0;
-	uint32_t DC_I2C_COUNT0;
-	uint32_t DC_I2C_DATA_RW;
-	uint32_t DC_I2C_DATA;
-	uint32_t DC_I2C_INDEX;
-	uint32_t DC_I2C_INDEX_WRITE;
-	uint32_t XTAL_REF_DIV;
-};
-
-struct dce110_i2c_hw_engine_registers {
-	uint32_t SETUP;
-	uint32_t SPEED;
-	uint32_t DC_I2C_ARBITRATION;
-	uint32_t DC_I2C_CONTROL;
-	uint32_t DC_I2C_SW_STATUS;
-	uint32_t DC_I2C_TRANSACTION0;
-	uint32_t DC_I2C_TRANSACTION1;
-	uint32_t DC_I2C_TRANSACTION2;
-	uint32_t DC_I2C_TRANSACTION3;
-	uint32_t DC_I2C_DATA;
-	uint32_t MICROSECOND_TIME_BASE_DIV;
-};
-
-struct i2c_hw_engine_dce110 {
-	struct i2c_hw_engine base;
-	const struct dce110_i2c_hw_engine_registers *regs;
-	const struct dce110_i2c_hw_engine_shift *i2c_shift;
-	const struct dce110_i2c_hw_engine_mask *i2c_mask;
-	struct {
-		uint32_t DC_I2C_DDCX_SETUP;
-		uint32_t DC_I2C_DDCX_SPEED;
-	} addr;
-	uint32_t engine_id;
-	/* expressed in kilohertz */
-	uint32_t reference_frequency;
-	/* number of bytes currently used in HW buffer */
-	uint32_t buffer_used_bytes;
-	/* number of bytes used for write transaction in HW buffer
-	 * - this will be used as the index to read from*/
-	uint32_t buffer_used_write;
-	/* number of pending transactions (before GO) */
-	uint32_t transaction_count;
-	uint32_t engine_keep_power_up_count;
-	uint32_t i2_setup_time_limit;
-};
-
-struct i2c_hw_engine_dce110_create_arg {
-	uint32_t engine_id;
-	uint32_t reference_frequency;
-	uint32_t default_speed;
-	struct dc_context *ctx;
-	const struct dce110_i2c_hw_engine_registers *regs;
-	const struct dce110_i2c_hw_engine_shift *i2c_shift;
-	const struct dce110_i2c_hw_engine_mask *i2c_mask;
-};
-
-struct i2c_engine *dal_i2c_hw_engine_dce110_create(
-	const struct i2c_hw_engine_dce110_create_arg *arg);
-
-enum {
-	I2C_SETUP_TIME_LIMIT_DCE = 255,
-	I2C_SETUP_TIME_LIMIT_DCN = 3,
-	I2C_HW_BUFFER_SIZE = 538,
-	I2C_SEND_RESET_LENGTH_9 = 9,
-	I2C_SEND_RESET_LENGTH_10 = 10,
-};
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_sw_engine_dce110.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_sw_engine_dce110.c
deleted file mode 100644
index 3aa7f791e523..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_sw_engine_dce110.c
+++ /dev/null
@@ -1,160 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_sw_engine.h"
-
-/*
- * Header of this unit
- */
-
-#include "i2c_sw_engine_dce110.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-/*
- * This unit
- */
-
-/*
- * @brief
- * Cast 'struct i2c_sw_engine *'
- * to 'struct i2c_sw_engine_dce110 *'
- */
-#define FROM_I2C_SW_ENGINE(ptr) \
-	container_of((ptr), struct i2c_sw_engine_dce110, base)
-/*
- * @brief
- * Cast 'struct i2c_engine *'
- * to 'struct i2c_sw_engine_dce80 *'
- */
-#define FROM_I2C_ENGINE(ptr) \
-	FROM_I2C_SW_ENGINE(container_of((ptr), struct i2c_sw_engine, base))
-
-/*
- * @brief
- * Cast 'struct engine *'
- * to 'struct i2c_sw_engine_dce80 *'
- */
-#define FROM_ENGINE(ptr) \
-	FROM_I2C_ENGINE(container_of((ptr), struct i2c_engine, base))
-
-static void release_engine(
-	struct engine *engine)
-{
-}
-
-static void destruct(
-	struct i2c_sw_engine_dce110 *engine)
-{
-	dal_i2c_sw_engine_destruct(&engine->base);
-}
-
-static void destroy(
-	struct i2c_engine **engine)
-{
-	struct i2c_sw_engine_dce110 *sw_engine = FROM_I2C_ENGINE(*engine);
-
-	destruct(sw_engine);
-
-	kfree(sw_engine);
-
-	*engine = NULL;
-}
-
-static bool acquire_engine(
-	struct i2c_engine *engine,
-	struct ddc *ddc_handle)
-{
-	return dal_i2caux_i2c_sw_engine_acquire_engine(engine, ddc_handle);
-}
-
-static const struct i2c_engine_funcs i2c_engine_funcs = {
-	.acquire_engine = acquire_engine,
-	.destroy = destroy,
-	.get_speed = dal_i2c_sw_engine_get_speed,
-	.set_speed = dal_i2c_sw_engine_set_speed,
-	.setup_engine = dal_i2c_engine_setup_i2c_engine,
-	.submit_channel_request = dal_i2c_sw_engine_submit_channel_request,
-	.process_channel_reply = dal_i2c_engine_process_channel_reply,
-	.get_channel_status = dal_i2c_sw_engine_get_channel_status,
-};
-
-static const struct engine_funcs engine_funcs = {
-	.release_engine = release_engine,
-	.get_engine_type = dal_i2c_sw_engine_get_engine_type,
-	.acquire = dal_i2c_engine_acquire,
-	.submit_request = dal_i2c_sw_engine_submit_request,
-};
-
-static void construct(
-	struct i2c_sw_engine_dce110 *engine_dce110,
-	const struct i2c_sw_engine_dce110_create_arg *arg_dce110)
-{
-	struct i2c_sw_engine_create_arg arg_base;
-
-	arg_base.ctx = arg_dce110->ctx;
-	arg_base.default_speed = arg_dce110->default_speed;
-
-	dal_i2c_sw_engine_construct(&engine_dce110->base, &arg_base);
-
-	/*struct engine   struct engine_funcs*/
-	engine_dce110->base.base.base.funcs = &engine_funcs;
-	/*struct i2c_engine  struct i2c_engine_funcs*/
-	engine_dce110->base.base.funcs = &i2c_engine_funcs;
-	engine_dce110->base.default_speed = arg_dce110->default_speed;
-	engine_dce110->engine_id = arg_dce110->engine_id;
-}
-
-struct i2c_engine *dal_i2c_sw_engine_dce110_create(
-	const struct i2c_sw_engine_dce110_create_arg *arg)
-{
-	struct i2c_sw_engine_dce110 *engine_dce110;
-
-	if (!arg) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	engine_dce110 = kzalloc(sizeof(struct i2c_sw_engine_dce110),
-				GFP_KERNEL);
-
-	if (!engine_dce110) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	construct(engine_dce110, arg);
-	return &engine_dce110->base.base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2caux_dce110.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2caux_dce110.c
deleted file mode 100644
index 1d748ac1d6d6..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2caux_dce110.c
+++ /dev/null
@@ -1,329 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "../i2caux.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_sw_engine.h"
-#include "../i2c_hw_engine.h"
-
-/*
- * Header of this unit
- */
-#include "i2caux_dce110.h"
-
-#include "i2c_sw_engine_dce110.h"
-#include "i2c_hw_engine_dce110.h"
-#include "aux_engine_dce110.h"
-#include "../../dc.h"
-#include "dc_types.h"
-
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-/*
- * This unit
- */
-/*cast pointer to struct i2caux TO pointer to struct i2caux_dce110*/
-#define FROM_I2C_AUX(ptr) \
-	container_of((ptr), struct i2caux_dce110, base)
-
-static void destruct(
-	struct i2caux_dce110 *i2caux_dce110)
-{
-	dal_i2caux_destruct(&i2caux_dce110->base);
-}
-
-static void destroy(
-	struct i2caux **i2c_engine)
-{
-	struct i2caux_dce110 *i2caux_dce110 = FROM_I2C_AUX(*i2c_engine);
-
-	destruct(i2caux_dce110);
-
-	kfree(i2caux_dce110);
-
-	*i2c_engine = NULL;
-}
-
-static struct i2c_engine *acquire_i2c_hw_engine(
-	struct i2caux *i2caux,
-	struct ddc *ddc)
-{
-	struct i2caux_dce110 *i2caux_dce110 = FROM_I2C_AUX(i2caux);
-
-	struct i2c_engine *engine = NULL;
-	/* generic hw engine is not used for EDID read
-	 * It may be needed for external i2c device, like thermal chip,
-	 * TODO will be implemented when needed.
-	 * check dce80 bool non_generic for generic hw engine;
-	 */
-
-	if (!ddc)
-		return NULL;
-
-	if (ddc->hw_info.hw_supported) {
-		enum gpio_ddc_line line = dal_ddc_get_line(ddc);
-
-		if (line < GPIO_DDC_LINE_COUNT)
-			engine = i2caux->i2c_hw_engines[line];
-	}
-
-	if (!engine)
-		return NULL;
-
-	if (!i2caux_dce110->i2c_hw_buffer_in_use &&
-		engine->base.funcs->acquire(&engine->base, ddc)) {
-		i2caux_dce110->i2c_hw_buffer_in_use = true;
-		return engine;
-	}
-
-	return NULL;
-}
-
-static void release_engine(
-	struct i2caux *i2caux,
-	struct engine *engine)
-{
-	struct i2caux_dce110 *i2caux_dce110 = FROM_I2C_AUX(i2caux);
-
-	if (engine->funcs->get_engine_type(engine) ==
-		I2CAUX_ENGINE_TYPE_I2C_DDC_HW)
-		i2caux_dce110->i2c_hw_buffer_in_use = false;
-
-	dal_i2caux_release_engine(i2caux, engine);
-}
-
-static const enum gpio_ddc_line hw_ddc_lines[] = {
-	GPIO_DDC_LINE_DDC1,
-	GPIO_DDC_LINE_DDC2,
-	GPIO_DDC_LINE_DDC3,
-	GPIO_DDC_LINE_DDC4,
-	GPIO_DDC_LINE_DDC5,
-	GPIO_DDC_LINE_DDC6,
-};
-
-static const enum gpio_ddc_line hw_aux_lines[] = {
-	GPIO_DDC_LINE_DDC1,
-	GPIO_DDC_LINE_DDC2,
-	GPIO_DDC_LINE_DDC3,
-	GPIO_DDC_LINE_DDC4,
-	GPIO_DDC_LINE_DDC5,
-	GPIO_DDC_LINE_DDC6,
-};
-
-/* function table */
-static const struct i2caux_funcs i2caux_funcs = {
-	.destroy = destroy,
-	.acquire_i2c_hw_engine = acquire_i2c_hw_engine,
-	.release_engine = release_engine,
-	.acquire_i2c_sw_engine = dal_i2caux_acquire_i2c_sw_engine,
-	.acquire_aux_engine = dal_i2caux_acquire_aux_engine,
-};
-
-#include "dce/dce_11_0_d.h"
-#include "dce/dce_11_0_sh_mask.h"
-
-/* set register offset */
-#define SR(reg_name)\
-	.reg_name = mm ## reg_name
-
-/* set register offset with instance */
-#define SRI(reg_name, block, id)\
-	.reg_name = mm ## block ## id ## _ ## reg_name
-
-#define aux_regs(id)\
-[id] = {\
-	AUX_COMMON_REG_LIST(id), \
-	.AUX_RESET_MASK = AUX_CONTROL__AUX_RESET_MASK \
-}
-
-#define hw_engine_regs(id)\
-{\
-		I2C_HW_ENGINE_COMMON_REG_LIST(id) \
-}
-
-static const struct dce110_aux_registers dce110_aux_regs[] = {
-		aux_regs(0),
-		aux_regs(1),
-		aux_regs(2),
-		aux_regs(3),
-		aux_regs(4),
-		aux_regs(5)
-};
-
-static const struct dce110_i2c_hw_engine_registers i2c_hw_engine_regs[] = {
-		hw_engine_regs(1),
-		hw_engine_regs(2),
-		hw_engine_regs(3),
-		hw_engine_regs(4),
-		hw_engine_regs(5),
-		hw_engine_regs(6)
-};
-
-static const struct dce110_i2c_hw_engine_shift i2c_shift = {
-		I2C_COMMON_MASK_SH_LIST_DCE110(__SHIFT)
-};
-
-static const struct dce110_i2c_hw_engine_mask i2c_mask = {
-		I2C_COMMON_MASK_SH_LIST_DCE110(_MASK)
-};
-
-void dal_i2caux_dce110_construct(
-	struct i2caux_dce110 *i2caux_dce110,
-	struct dc_context *ctx,
-	unsigned int num_i2caux_inst,
-	const struct dce110_aux_registers aux_regs[],
-	const struct dce110_i2c_hw_engine_registers i2c_hw_engine_regs[],
-	const struct dce110_i2c_hw_engine_shift *i2c_shift,
-	const struct dce110_i2c_hw_engine_mask *i2c_mask)
-{
-	uint32_t i = 0;
-	uint32_t reference_frequency = 0;
-	bool use_i2c_sw_engine = false;
-	struct i2caux *base = NULL;
-	/*TODO: For CZ bring up, if dal_i2caux_get_reference_clock
-	 * does not return 48KHz, we need hard coded for 48Khz.
-	 * Some BIOS setting incorrect cause this
-	 * For production, we always get value from BIOS*/
-	reference_frequency =
-		dal_i2caux_get_reference_clock(ctx->dc_bios) >> 1;
-
-	base = &i2caux_dce110->base;
-
-	dal_i2caux_construct(base, ctx);
-
-	i2caux_dce110->base.funcs = &i2caux_funcs;
-	i2caux_dce110->i2c_hw_buffer_in_use = false;
-	/* Create I2C engines (DDC lines per connector)
-	 * different I2C/AUX usage cases, DDC, Generic GPIO, AUX.
-	 */
-	do {
-		enum gpio_ddc_line line_id = hw_ddc_lines[i];
-
-		struct i2c_hw_engine_dce110_create_arg hw_arg_dce110;
-
-		if (use_i2c_sw_engine) {
-			struct i2c_sw_engine_dce110_create_arg sw_arg;
-
-			sw_arg.engine_id = i;
-			sw_arg.default_speed = base->default_i2c_sw_speed;
-			sw_arg.ctx = ctx;
-			base->i2c_sw_engines[line_id] =
-				dal_i2c_sw_engine_dce110_create(&sw_arg);
-		}
-
-		hw_arg_dce110.engine_id = i;
-		hw_arg_dce110.reference_frequency = reference_frequency;
-		hw_arg_dce110.default_speed = base->default_i2c_hw_speed;
-		hw_arg_dce110.ctx = ctx;
-		hw_arg_dce110.regs = &i2c_hw_engine_regs[i];
-		hw_arg_dce110.i2c_shift = i2c_shift;
-		hw_arg_dce110.i2c_mask = i2c_mask;
-
-		base->i2c_hw_engines[line_id] =
-			dal_i2c_hw_engine_dce110_create(&hw_arg_dce110);
-		if (base->i2c_hw_engines[line_id] != NULL) {
-			switch (ctx->dce_version) {
-			case DCN_VERSION_1_0:
-				base->i2c_hw_engines[line_id]->setup_limit =
-					I2C_SETUP_TIME_LIMIT_DCN;
-				base->i2c_hw_engines[line_id]->send_reset_length  = 0;
-			break;
-			default:
-				base->i2c_hw_engines[line_id]->setup_limit =
-					I2C_SETUP_TIME_LIMIT_DCE;
-				base->i2c_hw_engines[line_id]->send_reset_length  = 0;
-				break;
-			}
-		}
-		++i;
-	} while (i < num_i2caux_inst);
-
-	/* Create AUX engines for all lines which has assisted HW AUX
-	 * 'i' (loop counter) used as DDC/AUX engine_id */
-
-	i = 0;
-
-	do {
-		enum gpio_ddc_line line_id = hw_aux_lines[i];
-
-		struct aux_engine_dce110_init_data aux_init_data;
-
-		aux_init_data.engine_id = i;
-		aux_init_data.timeout_period = base->aux_timeout_period;
-		aux_init_data.ctx = ctx;
-		aux_init_data.regs = &aux_regs[i];
-
-		base->aux_engines[line_id] =
-			dal_aux_engine_dce110_create(&aux_init_data);
-
-		++i;
-	} while (i < num_i2caux_inst);
-
-	/*TODO Generic I2C SW and HW*/
-}
-
-/*
- * dal_i2caux_dce110_create
- *
- * @brief
- * public interface to allocate memory for DCE11 I2CAUX
- *
- * @param
- * struct adapter_service *as - [in]
- * struct dc_context *ctx - [in]
- *
- * @return
- * pointer to the base struct of DCE11 I2CAUX
- */
-struct i2caux *dal_i2caux_dce110_create(
-	struct dc_context *ctx)
-{
-	struct i2caux_dce110 *i2caux_dce110 =
-		kzalloc(sizeof(struct i2caux_dce110), GFP_KERNEL);
-
-	if (!i2caux_dce110) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	dal_i2caux_dce110_construct(i2caux_dce110,
-				    ctx,
-				    ARRAY_SIZE(dce110_aux_regs),
-				    dce110_aux_regs,
-				    i2c_hw_engine_regs,
-				    &i2c_shift,
-				    &i2c_mask);
-	return &i2caux_dce110->base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2caux_dce110.h b/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2caux_dce110.h
deleted file mode 100644
index d3d8cc58666a..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2caux_dce110.h
+++ /dev/null
@@ -1,54 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_AUX_DCE110_H__
-#define __DAL_I2C_AUX_DCE110_H__
-
-#include "../i2caux.h"
-
-struct i2caux_dce110 {
-	struct i2caux base;
-	/* indicate the I2C HW circular buffer is in use */
-	bool i2c_hw_buffer_in_use;
-};
-
-struct dce110_aux_registers;
-struct dce110_i2c_hw_engine_registers;
-struct dce110_i2c_hw_engine_shift;
-struct dce110_i2c_hw_engine_mask;
-
-struct i2caux *dal_i2caux_dce110_create(
-	struct dc_context *ctx);
-
-void dal_i2caux_dce110_construct(
-	struct i2caux_dce110 *i2caux_dce110,
-	struct dc_context *ctx,
-	unsigned int num_i2caux_inst,
-	const struct dce110_aux_registers *aux_regs,
-	const struct dce110_i2c_hw_engine_registers *i2c_hw_engine_regs,
-	const struct dce110_i2c_hw_engine_shift *i2c_shift,
-	const struct dce110_i2c_hw_engine_mask *i2c_mask);
-
-#endif /* __DAL_I2C_AUX_DCE110_H__ */
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce112/i2caux_dce112.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce112/i2caux_dce112.c
deleted file mode 100644
index a9db04738724..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce112/i2caux_dce112.c
+++ /dev/null
@@ -1,129 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-#include "include/i2caux_interface.h"
-#include "../i2caux.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_sw_engine.h"
-#include "../i2c_hw_engine.h"
-
-#include "../dce110/i2caux_dce110.h"
-#include "i2caux_dce112.h"
-
-#include "../dce110/aux_engine_dce110.h"
-
-#include "../dce110/i2c_hw_engine_dce110.h"
-
-#include "dce/dce_11_2_d.h"
-#include "dce/dce_11_2_sh_mask.h"
-
-/* set register offset */
-#define SR(reg_name)\
-	.reg_name = mm ## reg_name
-
-/* set register offset with instance */
-#define SRI(reg_name, block, id)\
-	.reg_name = mm ## block ## id ## _ ## reg_name
-
-#define aux_regs(id)\
-[id] = {\
-	AUX_COMMON_REG_LIST(id), \
-	.AUX_RESET_MASK = AUX_CONTROL__AUX_RESET_MASK \
-}
-
-#define hw_engine_regs(id)\
-{\
-		I2C_HW_ENGINE_COMMON_REG_LIST(id) \
-}
-
-static const struct dce110_aux_registers dce112_aux_regs[] = {
-		aux_regs(0),
-		aux_regs(1),
-		aux_regs(2),
-		aux_regs(3),
-		aux_regs(4),
-		aux_regs(5),
-};
-
-static const struct dce110_i2c_hw_engine_registers dce112_hw_engine_regs[] = {
-		hw_engine_regs(1),
-		hw_engine_regs(2),
-		hw_engine_regs(3),
-		hw_engine_regs(4),
-		hw_engine_regs(5),
-		hw_engine_regs(6)
-};
-
-static const struct dce110_i2c_hw_engine_shift i2c_shift = {
-		I2C_COMMON_MASK_SH_LIST_DCE110(__SHIFT)
-};
-
-static const struct dce110_i2c_hw_engine_mask i2c_mask = {
-		I2C_COMMON_MASK_SH_LIST_DCE110(_MASK)
-};
-
-static void construct(
-	struct i2caux_dce110 *i2caux_dce110,
-	struct dc_context *ctx)
-{
-	dal_i2caux_dce110_construct(i2caux_dce110,
-				    ctx,
-				    ARRAY_SIZE(dce112_aux_regs),
-				    dce112_aux_regs,
-				    dce112_hw_engine_regs,
-				    &i2c_shift,
-				    &i2c_mask);
-}
-
-/*
- * dal_i2caux_dce110_create
- *
- * @brief
- * public interface to allocate memory for DCE11 I2CAUX
- *
- * @param
- * struct adapter_service *as - [in]
- * struct dc_context *ctx - [in]
- *
- * @return
- * pointer to the base struct of DCE11 I2CAUX
- */
-struct i2caux *dal_i2caux_dce112_create(
-	struct dc_context *ctx)
-{
-	struct i2caux_dce110 *i2caux_dce110 =
-		kzalloc(sizeof(struct i2caux_dce110), GFP_KERNEL);
-
-	if (!i2caux_dce110) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	construct(i2caux_dce110, ctx);
-	return &i2caux_dce110->base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce112/i2caux_dce112.h b/drivers/gpu/drm/amd/display/dc/i2caux/dce112/i2caux_dce112.h
deleted file mode 100644
index 8d35453c25b6..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce112/i2caux_dce112.h
+++ /dev/null
@@ -1,32 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_AUX_DCE112_H__
-#define __DAL_I2C_AUX_DCE112_H__
-
-struct i2caux *dal_i2caux_dce112_create(
-	struct dc_context *ctx);
-
-#endif /* __DAL_I2C_AUX_DCE112_H__ */
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce120/i2caux_dce120.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce120/i2caux_dce120.c
deleted file mode 100644
index 6a4f344c1db4..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce120/i2caux_dce120.c
+++ /dev/null
@@ -1,120 +0,0 @@
-/*
- * Copyright 2012-16 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-#include "include/i2caux_interface.h"
-#include "../i2caux.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_sw_engine.h"
-#include "../i2c_hw_engine.h"
-
-#include "../dce110/i2c_hw_engine_dce110.h"
-#include "../dce110/aux_engine_dce110.h"
-#include "../dce110/i2caux_dce110.h"
-
-#include "dce/dce_12_0_offset.h"
-#include "dce/dce_12_0_sh_mask.h"
-#include "soc15_hw_ip.h"
-#include "vega10_ip_offset.h"
-
-/* begin *********************
- * macros to expend register list macro defined in HW object header file */
-
-#define BASE_INNER(seg) \
-	DCE_BASE__INST0_SEG ## seg
-
-/* compile time expand base address. */
-#define BASE(seg) \
-	BASE_INNER(seg)
-
-#define SR(reg_name)\
-		.reg_name = BASE(mm ## reg_name ## _BASE_IDX) +  \
-					mm ## reg_name
-
-#define SRI(reg_name, block, id)\
-	.reg_name = BASE(mm ## block ## id ## _ ## reg_name ## _BASE_IDX) + \
-					mm ## block ## id ## _ ## reg_name
-/* macros to expend register list macro defined in HW object header file
- * end *********************/
-
-#define aux_regs(id)\
-[id] = {\
-	AUX_COMMON_REG_LIST(id), \
-	.AUX_RESET_MASK = DP_AUX0_AUX_CONTROL__AUX_RESET_MASK \
-}
-
-static const struct dce110_aux_registers dce120_aux_regs[] = {
-		aux_regs(0),
-		aux_regs(1),
-		aux_regs(2),
-		aux_regs(3),
-		aux_regs(4),
-		aux_regs(5),
-};
-
-#define hw_engine_regs(id)\
-{\
-		I2C_HW_ENGINE_COMMON_REG_LIST(id) \
-}
-
-static const struct dce110_i2c_hw_engine_registers dce120_hw_engine_regs[] = {
-		hw_engine_regs(1),
-		hw_engine_regs(2),
-		hw_engine_regs(3),
-		hw_engine_regs(4),
-		hw_engine_regs(5),
-		hw_engine_regs(6)
-};
-
-static const struct dce110_i2c_hw_engine_shift i2c_shift = {
-		I2C_COMMON_MASK_SH_LIST_DCE110(__SHIFT)
-};
-
-static const struct dce110_i2c_hw_engine_mask i2c_mask = {
-		I2C_COMMON_MASK_SH_LIST_DCE110(_MASK)
-};
-
-struct i2caux *dal_i2caux_dce120_create(
-	struct dc_context *ctx)
-{
-	struct i2caux_dce110 *i2caux_dce110 =
-		kzalloc(sizeof(struct i2caux_dce110), GFP_KERNEL);
-
-	if (!i2caux_dce110) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	dal_i2caux_dce110_construct(i2caux_dce110,
-				    ctx,
-				    ARRAY_SIZE(dce120_aux_regs),
-				    dce120_aux_regs,
-				    dce120_hw_engine_regs,
-				    &i2c_shift,
-				    &i2c_mask);
-	return &i2caux_dce110->base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_hw_engine_dce80.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_hw_engine_dce80.c
deleted file mode 100644
index fd0832dd2c75..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_hw_engine_dce80.c
+++ /dev/null
@@ -1,875 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_hw_engine.h"
-#include "../i2c_generic_hw_engine.h"
-/*
- * Header of this unit
- */
-
-#include "i2c_hw_engine_dce80.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-#include "dce/dce_8_0_d.h"
-#include "dce/dce_8_0_sh_mask.h"
-/*
- * This unit
- */
-
-enum dc_i2c_status {
-	DC_I2C_STATUS__DC_I2C_STATUS_IDLE,
-	DC_I2C_STATUS__DC_I2C_STATUS_USED_BY_SW,
-	DC_I2C_STATUS__DC_I2C_STATUS_USED_BY_HW
-};
-
-enum dc_i2c_arbitration {
-	DC_I2C_ARBITRATION__DC_I2C_SW_PRIORITY_NORMAL,
-	DC_I2C_ARBITRATION__DC_I2C_SW_PRIORITY_HIGH
-};
-
-enum {
-	/* No timeout in HW
-	 * (timeout implemented in SW by querying status) */
-	I2C_SETUP_TIME_LIMIT = 255,
-	I2C_HW_BUFFER_SIZE = 144
-};
-
-/*
- * @brief
- * Cast 'struct i2c_hw_engine *'
- * to 'struct i2c_hw_engine_dce80 *'
- */
-#define FROM_I2C_HW_ENGINE(ptr) \
-	container_of((ptr), struct i2c_hw_engine_dce80, base)
-
-/*
- * @brief
- * Cast pointer to 'struct i2c_engine *'
- * to pointer to 'struct i2c_hw_engine_dce80 *'
- */
-#define FROM_I2C_ENGINE(ptr) \
-	FROM_I2C_HW_ENGINE(container_of((ptr), struct i2c_hw_engine, base))
-
-/*
- * @brief
- * Cast pointer to 'struct engine *'
- * to 'pointer to struct i2c_hw_engine_dce80 *'
- */
-#define FROM_ENGINE(ptr) \
-	FROM_I2C_ENGINE(container_of((ptr), struct i2c_engine, base))
-
-static void disable_i2c_hw_engine(
-	struct i2c_hw_engine_dce80 *engine)
-{
-	const uint32_t addr = engine->addr.DC_I2C_DDCX_SETUP;
-	uint32_t value = 0;
-
-	struct dc_context *ctx = NULL;
-
-	ctx = engine->base.base.base.ctx;
-
-	value = dm_read_reg(ctx, addr);
-
-	set_reg_field_value(
-		value,
-		0,
-		DC_I2C_DDC1_SETUP,
-		DC_I2C_DDC1_ENABLE);
-
-	dm_write_reg(ctx, addr, value);
-}
-
-static void release_engine(
-	struct engine *engine)
-{
-	struct i2c_hw_engine_dce80 *hw_engine = FROM_ENGINE(engine);
-
-	struct i2c_engine *base = NULL;
-	bool safe_to_reset;
-	uint32_t value = 0;
-
-	base = &hw_engine->base.base;
-
-	/* Restore original HW engine speed */
-
-	base->funcs->set_speed(base, hw_engine->base.original_speed);
-
-	/* Release I2C */
-	{
-		value = dm_read_reg(engine->ctx, mmDC_I2C_ARBITRATION);
-
-		set_reg_field_value(
-				value,
-				1,
-				DC_I2C_ARBITRATION,
-				DC_I2C_SW_DONE_USING_I2C_REG);
-
-		dm_write_reg(engine->ctx, mmDC_I2C_ARBITRATION, value);
-	}
-
-	/* Reset HW engine */
-	{
-		uint32_t i2c_sw_status = 0;
-
-		value = dm_read_reg(engine->ctx, mmDC_I2C_SW_STATUS);
-
-		i2c_sw_status = get_reg_field_value(
-				value,
-				DC_I2C_SW_STATUS,
-				DC_I2C_SW_STATUS);
-		/* if used by SW, safe to reset */
-		safe_to_reset = (i2c_sw_status == 1);
-	}
-	{
-		value = dm_read_reg(engine->ctx, mmDC_I2C_CONTROL);
-
-		if (safe_to_reset)
-			set_reg_field_value(
-				value,
-				1,
-				DC_I2C_CONTROL,
-				DC_I2C_SOFT_RESET);
-
-		set_reg_field_value(
-			value,
-			1,
-			DC_I2C_CONTROL,
-			DC_I2C_SW_STATUS_RESET);
-
-		dm_write_reg(engine->ctx, mmDC_I2C_CONTROL, value);
-	}
-
-	/* HW I2c engine - clock gating feature */
-	if (!hw_engine->engine_keep_power_up_count)
-		disable_i2c_hw_engine(hw_engine);
-}
-
-static void destruct(
-	struct i2c_hw_engine_dce80 *engine)
-{
-	dal_i2c_hw_engine_destruct(&engine->base);
-}
-
-static void destroy(
-	struct i2c_engine **i2c_engine)
-{
-	struct i2c_hw_engine_dce80 *engine = FROM_I2C_ENGINE(*i2c_engine);
-
-	destruct(engine);
-
-	kfree(engine);
-
-	*i2c_engine = NULL;
-}
-
-static bool setup_engine(
-	struct i2c_engine *i2c_engine)
-{
-	uint32_t value = 0;
-	struct i2c_hw_engine_dce80 *engine = FROM_I2C_ENGINE(i2c_engine);
-
-	/* Program pin select */
-	{
-		const uint32_t addr = mmDC_I2C_CONTROL;
-
-		value = dm_read_reg(i2c_engine->base.ctx, addr);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_CONTROL,
-			DC_I2C_GO);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_CONTROL,
-			DC_I2C_SOFT_RESET);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_CONTROL,
-			DC_I2C_SEND_RESET);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_CONTROL,
-			DC_I2C_SW_STATUS_RESET);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_CONTROL,
-			DC_I2C_TRANSACTION_COUNT);
-
-		set_reg_field_value(
-			value,
-			engine->engine_id,
-			DC_I2C_CONTROL,
-			DC_I2C_DDC_SELECT);
-
-		dm_write_reg(i2c_engine->base.ctx, addr, value);
-	}
-
-	/* Program time limit */
-	{
-		const uint32_t addr = engine->addr.DC_I2C_DDCX_SETUP;
-
-		value = dm_read_reg(i2c_engine->base.ctx, addr);
-
-		set_reg_field_value(
-			value,
-			I2C_SETUP_TIME_LIMIT,
-			DC_I2C_DDC1_SETUP,
-			DC_I2C_DDC1_TIME_LIMIT);
-
-		set_reg_field_value(
-			value,
-			1,
-			DC_I2C_DDC1_SETUP,
-			DC_I2C_DDC1_ENABLE);
-
-		dm_write_reg(i2c_engine->base.ctx, addr, value);
-	}
-
-	/* Program HW priority
-	 * set to High - interrupt software I2C at any time
-	 * Enable restart of SW I2C that was interrupted by HW
-	 * disable queuing of software while I2C is in use by HW */
-	{
-		value = dm_read_reg(i2c_engine->base.ctx,
-				mmDC_I2C_ARBITRATION);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_ARBITRATION,
-			DC_I2C_NO_QUEUED_SW_GO);
-
-		set_reg_field_value(
-			value,
-			DC_I2C_ARBITRATION__DC_I2C_SW_PRIORITY_NORMAL,
-			DC_I2C_ARBITRATION,
-			DC_I2C_SW_PRIORITY);
-
-		dm_write_reg(i2c_engine->base.ctx,
-				mmDC_I2C_ARBITRATION, value);
-	}
-
-	return true;
-}
-
-static uint32_t get_speed(
-	const struct i2c_engine *i2c_engine)
-{
-	const struct i2c_hw_engine_dce80 *engine = FROM_I2C_ENGINE(i2c_engine);
-
-	const uint32_t addr = engine->addr.DC_I2C_DDCX_SPEED;
-
-	uint32_t pre_scale = 0;
-
-	uint32_t value = dm_read_reg(i2c_engine->base.ctx, addr);
-
-	pre_scale = get_reg_field_value(
-			value,
-			DC_I2C_DDC1_SPEED,
-			DC_I2C_DDC1_PRESCALE);
-
-	/* [anaumov] it seems following is unnecessary */
-	/*ASSERT(value.bits.DC_I2C_DDC1_PRESCALE);*/
-
-	return pre_scale ?
-		engine->reference_frequency / pre_scale :
-		engine->base.default_speed;
-}
-
-static void set_speed(
-	struct i2c_engine *i2c_engine,
-	uint32_t speed)
-{
-	struct i2c_hw_engine_dce80 *engine = FROM_I2C_ENGINE(i2c_engine);
-
-	if (speed) {
-		const uint32_t addr = engine->addr.DC_I2C_DDCX_SPEED;
-
-		uint32_t value = dm_read_reg(i2c_engine->base.ctx, addr);
-
-		set_reg_field_value(
-			value,
-			engine->reference_frequency / speed,
-			DC_I2C_DDC1_SPEED,
-			DC_I2C_DDC1_PRESCALE);
-
-		set_reg_field_value(
-			value,
-			2,
-			DC_I2C_DDC1_SPEED,
-			DC_I2C_DDC1_THRESHOLD);
-
-		dm_write_reg(i2c_engine->base.ctx, addr, value);
-	}
-}
-
-static inline void reset_hw_engine(struct engine *engine)
-{
-	uint32_t value = dm_read_reg(engine->ctx, mmDC_I2C_CONTROL);
-
-	set_reg_field_value(
-		value,
-		1,
-		DC_I2C_CONTROL,
-		DC_I2C_SOFT_RESET);
-
-	set_reg_field_value(
-		value,
-		1,
-		DC_I2C_CONTROL,
-		DC_I2C_SW_STATUS_RESET);
-
-	dm_write_reg(engine->ctx, mmDC_I2C_CONTROL, value);
-}
-
-static bool is_hw_busy(struct engine *engine)
-{
-	uint32_t i2c_sw_status = 0;
-
-	uint32_t value = dm_read_reg(engine->ctx, mmDC_I2C_SW_STATUS);
-
-	i2c_sw_status = get_reg_field_value(
-			value,
-			DC_I2C_SW_STATUS,
-			DC_I2C_SW_STATUS);
-
-	if (i2c_sw_status == DC_I2C_STATUS__DC_I2C_STATUS_IDLE)
-		return false;
-
-	reset_hw_engine(engine);
-
-	value = dm_read_reg(engine->ctx, mmDC_I2C_SW_STATUS);
-
-	i2c_sw_status = get_reg_field_value(
-			value,
-			DC_I2C_SW_STATUS,
-			DC_I2C_SW_STATUS);
-
-	return i2c_sw_status != DC_I2C_STATUS__DC_I2C_STATUS_IDLE;
-}
-
-/*
- * @brief
- * DC_GPIO_DDC MM register offsets
- */
-static const uint32_t transaction_addr[] = {
-	mmDC_I2C_TRANSACTION0,
-	mmDC_I2C_TRANSACTION1,
-	mmDC_I2C_TRANSACTION2,
-	mmDC_I2C_TRANSACTION3
-};
-
-static bool process_transaction(
-	struct i2c_hw_engine_dce80 *engine,
-	struct i2c_request_transaction_data *request)
-{
-	uint32_t length = request->length;
-	uint8_t *buffer = request->data;
-
-	bool last_transaction = false;
-	uint32_t value = 0;
-
-	struct dc_context *ctx = NULL;
-
-	ctx = engine->base.base.base.ctx;
-
-	{
-		const uint32_t addr =
-			transaction_addr[engine->transaction_count];
-
-		value = dm_read_reg(ctx, addr);
-
-		set_reg_field_value(
-			value,
-			1,
-			DC_I2C_TRANSACTION0,
-			DC_I2C_STOP_ON_NACK0);
-
-		set_reg_field_value(
-			value,
-			1,
-			DC_I2C_TRANSACTION0,
-			DC_I2C_START0);
-
-		if ((engine->transaction_count == 3) ||
-		(request->action == I2CAUX_TRANSACTION_ACTION_I2C_WRITE) ||
-		(request->action & I2CAUX_TRANSACTION_ACTION_I2C_READ)) {
-
-			set_reg_field_value(
-				value,
-				1,
-				DC_I2C_TRANSACTION0,
-				DC_I2C_STOP0);
-
-			last_transaction = true;
-		} else
-			set_reg_field_value(
-				value,
-				0,
-				DC_I2C_TRANSACTION0,
-				DC_I2C_STOP0);
-
-		set_reg_field_value(
-			value,
-			(0 != (request->action &
-					I2CAUX_TRANSACTION_ACTION_I2C_READ)),
-			DC_I2C_TRANSACTION0,
-			DC_I2C_RW0);
-
-		set_reg_field_value(
-			value,
-			length,
-			DC_I2C_TRANSACTION0,
-			DC_I2C_COUNT0);
-
-		dm_write_reg(ctx, addr, value);
-	}
-
-	/* Write the I2C address and I2C data
-	 * into the hardware circular buffer, one byte per entry.
-	 * As an example, the 7-bit I2C slave address for CRT monitor
-	 * for reading DDC/EDID information is 0b1010001.
-	 * For an I2C send operation, the LSB must be programmed to 0;
-	 * for I2C receive operation, the LSB must be programmed to 1. */
-
-	{
-		value = 0;
-
-		set_reg_field_value(
-			value,
-			false,
-			DC_I2C_DATA,
-			DC_I2C_DATA_RW);
-
-		set_reg_field_value(
-			value,
-			request->address,
-			DC_I2C_DATA,
-			DC_I2C_DATA);
-
-		if (engine->transaction_count == 0) {
-			set_reg_field_value(
-				value,
-				0,
-				DC_I2C_DATA,
-				DC_I2C_INDEX);
-
-			/*enable index write*/
-			set_reg_field_value(
-				value,
-				1,
-				DC_I2C_DATA,
-				DC_I2C_INDEX_WRITE);
-		}
-
-		dm_write_reg(ctx, mmDC_I2C_DATA, value);
-
-		if (!(request->action & I2CAUX_TRANSACTION_ACTION_I2C_READ)) {
-
-			set_reg_field_value(
-				value,
-				0,
-				DC_I2C_DATA,
-				DC_I2C_INDEX_WRITE);
-
-			while (length) {
-
-				set_reg_field_value(
-					value,
-					*buffer++,
-					DC_I2C_DATA,
-					DC_I2C_DATA);
-
-				dm_write_reg(ctx, mmDC_I2C_DATA, value);
-				--length;
-			}
-		}
-	}
-
-	++engine->transaction_count;
-	engine->buffer_used_bytes += length + 1;
-
-	return last_transaction;
-}
-
-static void execute_transaction(
-	struct i2c_hw_engine_dce80 *engine)
-{
-	uint32_t value = 0;
-	struct dc_context *ctx = NULL;
-
-	ctx = engine->base.base.base.ctx;
-
-	{
-		const uint32_t addr = engine->addr.DC_I2C_DDCX_SETUP;
-
-		value = dm_read_reg(ctx, addr);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_DDC1_SETUP,
-			DC_I2C_DDC1_DATA_DRIVE_EN);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_DDC1_SETUP,
-			DC_I2C_DDC1_CLK_DRIVE_EN);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_DDC1_SETUP,
-			DC_I2C_DDC1_DATA_DRIVE_SEL);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_DDC1_SETUP,
-			DC_I2C_DDC1_INTRA_TRANSACTION_DELAY);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_DDC1_SETUP,
-			DC_I2C_DDC1_INTRA_BYTE_DELAY);
-
-		dm_write_reg(ctx, addr, value);
-	}
-
-	{
-		const uint32_t addr = mmDC_I2C_CONTROL;
-
-		value = dm_read_reg(ctx, addr);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_CONTROL,
-			DC_I2C_SOFT_RESET);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_CONTROL,
-			DC_I2C_SW_STATUS_RESET);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_CONTROL,
-			DC_I2C_SEND_RESET);
-
-		set_reg_field_value(
-			value,
-			0,
-			DC_I2C_CONTROL,
-			DC_I2C_GO);
-
-		set_reg_field_value(
-			value,
-			engine->transaction_count - 1,
-			DC_I2C_CONTROL,
-			DC_I2C_TRANSACTION_COUNT);
-
-		dm_write_reg(ctx, addr, value);
-	}
-
-	/* start I2C transfer */
-	{
-		const uint32_t addr = mmDC_I2C_CONTROL;
-
-		value	= dm_read_reg(ctx, addr);
-
-		set_reg_field_value(
-			value,
-			1,
-			DC_I2C_CONTROL,
-			DC_I2C_GO);
-
-		dm_write_reg(ctx, addr, value);
-	}
-
-	/* all transactions were executed and HW buffer became empty
-	 * (even though it actually happens when status becomes DONE) */
-	engine->transaction_count = 0;
-	engine->buffer_used_bytes = 0;
-}
-
-static void submit_channel_request(
-	struct i2c_engine *engine,
-	struct i2c_request_transaction_data *request)
-{
-	request->status = I2C_CHANNEL_OPERATION_SUCCEEDED;
-
-	if (!process_transaction(FROM_I2C_ENGINE(engine), request))
-		return;
-
-	if (is_hw_busy(&engine->base)) {
-		request->status = I2C_CHANNEL_OPERATION_ENGINE_BUSY;
-		return;
-	}
-
-	execute_transaction(FROM_I2C_ENGINE(engine));
-}
-
-static void process_channel_reply(
-	struct i2c_engine *engine,
-	struct i2c_reply_transaction_data *reply)
-{
-	uint32_t length = reply->length;
-	uint8_t *buffer = reply->data;
-
-	uint32_t value = 0;
-
-	/*set index*/
-	set_reg_field_value(
-		value,
-		length - 1,
-		DC_I2C_DATA,
-		DC_I2C_INDEX);
-
-	set_reg_field_value(
-		value,
-		1,
-		DC_I2C_DATA,
-		DC_I2C_DATA_RW);
-
-	set_reg_field_value(
-		value,
-		1,
-		DC_I2C_DATA,
-		DC_I2C_INDEX_WRITE);
-
-	dm_write_reg(engine->base.ctx, mmDC_I2C_DATA, value);
-
-	while (length) {
-		/* after reading the status,
-		 * if the I2C operation executed successfully
-		 * (i.e. DC_I2C_STATUS_DONE = 1) then the I2C controller
-		 * should read data bytes from I2C circular data buffer */
-
-		value = dm_read_reg(engine->base.ctx, mmDC_I2C_DATA);
-
-		*buffer++ = get_reg_field_value(
-				value,
-				DC_I2C_DATA,
-				DC_I2C_DATA);
-
-		--length;
-	}
-}
-
-static enum i2c_channel_operation_result get_channel_status(
-	struct i2c_engine *engine,
-	uint8_t *returned_bytes)
-{
-	uint32_t i2c_sw_status = 0;
-	uint32_t value = dm_read_reg(engine->base.ctx, mmDC_I2C_SW_STATUS);
-
-	i2c_sw_status = get_reg_field_value(
-			value,
-			DC_I2C_SW_STATUS,
-			DC_I2C_SW_STATUS);
-
-	if (i2c_sw_status == DC_I2C_STATUS__DC_I2C_STATUS_USED_BY_SW)
-		return I2C_CHANNEL_OPERATION_ENGINE_BUSY;
-	else if (value & DC_I2C_SW_STATUS__DC_I2C_SW_STOPPED_ON_NACK_MASK)
-		return I2C_CHANNEL_OPERATION_NO_RESPONSE;
-	else if (value & DC_I2C_SW_STATUS__DC_I2C_SW_TIMEOUT_MASK)
-		return I2C_CHANNEL_OPERATION_TIMEOUT;
-	else if (value & DC_I2C_SW_STATUS__DC_I2C_SW_ABORTED_MASK)
-		return I2C_CHANNEL_OPERATION_FAILED;
-	else if (value & DC_I2C_SW_STATUS__DC_I2C_SW_DONE_MASK)
-		return I2C_CHANNEL_OPERATION_SUCCEEDED;
-
-	/*
-	 * this is the case when HW used for communication, I2C_SW_STATUS
-	 * could be zero
-	 */
-	return I2C_CHANNEL_OPERATION_SUCCEEDED;
-}
-
-static uint32_t get_hw_buffer_available_size(
-	const struct i2c_hw_engine *engine)
-{
-	return I2C_HW_BUFFER_SIZE -
-		FROM_I2C_HW_ENGINE(engine)->buffer_used_bytes;
-}
-
-static uint32_t get_transaction_timeout(
-	const struct i2c_hw_engine *engine,
-	uint32_t length)
-{
-	uint32_t speed = engine->base.funcs->get_speed(&engine->base);
-
-	uint32_t period_timeout;
-	uint32_t num_of_clock_stretches;
-
-	if (!speed)
-		return 0;
-
-	period_timeout = (1000 * TRANSACTION_TIMEOUT_IN_I2C_CLOCKS) / speed;
-
-	num_of_clock_stretches = 1 + (length << 3) + 1;
-	num_of_clock_stretches +=
-		(FROM_I2C_HW_ENGINE(engine)->buffer_used_bytes << 3) +
-		(FROM_I2C_HW_ENGINE(engine)->transaction_count << 1);
-
-	return period_timeout * num_of_clock_stretches;
-}
-
-/*
- * @brief
- * DC_I2C_DDC1_SETUP MM register offsets
- *
- * @note
- * The indices of this offset array are DDC engine IDs
- */
-static const int32_t ddc_setup_offset[] = {
-
-	mmDC_I2C_DDC1_SETUP - mmDC_I2C_DDC1_SETUP, /* DDC Engine 1 */
-	mmDC_I2C_DDC2_SETUP - mmDC_I2C_DDC1_SETUP, /* DDC Engine 2 */
-	mmDC_I2C_DDC3_SETUP - mmDC_I2C_DDC1_SETUP, /* DDC Engine 3 */
-	mmDC_I2C_DDC4_SETUP - mmDC_I2C_DDC1_SETUP, /* DDC Engine 4 */
-	mmDC_I2C_DDC5_SETUP - mmDC_I2C_DDC1_SETUP, /* DDC Engine 5 */
-	mmDC_I2C_DDC6_SETUP - mmDC_I2C_DDC1_SETUP, /* DDC Engine 6 */
-	mmDC_I2C_DDCVGA_SETUP - mmDC_I2C_DDC1_SETUP /* DDC Engine 7 */
-};
-
-/*
- * @brief
- * DC_I2C_DDC1_SPEED MM register offsets
- *
- * @note
- * The indices of this offset array are DDC engine IDs
- */
-static const int32_t ddc_speed_offset[] = {
-	mmDC_I2C_DDC1_SPEED - mmDC_I2C_DDC1_SPEED, /* DDC Engine 1 */
-	mmDC_I2C_DDC2_SPEED - mmDC_I2C_DDC1_SPEED, /* DDC Engine 2 */
-	mmDC_I2C_DDC3_SPEED - mmDC_I2C_DDC1_SPEED, /* DDC Engine 3 */
-	mmDC_I2C_DDC4_SPEED - mmDC_I2C_DDC1_SPEED, /* DDC Engine 4 */
-	mmDC_I2C_DDC5_SPEED - mmDC_I2C_DDC1_SPEED, /* DDC Engine 5 */
-	mmDC_I2C_DDC6_SPEED - mmDC_I2C_DDC1_SPEED, /* DDC Engine 6 */
-	mmDC_I2C_DDCVGA_SPEED - mmDC_I2C_DDC1_SPEED /* DDC Engine 7 */
-};
-
-static const struct i2c_engine_funcs i2c_engine_funcs = {
-	.destroy = destroy,
-	.get_speed = get_speed,
-	.set_speed = set_speed,
-	.setup_engine = setup_engine,
-	.submit_channel_request = submit_channel_request,
-	.process_channel_reply = process_channel_reply,
-	.get_channel_status = get_channel_status,
-	.acquire_engine = dal_i2c_hw_engine_acquire_engine,
-};
-
-static const struct engine_funcs engine_funcs = {
-	.release_engine = release_engine,
-	.get_engine_type = dal_i2c_hw_engine_get_engine_type,
-	.acquire = dal_i2c_engine_acquire,
-	.submit_request = dal_i2c_hw_engine_submit_request,
-};
-
-static const struct i2c_hw_engine_funcs i2c_hw_engine_funcs = {
-	.get_hw_buffer_available_size =
-		get_hw_buffer_available_size,
-	.get_transaction_timeout =
-		get_transaction_timeout,
-	.wait_on_operation_result =
-		dal_i2c_hw_engine_wait_on_operation_result,
-};
-
-static void construct(
-	struct i2c_hw_engine_dce80 *engine,
-	const struct i2c_hw_engine_dce80_create_arg *arg)
-{
-	dal_i2c_hw_engine_construct(&engine->base, arg->ctx);
-
-	engine->base.base.base.funcs = &engine_funcs;
-	engine->base.base.funcs = &i2c_engine_funcs;
-	engine->base.funcs = &i2c_hw_engine_funcs;
-	engine->base.default_speed = arg->default_speed;
-	engine->addr.DC_I2C_DDCX_SETUP =
-		mmDC_I2C_DDC1_SETUP + ddc_setup_offset[arg->engine_id];
-	engine->addr.DC_I2C_DDCX_SPEED =
-		mmDC_I2C_DDC1_SPEED + ddc_speed_offset[arg->engine_id];
-
-	engine->engine_id = arg->engine_id;
-	engine->reference_frequency = arg->reference_frequency;
-	engine->buffer_used_bytes = 0;
-	engine->transaction_count = 0;
-	engine->engine_keep_power_up_count = 1;
-}
-
-struct i2c_engine *dal_i2c_hw_engine_dce80_create(
-	const struct i2c_hw_engine_dce80_create_arg *arg)
-{
-	struct i2c_hw_engine_dce80 *engine;
-
-	if (!arg) {
-		BREAK_TO_DEBUGGER();
-		return NULL;
-	}
-
-	if ((arg->engine_id >= sizeof(ddc_setup_offset) / sizeof(int32_t)) ||
-	    (arg->engine_id >= sizeof(ddc_speed_offset) / sizeof(int32_t)) ||
-	    !arg->reference_frequency) {
-		BREAK_TO_DEBUGGER();
-		return NULL;
-	}
-
-	engine = kzalloc(sizeof(struct i2c_hw_engine_dce80), GFP_KERNEL);
-
-	if (!engine) {
-		BREAK_TO_DEBUGGER();
-		return NULL;
-	}
-
-	construct(engine, arg);
-	return &engine->base.base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_hw_engine_dce80.h b/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_hw_engine_dce80.h
deleted file mode 100644
index 5c6116fb5479..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_hw_engine_dce80.h
+++ /dev/null
@@ -1,54 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_HW_ENGINE_DCE80_H__
-#define __DAL_I2C_HW_ENGINE_DCE80_H__
-
-struct i2c_hw_engine_dce80 {
-	struct i2c_hw_engine base;
-	struct {
-		uint32_t DC_I2C_DDCX_SETUP;
-		uint32_t DC_I2C_DDCX_SPEED;
-	} addr;
-	uint32_t engine_id;
-	/* expressed in kilohertz */
-	uint32_t reference_frequency;
-	/* number of bytes currently used in HW buffer */
-	uint32_t buffer_used_bytes;
-	/* number of pending transactions (before GO) */
-	uint32_t transaction_count;
-	uint32_t engine_keep_power_up_count;
-};
-
-struct i2c_hw_engine_dce80_create_arg {
-	uint32_t engine_id;
-	uint32_t reference_frequency;
-	uint32_t default_speed;
-	struct dc_context *ctx;
-};
-
-struct i2c_engine *dal_i2c_hw_engine_dce80_create(
-	const struct i2c_hw_engine_dce80_create_arg *arg);
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_sw_engine_dce80.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_sw_engine_dce80.c
deleted file mode 100644
index 4853ee26096a..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_sw_engine_dce80.c
+++ /dev/null
@@ -1,173 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_sw_engine.h"
-
-/*
- * Header of this unit
- */
-
-#include "i2c_sw_engine_dce80.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-#include "dce/dce_8_0_d.h"
-#include "dce/dce_8_0_sh_mask.h"
-
-/*
- * This unit
- */
-
-static const uint32_t ddc_hw_status_addr[] = {
-	mmDC_I2C_DDC1_HW_STATUS,
-	mmDC_I2C_DDC2_HW_STATUS,
-	mmDC_I2C_DDC3_HW_STATUS,
-	mmDC_I2C_DDC4_HW_STATUS,
-	mmDC_I2C_DDC5_HW_STATUS,
-	mmDC_I2C_DDC6_HW_STATUS,
-	mmDC_I2C_DDCVGA_HW_STATUS
-};
-
-/*
- * @brief
- * Cast 'struct i2c_sw_engine *'
- * to 'struct i2c_sw_engine_dce80 *'
- */
-#define FROM_I2C_SW_ENGINE(ptr) \
-	container_of((ptr), struct i2c_sw_engine_dce80, base)
-
-/*
- * @brief
- * Cast 'struct i2c_engine *'
- * to 'struct i2c_sw_engine_dce80 *'
- */
-#define FROM_I2C_ENGINE(ptr) \
-	FROM_I2C_SW_ENGINE(container_of((ptr), struct i2c_sw_engine, base))
-
-/*
- * @brief
- * Cast 'struct engine *'
- * to 'struct i2c_sw_engine_dce80 *'
- */
-#define FROM_ENGINE(ptr) \
-	FROM_I2C_ENGINE(container_of((ptr), struct i2c_engine, base))
-
-static void release_engine(
-	struct engine *engine)
-{
-
-}
-
-static void destruct(
-	struct i2c_sw_engine_dce80 *engine)
-{
-	dal_i2c_sw_engine_destruct(&engine->base);
-}
-
-static void destroy(
-	struct i2c_engine **engine)
-{
-	struct i2c_sw_engine_dce80 *sw_engine = FROM_I2C_ENGINE(*engine);
-
-	destruct(sw_engine);
-
-	kfree(sw_engine);
-
-	*engine = NULL;
-}
-
-static bool acquire_engine(
-	struct i2c_engine *engine,
-	struct ddc *ddc_handle)
-{
-	return dal_i2caux_i2c_sw_engine_acquire_engine(engine, ddc_handle);
-}
-
-static const struct i2c_engine_funcs i2c_engine_funcs = {
-	.acquire_engine = acquire_engine,
-	.destroy = destroy,
-	.get_speed = dal_i2c_sw_engine_get_speed,
-	.set_speed = dal_i2c_sw_engine_set_speed,
-	.setup_engine = dal_i2c_engine_setup_i2c_engine,
-	.submit_channel_request = dal_i2c_sw_engine_submit_channel_request,
-	.process_channel_reply = dal_i2c_engine_process_channel_reply,
-	.get_channel_status = dal_i2c_sw_engine_get_channel_status,
-};
-
-static const struct engine_funcs engine_funcs = {
-	.release_engine = release_engine,
-	.get_engine_type = dal_i2c_sw_engine_get_engine_type,
-	.acquire = dal_i2c_engine_acquire,
-	.submit_request = dal_i2c_sw_engine_submit_request,
-};
-
-static void construct(
-	struct i2c_sw_engine_dce80 *engine,
-	const struct i2c_sw_engine_dce80_create_arg *arg)
-{
-	struct i2c_sw_engine_create_arg arg_base;
-
-	arg_base.ctx = arg->ctx;
-	arg_base.default_speed = arg->default_speed;
-
-	dal_i2c_sw_engine_construct(&engine->base, &arg_base);
-
-	engine->base.base.base.funcs = &engine_funcs;
-	engine->base.base.funcs = &i2c_engine_funcs;
-	engine->base.default_speed = arg->default_speed;
-	engine->engine_id = arg->engine_id;
-}
-
-struct i2c_engine *dal_i2c_sw_engine_dce80_create(
-	const struct i2c_sw_engine_dce80_create_arg *arg)
-{
-	struct i2c_sw_engine_dce80 *engine;
-
-	if (!arg) {
-		BREAK_TO_DEBUGGER();
-		return NULL;
-	}
-
-	engine = kzalloc(sizeof(struct i2c_sw_engine_dce80), GFP_KERNEL);
-
-	if (!engine) {
-		BREAK_TO_DEBUGGER();
-		return NULL;
-	}
-
-	construct(engine, arg);
-	return &engine->base.base;
-}
-
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2caux_dce80.c b/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2caux_dce80.c
deleted file mode 100644
index ed48596dd2a5..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2caux_dce80.c
+++ /dev/null
@@ -1,284 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "../i2caux.h"
-
-/*
- * Header of this unit
- */
-
-#include "i2caux_dce80.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_sw_engine.h"
-#include "i2c_sw_engine_dce80.h"
-#include "../i2c_hw_engine.h"
-#include "i2c_hw_engine_dce80.h"
-#include "../i2c_generic_hw_engine.h"
-#include "../aux_engine.h"
-
-
-#include "../dce110/aux_engine_dce110.h"
-#include "../dce110/i2caux_dce110.h"
-
-#include "dce/dce_8_0_d.h"
-#include "dce/dce_8_0_sh_mask.h"
-
-
-/* set register offset */
-#define SR(reg_name)\
-	.reg_name = mm ## reg_name
-
-/* set register offset with instance */
-#define SRI(reg_name, block, id)\
-	.reg_name = mm ## block ## id ## _ ## reg_name
-
-#define aux_regs(id)\
-[id] = {\
-	AUX_COMMON_REG_LIST(id), \
-	.AUX_RESET_MASK = 0 \
-}
-
-static const struct dce110_aux_registers dce80_aux_regs[] = {
-		aux_regs(0),
-		aux_regs(1),
-		aux_regs(2),
-		aux_regs(3),
-		aux_regs(4),
-		aux_regs(5)
-};
-
-/*
- * This unit
- */
-
-#define FROM_I2C_AUX(ptr) \
-	container_of((ptr), struct i2caux_dce80, base)
-
-static void destruct(
-	struct i2caux_dce80 *i2caux_dce80)
-{
-	dal_i2caux_destruct(&i2caux_dce80->base);
-}
-
-static void destroy(
-	struct i2caux **i2c_engine)
-{
-	struct i2caux_dce80 *i2caux_dce80 = FROM_I2C_AUX(*i2c_engine);
-
-	destruct(i2caux_dce80);
-
-	kfree(i2caux_dce80);
-
-	*i2c_engine = NULL;
-}
-
-static struct i2c_engine *acquire_i2c_hw_engine(
-	struct i2caux *i2caux,
-	struct ddc *ddc)
-{
-	struct i2caux_dce80 *i2caux_dce80 = FROM_I2C_AUX(i2caux);
-
-	struct i2c_engine *engine = NULL;
-	bool non_generic;
-
-	if (!ddc)
-		return NULL;
-
-	if (ddc->hw_info.hw_supported) {
-		enum gpio_ddc_line line = dal_ddc_get_line(ddc);
-
-		if (line < GPIO_DDC_LINE_COUNT) {
-			non_generic = true;
-			engine = i2caux->i2c_hw_engines[line];
-		}
-	}
-
-	if (!engine) {
-		non_generic = false;
-		engine = i2caux->i2c_generic_hw_engine;
-	}
-
-	if (!engine)
-		return NULL;
-
-	if (non_generic) {
-		if (!i2caux_dce80->i2c_hw_buffer_in_use &&
-			engine->base.funcs->acquire(&engine->base, ddc)) {
-			i2caux_dce80->i2c_hw_buffer_in_use = true;
-			return engine;
-		}
-	} else {
-		if (engine->base.funcs->acquire(&engine->base, ddc))
-			return engine;
-	}
-
-	return NULL;
-}
-
-static void release_engine(
-	struct i2caux *i2caux,
-	struct engine *engine)
-{
-	if (engine->funcs->get_engine_type(engine) ==
-		I2CAUX_ENGINE_TYPE_I2C_DDC_HW)
-		FROM_I2C_AUX(i2caux)->i2c_hw_buffer_in_use = false;
-
-	dal_i2caux_release_engine(i2caux, engine);
-}
-
-static const enum gpio_ddc_line hw_ddc_lines[] = {
-	GPIO_DDC_LINE_DDC1,
-	GPIO_DDC_LINE_DDC2,
-	GPIO_DDC_LINE_DDC3,
-	GPIO_DDC_LINE_DDC4,
-	GPIO_DDC_LINE_DDC5,
-	GPIO_DDC_LINE_DDC6,
-	GPIO_DDC_LINE_DDC_VGA
-};
-
-static const enum gpio_ddc_line hw_aux_lines[] = {
-	GPIO_DDC_LINE_DDC1,
-	GPIO_DDC_LINE_DDC2,
-	GPIO_DDC_LINE_DDC3,
-	GPIO_DDC_LINE_DDC4,
-	GPIO_DDC_LINE_DDC5,
-	GPIO_DDC_LINE_DDC6
-};
-
-static const struct i2caux_funcs i2caux_funcs = {
-	.destroy = destroy,
-	.acquire_i2c_hw_engine = acquire_i2c_hw_engine,
-	.release_engine = release_engine,
-	.acquire_i2c_sw_engine = dal_i2caux_acquire_i2c_sw_engine,
-	.acquire_aux_engine = dal_i2caux_acquire_aux_engine,
-};
-
-static void construct(
-	struct i2caux_dce80 *i2caux_dce80,
-	struct dc_context *ctx)
-{
-	/* Entire family have I2C engine reference clock frequency
-	 * changed from XTALIN (27) to XTALIN/2 (13.5) */
-
-	struct i2caux *base = &i2caux_dce80->base;
-
-	uint32_t reference_frequency =
-		dal_i2caux_get_reference_clock(ctx->dc_bios) >> 1;
-
-	/*bool use_i2c_sw_engine = dal_adapter_service_is_feature_supported(as,
-		FEATURE_RESTORE_USAGE_I2C_SW_ENGINE);*/
-
-	/* Use SWI2C for dce8 currently, sicne we have bug with hwi2c */
-	bool use_i2c_sw_engine = true;
-
-	uint32_t i;
-
-	dal_i2caux_construct(base, ctx);
-
-	i2caux_dce80->base.funcs = &i2caux_funcs;
-	i2caux_dce80->i2c_hw_buffer_in_use = false;
-
-	/* Create I2C HW engines (HW + SW pairs)
-	 * for all lines which has assisted HW DDC
-	 * 'i' (loop counter) used as DDC/AUX engine_id */
-
-	i = 0;
-
-	do {
-		enum gpio_ddc_line line_id = hw_ddc_lines[i];
-
-		struct i2c_hw_engine_dce80_create_arg hw_arg;
-
-		if (use_i2c_sw_engine) {
-			struct i2c_sw_engine_dce80_create_arg sw_arg;
-
-			sw_arg.engine_id = i;
-			sw_arg.default_speed = base->default_i2c_sw_speed;
-			sw_arg.ctx = ctx;
-			base->i2c_sw_engines[line_id] =
-				dal_i2c_sw_engine_dce80_create(&sw_arg);
-		}
-
-		hw_arg.engine_id = i;
-		hw_arg.reference_frequency = reference_frequency;
-		hw_arg.default_speed = base->default_i2c_hw_speed;
-		hw_arg.ctx = ctx;
-
-		base->i2c_hw_engines[line_id] =
-			dal_i2c_hw_engine_dce80_create(&hw_arg);
-
-		++i;
-	} while (i < ARRAY_SIZE(hw_ddc_lines));
-
-	/* Create AUX engines for all lines which has assisted HW AUX
-	 * 'i' (loop counter) used as DDC/AUX engine_id */
-
-	i = 0;
-
-	do {
-		enum gpio_ddc_line line_id = hw_aux_lines[i];
-
-		struct aux_engine_dce110_init_data arg;
-
-		arg.engine_id = i;
-		arg.timeout_period = base->aux_timeout_period;
-		arg.ctx = ctx;
-		arg.regs = &dce80_aux_regs[i];
-
-		base->aux_engines[line_id] =
-			dal_aux_engine_dce110_create(&arg);
-
-		++i;
-	} while (i < ARRAY_SIZE(hw_aux_lines));
-
-	/* TODO Generic I2C SW and HW */
-}
-
-struct i2caux *dal_i2caux_dce80_create(
-	struct dc_context *ctx)
-{
-	struct i2caux_dce80 *i2caux_dce80 =
-		kzalloc(sizeof(struct i2caux_dce80), GFP_KERNEL);
-
-	if (!i2caux_dce80) {
-		BREAK_TO_DEBUGGER();
-		return NULL;
-	}
-
-	construct(i2caux_dce80, ctx);
-	return &i2caux_dce80->base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dcn10/i2caux_dcn10.c b/drivers/gpu/drm/amd/display/dc/i2caux/dcn10/i2caux_dcn10.c
deleted file mode 100644
index a59c1f50c1e8..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dcn10/i2caux_dcn10.c
+++ /dev/null
@@ -1,120 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-#include "include/i2caux_interface.h"
-#include "../i2caux.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_sw_engine.h"
-#include "../i2c_hw_engine.h"
-
-#include "../dce110/aux_engine_dce110.h"
-#include "../dce110/i2c_hw_engine_dce110.h"
-#include "../dce110/i2caux_dce110.h"
-
-#include "dcn/dcn_1_0_offset.h"
-#include "dcn/dcn_1_0_sh_mask.h"
-#include "soc15_hw_ip.h"
-#include "vega10_ip_offset.h"
-
-/* begin *********************
- * macros to expend register list macro defined in HW object header file */
-
-#define BASE_INNER(seg) \
-	DCE_BASE__INST0_SEG ## seg
-
-/* compile time expand base address. */
-#define BASE(seg) \
-	BASE_INNER(seg)
-
-#define SR(reg_name)\
-		.reg_name = BASE(mm ## reg_name ## _BASE_IDX) +  \
-					mm ## reg_name
-
-#define SRI(reg_name, block, id)\
-	.reg_name = BASE(mm ## block ## id ## _ ## reg_name ## _BASE_IDX) + \
-					mm ## block ## id ## _ ## reg_name
-/* macros to expend register list macro defined in HW object header file
- * end *********************/
-
-#define aux_regs(id)\
-[id] = {\
-	AUX_COMMON_REG_LIST(id), \
-	.AUX_RESET_MASK = DP_AUX0_AUX_CONTROL__AUX_RESET_MASK \
-}
-
-#define hw_engine_regs(id)\
-{\
-		I2C_HW_ENGINE_COMMON_REG_LIST(id) \
-}
-
-static const struct dce110_aux_registers dcn10_aux_regs[] = {
-		aux_regs(0),
-		aux_regs(1),
-		aux_regs(2),
-		aux_regs(3),
-		aux_regs(4),
-		aux_regs(5),
-};
-
-static const struct dce110_i2c_hw_engine_registers dcn10_hw_engine_regs[] = {
-		hw_engine_regs(1),
-		hw_engine_regs(2),
-		hw_engine_regs(3),
-		hw_engine_regs(4),
-		hw_engine_regs(5),
-		hw_engine_regs(6)
-};
-
-static const struct dce110_i2c_hw_engine_shift i2c_shift = {
-		I2C_COMMON_MASK_SH_LIST_DCE110(__SHIFT)
-};
-
-static const struct dce110_i2c_hw_engine_mask i2c_mask = {
-		I2C_COMMON_MASK_SH_LIST_DCE110(_MASK)
-};
-
-struct i2caux *dal_i2caux_dcn10_create(
-	struct dc_context *ctx)
-{
-	struct i2caux_dce110 *i2caux_dce110 =
-		kzalloc(sizeof(struct i2caux_dce110), GFP_KERNEL);
-
-	if (!i2caux_dce110) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	dal_i2caux_dce110_construct(i2caux_dce110,
-				    ctx,
-				    ARRAY_SIZE(dcn10_aux_regs),
-				    dcn10_aux_regs,
-				    dcn10_hw_engine_regs,
-				    &i2c_shift,
-				    &i2c_mask);
-	return &i2caux_dce110->base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dcn10/i2caux_dcn10.h b/drivers/gpu/drm/amd/display/dc/i2caux/dcn10/i2caux_dcn10.h
deleted file mode 100644
index aeb4a86463d4..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dcn10/i2caux_dcn10.h
+++ /dev/null
@@ -1,32 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_AUX_DCN10_H__
-#define __DAL_I2C_AUX_DCN10_H__
-
-struct i2caux *dal_i2caux_dcn10_create(
-	struct dc_context *ctx);
-
-#endif /* __DAL_I2C_AUX_DCN10_H__ */
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/diagnostics/i2caux_diag.c b/drivers/gpu/drm/amd/display/dc/i2caux/diagnostics/i2caux_diag.c
deleted file mode 100644
index e6408f644086..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/diagnostics/i2caux_diag.c
+++ /dev/null
@@ -1,97 +0,0 @@
-/*
- * Copyright 2012-16 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "../i2caux.h"
-#include "../engine.h"
-#include "../i2c_engine.h"
-#include "../i2c_sw_engine.h"
-#include "../i2c_hw_engine.h"
-
-/*
- * Header of this unit
- */
-#include "i2caux_diag.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-/*
- * This unit
- */
-
-static void destruct(
-	struct i2caux *i2caux)
-{
-	dal_i2caux_destruct(i2caux);
-}
-
-static void destroy(
-	struct i2caux **i2c_engine)
-{
-	destruct(*i2c_engine);
-
-	kfree(*i2c_engine);
-
-	*i2c_engine = NULL;
-}
-
-/* function table */
-static const struct i2caux_funcs i2caux_funcs = {
-	.destroy = destroy,
-	.acquire_i2c_hw_engine = NULL,
-	.release_engine = NULL,
-	.acquire_i2c_sw_engine = NULL,
-	.acquire_aux_engine = NULL,
-};
-
-static void construct(
-	struct i2caux *i2caux,
-	struct dc_context *ctx)
-{
-	dal_i2caux_construct(i2caux, ctx);
-	i2caux->funcs = &i2caux_funcs;
-}
-
-struct i2caux *dal_i2caux_diag_fpga_create(
-	struct dc_context *ctx)
-{
-	struct i2caux *i2caux =	kzalloc(sizeof(struct i2caux),
-					       GFP_KERNEL);
-
-	if (!i2caux) {
-		ASSERT_CRITICAL(false);
-		return NULL;
-	}
-
-	construct(i2caux, ctx);
-	return i2caux;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/diagnostics/i2caux_diag.h b/drivers/gpu/drm/amd/display/dc/i2caux/diagnostics/i2caux_diag.h
deleted file mode 100644
index a83eeb748283..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/diagnostics/i2caux_diag.h
+++ /dev/null
@@ -1,32 +0,0 @@
-/*
- * Copyright 2012-16 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_AUX_DIAG_FPGA_H__
-#define __DAL_I2C_AUX_DIAG_FPGA_H__
-
-struct i2caux *dal_i2caux_diag_fpga_create(
-	struct dc_context *ctx);
-
-#endif /* __DAL_I2C_AUX_DIAG_FPGA_H__ */
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/engine.h b/drivers/gpu/drm/amd/display/dc/i2caux/engine.h
deleted file mode 100644
index b16fb1ff687d..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/engine.h
+++ /dev/null
@@ -1,111 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_ENGINE_H__
-#define __DAL_ENGINE_H__
-
-#include "dc_ddc_types.h"
-
-enum i2caux_transaction_operation {
-	I2CAUX_TRANSACTION_READ,
-	I2CAUX_TRANSACTION_WRITE
-};
-
-enum i2caux_transaction_address_space {
-	I2CAUX_TRANSACTION_ADDRESS_SPACE_I2C = 1,
-	I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD
-};
-
-struct i2caux_transaction_payload {
-	enum i2caux_transaction_address_space address_space;
-	uint32_t address;
-	uint32_t length;
-	uint8_t *data;
-};
-
-enum i2caux_transaction_status {
-	I2CAUX_TRANSACTION_STATUS_UNKNOWN = (-1L),
-	I2CAUX_TRANSACTION_STATUS_SUCCEEDED,
-	I2CAUX_TRANSACTION_STATUS_FAILED_CHANNEL_BUSY,
-	I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT,
-	I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR,
-	I2CAUX_TRANSACTION_STATUS_FAILED_NACK,
-	I2CAUX_TRANSACTION_STATUS_FAILED_INCOMPLETE,
-	I2CAUX_TRANSACTION_STATUS_FAILED_OPERATION,
-	I2CAUX_TRANSACTION_STATUS_FAILED_INVALID_OPERATION,
-	I2CAUX_TRANSACTION_STATUS_FAILED_BUFFER_OVERFLOW,
-	I2CAUX_TRANSACTION_STATUS_FAILED_HPD_DISCON
-};
-
-struct i2caux_transaction_request {
-	enum i2caux_transaction_operation operation;
-	struct i2caux_transaction_payload payload;
-	enum i2caux_transaction_status status;
-};
-
-enum i2caux_engine_type {
-	I2CAUX_ENGINE_TYPE_UNKNOWN = (-1L),
-	I2CAUX_ENGINE_TYPE_AUX,
-	I2CAUX_ENGINE_TYPE_I2C_DDC_HW,
-	I2CAUX_ENGINE_TYPE_I2C_GENERIC_HW,
-	I2CAUX_ENGINE_TYPE_I2C_SW
-};
-
-enum i2c_default_speed {
-	I2CAUX_DEFAULT_I2C_HW_SPEED = 50,
-	I2CAUX_DEFAULT_I2C_SW_SPEED = 50
-};
-
-struct engine;
-
-struct engine_funcs {
-	enum i2caux_engine_type (*get_engine_type)(
-		const struct engine *engine);
-	bool (*acquire)(
-		struct engine *engine,
-		struct ddc *ddc);
-	bool (*submit_request)(
-		struct engine *engine,
-		struct i2caux_transaction_request *request,
-		bool middle_of_transaction);
-	void (*release_engine)(
-		struct engine *engine);
-};
-
-struct engine {
-	const struct engine_funcs *funcs;
-	uint32_t inst;
-	struct ddc *ddc;
-	struct dc_context *ctx;
-};
-
-void dal_i2caux_construct_engine(
-	struct engine *engine,
-	struct dc_context *ctx);
-
-void dal_i2caux_destruct_engine(
-	struct engine *engine);
-
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_engine.c b/drivers/gpu/drm/amd/display/dc/i2caux/i2c_engine.c
deleted file mode 100644
index 70e20bd47ce4..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_engine.c
+++ /dev/null
@@ -1,118 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "engine.h"
-
-/*
- * Header of this unit
- */
-
-#include "i2c_engine.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-/*
- * This unit
- */
-
-#define FROM_ENGINE(ptr) \
-	container_of((ptr), struct i2c_engine, base)
-
-bool dal_i2c_engine_acquire(
-	struct engine *engine,
-	struct ddc *ddc_handle)
-{
-	struct i2c_engine *i2c_engine = FROM_ENGINE(engine);
-
-	uint32_t counter = 0;
-	bool result;
-
-	do {
-		result = i2c_engine->funcs->acquire_engine(
-			i2c_engine, ddc_handle);
-
-		if (result)
-			break;
-
-		/* i2c_engine is busy by VBios, lets wait and retry */
-
-		udelay(10);
-
-		++counter;
-	} while (counter < 2);
-
-	if (result) {
-		if (!i2c_engine->funcs->setup_engine(i2c_engine)) {
-			engine->funcs->release_engine(engine);
-			result = false;
-		}
-	}
-
-	return result;
-}
-
-bool dal_i2c_engine_setup_i2c_engine(
-	struct i2c_engine *engine)
-{
-	/* Derivative classes do not have to override this */
-
-	return true;
-}
-
-void dal_i2c_engine_submit_channel_request(
-	struct i2c_engine *engine,
-	struct i2c_request_transaction_data *request)
-{
-
-}
-
-void dal_i2c_engine_process_channel_reply(
-	struct i2c_engine *engine,
-	struct i2c_reply_transaction_data *reply)
-{
-
-}
-
-void dal_i2c_engine_construct(
-	struct i2c_engine *engine,
-	struct dc_context *ctx)
-{
-	dal_i2caux_construct_engine(&engine->base, ctx);
-	engine->timeout_delay = 0;
-}
-
-void dal_i2c_engine_destruct(
-	struct i2c_engine *engine)
-{
-	dal_i2caux_destruct_engine(&engine->base);
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_engine.h b/drivers/gpu/drm/amd/display/dc/i2caux/i2c_engine.h
deleted file mode 100644
index ded6ea34b714..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_engine.h
+++ /dev/null
@@ -1,115 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_ENGINE_H__
-#define __DAL_I2C_ENGINE_H__
-
-enum i2c_channel_operation_result {
-	I2C_CHANNEL_OPERATION_SUCCEEDED,
-	I2C_CHANNEL_OPERATION_FAILED,
-	I2C_CHANNEL_OPERATION_NOT_GRANTED,
-	I2C_CHANNEL_OPERATION_IS_BUSY,
-	I2C_CHANNEL_OPERATION_NO_HANDLE_PROVIDED,
-	I2C_CHANNEL_OPERATION_CHANNEL_IN_USE,
-	I2C_CHANNEL_OPERATION_CHANNEL_CLIENT_MAX_ALLOWED,
-	I2C_CHANNEL_OPERATION_ENGINE_BUSY,
-	I2C_CHANNEL_OPERATION_TIMEOUT,
-	I2C_CHANNEL_OPERATION_NO_RESPONSE,
-	I2C_CHANNEL_OPERATION_HW_REQUEST_I2C_BUS,
-	I2C_CHANNEL_OPERATION_WRONG_PARAMETER,
-	I2C_CHANNEL_OPERATION_OUT_NB_OF_RETRIES,
-	I2C_CHANNEL_OPERATION_NOT_STARTED
-};
-
-struct i2c_request_transaction_data {
-	enum i2caux_transaction_action action;
-	enum i2c_channel_operation_result status;
-	uint8_t address;
-	uint32_t length;
-	uint8_t *data;
-};
-
-struct i2c_reply_transaction_data {
-	uint32_t length;
-	uint8_t *data;
-};
-
-struct i2c_engine;
-
-struct i2c_engine_funcs {
-	void (*destroy)(
-		struct i2c_engine **ptr);
-	uint32_t (*get_speed)(
-		const struct i2c_engine *engine);
-	void (*set_speed)(
-		struct i2c_engine *engine,
-		uint32_t speed);
-	bool (*acquire_engine)(
-		struct i2c_engine *engine,
-		struct ddc *ddc);
-	bool (*setup_engine)(
-		struct i2c_engine *engine);
-	void (*submit_channel_request)(
-		struct i2c_engine *engine,
-		struct i2c_request_transaction_data *request);
-	void (*process_channel_reply)(
-		struct i2c_engine *engine,
-		struct i2c_reply_transaction_data *reply);
-	enum i2c_channel_operation_result (*get_channel_status)(
-		struct i2c_engine *engine,
-		uint8_t *returned_bytes);
-};
-
-struct i2c_engine {
-	struct engine base;
-	const struct i2c_engine_funcs *funcs;
-	uint32_t timeout_delay;
-	uint32_t setup_limit;
-	uint32_t send_reset_length;
-};
-
-void dal_i2c_engine_construct(
-	struct i2c_engine *engine,
-	struct dc_context *ctx);
-
-void dal_i2c_engine_destruct(
-	struct i2c_engine *engine);
-
-bool dal_i2c_engine_setup_i2c_engine(
-	struct i2c_engine *engine);
-
-void dal_i2c_engine_submit_channel_request(
-	struct i2c_engine *engine,
-	struct i2c_request_transaction_data *request);
-
-void dal_i2c_engine_process_channel_reply(
-	struct i2c_engine *engine,
-	struct i2c_reply_transaction_data *reply);
-
-bool dal_i2c_engine_acquire(
-	struct engine *ptr,
-	struct ddc *ddc_handle);
-
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_generic_hw_engine.c b/drivers/gpu/drm/amd/display/dc/i2caux/i2c_generic_hw_engine.c
deleted file mode 100644
index 5a4295e0fae5..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_generic_hw_engine.c
+++ /dev/null
@@ -1,284 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "engine.h"
-#include "i2c_engine.h"
-#include "i2c_hw_engine.h"
-
-/*
- * Header of this unit
- */
-
-#include "i2c_generic_hw_engine.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-/*
- * This unit
- */
-
-/*
- * @brief
- * Cast 'struct i2c_hw_engine *'
- * to 'struct i2c_generic_hw_engine *'
- */
-#define FROM_I2C_HW_ENGINE(ptr) \
-	container_of((ptr), struct i2c_generic_hw_engine, base)
-
-/*
- * @brief
- * Cast 'struct i2c_engine *'
- * to 'struct i2c_generic_hw_engine *'
- */
-#define FROM_I2C_ENGINE(ptr) \
-	FROM_I2C_HW_ENGINE(container_of((ptr), struct i2c_hw_engine, base))
-
-/*
- * @brief
- * Cast 'struct engine *'
- * to 'struct i2c_generic_hw_engine *'
- */
-#define FROM_ENGINE(ptr) \
-	FROM_I2C_ENGINE(container_of((ptr), struct i2c_engine, base))
-
-enum i2caux_engine_type dal_i2c_generic_hw_engine_get_engine_type(
-	const struct engine *engine)
-{
-	return I2CAUX_ENGINE_TYPE_I2C_GENERIC_HW;
-}
-
-/*
- * @brief
- * Single transaction handling.
- * Since transaction may be bigger than HW buffer size,
- * it divides transaction to sub-transactions
- * and uses batch transaction feature of the engine.
- */
-bool dal_i2c_generic_hw_engine_submit_request(
-	struct engine *engine,
-	struct i2caux_transaction_request *i2caux_request,
-	bool middle_of_transaction)
-{
-	struct i2c_generic_hw_engine *hw_engine = FROM_ENGINE(engine);
-
-	struct i2c_hw_engine *base = &hw_engine->base;
-
-	uint32_t max_payload_size =
-		base->funcs->get_hw_buffer_available_size(base);
-
-	bool initial_stop_bit = !middle_of_transaction;
-
-	struct i2c_generic_transaction_attributes attributes;
-
-	enum i2c_channel_operation_result operation_result =
-		I2C_CHANNEL_OPERATION_FAILED;
-
-	bool result = false;
-
-	/* setup transaction initial properties */
-
-	uint8_t address = i2caux_request->payload.address;
-	uint8_t *current_payload = i2caux_request->payload.data;
-	uint32_t remaining_payload_size = i2caux_request->payload.length;
-
-	bool first_iteration = true;
-
-	if (i2caux_request->operation == I2CAUX_TRANSACTION_READ)
-		attributes.action = I2CAUX_TRANSACTION_ACTION_I2C_READ;
-	else if (i2caux_request->operation == I2CAUX_TRANSACTION_WRITE)
-		attributes.action = I2CAUX_TRANSACTION_ACTION_I2C_WRITE;
-	else {
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_INVALID_OPERATION;
-		return false;
-	}
-
-	/* Do batch transaction.
-	 * Divide read/write data into payloads which fit HW buffer size.
-	 * 1. Single transaction:
-	 *    start_bit = 1, stop_bit depends on session state, ack_on_read = 0;
-	 * 2. Start of batch transaction:
-	 *    start_bit = 1, stop_bit = 0, ack_on_read = 1;
-	 * 3. Middle of batch transaction:
-	 *    start_bit = 0, stop_bit = 0, ack_on_read = 1;
-	 * 4. End of batch transaction:
-	 *    start_bit = 0, stop_bit depends on session state, ack_on_read = 0.
-	 * Session stop bit is set if 'middle_of_transaction' = 0. */
-
-	while (remaining_payload_size) {
-		uint32_t current_transaction_size;
-		uint32_t current_payload_size;
-
-		bool last_iteration;
-		bool stop_bit;
-
-		/* Calculate current transaction size and payload size.
-		 * Transaction size = total number of bytes in transaction,
-		 * including slave's address;
-		 * Payload size = number of data bytes in transaction. */
-
-		if (first_iteration) {
-			/* In the first sub-transaction we send slave's address
-			 * thus we need to reserve one byte for it */
-			current_transaction_size =
-			(remaining_payload_size > max_payload_size - 1) ?
-				max_payload_size :
-				remaining_payload_size + 1;
-
-			current_payload_size = current_transaction_size - 1;
-		} else {
-			/* Second and further sub-transactions will have
-			 * entire buffer reserved for data */
-			current_transaction_size =
-				(remaining_payload_size > max_payload_size) ?
-				max_payload_size :
-				remaining_payload_size;
-
-			current_payload_size = current_transaction_size;
-		}
-
-		last_iteration =
-			(remaining_payload_size == current_payload_size);
-
-		stop_bit = last_iteration ? initial_stop_bit : false;
-
-		/* write slave device address */
-
-		if (first_iteration)
-			hw_engine->funcs->write_address(hw_engine, address);
-
-		/* write current portion of data, if requested */
-
-		if (i2caux_request->operation == I2CAUX_TRANSACTION_WRITE)
-			hw_engine->funcs->write_data(
-				hw_engine,
-				current_payload,
-				current_payload_size);
-
-		/* execute transaction */
-
-		attributes.start_bit = first_iteration;
-		attributes.stop_bit = stop_bit;
-		attributes.last_read = last_iteration;
-		attributes.transaction_size = current_transaction_size;
-
-		hw_engine->funcs->execute_transaction(hw_engine, &attributes);
-
-		/* wait until transaction is processed; if it fails - quit */
-
-		operation_result = base->funcs->wait_on_operation_result(
-			base,
-			base->funcs->get_transaction_timeout(
-				base, current_transaction_size),
-			I2C_CHANNEL_OPERATION_ENGINE_BUSY);
-
-		if (operation_result != I2C_CHANNEL_OPERATION_SUCCEEDED)
-			break;
-
-		/* read current portion of data, if requested */
-
-		/* the read offset should be 1 for first sub-transaction,
-		 * and 0 for any next one */
-
-		if (i2caux_request->operation == I2CAUX_TRANSACTION_READ)
-			hw_engine->funcs->read_data(hw_engine, current_payload,
-				current_payload_size, first_iteration ? 1 : 0);
-
-		/* update loop variables */
-
-		first_iteration = false;
-		current_payload += current_payload_size;
-		remaining_payload_size -= current_payload_size;
-	}
-
-	/* update transaction status */
-
-	switch (operation_result) {
-	case I2C_CHANNEL_OPERATION_SUCCEEDED:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_SUCCEEDED;
-		result = true;
-	break;
-	case I2C_CHANNEL_OPERATION_NO_RESPONSE:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_NACK;
-	break;
-	case I2C_CHANNEL_OPERATION_TIMEOUT:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-	break;
-	case I2C_CHANNEL_OPERATION_FAILED:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_INCOMPLETE;
-	break;
-	default:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_OPERATION;
-	}
-
-	return result;
-}
-
-/*
- * @brief
- * Returns number of microseconds to wait until timeout to be considered
- */
-uint32_t dal_i2c_generic_hw_engine_get_transaction_timeout(
-	const struct i2c_hw_engine *engine,
-	uint32_t length)
-{
-	const struct i2c_engine *base = &engine->base;
-
-	uint32_t speed = base->funcs->get_speed(base);
-
-	if (!speed)
-		return 0;
-
-	/* total timeout = period_timeout * (start + data bits count + stop) */
-
-	return ((1000 * TRANSACTION_TIMEOUT_IN_I2C_CLOCKS) / speed) *
-		(1 + (length << 3) + 1);
-}
-
-void dal_i2c_generic_hw_engine_construct(
-	struct i2c_generic_hw_engine *engine,
-	struct dc_context *ctx)
-{
-	dal_i2c_hw_engine_construct(&engine->base, ctx);
-}
-
-void dal_i2c_generic_hw_engine_destruct(
-	struct i2c_generic_hw_engine *engine)
-{
-	dal_i2c_hw_engine_destruct(&engine->base);
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_generic_hw_engine.h b/drivers/gpu/drm/amd/display/dc/i2caux/i2c_generic_hw_engine.h
deleted file mode 100644
index 1da0397b04a2..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_generic_hw_engine.h
+++ /dev/null
@@ -1,77 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_GENERIC_HW_ENGINE_H__
-#define __DAL_I2C_GENERIC_HW_ENGINE_H__
-
-struct i2c_generic_transaction_attributes {
-	enum i2caux_transaction_action action;
-	uint32_t transaction_size;
-	bool start_bit;
-	bool stop_bit;
-	bool last_read;
-};
-
-struct i2c_generic_hw_engine;
-
-struct i2c_generic_hw_engine_funcs {
-	void (*write_address)(
-		struct i2c_generic_hw_engine *engine,
-		uint8_t address);
-	void (*write_data)(
-		struct i2c_generic_hw_engine *engine,
-		const uint8_t *buffer,
-		uint32_t length);
-	void (*read_data)(
-		struct i2c_generic_hw_engine *engine,
-		uint8_t *buffer,
-		uint32_t length,
-		uint32_t offset);
-	void (*execute_transaction)(
-		struct i2c_generic_hw_engine *engine,
-		struct i2c_generic_transaction_attributes *attributes);
-};
-
-struct i2c_generic_hw_engine {
-	struct i2c_hw_engine base;
-	const struct i2c_generic_hw_engine_funcs *funcs;
-};
-
-void dal_i2c_generic_hw_engine_construct(
-	struct i2c_generic_hw_engine *engine,
-	struct dc_context *ctx);
-
-void dal_i2c_generic_hw_engine_destruct(
-	struct i2c_generic_hw_engine *engine);
-enum i2caux_engine_type dal_i2c_generic_hw_engine_get_engine_type(
-	const struct engine *engine);
-bool dal_i2c_generic_hw_engine_submit_request(
-	struct engine *ptr,
-	struct i2caux_transaction_request *i2caux_request,
-	bool middle_of_transaction);
-uint32_t dal_i2c_generic_hw_engine_get_transaction_timeout(
-	const struct i2c_hw_engine *engine,
-	uint32_t length);
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_hw_engine.c b/drivers/gpu/drm/amd/display/dc/i2caux/i2c_hw_engine.c
deleted file mode 100644
index 141898533e8e..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_hw_engine.c
+++ /dev/null
@@ -1,251 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-#include "dm_event_log.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "engine.h"
-#include "i2c_engine.h"
-
-/*
- * Header of this unit
- */
-
-#include "i2c_hw_engine.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-/*
- * This unit
- */
-
-/*
- * @brief
- * Cast 'struct i2c_engine *'
- * to 'struct i2c_hw_engine *'
- */
-#define FROM_I2C_ENGINE(ptr) \
-	container_of((ptr), struct i2c_hw_engine, base)
-
-/*
- * @brief
- * Cast 'struct engine *'
- * to 'struct i2c_hw_engine *'
- */
-#define FROM_ENGINE(ptr) \
-	FROM_I2C_ENGINE(container_of((ptr), struct i2c_engine, base))
-
-enum i2caux_engine_type dal_i2c_hw_engine_get_engine_type(
-	const struct engine *engine)
-{
-	return I2CAUX_ENGINE_TYPE_I2C_DDC_HW;
-}
-
-bool dal_i2c_hw_engine_submit_request(
-	struct engine *engine,
-	struct i2caux_transaction_request *i2caux_request,
-	bool middle_of_transaction)
-{
-	struct i2c_hw_engine *hw_engine = FROM_ENGINE(engine);
-
-	struct i2c_request_transaction_data request;
-
-	uint32_t transaction_timeout;
-
-	enum i2c_channel_operation_result operation_result;
-
-	bool result = false;
-
-	/* We need following:
-	 * transaction length will not exceed
-	 * the number of free bytes in HW buffer (minus one for address)*/
-
-	if (i2caux_request->payload.length >=
-		hw_engine->funcs->get_hw_buffer_available_size(hw_engine)) {
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_BUFFER_OVERFLOW;
-		return false;
-	}
-
-	if (i2caux_request->operation == I2CAUX_TRANSACTION_READ)
-		request.action = middle_of_transaction ?
-			I2CAUX_TRANSACTION_ACTION_I2C_READ_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_READ;
-	else if (i2caux_request->operation == I2CAUX_TRANSACTION_WRITE)
-		request.action = middle_of_transaction ?
-			I2CAUX_TRANSACTION_ACTION_I2C_WRITE_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_WRITE;
-	else {
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_INVALID_OPERATION;
-		/* [anaumov] in DAL2, there was no "return false" */
-		return false;
-	}
-
-	request.address = (uint8_t)i2caux_request->payload.address;
-	request.length = i2caux_request->payload.length;
-	request.data = i2caux_request->payload.data;
-
-	/* obtain timeout value before submitting request */
-
-	transaction_timeout = hw_engine->funcs->get_transaction_timeout(
-		hw_engine, i2caux_request->payload.length + 1);
-
-	hw_engine->base.funcs->submit_channel_request(
-		&hw_engine->base, &request);
-	/* EVENT_LOG_AUX_REQ(engine->ddc->pin_data->en, EVENT_LOG_AUX_ORIGIN_I2C, */
-	/* request.action, request.address, request.length, request.data); */
-
-	if ((request.status == I2C_CHANNEL_OPERATION_FAILED) ||
-		(request.status == I2C_CHANNEL_OPERATION_ENGINE_BUSY)) {
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_CHANNEL_BUSY;
-		return false;
-	}
-
-	/* wait until transaction proceed */
-
-	operation_result = hw_engine->funcs->wait_on_operation_result(
-		hw_engine,
-		transaction_timeout,
-		I2C_CHANNEL_OPERATION_ENGINE_BUSY);
-
-	/* update transaction status */
-
-	switch (operation_result) {
-	case I2C_CHANNEL_OPERATION_SUCCEEDED:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_SUCCEEDED;
-		result = true;
-	break;
-	case I2C_CHANNEL_OPERATION_NO_RESPONSE:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_NACK;
-	break;
-	case I2C_CHANNEL_OPERATION_TIMEOUT:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-	break;
-	case I2C_CHANNEL_OPERATION_FAILED:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_INCOMPLETE;
-	break;
-	default:
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_OPERATION;
-	}
-
-	if (result && (i2caux_request->operation == I2CAUX_TRANSACTION_READ)) {
-		struct i2c_reply_transaction_data reply;
-
-		reply.data = i2caux_request->payload.data;
-		reply.length = i2caux_request->payload.length;
-
-		hw_engine->base.funcs->
-			process_channel_reply(&hw_engine->base, &reply);
-		/* EVENT_LOG_AUX_REP(engine->ddc->pin_data->en, EVENT_LOG_AUX_ORIGIN_I2C, */
-		/* AUX_TRANSACTION_REPLY_I2C_ACK, reply.length, reply.data); */
-	}
-
-
-
-	return result;
-}
-
-bool dal_i2c_hw_engine_acquire_engine(
-	struct i2c_engine *engine,
-	struct ddc *ddc)
-{
-	enum gpio_result result;
-	uint32_t current_speed;
-
-	result = dal_ddc_open(ddc, GPIO_MODE_HARDWARE,
-		GPIO_DDC_CONFIG_TYPE_MODE_I2C);
-
-	if (result != GPIO_RESULT_OK)
-		return false;
-
-	engine->base.ddc = ddc;
-
-	current_speed = engine->funcs->get_speed(engine);
-
-	if (current_speed)
-		FROM_I2C_ENGINE(engine)->original_speed = current_speed;
-
-	return true;
-}
-/*
- * @brief
- * Queries in a loop for current engine status
- * until retrieved status matches 'expected_result', or timeout occurs.
- * Timeout given in microseconds
- * and the status query frequency is also one per microsecond.
- */
-enum i2c_channel_operation_result dal_i2c_hw_engine_wait_on_operation_result(
-	struct i2c_hw_engine *engine,
-	uint32_t timeout,
-	enum i2c_channel_operation_result expected_result)
-{
-	enum i2c_channel_operation_result result;
-	uint32_t i = 0;
-
-	if (!timeout)
-		return I2C_CHANNEL_OPERATION_SUCCEEDED;
-
-	do {
-		result = engine->base.funcs->get_channel_status(
-			&engine->base, NULL);
-
-		if (result != expected_result)
-			break;
-
-		udelay(1);
-
-		++i;
-	} while (i < timeout);
-
-	return result;
-}
-
-void dal_i2c_hw_engine_construct(
-	struct i2c_hw_engine *engine,
-	struct dc_context *ctx)
-{
-	dal_i2c_engine_construct(&engine->base, ctx);
-	engine->original_speed = I2CAUX_DEFAULT_I2C_HW_SPEED;
-	engine->default_speed = I2CAUX_DEFAULT_I2C_HW_SPEED;
-}
-
-void dal_i2c_hw_engine_destruct(
-	struct i2c_hw_engine *engine)
-{
-	dal_i2c_engine_destruct(&engine->base);
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_hw_engine.h b/drivers/gpu/drm/amd/display/dc/i2caux/i2c_hw_engine.h
deleted file mode 100644
index 8936a994804a..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_hw_engine.h
+++ /dev/null
@@ -1,80 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_HW_ENGINE_H__
-#define __DAL_I2C_HW_ENGINE_H__
-
-enum {
-	TRANSACTION_TIMEOUT_IN_I2C_CLOCKS = 32
-};
-
-struct i2c_hw_engine;
-
-struct i2c_hw_engine_funcs {
-	uint32_t (*get_hw_buffer_available_size)(
-		const struct i2c_hw_engine *engine);
-	enum i2c_channel_operation_result (*wait_on_operation_result)(
-		struct i2c_hw_engine *engine,
-		uint32_t timeout,
-		enum i2c_channel_operation_result expected_result);
-	uint32_t (*get_transaction_timeout)(
-		const struct i2c_hw_engine *engine,
-		uint32_t length);
-};
-
-struct i2c_hw_engine {
-	struct i2c_engine base;
-	const struct i2c_hw_engine_funcs *funcs;
-
-	/* Values below are in kilohertz */
-	uint32_t original_speed;
-	uint32_t default_speed;
-};
-
-void dal_i2c_hw_engine_construct(
-	struct i2c_hw_engine *engine,
-	struct dc_context *ctx);
-
-void dal_i2c_hw_engine_destruct(
-	struct i2c_hw_engine *engine);
-
-enum i2c_channel_operation_result dal_i2c_hw_engine_wait_on_operation_result(
-	struct i2c_hw_engine *engine,
-	uint32_t timeout,
-	enum i2c_channel_operation_result expected_result);
-
-bool dal_i2c_hw_engine_acquire_engine(
-	struct i2c_engine *engine,
-	struct ddc *ddc);
-
-bool dal_i2c_hw_engine_submit_request(
-	struct engine *ptr,
-	struct i2caux_transaction_request *i2caux_request,
-	bool middle_of_transaction);
-
-enum i2caux_engine_type dal_i2c_hw_engine_get_engine_type(
-	const struct engine *engine);
-
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_sw_engine.c b/drivers/gpu/drm/amd/display/dc/i2caux/i2c_sw_engine.c
deleted file mode 100644
index 8e19bb629394..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_sw_engine.c
+++ /dev/null
@@ -1,601 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "engine.h"
-#include "i2c_engine.h"
-
-/*
- * Header of this unit
- */
-
-#include "i2c_sw_engine.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-/*
- * This unit
- */
-
-#define SCL false
-#define SDA true
-
-static inline bool read_bit_from_ddc(
-	struct ddc *ddc,
-	bool data_nor_clock)
-{
-	uint32_t value = 0;
-
-	if (data_nor_clock)
-		dal_gpio_get_value(ddc->pin_data, &value);
-	else
-		dal_gpio_get_value(ddc->pin_clock, &value);
-
-	return (value != 0);
-}
-
-static inline void write_bit_to_ddc(
-	struct ddc *ddc,
-	bool data_nor_clock,
-	bool bit)
-{
-	uint32_t value = bit ? 1 : 0;
-
-	if (data_nor_clock)
-		dal_gpio_set_value(ddc->pin_data, value);
-	else
-		dal_gpio_set_value(ddc->pin_clock, value);
-}
-
-static bool wait_for_scl_high(
-	struct dc_context *ctx,
-	struct ddc *ddc,
-	uint16_t clock_delay_div_4)
-{
-	uint32_t scl_retry = 0;
-	uint32_t scl_retry_max = I2C_SW_TIMEOUT_DELAY / clock_delay_div_4;
-
-	udelay(clock_delay_div_4);
-
-	/* 3 milliseconds delay
-	 * to wake up some displays from "low power" state.
-	 */
-
-	do {
-		if (read_bit_from_ddc(ddc, SCL))
-			return true;
-
-		udelay(clock_delay_div_4);
-
-		++scl_retry;
-	} while (scl_retry <= scl_retry_max);
-
-	return false;
-}
-
-static bool start_sync(
-	struct dc_context *ctx,
-	struct ddc *ddc_handle,
-	uint16_t clock_delay_div_4)
-{
-	uint32_t retry = 0;
-
-	/* The I2C communications start signal is:
-	 * the SDA going low from high, while the SCL is high. */
-
-	write_bit_to_ddc(ddc_handle, SCL, true);
-
-	udelay(clock_delay_div_4);
-
-	do {
-		write_bit_to_ddc(ddc_handle, SDA, true);
-
-		if (!read_bit_from_ddc(ddc_handle, SDA)) {
-			++retry;
-			continue;
-		}
-
-		udelay(clock_delay_div_4);
-
-		write_bit_to_ddc(ddc_handle, SCL, true);
-
-		if (!wait_for_scl_high(ctx, ddc_handle, clock_delay_div_4))
-			break;
-
-		write_bit_to_ddc(ddc_handle, SDA, false);
-
-		udelay(clock_delay_div_4);
-
-		write_bit_to_ddc(ddc_handle, SCL, false);
-
-		udelay(clock_delay_div_4);
-
-		return true;
-	} while (retry <= I2C_SW_RETRIES);
-
-	return false;
-}
-
-static bool stop_sync(
-	struct dc_context *ctx,
-	struct ddc *ddc_handle,
-	uint16_t clock_delay_div_4)
-{
-	uint32_t retry = 0;
-
-	/* The I2C communications stop signal is:
-	 * the SDA going high from low, while the SCL is high. */
-
-	write_bit_to_ddc(ddc_handle, SCL, false);
-
-	udelay(clock_delay_div_4);
-
-	write_bit_to_ddc(ddc_handle, SDA, false);
-
-	udelay(clock_delay_div_4);
-
-	write_bit_to_ddc(ddc_handle, SCL, true);
-
-	if (!wait_for_scl_high(ctx, ddc_handle, clock_delay_div_4))
-		return false;
-
-	write_bit_to_ddc(ddc_handle, SDA, true);
-
-	do {
-		udelay(clock_delay_div_4);
-
-		if (read_bit_from_ddc(ddc_handle, SDA))
-			return true;
-
-		++retry;
-	} while (retry <= 2);
-
-	return false;
-}
-
-static bool write_byte(
-	struct dc_context *ctx,
-	struct ddc *ddc_handle,
-	uint16_t clock_delay_div_4,
-	uint8_t byte)
-{
-	int32_t shift = 7;
-	bool ack;
-
-	/* bits are transmitted serially, starting from MSB */
-
-	do {
-		udelay(clock_delay_div_4);
-
-		write_bit_to_ddc(ddc_handle, SDA, (byte >> shift) & 1);
-
-		udelay(clock_delay_div_4);
-
-		write_bit_to_ddc(ddc_handle, SCL, true);
-
-		if (!wait_for_scl_high(ctx, ddc_handle, clock_delay_div_4))
-			return false;
-
-		write_bit_to_ddc(ddc_handle, SCL, false);
-
-		--shift;
-	} while (shift >= 0);
-
-	/* The display sends ACK by preventing the SDA from going high
-	 * after the SCL pulse we use to send our last data bit.
-	 * If the SDA goes high after that bit, it's a NACK */
-
-	udelay(clock_delay_div_4);
-
-	write_bit_to_ddc(ddc_handle, SDA, true);
-
-	udelay(clock_delay_div_4);
-
-	write_bit_to_ddc(ddc_handle, SCL, true);
-
-	if (!wait_for_scl_high(ctx, ddc_handle, clock_delay_div_4))
-		return false;
-
-	/* read ACK bit */
-
-	ack = !read_bit_from_ddc(ddc_handle, SDA);
-
-	udelay(clock_delay_div_4 << 1);
-
-	write_bit_to_ddc(ddc_handle, SCL, false);
-
-	udelay(clock_delay_div_4 << 1);
-
-	return ack;
-}
-
-static bool read_byte(
-	struct dc_context *ctx,
-	struct ddc *ddc_handle,
-	uint16_t clock_delay_div_4,
-	uint8_t *byte,
-	bool more)
-{
-	int32_t shift = 7;
-
-	uint8_t data = 0;
-
-	/* The data bits are read from MSB to LSB;
-	 * bit is read while SCL is high */
-
-	do {
-		write_bit_to_ddc(ddc_handle, SCL, true);
-
-		if (!wait_for_scl_high(ctx, ddc_handle, clock_delay_div_4))
-			return false;
-
-		if (read_bit_from_ddc(ddc_handle, SDA))
-			data |= (1 << shift);
-
-		write_bit_to_ddc(ddc_handle, SCL, false);
-
-		udelay(clock_delay_div_4 << 1);
-
-		--shift;
-	} while (shift >= 0);
-
-	/* read only whole byte */
-
-	*byte = data;
-
-	udelay(clock_delay_div_4);
-
-	/* send the acknowledge bit:
-	 * SDA low means ACK, SDA high means NACK */
-
-	write_bit_to_ddc(ddc_handle, SDA, !more);
-
-	udelay(clock_delay_div_4);
-
-	write_bit_to_ddc(ddc_handle, SCL, true);
-
-	if (!wait_for_scl_high(ctx, ddc_handle, clock_delay_div_4))
-		return false;
-
-	write_bit_to_ddc(ddc_handle, SCL, false);
-
-	udelay(clock_delay_div_4);
-
-	write_bit_to_ddc(ddc_handle, SDA, true);
-
-	udelay(clock_delay_div_4);
-
-	return true;
-}
-
-static bool i2c_write(
-	struct dc_context *ctx,
-	struct ddc *ddc_handle,
-	uint16_t clock_delay_div_4,
-	uint8_t address,
-	uint32_t length,
-	const uint8_t *data)
-{
-	uint32_t i = 0;
-
-	if (!write_byte(ctx, ddc_handle, clock_delay_div_4, address))
-		return false;
-
-	while (i < length) {
-		if (!write_byte(ctx, ddc_handle, clock_delay_div_4, data[i]))
-			return false;
-		++i;
-	}
-
-	return true;
-}
-
-static bool i2c_read(
-	struct dc_context *ctx,
-	struct ddc *ddc_handle,
-	uint16_t clock_delay_div_4,
-	uint8_t address,
-	uint32_t length,
-	uint8_t *data)
-{
-	uint32_t i = 0;
-
-	if (!write_byte(ctx, ddc_handle, clock_delay_div_4, address))
-		return false;
-
-	while (i < length) {
-		if (!read_byte(ctx, ddc_handle, clock_delay_div_4, data + i,
-			i < length - 1))
-			return false;
-		++i;
-	}
-
-	return true;
-}
-
-/*
- * @brief
- * Cast 'struct i2c_engine *'
- * to 'struct i2c_sw_engine *'
- */
-#define FROM_I2C_ENGINE(ptr) \
-	container_of((ptr), struct i2c_sw_engine, base)
-
-/*
- * @brief
- * Cast 'struct engine *'
- * to 'struct i2c_sw_engine *'
- */
-#define FROM_ENGINE(ptr) \
-	FROM_I2C_ENGINE(container_of((ptr), struct i2c_engine, base))
-
-enum i2caux_engine_type dal_i2c_sw_engine_get_engine_type(
-	const struct engine *engine)
-{
-	return I2CAUX_ENGINE_TYPE_I2C_SW;
-}
-
-bool dal_i2c_sw_engine_submit_request(
-	struct engine *engine,
-	struct i2caux_transaction_request *i2caux_request,
-	bool middle_of_transaction)
-{
-	struct i2c_sw_engine *sw_engine = FROM_ENGINE(engine);
-
-	struct i2c_engine *base = &sw_engine->base;
-
-	struct i2c_request_transaction_data request;
-	bool operation_succeeded = false;
-
-	if (i2caux_request->operation == I2CAUX_TRANSACTION_READ)
-		request.action = middle_of_transaction ?
-			I2CAUX_TRANSACTION_ACTION_I2C_READ_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_READ;
-	else if (i2caux_request->operation == I2CAUX_TRANSACTION_WRITE)
-		request.action = middle_of_transaction ?
-			I2CAUX_TRANSACTION_ACTION_I2C_WRITE_MOT :
-			I2CAUX_TRANSACTION_ACTION_I2C_WRITE;
-	else {
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_INVALID_OPERATION;
-		/* in DAL2, there was no "return false" */
-		return false;
-	}
-
-	request.address = (uint8_t)i2caux_request->payload.address;
-	request.length = i2caux_request->payload.length;
-	request.data = i2caux_request->payload.data;
-
-	base->funcs->submit_channel_request(base, &request);
-
-	if ((request.status == I2C_CHANNEL_OPERATION_ENGINE_BUSY) ||
-		(request.status == I2C_CHANNEL_OPERATION_FAILED))
-		i2caux_request->status =
-			I2CAUX_TRANSACTION_STATUS_FAILED_CHANNEL_BUSY;
-	else {
-		enum i2c_channel_operation_result operation_result;
-
-		do {
-			operation_result =
-				base->funcs->get_channel_status(base, NULL);
-
-			switch (operation_result) {
-			case I2C_CHANNEL_OPERATION_SUCCEEDED:
-				i2caux_request->status =
-					I2CAUX_TRANSACTION_STATUS_SUCCEEDED;
-				operation_succeeded = true;
-			break;
-			case I2C_CHANNEL_OPERATION_NO_RESPONSE:
-				i2caux_request->status =
-					I2CAUX_TRANSACTION_STATUS_FAILED_NACK;
-			break;
-			case I2C_CHANNEL_OPERATION_TIMEOUT:
-				i2caux_request->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_TIMEOUT;
-			break;
-			case I2C_CHANNEL_OPERATION_FAILED:
-				i2caux_request->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_INCOMPLETE;
-			break;
-			default:
-				i2caux_request->status =
-				I2CAUX_TRANSACTION_STATUS_FAILED_OPERATION;
-			break;
-			}
-		} while (operation_result == I2C_CHANNEL_OPERATION_ENGINE_BUSY);
-	}
-
-	return operation_succeeded;
-}
-
-uint32_t dal_i2c_sw_engine_get_speed(
-	const struct i2c_engine *engine)
-{
-	return FROM_I2C_ENGINE(engine)->speed;
-}
-
-void dal_i2c_sw_engine_set_speed(
-	struct i2c_engine *engine,
-	uint32_t speed)
-{
-	struct i2c_sw_engine *sw_engine = FROM_I2C_ENGINE(engine);
-
-	ASSERT(speed);
-
-	sw_engine->speed = speed ? speed : I2CAUX_DEFAULT_I2C_SW_SPEED;
-
-	sw_engine->clock_delay = 1000 / sw_engine->speed;
-
-	if (sw_engine->clock_delay < 12)
-		sw_engine->clock_delay = 12;
-}
-
-bool dal_i2caux_i2c_sw_engine_acquire_engine(
-	struct i2c_engine *engine,
-	struct ddc *ddc)
-{
-	enum gpio_result result;
-
-	result = dal_ddc_open(ddc, GPIO_MODE_FAST_OUTPUT,
-		GPIO_DDC_CONFIG_TYPE_MODE_I2C);
-
-	if (result != GPIO_RESULT_OK)
-		return false;
-
-	engine->base.ddc = ddc;
-
-	return true;
-}
-
-void dal_i2c_sw_engine_submit_channel_request(
-	struct i2c_engine *engine,
-	struct i2c_request_transaction_data *req)
-{
-	struct i2c_sw_engine *sw_engine = FROM_I2C_ENGINE(engine);
-
-	struct ddc *ddc = engine->base.ddc;
-	uint16_t clock_delay_div_4 = sw_engine->clock_delay >> 2;
-
-	/* send sync (start / repeated start) */
-
-	bool result = start_sync(engine->base.ctx, ddc, clock_delay_div_4);
-
-	/* process payload */
-
-	if (result) {
-		switch (req->action) {
-		case I2CAUX_TRANSACTION_ACTION_I2C_WRITE:
-		case I2CAUX_TRANSACTION_ACTION_I2C_WRITE_MOT:
-			result = i2c_write(engine->base.ctx, ddc, clock_delay_div_4,
-				req->address, req->length, req->data);
-		break;
-		case I2CAUX_TRANSACTION_ACTION_I2C_READ:
-		case I2CAUX_TRANSACTION_ACTION_I2C_READ_MOT:
-			result = i2c_read(engine->base.ctx, ddc, clock_delay_div_4,
-				req->address, req->length, req->data);
-		break;
-		default:
-			result = false;
-		break;
-		}
-	}
-
-	/* send stop if not 'mot' or operation failed */
-
-	if (!result ||
-		(req->action == I2CAUX_TRANSACTION_ACTION_I2C_WRITE) ||
-		(req->action == I2CAUX_TRANSACTION_ACTION_I2C_READ))
-		if (!stop_sync(engine->base.ctx, ddc, clock_delay_div_4))
-			result = false;
-
-	req->status = result ?
-		I2C_CHANNEL_OPERATION_SUCCEEDED :
-		I2C_CHANNEL_OPERATION_FAILED;
-}
-
-enum i2c_channel_operation_result dal_i2c_sw_engine_get_channel_status(
-	struct i2c_engine *engine,
-	uint8_t *returned_bytes)
-{
-	/* No arbitration with VBIOS is performed since DCE 6.0 */
-	return I2C_CHANNEL_OPERATION_SUCCEEDED;
-}
-
-void dal_i2c_sw_engine_destruct(
-	struct i2c_sw_engine *engine)
-{
-	dal_i2c_engine_destruct(&engine->base);
-}
-
-static void destroy(
-	struct i2c_engine **ptr)
-{
-	dal_i2c_sw_engine_destruct(FROM_I2C_ENGINE(*ptr));
-
-	kfree(*ptr);
-	*ptr = NULL;
-}
-
-static const struct i2c_engine_funcs i2c_engine_funcs = {
-	.acquire_engine = dal_i2caux_i2c_sw_engine_acquire_engine,
-	.destroy = destroy,
-	.get_speed = dal_i2c_sw_engine_get_speed,
-	.set_speed = dal_i2c_sw_engine_set_speed,
-	.setup_engine = dal_i2c_engine_setup_i2c_engine,
-	.submit_channel_request = dal_i2c_sw_engine_submit_channel_request,
-	.process_channel_reply = dal_i2c_engine_process_channel_reply,
-	.get_channel_status = dal_i2c_sw_engine_get_channel_status,
-};
-
-static void release_engine(
-	struct engine *engine)
-{
-
-}
-
-static const struct engine_funcs engine_funcs = {
-	.release_engine = release_engine,
-	.get_engine_type = dal_i2c_sw_engine_get_engine_type,
-	.acquire = dal_i2c_engine_acquire,
-	.submit_request = dal_i2c_sw_engine_submit_request,
-};
-
-void dal_i2c_sw_engine_construct(
-	struct i2c_sw_engine *engine,
-	const struct i2c_sw_engine_create_arg *arg)
-{
-	dal_i2c_engine_construct(&engine->base, arg->ctx);
-	dal_i2c_sw_engine_set_speed(&engine->base, arg->default_speed);
-	engine->base.funcs = &i2c_engine_funcs;
-	engine->base.base.funcs = &engine_funcs;
-}
-
-struct i2c_engine *dal_i2c_sw_engine_create(
-	const struct i2c_sw_engine_create_arg *arg)
-{
-	struct i2c_sw_engine *engine;
-
-	if (!arg) {
-		BREAK_TO_DEBUGGER();
-		return NULL;
-	}
-
-	engine = kzalloc(sizeof(struct i2c_sw_engine), GFP_KERNEL);
-
-	if (!engine) {
-		BREAK_TO_DEBUGGER();
-		return NULL;
-	}
-
-	dal_i2c_sw_engine_construct(engine, arg);
-	return &engine->base;
-}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_sw_engine.h b/drivers/gpu/drm/amd/display/dc/i2caux/i2c_sw_engine.h
deleted file mode 100644
index 546f15b0d3f1..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2c_sw_engine.h
+++ /dev/null
@@ -1,81 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_SW_ENGINE_H__
-#define __DAL_I2C_SW_ENGINE_H__
-
-enum {
-	I2C_SW_RETRIES = 10,
-	I2C_SW_SCL_READ_RETRIES = 128,
-	/* following value is in microseconds */
-	I2C_SW_TIMEOUT_DELAY = 3000
-};
-
-struct i2c_sw_engine;
-
-struct i2c_sw_engine {
-	struct i2c_engine base;
-	uint32_t clock_delay;
-	/* Values below are in KHz */
-	uint32_t speed;
-	uint32_t default_speed;
-};
-
-struct i2c_sw_engine_create_arg {
-	uint32_t default_speed;
-	struct dc_context *ctx;
-};
-
-void dal_i2c_sw_engine_construct(
-	struct i2c_sw_engine *engine,
-	const struct i2c_sw_engine_create_arg *arg);
-
-bool dal_i2caux_i2c_sw_engine_acquire_engine(
-	struct i2c_engine *engine,
-	struct ddc *ddc_handle);
-
-void dal_i2c_sw_engine_destruct(
-	struct i2c_sw_engine *engine);
-
-struct i2c_engine *dal_i2c_sw_engine_create(
-	const struct i2c_sw_engine_create_arg *arg);
-enum i2caux_engine_type dal_i2c_sw_engine_get_engine_type(
-	const struct engine *engine);
-bool dal_i2c_sw_engine_submit_request(
-	struct engine *ptr,
-	struct i2caux_transaction_request *i2caux_request,
-	bool middle_of_transaction);
-uint32_t dal_i2c_sw_engine_get_speed(
-	const struct i2c_engine *engine);
-void dal_i2c_sw_engine_set_speed(
-	struct i2c_engine *ptr,
-	uint32_t speed);
-void dal_i2c_sw_engine_submit_channel_request(
-	struct i2c_engine *ptr,
-	struct i2c_request_transaction_data *req);
-enum i2c_channel_operation_result dal_i2c_sw_engine_get_channel_status(
-	struct i2c_engine *engine,
-	uint8_t *returned_bytes);
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2caux.c b/drivers/gpu/drm/amd/display/dc/i2caux/i2caux.c
deleted file mode 100644
index 1ad6e49102ff..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2caux.c
+++ /dev/null
@@ -1,491 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dm_services.h"
-
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-#include "dc_bios_types.h"
-
-/*
- * Header of this unit
- */
-
-#include "i2caux.h"
-
-/*
- * Post-requisites: headers required by this unit
- */
-
-#include "engine.h"
-#include "i2c_engine.h"
-#include "aux_engine.h"
-
-/*
- * This unit
- */
-
-#include "dce80/i2caux_dce80.h"
-
-#include "dce100/i2caux_dce100.h"
-
-#include "dce110/i2caux_dce110.h"
-
-#include "dce112/i2caux_dce112.h"
-
-#include "dce120/i2caux_dce120.h"
-
-#if defined(CONFIG_DRM_AMD_DC_DCN1_0)
-#include "dcn10/i2caux_dcn10.h"
-#endif
-
-#include "diagnostics/i2caux_diag.h"
-
-/*
- * @brief
- * Plain API, available publicly
- */
-
-struct i2caux *dal_i2caux_create(
-	struct dc_context *ctx)
-{
-	if (IS_FPGA_MAXIMUS_DC(ctx->dce_environment)) {
-		return dal_i2caux_diag_fpga_create(ctx);
-	}
-
-	switch (ctx->dce_version) {
-	case DCE_VERSION_8_0:
-	case DCE_VERSION_8_1:
-	case DCE_VERSION_8_3:
-		return dal_i2caux_dce80_create(ctx);
-	case DCE_VERSION_11_2:
-	case DCE_VERSION_11_22:
-		return dal_i2caux_dce112_create(ctx);
-	case DCE_VERSION_11_0:
-		return dal_i2caux_dce110_create(ctx);
-	case DCE_VERSION_10_0:
-		return dal_i2caux_dce100_create(ctx);
-	case DCE_VERSION_12_0:
-	case DCE_VERSION_12_1:
-		return dal_i2caux_dce120_create(ctx);
-#if defined(CONFIG_DRM_AMD_DC_DCN1_0)
-	case DCN_VERSION_1_0:
-		return dal_i2caux_dcn10_create(ctx);
-#endif
-
-#if defined(CONFIG_DRM_AMD_DC_DCN1_01)
-	case DCN_VERSION_1_01:
-		return dal_i2caux_dcn10_create(ctx);
-#endif
-	default:
-		BREAK_TO_DEBUGGER();
-		return NULL;
-	}
-}
-
-bool dal_i2caux_submit_i2c_command(
-	struct i2caux *i2caux,
-	struct ddc *ddc,
-	struct i2c_command *cmd)
-{
-	struct i2c_engine *engine;
-	uint8_t index_of_payload = 0;
-	bool result;
-
-	if (!ddc) {
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	if (!cmd) {
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	/*
-	 * default will be SW, however there is a feature flag in adapter
-	 * service that determines whether SW i2c_engine will be available or
-	 * not, if sw i2c is not available we will fallback to hw. This feature
-	 * flag is set to not creating sw i2c engine for every dce except dce80
-	 * currently
-	 */
-	switch (cmd->engine) {
-	case I2C_COMMAND_ENGINE_DEFAULT:
-	case I2C_COMMAND_ENGINE_SW:
-		/* try to acquire SW engine first,
-		 * acquire HW engine if SW engine not available */
-		engine = i2caux->funcs->acquire_i2c_sw_engine(i2caux, ddc);
-
-		if (!engine)
-			engine = i2caux->funcs->acquire_i2c_hw_engine(
-				i2caux, ddc);
-	break;
-	case I2C_COMMAND_ENGINE_HW:
-	default:
-		/* try to acquire HW engine first,
-		 * acquire SW engine if HW engine not available */
-		engine = i2caux->funcs->acquire_i2c_hw_engine(i2caux, ddc);
-
-		if (!engine)
-			engine = i2caux->funcs->acquire_i2c_sw_engine(
-				i2caux, ddc);
-	}
-
-	if (!engine)
-		return false;
-
-	engine->funcs->set_speed(engine, cmd->speed);
-
-	result = true;
-
-	while (index_of_payload < cmd->number_of_payloads) {
-		bool mot = (index_of_payload != cmd->number_of_payloads - 1);
-
-		struct i2c_payload *payload = cmd->payloads + index_of_payload;
-
-		struct i2caux_transaction_request request = { 0 };
-
-		request.operation = payload->write ?
-			I2CAUX_TRANSACTION_WRITE :
-			I2CAUX_TRANSACTION_READ;
-
-		request.payload.address_space =
-			I2CAUX_TRANSACTION_ADDRESS_SPACE_I2C;
-		request.payload.address = (payload->address << 1) |
-			!payload->write;
-		request.payload.length = payload->length;
-		request.payload.data = payload->data;
-
-		if (!engine->base.funcs->submit_request(
-			&engine->base, &request, mot)) {
-			result = false;
-			break;
-		}
-
-		++index_of_payload;
-	}
-
-	i2caux->funcs->release_engine(i2caux, &engine->base);
-
-	return result;
-}
-
-bool dal_i2caux_submit_aux_command(
-	struct i2caux *i2caux,
-	struct ddc *ddc,
-	struct aux_command *cmd)
-{
-	struct aux_engine *engine;
-	uint8_t index_of_payload = 0;
-	bool result;
-	bool mot;
-
-	if (!ddc) {
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	if (!cmd) {
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	engine = i2caux->funcs->acquire_aux_engine(i2caux, ddc);
-
-	if (!engine)
-		return false;
-
-	engine->delay = cmd->defer_delay;
-	engine->max_defer_write_retry = cmd->max_defer_write_retry;
-
-	result = true;
-
-	while (index_of_payload < cmd->number_of_payloads) {
-		struct aux_payload *payload = cmd->payloads + index_of_payload;
-		struct i2caux_transaction_request request = { 0 };
-
-		if (cmd->mot == I2C_MOT_UNDEF)
-			mot = (index_of_payload != cmd->number_of_payloads - 1);
-		else
-			mot = (cmd->mot == I2C_MOT_TRUE);
-
-		request.operation = payload->write ?
-			I2CAUX_TRANSACTION_WRITE :
-			I2CAUX_TRANSACTION_READ;
-
-		if (payload->i2c_over_aux) {
-			request.payload.address_space =
-				I2CAUX_TRANSACTION_ADDRESS_SPACE_I2C;
-
-			request.payload.address = (payload->address << 1) |
-				!payload->write;
-		} else {
-			request.payload.address_space =
-				I2CAUX_TRANSACTION_ADDRESS_SPACE_DPCD;
-
-			request.payload.address = payload->address;
-		}
-
-		request.payload.length = payload->length;
-		request.payload.data = payload->data;
-
-		if (!engine->base.funcs->submit_request(
-			&engine->base, &request, mot)) {
-			result = false;
-			break;
-		}
-
-		++index_of_payload;
-	}
-
-	i2caux->funcs->release_engine(i2caux, &engine->base);
-
-	return result;
-}
-
-static bool get_hw_supported_ddc_line(
-	struct ddc *ddc,
-	enum gpio_ddc_line *line)
-{
-	enum gpio_ddc_line line_found;
-
-	*line = GPIO_DDC_LINE_UNKNOWN;
-
-	if (!ddc) {
-		BREAK_TO_DEBUGGER();
-		return false;
-	}
-
-	if (!ddc->hw_info.hw_supported)
-		return false;
-
-	line_found = dal_ddc_get_line(ddc);
-
-	if (line_found >= GPIO_DDC_LINE_COUNT)
-		return false;
-
-	*line = line_found;
-
-	return true;
-}
-
-void dal_i2caux_configure_aux(
-	struct i2caux *i2caux,
-	struct ddc *ddc,
-	union aux_config cfg)
-{
-	struct aux_engine *engine =
-		i2caux->funcs->acquire_aux_engine(i2caux, ddc);
-
-	if (!engine)
-		return;
-
-	engine->funcs->configure(engine, cfg);
-
-	i2caux->funcs->release_engine(i2caux, &engine->base);
-}
-
-void dal_i2caux_destroy(
-	struct i2caux **i2caux)
-{
-	if (!i2caux || !*i2caux) {
-		BREAK_TO_DEBUGGER();
-		return;
-	}
-
-	(*i2caux)->funcs->destroy(i2caux);
-
-	*i2caux = NULL;
-}
-
-/*
- * @brief
- * An utility function used by 'struct i2caux' and its descendants
- */
-
-uint32_t dal_i2caux_get_reference_clock(
-		struct dc_bios *bios)
-{
-	struct dc_firmware_info info = { { 0 } };
-
-	if (bios->funcs->get_firmware_info(bios, &info) != BP_RESULT_OK)
-		return 0;
-
-	return info.pll_info.crystal_frequency;
-}
-
-/*
- * @brief
- * i2caux
- */
-
-enum {
-	/* following are expressed in KHz */
-	DEFAULT_I2C_SW_SPEED = 50,
-	DEFAULT_I2C_HW_SPEED = 50,
-
-	DEFAULT_I2C_SW_SPEED_100KHZ = 100,
-	DEFAULT_I2C_HW_SPEED_100KHZ = 100,
-
-	/* This is the timeout as defined in DP 1.2a,
-	 * 2.3.4 "Detailed uPacket TX AUX CH State Description". */
-	AUX_TIMEOUT_PERIOD = 400,
-
-	/* Ideally, the SW timeout should be just above 550usec
-	 * which is programmed in HW.
-	 * But the SW timeout of 600usec is not reliable,
-	 * because on some systems, delay_in_microseconds()
-	 * returns faster than it should.
-	 * EPR #379763: by trial-and-error on different systems,
-	 * 700usec is the minimum reliable SW timeout for polling
-	 * the AUX_SW_STATUS.AUX_SW_DONE bit.
-	 * This timeout expires *only* when there is
-	 * AUX Error or AUX Timeout conditions - not during normal operation.
-	 * During normal operation, AUX_SW_STATUS.AUX_SW_DONE bit is set
-	 * at most within ~240usec. That means,
-	 * increasing this timeout will not affect normal operation,
-	 * and we'll timeout after
-	 * SW_AUX_TIMEOUT_PERIOD_MULTIPLIER * AUX_TIMEOUT_PERIOD = 1600usec.
-	 * This timeout is especially important for
-	 * resume from S3 and CTS. */
-	SW_AUX_TIMEOUT_PERIOD_MULTIPLIER = 4
-};
-
-struct i2c_engine *dal_i2caux_acquire_i2c_sw_engine(
-	struct i2caux *i2caux,
-	struct ddc *ddc)
-{
-	enum gpio_ddc_line line;
-	struct i2c_engine *engine = NULL;
-
-	if (get_hw_supported_ddc_line(ddc, &line))
-		engine = i2caux->i2c_sw_engines[line];
-
-	if (!engine)
-		engine = i2caux->i2c_generic_sw_engine;
-
-	if (!engine)
-		return NULL;
-
-	if (!engine->base.funcs->acquire(&engine->base, ddc))
-		return NULL;
-
-	return engine;
-}
-
-struct aux_engine *dal_i2caux_acquire_aux_engine(
-	struct i2caux *i2caux,
-	struct ddc *ddc)
-{
-	enum gpio_ddc_line line;
-	struct aux_engine *engine;
-
-	if (!get_hw_supported_ddc_line(ddc, &line))
-		return NULL;
-
-	engine = i2caux->aux_engines[line];
-
-	if (!engine)
-		return NULL;
-
-	if (!engine->base.funcs->acquire(&engine->base, ddc))
-		return NULL;
-
-	return engine;
-}
-
-void dal_i2caux_release_engine(
-	struct i2caux *i2caux,
-	struct engine *engine)
-{
-	engine->funcs->release_engine(engine);
-
-	dal_ddc_close(engine->ddc);
-
-	engine->ddc = NULL;
-}
-
-void dal_i2caux_construct(
-	struct i2caux *i2caux,
-	struct dc_context *ctx)
-{
-	uint32_t i = 0;
-
-	i2caux->ctx = ctx;
-	do {
-		i2caux->i2c_sw_engines[i] = NULL;
-		i2caux->i2c_hw_engines[i] = NULL;
-		i2caux->aux_engines[i] = NULL;
-
-		++i;
-	} while (i < GPIO_DDC_LINE_COUNT);
-
-	i2caux->i2c_generic_sw_engine = NULL;
-	i2caux->i2c_generic_hw_engine = NULL;
-
-	i2caux->aux_timeout_period =
-		SW_AUX_TIMEOUT_PERIOD_MULTIPLIER * AUX_TIMEOUT_PERIOD;
-
-	if (ctx->dce_version >= DCE_VERSION_11_2) {
-		i2caux->default_i2c_hw_speed = DEFAULT_I2C_HW_SPEED_100KHZ;
-		i2caux->default_i2c_sw_speed = DEFAULT_I2C_SW_SPEED_100KHZ;
-	} else {
-		i2caux->default_i2c_hw_speed = DEFAULT_I2C_HW_SPEED;
-		i2caux->default_i2c_sw_speed = DEFAULT_I2C_SW_SPEED;
-	}
-}
-
-void dal_i2caux_destruct(
-	struct i2caux *i2caux)
-{
-	uint32_t i = 0;
-
-	if (i2caux->i2c_generic_hw_engine)
-		i2caux->i2c_generic_hw_engine->funcs->destroy(
-			&i2caux->i2c_generic_hw_engine);
-
-	if (i2caux->i2c_generic_sw_engine)
-		i2caux->i2c_generic_sw_engine->funcs->destroy(
-			&i2caux->i2c_generic_sw_engine);
-
-	do {
-		if (i2caux->aux_engines[i])
-			i2caux->aux_engines[i]->funcs->destroy(
-				&i2caux->aux_engines[i]);
-
-		if (i2caux->i2c_hw_engines[i])
-			i2caux->i2c_hw_engines[i]->funcs->destroy(
-				&i2caux->i2c_hw_engines[i]);
-
-		if (i2caux->i2c_sw_engines[i])
-			i2caux->i2c_sw_engines[i]->funcs->destroy(
-				&i2caux->i2c_sw_engines[i]);
-
-		++i;
-	} while (i < GPIO_DDC_LINE_COUNT);
-}
-
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/i2caux.h b/drivers/gpu/drm/amd/display/dc/i2caux/i2caux.h
deleted file mode 100644
index 64f51bb06915..000000000000
--- a/drivers/gpu/drm/amd/display/dc/i2caux/i2caux.h
+++ /dev/null
@@ -1,122 +0,0 @@
-/*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DAL_I2C_AUX_H__
-#define __DAL_I2C_AUX_H__
-
-uint32_t dal_i2caux_get_reference_clock(
-	struct dc_bios *bios);
-
-struct i2caux;
-
-struct engine;
-
-struct i2caux_funcs {
-	void (*destroy)(struct i2caux **ptr);
-	struct i2c_engine * (*acquire_i2c_sw_engine)(
-		struct i2caux *i2caux,
-		struct ddc *ddc);
-	struct i2c_engine * (*acquire_i2c_hw_engine)(
-		struct i2caux *i2caux,
-		struct ddc *ddc);
-	struct aux_engine * (*acquire_aux_engine)(
-		struct i2caux *i2caux,
-		struct ddc *ddc);
-	void (*release_engine)(
-		struct i2caux *i2caux,
-		struct engine *engine);
-};
-
-struct i2c_engine;
-struct aux_engine;
-
-struct i2caux {
-	struct dc_context *ctx;
-	const struct i2caux_funcs *funcs;
-	/* On ASIC we have certain amount of lines with HW DDC engine
-	 * (4, 6, or maybe more in the future).
-	 * For every such line, we create separate HW DDC engine
-	 * (since we have these engines in HW) and separate SW DDC engine
-	 * (to allow concurrent use of few lines).
-	 * In similar way we have AUX engines. */
-
-	/* I2C SW engines, per DDC line.
-	 * Only lines with HW DDC support will be initialized */
-	struct i2c_engine *i2c_sw_engines[GPIO_DDC_LINE_COUNT];
-
-	/* I2C HW engines, per DDC line.
-	 * Only lines with HW DDC support will be initialized */
-	struct i2c_engine *i2c_hw_engines[GPIO_DDC_LINE_COUNT];
-
-	/* AUX engines, per DDC line.
-	 * Only lines with HW AUX support will be initialized */
-	struct aux_engine *aux_engines[GPIO_DDC_LINE_COUNT];
-
-	/* For all other lines, we can use
-	 * single instance of generic I2C HW engine
-	 * (since in HW, there is single instance of it)
-	 * or single instance of generic I2C SW engine.
-	 * AUX is not supported for other lines. */
-
-	/* General-purpose I2C SW engine.
-	 * Can be assigned dynamically to any line per transaction */
-	struct i2c_engine *i2c_generic_sw_engine;
-
-	/* General-purpose I2C generic HW engine.
-	 * Can be assigned dynamically to almost any line per transaction */
-	struct i2c_engine *i2c_generic_hw_engine;
-
-	/* [anaumov] in DAL2, there is a Mutex */
-
-	uint32_t aux_timeout_period;
-
-	/* expressed in KHz */
-	uint32_t default_i2c_sw_speed;
-	uint32_t default_i2c_hw_speed;
-};
-
-void dal_i2caux_construct(
-	struct i2caux *i2caux,
-	struct dc_context *ctx);
-
-void dal_i2caux_release_engine(
-	struct i2caux *i2caux,
-	struct engine *engine);
-
-void dal_i2caux_destruct(
-	struct i2caux *i2caux);
-
-void dal_i2caux_destroy(
-	struct i2caux **ptr);
-
-struct i2c_engine *dal_i2caux_acquire_i2c_sw_engine(
-	struct i2caux *i2caux,
-	struct ddc *ddc);
-
-struct aux_engine *dal_i2caux_acquire_aux_engine(
-	struct i2caux *i2caux,
-	struct ddc *ddc);
-
-#endif
diff --git a/drivers/gpu/drm/amd/display/dc/inc/clock_source.h b/drivers/gpu/drm/amd/display/dc/inc/clock_source.h
index 47ef90495376..fe6301cb8681 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/clock_source.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/clock_source.h
@@ -78,7 +78,7 @@ struct csdp_ref_clk_ds_params {
 };
 
 struct pixel_clk_params {
-	uint32_t requested_pix_clk; /* in KHz */
+	uint32_t requested_pix_clk_100hz;
 /*> Requested Pixel Clock
  * (based on Video Timing standard used for requested mode)*/
 	uint32_t requested_sym_clk; /* in KHz */
@@ -104,9 +104,9 @@ struct pixel_clk_params {
  *  with actually calculated Clock and reference Crystal frequency
  */
 struct pll_settings {
-	uint32_t actual_pix_clk;
-	uint32_t adjusted_pix_clk;
-	uint32_t calculated_pix_clk;
+	uint32_t actual_pix_clk_100hz;
+	uint32_t adjusted_pix_clk_100hz;
+	uint32_t calculated_pix_clk_100hz;
 	uint32_t vco_freq;
 	uint32_t reference_freq;
 	uint32_t reference_divider;
@@ -166,6 +166,10 @@ struct clock_source_funcs {
 			struct clock_source *,
 			struct pixel_clk_params *,
 			struct pll_settings *);
+	bool (*get_pixel_clk_frequency_100hz)(
+			struct clock_source *clock_source,
+			unsigned int inst,
+			unsigned int *pixel_clk_khz);
 };
 
 struct clock_source {
diff --git a/drivers/gpu/drm/amd/display/dc/inc/core_status.h b/drivers/gpu/drm/amd/display/dc/inc/core_status.h
index 94fc31080fda..2e61a22ef4b2 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/core_status.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/core_status.h
@@ -30,7 +30,7 @@ enum dc_status {
 	DC_OK = 1,
 
 	DC_NO_CONTROLLER_RESOURCE = 2,
-	DC_NO_STREAM_ENG_RESOURCE = 3,
+	DC_NO_STREAM_ENC_RESOURCE = 3,
 	DC_NO_CLOCK_SOURCE_RESOURCE = 4,
 	DC_FAIL_CONTROLLER_VALIDATE = 5,
 	DC_FAIL_ENC_VALIDATE = 6,
diff --git a/drivers/gpu/drm/amd/display/dc/inc/core_types.h b/drivers/gpu/drm/amd/display/dc/inc/core_types.h
index b168a5e9dd9d..986ed1728644 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/core_types.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/core_types.h
@@ -146,7 +146,7 @@ struct resource_pool {
 	struct mpc *mpc;
 	struct pp_smu_funcs_rv *pp_smu;
 	struct pp_smu_display_requirement_rv pp_smu_req;
-	struct aux_engine *engines[MAX_PIPES];
+	struct dce_aux *engines[MAX_PIPES];
 	struct dce_i2c_hw *hw_i2cs[MAX_PIPES];
 	struct dce_i2c_sw *sw_i2cs[MAX_PIPES];
 	bool i2c_hw_buffer_in_use;
@@ -180,13 +180,8 @@ struct resource_pool {
 	const struct resource_caps *res_cap;
 };
 
-struct dcn_fe_clocks {
-	int dppclk_khz;
-};
-
 struct dcn_fe_bandwidth {
-	struct dcn_fe_clocks calc;
-	struct dcn_fe_clocks cur;
+	int dppclk_khz;
 };
 
 struct stream_resource {
diff --git a/drivers/gpu/drm/amd/display/dc/inc/dc_link_ddc.h b/drivers/gpu/drm/amd/display/dc/inc/dc_link_ddc.h
index 538b83303b86..16fd4dc6c4dd 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/dc_link_ddc.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/dc_link_ddc.h
@@ -64,13 +64,6 @@ void dal_ddc_i2c_payloads_add(
 		uint8_t *data,
 		bool write);
 
-void dal_ddc_aux_payloads_add(
-		struct aux_payloads *payloads,
-		uint32_t address,
-		uint32_t len,
-		uint8_t *data,
-		bool write);
-
 struct ddc_service_init_data {
 	struct graphics_object_id id;
 	struct dc_context *ctx;
@@ -103,12 +96,10 @@ bool dal_ddc_service_query_ddc_data(
 		uint32_t read_size);
 
 int dc_link_aux_transfer(struct ddc_service *ddc,
-			     unsigned int address,
-			     uint8_t *reply,
-			     void *buffer,
-			     unsigned int size,
-			     enum aux_transaction_type type,
-			     enum i2caux_transaction_action action);
+		struct aux_payload *payload);
+
+bool dc_link_aux_transfer_with_retries(struct ddc_service *ddc,
+		struct aux_payload *payload);
 
 void dal_ddc_service_write_scdc_data(
 		struct ddc_service *ddc_service,
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/abm.h b/drivers/gpu/drm/amd/display/dc/inc/hw/abm.h
index abc961c0906e..86dc39a02408 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/abm.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/abm.h
@@ -46,6 +46,7 @@ struct abm_funcs {
 	void (*abm_init)(struct abm *abm);
 	bool (*set_abm_level)(struct abm *abm, unsigned int abm_level);
 	bool (*set_abm_immediate_disable)(struct abm *abm);
+	bool (*set_pipe)(struct abm *abm, unsigned int controller_id);
 	bool (*init_backlight)(struct abm *abm);
 
 	/* backlight_pwm_u16_16 is unsigned 32 bit,
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h b/drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h
index 02f757dd70d4..9d2d8e51306c 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h
@@ -39,6 +39,18 @@ enum segment_order {
 	segment_order__non_contiguous,
 };
 
+struct dcn_hubbub_wm_set {
+	uint32_t wm_set;
+	uint32_t data_urgent;
+	uint32_t pte_meta_urgent;
+	uint32_t sr_enter;
+	uint32_t sr_exit;
+	uint32_t dram_clk_chanage;
+};
+
+struct dcn_hubbub_wm {
+	struct dcn_hubbub_wm_set sets[4];
+};
 
 struct hubbub_funcs {
 	void (*update_dchub)(
@@ -58,7 +70,14 @@ struct hubbub_funcs {
 	bool (*dcc_support_pixel_format)(
 			enum surface_pixel_format format,
 			unsigned int *bytes_per_element);
+
+	void (*wm_read_state)(struct hubbub *hubbub,
+			struct dcn_hubbub_wm *wm);
 };
 
+struct hubbub {
+	const struct hubbub_funcs *funcs;
+	struct dc_context *ctx;
+};
 
 #endif
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/dmcu.h b/drivers/gpu/drm/amd/display/dc/inc/hw/dmcu.h
index cb85eaa9857f..cbaa43853611 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/dmcu.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/dmcu.h
@@ -27,16 +27,22 @@
 
 #include "dm_services_types.h"
 
+/* If HW itself ever powered down it will be 0.
+ * fwDmcuInit will write to 1.
+ * Driver will only call MCP init if current state is 1,
+ * and the MCP command will transition this to 2.
+ */
 enum dmcu_state {
-	DMCU_NOT_INITIALIZED = 0,
-	DMCU_RUNNING = 1
+	DMCU_UNLOADED = 0,
+	DMCU_LOADED_UNINITIALIZED = 1,
+	DMCU_RUNNING = 2,
 };
 
 struct dmcu_version {
-	unsigned int date;
-	unsigned int month;
-	unsigned int year;
 	unsigned int interface_version;
+	unsigned int abm_version;
+	unsigned int psr_version;
+	unsigned int build_version;
 };
 
 struct dmcu {
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h b/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h
index e894e649ce5a..fb7967b39edb 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h
@@ -39,6 +39,11 @@ struct dpp {
 
 };
 
+struct dpp_input_csc_matrix {
+	enum dc_color_space color_space;
+	uint16_t regval[12];
+};
+
 struct dpp_grph_csc_adjustment {
 	struct fixed31_32 temperature_matrix[CSC_TEMPERATURE_MATRIX_SIZE];
 	enum graphics_gamut_adjust_type gamut_adjust_type;
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h b/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h
index 04c6989aac58..1cd07e94ee63 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h
@@ -78,7 +78,8 @@ struct hubp_funcs {
 	bool (*hubp_program_surface_flip_and_addr)(
 		struct hubp *hubp,
 		const struct dc_plane_address *address,
-		bool flip_immediate);
+		bool flip_immediate,
+		uint8_t vmid);
 
 	void (*hubp_program_pte_vm)(
 		struct hubp *hubp,
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h b/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h
index c20fdcaac53b..c9d3e37e9531 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/link_encoder.h
@@ -153,6 +153,7 @@ struct link_encoder_funcs {
 	void (*enable_hpd)(struct link_encoder *enc);
 	void (*disable_hpd)(struct link_encoder *enc);
 	bool (*is_dig_enabled)(struct link_encoder *enc);
+	unsigned int (*get_dig_frontend)(struct link_encoder *enc);
 	void (*destroy)(struct link_encoder **enc);
 };
 
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/mem_input.h b/drivers/gpu/drm/amd/display/dc/inc/hw/mem_input.h
index 06df02ddff6a..da89c2edb07c 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/mem_input.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/mem_input.h
@@ -31,7 +31,7 @@
 #include "dml/display_mode_structs.h"
 
 struct dchub_init_data;
-struct cstate_pstate_watermarks_st1 {
+struct cstate_pstate_watermarks_st {
 	uint32_t cstate_exit_ns;
 	uint32_t cstate_enter_plus_exit_ns;
 	uint32_t pstate_change_ns;
@@ -40,7 +40,7 @@ struct cstate_pstate_watermarks_st1 {
 struct dcn_watermarks {
 	uint32_t pte_meta_urgent_ns;
 	uint32_t urgent_ns;
-	struct cstate_pstate_watermarks_st1 cstate_pstate;
+	struct cstate_pstate_watermarks_st cstate_pstate;
 };
 
 struct dcn_watermark_set {
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h b/drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h
index 53a9b64df11a..4051493557bc 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/stream_encoder.h
@@ -161,6 +161,10 @@ struct stream_encoder_funcs {
 	void (*set_avmute)(
 		struct stream_encoder *enc, bool enable);
 
+	void (*dig_connect_to_otg)(
+		struct stream_encoder *enc,
+		int tg_inst);
+
 };
 
 #endif /* STREAM_ENCODER_H_ */
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
index af700c7dac50..c25f7df7b5e3 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
@@ -134,15 +134,24 @@ struct dc_crtc_timing;
 
 struct drr_params;
 
+
 struct timing_generator_funcs {
 	bool (*validate_timing)(struct timing_generator *tg,
 							const struct dc_crtc_timing *timing);
 	void (*program_timing)(struct timing_generator *tg,
 							const struct dc_crtc_timing *timing,
 							bool use_vbios);
-	void (*program_vline_interrupt)(struct timing_generator *optc,
-			const struct dc_crtc_timing *dc_crtc_timing,
-			unsigned long long vsync_delta);
+	void (*setup_vertical_interrupt0)(
+			struct timing_generator *optc,
+			uint32_t start_line,
+			uint32_t end_line);
+	void (*setup_vertical_interrupt1)(
+			struct timing_generator *optc,
+			uint32_t start_line);
+	void (*setup_vertical_interrupt2)(
+			struct timing_generator *optc,
+			uint32_t start_line);
+
 	bool (*enable_crtc)(struct timing_generator *tg);
 	bool (*disable_crtc)(struct timing_generator *tg);
 	bool (*is_counter_moving)(struct timing_generator *tg);
@@ -159,6 +168,8 @@ struct timing_generator_funcs {
 	bool (*get_otg_active_size)(struct timing_generator *optc,
 			uint32_t *otg_active_width,
 			uint32_t *otg_active_height);
+	bool (*is_matching_timing)(struct timing_generator *tg,
+			const struct dc_crtc_timing *otg_timing);
 	void (*set_early_control)(struct timing_generator *tg,
 							   uint32_t early_cntl);
 	void (*wait_for_state)(struct timing_generator *tg,
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/engine_base.c b/drivers/gpu/drm/amd/display/dc/inc/hw/vmid.h
index 5d155d36d353..037beb0a2a27 100644
--- a/drivers/gpu/drm/amd/display/dc/i2caux/engine_base.c
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/vmid.h
@@ -1,5 +1,5 @@
 /*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
+ * Copyright 2018 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -23,30 +23,27 @@
  *
  */
 
-#include "dm_services.h"
+#ifndef DAL_DC_INC_HW_VMID_H_
+#define DAL_DC_INC_HW_VMID_H_
 
-/*
- * Pre-requisites: headers required by header of this unit
- */
-#include "include/i2caux_interface.h"
-
-/*
- * Header of this unit
- */
-
-#include "engine.h"
+#include "core_types.h"
+#include "dchubbub.h"
 
-void dal_i2caux_construct_engine(
-	struct engine *engine,
-	struct dc_context *ctx)
-{
-	engine->ddc = NULL;
-	engine->ctx = ctx;
-}
+struct dcn_vmid_registers {
+	uint32_t CNTL;
+	uint32_t PAGE_TABLE_BASE_ADDR_HI32;
+	uint32_t PAGE_TABLE_BASE_ADDR_LO32;
+	uint32_t PAGE_TABLE_START_ADDR_HI32;
+	uint32_t PAGE_TABLE_START_ADDR_LO32;
+	uint32_t PAGE_TABLE_END_ADDR_HI32;
+	uint32_t PAGE_TABLE_END_ADDR_LO32;
+};
 
-void dal_i2caux_destruct_engine(
-	struct engine *engine)
-{
-	/* nothing to do */
-}
+struct dcn_vmid_page_table_config {
+	uint64_t	page_table_start_addr;
+	uint64_t	page_table_end_addr;
+	enum dcn_hubbub_page_table_depth	depth;
+	enum dcn_hubbub_page_table_block_size	block_size;
+};
 
+#endif /* DAL_DC_INC_HW_VMID_H_ */
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h b/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h
index d6a85f48b6d1..7676f25216b1 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw_sequencer.h
@@ -38,6 +38,11 @@ enum pipe_gating_control {
 	PIPE_GATING_CONTROL_INIT
 };
 
+enum vline_select {
+	VLINE0,
+	VLINE1
+};
+
 struct dce_hwseq_wa {
 	bool blnd_crtc_trigger;
 	bool DEGVIDCN10_253;
@@ -68,8 +73,14 @@ struct stream_resource;
 
 struct hw_sequencer_funcs {
 
+	void (*disable_stream_gating)(struct dc *dc, struct pipe_ctx *pipe_ctx);
+
+	void (*enable_stream_gating)(struct dc *dc, struct pipe_ctx *pipe_ctx);
+
 	void (*init_hw)(struct dc *dc);
 
+	void (*init_pipes)(struct dc *dc, struct dc_state *context);
+
 	enum dc_status (*apply_ctx_to_hw)(
 			struct dc *dc, struct dc_state *context);
 
@@ -218,6 +229,9 @@ struct hw_sequencer_funcs {
 	void (*set_cursor_attribute)(struct pipe_ctx *pipe);
 	void (*set_cursor_sdr_white_level)(struct pipe_ctx *pipe);
 
+	void (*setup_periodic_interrupt)(struct pipe_ctx *pipe_ctx, enum vline_select vline);
+	void (*setup_vupdate_interrupt)(struct pipe_ctx *pipe_ctx);
+
 };
 
 void color_space_to_black_color(
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_sw_engine_dce110.h b/drivers/gpu/drm/amd/display/dc/inc/vm_helper.h
index c48c61f540a8..193407f76a80 100644
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce110/i2c_sw_engine_dce110.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/vm_helper.h
@@ -1,5 +1,5 @@
 /*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
+ * Copyright 2018 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -23,21 +23,34 @@
  *
  */
 
-#ifndef __DAL_I2C_SW_ENGINE_DCE110_H__
-#define __DAL_I2C_SW_ENGINE_DCE110_H__
+#ifndef DC_INC_VM_HELPER_H_
+#define DC_INC_VM_HELPER_H_
 
-struct i2c_sw_engine_dce110 {
-	struct i2c_sw_engine base;
-	uint32_t engine_id;
+#include "dc_types.h"
+
+#define MAX_VMID 16
+#define MAX_HUBP 6
+
+struct vmid_usage {
+	uint16_t vmid_usage[2];
 };
 
-struct i2c_sw_engine_dce110_create_arg {
-	uint32_t engine_id;
-	uint32_t default_speed;
-	struct dc_context *ctx;
+struct vm_helper {
+	unsigned int num_vmid;
+	unsigned int num_hubp;
+	unsigned int num_vmids_available;
+	uint64_t ptb_assigned_to_vmid[MAX_VMID];
+	struct vmid_usage hubp_vmid_usage[MAX_HUBP];
 };
 
-struct i2c_engine *dal_i2c_sw_engine_dce110_create(
-	const struct i2c_sw_engine_dce110_create_arg *arg);
+uint8_t get_vmid_for_ptb(
+		struct vm_helper *vm_helper,
+		int64_t ptb,
+		uint8_t pipe_idx);
+
+void init_vm_helper(
+	struct vm_helper *vm_helper,
+	unsigned int num_vmid,
+	unsigned int num_hubp);
 
-#endif
+#endif /* DC_INC_VM_HELPER_H_ */
diff --git a/drivers/gpu/drm/amd/display/dc/irq_types.h b/drivers/gpu/drm/amd/display/dc/irq_types.h
index 0b5f3a278c22..d0ccd81ad5b4 100644
--- a/drivers/gpu/drm/amd/display/dc/irq_types.h
+++ b/drivers/gpu/drm/amd/display/dc/irq_types.h
@@ -144,6 +144,14 @@ enum dc_irq_source {
 	DC_IRQ_SOURCE_DC5_VLINE0,
 	DC_IRQ_SOURCE_DC6_VLINE0,
 
+	DC_IRQ_SOURCE_DC1_VLINE1,
+	DC_IRQ_SOURCE_DC2_VLINE1,
+	DC_IRQ_SOURCE_DC3_VLINE1,
+	DC_IRQ_SOURCE_DC4_VLINE1,
+	DC_IRQ_SOURCE_DC5_VLINE1,
+	DC_IRQ_SOURCE_DC6_VLINE1,
+
+
 	DAL_IRQ_SOURCES_NUMBER
 };
 
diff --git a/drivers/gpu/drm/amd/display/include/bios_parser_types.h b/drivers/gpu/drm/amd/display/include/bios_parser_types.h
index 7fd78a696800..01bf01a34a08 100644
--- a/drivers/gpu/drm/amd/display/include/bios_parser_types.h
+++ b/drivers/gpu/drm/amd/display/include/bios_parser_types.h
@@ -211,8 +211,8 @@ struct bp_pixel_clock_parameters {
 	/* signal_type -> Encoder Mode - needed by VBIOS Exec table */
 	enum signal_type signal_type;
 	/* Adjusted Pixel Clock (after VBIOS exec table)
-	 * that becomes Target Pixel Clock (KHz) */
-	uint32_t target_pixel_clock;
+	 * that becomes Target Pixel Clock (100 Hz units) */
+	uint32_t target_pixel_clock_100hz;
 	/* Calculated Reference divider of Display PLL */
 	uint32_t reference_divider;
 	/* Calculated Feedback divider of Display PLL */
diff --git a/drivers/gpu/drm/amd/display/include/dal_asic_id.h b/drivers/gpu/drm/amd/display/include/dal_asic_id.h
index 4f501ddcfb8d..34d6fdcb32e2 100644
--- a/drivers/gpu/drm/amd/display/include/dal_asic_id.h
+++ b/drivers/gpu/drm/amd/display/include/dal_asic_id.h
@@ -131,6 +131,7 @@
 #define INTERNAL_REV_RAVEN_A0             0x00    /* First spin of Raven */
 #define RAVEN_A0 0x01
 #define RAVEN_B0 0x21
+#define PICASSO_A0 0x41
 #if defined(CONFIG_DRM_AMD_DC_DCN1_01)
 /* DCN1_01 */
 #define RAVEN2_A0 0x81
@@ -165,4 +166,6 @@
 
 #define	FAMILY_UNKNOWN 0xFF
 
+
+
 #endif /* __DAL_ASIC_ID_H__ */
diff --git a/drivers/gpu/drm/amd/display/include/gpio_interface.h b/drivers/gpu/drm/amd/display/include/gpio_interface.h
index e4fd31024b92..7de64195dc33 100644
--- a/drivers/gpu/drm/amd/display/include/gpio_interface.h
+++ b/drivers/gpu/drm/amd/display/include/gpio_interface.h
@@ -59,6 +59,14 @@ enum gpio_result dal_gpio_change_mode(
 	struct gpio *gpio,
 	enum gpio_mode mode);
 
+/* Lock Pin */
+enum gpio_result dal_gpio_lock_pin(
+	struct gpio *gpio);
+
+/* Unlock Pin */
+enum gpio_result dal_gpio_unlock_pin(
+	struct gpio *gpio);
+
 /* Get the GPIO id */
 enum gpio_id dal_gpio_get_id(
 	const struct gpio *gpio);
diff --git a/drivers/gpu/drm/amd/display/include/i2caux_interface.h b/drivers/gpu/drm/amd/display/include/i2caux_interface.h
index 13a3c82d118f..bb012cb1a9f5 100644
--- a/drivers/gpu/drm/amd/display/include/i2caux_interface.h
+++ b/drivers/gpu/drm/amd/display/include/i2caux_interface.h
@@ -40,9 +40,19 @@ struct aux_payload {
 	/* set following flag to write data,
 	 * reset it to read data */
 	bool write;
+	bool mot;
 	uint32_t address;
 	uint8_t length;
 	uint8_t *data;
+	/*
+	 * used to return the reply type of the transaction
+	 * ignored if NULL
+	 */
+	uint8_t *reply;
+	/* expressed in milliseconds
+	 * zero means "use default value"
+	 */
+	uint32_t defer_delay;
 };
 
 struct aux_command {
@@ -66,27 +76,4 @@ union aux_config {
 	uint32_t raw;
 };
 
-struct i2caux;
-
-struct i2caux *dal_i2caux_create(
-	struct dc_context *ctx);
-
-bool dal_i2caux_submit_i2c_command(
-	struct i2caux *i2caux,
-	struct ddc *ddc,
-	struct i2c_command *cmd);
-
-bool dal_i2caux_submit_aux_command(
-	struct i2caux *i2caux,
-	struct ddc *ddc,
-	struct aux_command *cmd);
-
-void dal_i2caux_configure_aux(
-	struct i2caux *i2caux,
-	struct ddc *ddc,
-	union aux_config cfg);
-
-void dal_i2caux_destroy(
-	struct i2caux **ptr);
-
 #endif
diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index 479b77c2e89e..0fbc8fbc3541 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -823,7 +823,7 @@ static bool build_freesync_hdr(struct pwl_float_data_ex *rgb_regamma,
 	bool is_clipped = false;
 	struct fixed31_32 sdr_white_level;
 
-	if (fs_params == NULL || fs_params->max_content == 0 ||
+	if (fs_params->max_content == 0 ||
 			fs_params->max_display == 0)
 		return false;
 
@@ -1508,7 +1508,7 @@ static bool map_regamma_hw_to_x_user(
 	struct hw_x_point *coords = coords_x;
 	const struct pwl_float_data_ex *regamma = rgb_regamma;
 
-	if (mapUserRamp) {
+	if (ramp && mapUserRamp) {
 		copy_rgb_regamma_to_coordinates_x(coords,
 				hw_points_num,
 				rgb_regamma);
@@ -1545,7 +1545,7 @@ bool mod_color_calculate_regamma_params(struct dc_transfer_func *output_tf,
 
 	struct pwl_float_data *rgb_user = NULL;
 	struct pwl_float_data_ex *rgb_regamma = NULL;
-	struct gamma_pixel *axix_x = NULL;
+	struct gamma_pixel *axis_x = NULL;
 	struct pixel_gamma_point *coeff = NULL;
 	enum dc_transfer_func_predefined tf = TRANSFER_FUNCTION_SRGB;
 	bool ret = false;
@@ -1555,47 +1555,54 @@ bool mod_color_calculate_regamma_params(struct dc_transfer_func *output_tf,
 
 	/* we can use hardcoded curve for plain SRGB TF */
 	if (output_tf->type == TF_TYPE_PREDEFINED && canRomBeUsed == true &&
-			output_tf->tf == TRANSFER_FUNCTION_SRGB &&
-			(ramp->is_identity || (!mapUserRamp && ramp->type == GAMMA_RGB_256)))
-		return true;
+			output_tf->tf == TRANSFER_FUNCTION_SRGB) {
+		if (ramp == NULL)
+			return true;
+		if (ramp->is_identity || (!mapUserRamp && ramp->type == GAMMA_RGB_256))
+			return true;
+	}
 
 	output_tf->type = TF_TYPE_DISTRIBUTED_POINTS;
 
-	rgb_user = kvcalloc(ramp->num_entries + _EXTRA_POINTS,
+	if (ramp && (mapUserRamp || ramp->type != GAMMA_RGB_256)) {
+		rgb_user = kvcalloc(ramp->num_entries + _EXTRA_POINTS,
 			    sizeof(*rgb_user),
 			    GFP_KERNEL);
-	if (!rgb_user)
-		goto rgb_user_alloc_fail;
+		if (!rgb_user)
+			goto rgb_user_alloc_fail;
+
+		axis_x = kvcalloc(ramp->num_entries + 3, sizeof(*axis_x),
+				GFP_KERNEL);
+		if (!axis_x)
+			goto axis_x_alloc_fail;
+
+		dividers.divider1 = dc_fixpt_from_fraction(3, 2);
+		dividers.divider2 = dc_fixpt_from_int(2);
+		dividers.divider3 = dc_fixpt_from_fraction(5, 2);
+
+		build_evenly_distributed_points(
+				axis_x,
+				ramp->num_entries,
+				dividers);
+
+		if (ramp->type == GAMMA_RGB_256 && mapUserRamp)
+			scale_gamma(rgb_user, ramp, dividers);
+		else if (ramp->type == GAMMA_RGB_FLOAT_1024)
+			scale_gamma_dx(rgb_user, ramp, dividers);
+	}
+
 	rgb_regamma = kvcalloc(MAX_HW_POINTS + _EXTRA_POINTS,
 			       sizeof(*rgb_regamma),
 			       GFP_KERNEL);
 	if (!rgb_regamma)
 		goto rgb_regamma_alloc_fail;
-	axix_x = kvcalloc(ramp->num_entries + 3, sizeof(*axix_x),
-			  GFP_KERNEL);
-	if (!axix_x)
-		goto axix_x_alloc_fail;
+
 	coeff = kvcalloc(MAX_HW_POINTS + _EXTRA_POINTS, sizeof(*coeff),
 			 GFP_KERNEL);
 	if (!coeff)
 		goto coeff_alloc_fail;
 
-	dividers.divider1 = dc_fixpt_from_fraction(3, 2);
-	dividers.divider2 = dc_fixpt_from_int(2);
-	dividers.divider3 = dc_fixpt_from_fraction(5, 2);
-
 	tf = output_tf->tf;
-
-	build_evenly_distributed_points(
-			axix_x,
-			ramp->num_entries,
-			dividers);
-
-	if (ramp->type == GAMMA_RGB_256 && mapUserRamp)
-		scale_gamma(rgb_user, ramp, dividers);
-	else if (ramp->type == GAMMA_RGB_FLOAT_1024)
-		scale_gamma_dx(rgb_user, ramp, dividers);
-
 	if (tf == TRANSFER_FUNCTION_PQ) {
 		tf_pts->end_exponent = 7;
 		tf_pts->x_point_at_y1_red = 125;
@@ -1623,22 +1630,22 @@ bool mod_color_calculate_regamma_params(struct dc_transfer_func *output_tf,
 				coordinates_x, tf == TRANSFER_FUNCTION_SRGB ? true:false);
 	}
 	map_regamma_hw_to_x_user(ramp, coeff, rgb_user,
-			coordinates_x, axix_x, rgb_regamma,
+			coordinates_x, axis_x, rgb_regamma,
 			MAX_HW_POINTS, tf_pts,
-			(mapUserRamp || ramp->type != GAMMA_RGB_256) &&
-			ramp->type != GAMMA_CS_TFM_1D);
+			(mapUserRamp || (ramp && ramp->type != GAMMA_RGB_256)) &&
+			(ramp && ramp->type != GAMMA_CS_TFM_1D));
 
-	if (ramp->type == GAMMA_CS_TFM_1D)
+	if (ramp && ramp->type == GAMMA_CS_TFM_1D)
 		apply_lut_1d(ramp, MAX_HW_POINTS, tf_pts);
 
 	ret = true;
 
 	kvfree(coeff);
 coeff_alloc_fail:
-	kvfree(axix_x);
-axix_x_alloc_fail:
 	kvfree(rgb_regamma);
 rgb_regamma_alloc_fail:
+	kvfree(axis_x);
+axis_x_alloc_fail:
 	kvfree(rgb_user);
 rgb_user_alloc_fail:
 	return ret;
@@ -1758,69 +1765,85 @@ bool mod_color_calculate_degamma_params(struct dc_transfer_func *input_tf,
 {
 	struct dc_transfer_func_distributed_points *tf_pts = &input_tf->tf_pts;
 	struct dividers dividers;
-
 	struct pwl_float_data *rgb_user = NULL;
 	struct pwl_float_data_ex *curve = NULL;
 	struct gamma_pixel *axis_x = NULL;
 	struct pixel_gamma_point *coeff = NULL;
 	enum dc_transfer_func_predefined tf = TRANSFER_FUNCTION_SRGB;
+	uint32_t i;
 	bool ret = false;
 
 	if (input_tf->type == TF_TYPE_BYPASS)
 		return false;
 
-	/* we can use hardcoded curve for plain SRGB TF */
+	/* we can use hardcoded curve for plain SRGB TF
+	 * If linear, it's bypass if on user ramp
+	 */
 	if (input_tf->type == TF_TYPE_PREDEFINED &&
-			input_tf->tf == TRANSFER_FUNCTION_SRGB &&
-			(!mapUserRamp &&
-			(ramp->type == GAMMA_RGB_256 || ramp->num_entries == 0)))
+			(input_tf->tf == TRANSFER_FUNCTION_SRGB ||
+					input_tf->tf == TRANSFER_FUNCTION_LINEAR) &&
+					!mapUserRamp)
 		return true;
 
 	input_tf->type = TF_TYPE_DISTRIBUTED_POINTS;
 
-	rgb_user = kvcalloc(ramp->num_entries + _EXTRA_POINTS,
-			    sizeof(*rgb_user),
-			    GFP_KERNEL);
-	if (!rgb_user)
-		goto rgb_user_alloc_fail;
+	if (mapUserRamp && ramp && ramp->type == GAMMA_RGB_256) {
+		rgb_user = kvcalloc(ramp->num_entries + _EXTRA_POINTS,
+				sizeof(*rgb_user),
+				GFP_KERNEL);
+		if (!rgb_user)
+			goto rgb_user_alloc_fail;
+
+		axis_x = kvcalloc(ramp->num_entries + _EXTRA_POINTS, sizeof(*axis_x),
+				GFP_KERNEL);
+		if (!axis_x)
+			goto axis_x_alloc_fail;
+
+		dividers.divider1 = dc_fixpt_from_fraction(3, 2);
+		dividers.divider2 = dc_fixpt_from_int(2);
+		dividers.divider3 = dc_fixpt_from_fraction(5, 2);
+
+		build_evenly_distributed_points(
+				axis_x,
+				ramp->num_entries,
+				dividers);
+
+		scale_gamma(rgb_user, ramp, dividers);
+	}
+
 	curve = kvcalloc(MAX_HW_POINTS + _EXTRA_POINTS, sizeof(*curve),
-			 GFP_KERNEL);
+			GFP_KERNEL);
 	if (!curve)
 		goto curve_alloc_fail;
-	axis_x = kvcalloc(ramp->num_entries + _EXTRA_POINTS, sizeof(*axis_x),
-			  GFP_KERNEL);
-	if (!axis_x)
-		goto axis_x_alloc_fail;
+
 	coeff = kvcalloc(MAX_HW_POINTS + _EXTRA_POINTS, sizeof(*coeff),
-			 GFP_KERNEL);
+			GFP_KERNEL);
 	if (!coeff)
 		goto coeff_alloc_fail;
 
-	dividers.divider1 = dc_fixpt_from_fraction(3, 2);
-	dividers.divider2 = dc_fixpt_from_int(2);
-	dividers.divider3 = dc_fixpt_from_fraction(5, 2);
-
 	tf = input_tf->tf;
 
-	build_evenly_distributed_points(
-			axis_x,
-			ramp->num_entries,
-			dividers);
-
-	if (ramp->type == GAMMA_RGB_256 && mapUserRamp)
-		scale_gamma(rgb_user, ramp, dividers);
-	else if (ramp->type == GAMMA_RGB_FLOAT_1024)
-		scale_gamma_dx(rgb_user, ramp, dividers);
-
 	if (tf == TRANSFER_FUNCTION_PQ)
 		build_de_pq(curve,
 				MAX_HW_POINTS,
 				coordinates_x);
-	else
+	else if (tf == TRANSFER_FUNCTION_SRGB ||
+			tf == TRANSFER_FUNCTION_BT709)
 		build_degamma(curve,
 				MAX_HW_POINTS,
 				coordinates_x,
-				tf == TRANSFER_FUNCTION_SRGB ? true:false);
+				tf == TRANSFER_FUNCTION_SRGB ? true : false);
+	else if (tf == TRANSFER_FUNCTION_LINEAR) {
+		// just copy coordinates_x into curve
+		i = 0;
+		while (i != MAX_HW_POINTS + 1) {
+			curve[i].r = coordinates_x[i].x;
+			curve[i].g = curve[i].r;
+			curve[i].b = curve[i].r;
+			i++;
+		}
+	} else
+		goto invalid_tf_fail;
 
 	tf_pts->end_exponent = 0;
 	tf_pts->x_point_at_y1_red = 1;
@@ -1830,23 +1853,21 @@ bool mod_color_calculate_degamma_params(struct dc_transfer_func *input_tf,
 	map_regamma_hw_to_x_user(ramp, coeff, rgb_user,
 			coordinates_x, axis_x, curve,
 			MAX_HW_POINTS, tf_pts,
-			mapUserRamp && ramp->type != GAMMA_CUSTOM);
-	if (ramp->type == GAMMA_CUSTOM)
-		apply_lut_1d(ramp, MAX_HW_POINTS, tf_pts);
+			mapUserRamp && ramp && ramp->type == GAMMA_RGB_256);
 
 	ret = true;
 
+invalid_tf_fail:
 	kvfree(coeff);
 coeff_alloc_fail:
-	kvfree(axis_x);
-axis_x_alloc_fail:
 	kvfree(curve);
 curve_alloc_fail:
+	kvfree(axis_x);
+axis_x_alloc_fail:
 	kvfree(rgb_user);
 rgb_user_alloc_fail:
 
 	return ret;
-
 }
 
 
diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index 1544ed3f1747..94a84bc57c7a 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -108,8 +108,8 @@ static unsigned int calc_duration_in_us_from_v_total(
 {
 	unsigned int duration_in_us =
 			(unsigned int)(div64_u64(((unsigned long long)(v_total)
-				* 1000) * stream->timing.h_total,
-					stream->timing.pix_clk_khz));
+				* 10000) * stream->timing.h_total,
+					stream->timing.pix_clk_100hz));
 
 	return duration_in_us;
 }
@@ -126,7 +126,7 @@ static unsigned int calc_v_total_from_refresh(
 					refresh_in_uhz)));
 
 	v_total = div64_u64(div64_u64(((unsigned long long)(
-			frame_duration_in_ns) * stream->timing.pix_clk_khz),
+			frame_duration_in_ns) * (stream->timing.pix_clk_100hz / 10)),
 			stream->timing.h_total), 1000000);
 
 	/* v_total cannot be less than nominal */
@@ -152,7 +152,7 @@ static unsigned int calc_v_total_from_duration(
 		duration_in_us = vrr->max_duration_in_us;
 
 	v_total = div64_u64(div64_u64(((unsigned long long)(
-				duration_in_us) * stream->timing.pix_clk_khz),
+				duration_in_us) * (stream->timing.pix_clk_100hz / 10)),
 				stream->timing.h_total), 1000);
 
 	/* v_total cannot be less than nominal */
@@ -227,7 +227,7 @@ static void update_v_total_for_static_ramp(
 	}
 
 	v_total = div64_u64(div64_u64(((unsigned long long)(
-			current_duration_in_us) * stream->timing.pix_clk_khz),
+			current_duration_in_us) * (stream->timing.pix_clk_100hz / 10)),
 				stream->timing.h_total), 1000);
 
 	in_out_vrr->adjust.v_total_min = v_total;
@@ -461,6 +461,26 @@ bool mod_freesync_get_v_position(struct mod_freesync *mod_freesync,
 	return false;
 }
 
+static void build_vrr_infopacket_header_vtem(enum signal_type signal,
+		struct dc_info_packet *infopacket)
+{
+	// HEADER
+
+	// HB0, HB1, HB2 indicates PacketType VTEMPacket
+	infopacket->hb0 = 0x7F;
+	infopacket->hb1 = 0xC0;
+	infopacket->hb2 = 0x00;
+	/* HB3 Bit Fields
+	 * Reserved :1 = 0
+	 * Sync     :1 = 0
+	 * VFR      :1 = 1
+	 * Ds_Type  :2 = 0
+	 * End      :1 = 0
+	 * New      :1 = 0
+	 */
+	infopacket->hb3 = 0x20;
+}
+
 static void build_vrr_infopacket_header_v1(enum signal_type signal,
 		struct dc_info_packet *infopacket,
 		unsigned int *payload_size)
@@ -559,6 +579,54 @@ static void build_vrr_infopacket_header_v2(enum signal_type signal,
 	}
 }
 
+static void build_vrr_vtem_infopacket_data(const struct dc_stream_state *stream,
+		const struct mod_vrr_params *vrr,
+		struct dc_info_packet *infopacket)
+{
+	/* dc_info_packet to VtemPacket Translation of Bit-fields,
+	 * SB[6]
+	 * unsigned char VRR_EN        :1
+	 * unsigned char M_CONST       :1
+	 * unsigned char Reserved2     :2
+	 * unsigned char FVA_Factor_M1 :4
+	 * SB[7]
+	 * unsigned char Base_Vfront   :8
+	 * SB[8]
+	 * unsigned char Base_Refresh_Rate_98 :2
+	 * unsigned char RB                   :1
+	 * unsigned char Reserved3            :5
+	 * SB[9]
+	 * unsigned char Base_RefreshRate_07  :8
+	 */
+	unsigned int fieldRateInHz;
+
+	if (vrr->state == VRR_STATE_ACTIVE_VARIABLE ||
+				vrr->state == VRR_STATE_ACTIVE_FIXED){
+		infopacket->sb[6] |= 0x80; //VRR_EN Bit = 1
+	} else {
+		infopacket->sb[6] &= 0x7F; //VRR_EN Bit = 0
+	}
+
+	if (!stream->timing.vic) {
+		infopacket->sb[7] = stream->timing.v_front_porch;
+
+		/* TODO: In dal2, we check mode flags for a reduced blanking timing.
+		 * Need a way to relay that information to this function.
+		 * if("ReducedBlanking")
+		 * {
+		 *   infopacket->sb[8] |= 0x20; //Set 3rd bit to 1
+		 * }
+		 */
+		fieldRateInHz = (stream->timing.pix_clk_100hz * 100)/
+				(stream->timing.h_total * stream->timing.v_total);
+
+		infopacket->sb[8] |= ((fieldRateInHz & 0x300) >> 2);
+		infopacket->sb[9] |= fieldRateInHz & 0xFF;
+
+	}
+	infopacket->valid = true;
+}
+
 static void build_vrr_infopacket_data(const struct mod_vrr_params *vrr,
 		struct dc_info_packet *infopacket)
 {
@@ -672,6 +740,19 @@ static void build_vrr_infopacket_v2(enum signal_type signal,
 	infopacket->valid = true;
 }
 
+static void build_vrr_infopacket_vtem(const struct dc_stream_state *stream,
+		const struct mod_vrr_params *vrr,
+		struct dc_info_packet *infopacket)
+{
+	//VTEM info packet for HdmiVrr
+
+	//VTEM Packet is structured differently
+	build_vrr_infopacket_header_vtem(stream->signal, infopacket);
+	build_vrr_vtem_infopacket_data(stream, vrr, infopacket);
+
+	infopacket->valid = true;
+}
+
 void mod_freesync_build_vrr_infopacket(struct mod_freesync *mod_freesync,
 		const struct dc_stream_state *stream,
 		const struct mod_vrr_params *vrr,
@@ -679,18 +760,21 @@ void mod_freesync_build_vrr_infopacket(struct mod_freesync *mod_freesync,
 		const enum color_transfer_func *app_tf,
 		struct dc_info_packet *infopacket)
 {
-	/* SPD info packet for FreeSync */
-
-	/* Check if Freesync is supported. Return if false. If true,
+	/* SPD info packet for FreeSync
+	 * VTEM info packet for HdmiVRR
+	 * Check if Freesync is supported. Return if false. If true,
 	 * set the corresponding bit in the info packet
 	 */
-	if (!vrr->supported || !vrr->send_vsif)
+	if (!vrr->supported || (!vrr->send_info_frame && packet_type != PACKET_TYPE_VTEM))
 		return;
 
 	switch (packet_type) {
 	case PACKET_TYPE_FS2:
 		build_vrr_infopacket_v2(stream->signal, vrr, app_tf, infopacket);
 		break;
+	case PACKET_TYPE_VTEM:
+		build_vrr_infopacket_vtem(stream, vrr, infopacket);
+		break;
 	case PACKET_TYPE_VRR:
 	case PACKET_TYPE_FS1:
 	default:
@@ -739,7 +823,7 @@ void mod_freesync_build_vrr_params(struct mod_freesync *mod_freesync,
 		return;
 
 	in_out_vrr->state = in_config->state;
-	in_out_vrr->send_vsif = in_config->vsif_supported;
+	in_out_vrr->send_info_frame = in_config->vsif_supported;
 
 	if (in_config->state == VRR_STATE_UNSUPPORTED) {
 		in_out_vrr->state = VRR_STATE_UNSUPPORTED;
@@ -972,7 +1056,7 @@ unsigned long long mod_freesync_calc_nominal_field_rate(
 	unsigned long long nominal_field_rate_in_uhz = 0;
 
 	/* Calculate nominal field rate for stream */
-	nominal_field_rate_in_uhz = stream->timing.pix_clk_khz;
+	nominal_field_rate_in_uhz = stream->timing.pix_clk_100hz / 10;
 	nominal_field_rate_in_uhz *= 1000ULL * 1000ULL * 1000ULL;
 	nominal_field_rate_in_uhz = div_u64(nominal_field_rate_in_uhz,
 						stream->timing.h_total);
diff --git a/drivers/gpu/drm/amd/display/modules/inc/mod_freesync.h b/drivers/gpu/drm/amd/display/modules/inc/mod_freesync.h
index 949a8b62aa98..4222e403b151 100644
--- a/drivers/gpu/drm/amd/display/modules/inc/mod_freesync.h
+++ b/drivers/gpu/drm/amd/display/modules/inc/mod_freesync.h
@@ -104,7 +104,7 @@ struct mod_vrr_params_fixed_refresh {
 
 struct mod_vrr_params {
 	bool supported;
-	bool send_vsif;
+	bool send_info_frame;
 	enum mod_vrr_state state;
 
 	uint32_t min_refresh_in_uhz;
diff --git a/drivers/gpu/drm/amd/display/modules/inc/mod_shared.h b/drivers/gpu/drm/amd/display/modules/inc/mod_shared.h
index 1bd02c0ac30c..b711e7e6c204 100644
--- a/drivers/gpu/drm/amd/display/modules/inc/mod_shared.h
+++ b/drivers/gpu/drm/amd/display/modules/inc/mod_shared.h
@@ -41,7 +41,8 @@ enum color_transfer_func {
 enum vrr_packet_type {
 	PACKET_TYPE_VRR,
 	PACKET_TYPE_FS1,
-	PACKET_TYPE_FS2
+	PACKET_TYPE_FS2,
+	PACKET_TYPE_VTEM
 };
 
 
diff --git a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
index c11a443dcbc8..038b88221c5f 100644
--- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
+++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
@@ -41,6 +41,17 @@ static const unsigned char min_reduction_table[13] = {
 static const unsigned char max_reduction_table[13] = {
 0xf5, 0xe5, 0xd9, 0xcd, 0xb1, 0xa5, 0xa5, 0x80, 0x65, 0x4d, 0x4d, 0x4d, 0x32};
 
+/* ABM 2.2 Min Reduction effectively disabled (100% for all configs)*/
+static const unsigned char min_reduction_table_v_2_2[13] = {
+0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+
+/* Possible ABM 2.2 Max Reduction configs from least aggressive to most aggressive
+ *  0    1     2     3     4     5     6     7     8     9     10    11   12
+ * 96.1 89.8 74.9  69.4  64.7  52.2  48.6  39.6  30.2  25.1  19.6  12.5  12.5 %
+ */
+static const unsigned char max_reduction_table_v_2_2[13] = {
+0xf5, 0xe5, 0xbf, 0xb1, 0xa5, 0x85, 0x7c, 0x65, 0x4d, 0x40, 0x32, 0x20, 0x20};
+
 /* Predefined ABM configuration sets. We may have different configuration sets
  * in order to satisfy different power/quality requirements.
  */
@@ -56,6 +67,13 @@ static const unsigned char abm_config[abm_defines_max_config][abm_defines_max_le
 #define NUM_AGGR_LEVEL    4
 #define NUM_POWER_FN_SEGS 8
 #define NUM_BL_CURVE_SEGS 16
+#define IRAM_SIZE 256
+
+#define IRAM_RESERVE_AREA_START_V2 0xF0  // reserve 0xF0~0xF6 are write by DMCU only
+#define IRAM_RESERVE_AREA_END_V2 0xF6  // reserve 0xF0~0xF6 are write by DMCU only
+
+#define IRAM_RESERVE_AREA_START_V2_2 0xF0  // reserve 0xF0~0xFF are write by DMCU only
+#define IRAM_RESERVE_AREA_END_V2_2 0xFF  // reserve 0xF0~0xFF are write by DMCU only
 
 #pragma pack(push, 1)
 /* NOTE: iRAM is 256B in size */
@@ -86,11 +104,10 @@ struct iram_table_v_2 {
 
 	/* For reading PSR State directly from IRAM */
 	uint8_t psr_state;						/* 0xf0       */
-	uint8_t dmcu_interface_version;					/* 0xf1       */
-	uint8_t dmcu_date_version_year_b0;				/* 0xf2       */
-	uint8_t dmcu_date_version_year_b1;				/* 0xf3       */
-	uint8_t dmcu_date_version_month;				/* 0xf4       */
-	uint8_t dmcu_date_version_day;					/* 0xf5       */
+	uint8_t dmcu_mcp_interface_version;							/* 0xf1       */
+	uint8_t dmcu_abm_feature_version;							/* 0xf2       */
+	uint8_t dmcu_psr_feature_version;							/* 0xf3       */
+	uint16_t dmcu_version;										/* 0xf4       */
 	uint8_t dmcu_state;						/* 0xf6       */
 
 	uint16_t blRampReduction;					/* 0xf7       */
@@ -101,20 +118,58 @@ struct iram_table_v_2 {
 	uint8_t dummy8;							/* 0xfe       */
 	uint8_t dummy9;							/* 0xff       */
 };
-#pragma pack(pop)
 
-static uint16_t backlight_8_to_16(unsigned int backlight_8bit)
-{
-	return (uint16_t)(backlight_8bit * 0x101);
-}
+struct iram_table_v_2_2 {
+	/* flags                      */
+	uint16_t flags;							/* 0x00 U16  */
+
+	/* parameters for ABM2.2 algorithm */
+	uint8_t min_reduction[NUM_AMBI_LEVEL][NUM_AGGR_LEVEL];		/* 0x02 U0.8 */
+	uint8_t max_reduction[NUM_AMBI_LEVEL][NUM_AGGR_LEVEL];		/* 0x16 U0.8 */
+	uint8_t bright_pos_gain[NUM_AMBI_LEVEL][NUM_AGGR_LEVEL];	/* 0x2a U2.6 */
+	uint8_t dark_pos_gain[NUM_AMBI_LEVEL][NUM_AGGR_LEVEL];		/* 0x3e U2.6 */
+	uint8_t hybridFactor[NUM_AGGR_LEVEL];						/* 0x52 U0.8 */
+	uint8_t contrastFactor[NUM_AGGR_LEVEL];						/* 0x56 U0.8 */
+	uint8_t deviation_gain[NUM_AGGR_LEVEL];						/* 0x5a U0.8 */
+	uint8_t iir_curve[NUM_AMBI_LEVEL];							/* 0x5e U0.8 */
+	uint8_t pad[29];											/* 0x63 U0.8 */
+
+	/* parameters for crgb conversion */
+	uint16_t crgb_thresh[NUM_POWER_FN_SEGS];					/* 0x80 U3.13 */
+	uint16_t crgb_offset[NUM_POWER_FN_SEGS];					/* 0x90 U1.15 */
+	uint16_t crgb_slope[NUM_POWER_FN_SEGS];						/* 0xa0 U4.12 */
+
+	/* parameters for custom curve */
+	/* thresholds for brightness --> backlight */
+	uint16_t backlight_thresholds[NUM_BL_CURVE_SEGS];			/* 0xb0 U16.0 */
+	/* offsets for brightness --> backlight */
+	uint16_t backlight_offsets[NUM_BL_CURVE_SEGS];				/* 0xd0 U16.0 */
+
+	/* For reading PSR State directly from IRAM */
+	uint8_t psr_state;											/* 0xf0       */
+	uint8_t dmcu_mcp_interface_version;							/* 0xf1       */
+	uint8_t dmcu_abm_feature_version;							/* 0xf2       */
+	uint8_t dmcu_psr_feature_version;							/* 0xf3       */
+	uint16_t dmcu_version;										/* 0xf4       */
+	uint8_t dmcu_state;											/* 0xf6       */
+
+	uint8_t dummy1;												/* 0xf7       */
+	uint8_t dummy2;												/* 0xf8       */
+	uint8_t dummy3;												/* 0xf9       */
+	uint8_t dummy4;												/* 0xfa       */
+	uint8_t dummy5;												/* 0xfb       */
+	uint8_t dummy6;												/* 0xfc       */
+	uint8_t dummy7;												/* 0xfd       */
+	uint8_t dummy8;												/* 0xfe       */
+	uint8_t dummy9;												/* 0xff       */
+};
+#pragma pack(pop)
 
 static void fill_backlight_transform_table(struct dmcu_iram_parameters params,
 		struct iram_table_v_2 *table)
 {
 	unsigned int i;
 	unsigned int num_entries = NUM_BL_CURVE_SEGS;
-	unsigned int query_input_8bit;
-	unsigned int query_output_8bit;
 	unsigned int lut_index;
 
 	table->backlight_thresholds[0] = 0;
@@ -132,24 +187,368 @@ static void fill_backlight_transform_table(struct dmcu_iram_parameters params,
 	 * format U4.10.
 	 */
 	for (i = 1; i+1 < num_entries; i++) {
-		query_input_8bit = DIV_ROUNDUP((i * 256), num_entries);
+		lut_index = (params.backlight_lut_array_size - 1) * i / (num_entries - 1);
+		ASSERT(lut_index < params.backlight_lut_array_size);
+
+		table->backlight_thresholds[i] =
+			cpu_to_be16(DIV_ROUNDUP((i * 65536), num_entries));
+		table->backlight_offsets[i] =
+			cpu_to_be16(params.backlight_lut_array[lut_index]);
+	}
+}
 
+static void fill_backlight_transform_table_v_2_2(struct dmcu_iram_parameters params,
+		struct iram_table_v_2_2 *table)
+{
+	unsigned int i;
+	unsigned int num_entries = NUM_BL_CURVE_SEGS;
+	unsigned int lut_index;
+
+	table->backlight_thresholds[0] = 0;
+	table->backlight_offsets[0] = params.backlight_lut_array[0];
+	table->backlight_thresholds[num_entries-1] = 0xFFFF;
+	table->backlight_offsets[num_entries-1] =
+		params.backlight_lut_array[params.backlight_lut_array_size - 1];
+
+	/* Setup all brightness levels between 0% and 100% exclusive
+	 * Fills brightness-to-backlight transform table. Backlight custom curve
+	 * describes transform from brightness to backlight. It will be defined
+	 * as set of thresholds and set of offsets, together, implying
+	 * extrapolation of custom curve into 16 uniformly spanned linear
+	 * segments.  Each threshold/offset represented by 16 bit entry in
+	 * format U4.10.
+	 */
+	for (i = 1; i+1 < num_entries; i++) {
 		lut_index = (params.backlight_lut_array_size - 1) * i / (num_entries - 1);
 		ASSERT(lut_index < params.backlight_lut_array_size);
-		query_output_8bit = params.backlight_lut_array[lut_index] >> 8;
 
 		table->backlight_thresholds[i] =
-				backlight_8_to_16(query_input_8bit);
+			cpu_to_be16(DIV_ROUNDUP((i * 65536), num_entries));
 		table->backlight_offsets[i] =
-				backlight_8_to_16(query_output_8bit);
+			cpu_to_be16(params.backlight_lut_array[lut_index]);
 	}
 }
 
+void fill_iram_v_2(struct iram_table_v_2 *ram_table, struct dmcu_iram_parameters params)
+{
+	unsigned int set = params.set;
+
+	ram_table->flags = 0x0;
+	ram_table->deviation_gain = 0xb3;
+
+	ram_table->blRampReduction =
+		cpu_to_be16(params.backlight_ramping_reduction);
+	ram_table->blRampStart =
+		cpu_to_be16(params.backlight_ramping_start);
+
+	ram_table->min_reduction[0][0] = min_reduction_table[abm_config[set][0]];
+	ram_table->min_reduction[1][0] = min_reduction_table[abm_config[set][0]];
+	ram_table->min_reduction[2][0] = min_reduction_table[abm_config[set][0]];
+	ram_table->min_reduction[3][0] = min_reduction_table[abm_config[set][0]];
+	ram_table->min_reduction[4][0] = min_reduction_table[abm_config[set][0]];
+	ram_table->max_reduction[0][0] = max_reduction_table[abm_config[set][0]];
+	ram_table->max_reduction[1][0] = max_reduction_table[abm_config[set][0]];
+	ram_table->max_reduction[2][0] = max_reduction_table[abm_config[set][0]];
+	ram_table->max_reduction[3][0] = max_reduction_table[abm_config[set][0]];
+	ram_table->max_reduction[4][0] = max_reduction_table[abm_config[set][0]];
+
+	ram_table->min_reduction[0][1] = min_reduction_table[abm_config[set][1]];
+	ram_table->min_reduction[1][1] = min_reduction_table[abm_config[set][1]];
+	ram_table->min_reduction[2][1] = min_reduction_table[abm_config[set][1]];
+	ram_table->min_reduction[3][1] = min_reduction_table[abm_config[set][1]];
+	ram_table->min_reduction[4][1] = min_reduction_table[abm_config[set][1]];
+	ram_table->max_reduction[0][1] = max_reduction_table[abm_config[set][1]];
+	ram_table->max_reduction[1][1] = max_reduction_table[abm_config[set][1]];
+	ram_table->max_reduction[2][1] = max_reduction_table[abm_config[set][1]];
+	ram_table->max_reduction[3][1] = max_reduction_table[abm_config[set][1]];
+	ram_table->max_reduction[4][1] = max_reduction_table[abm_config[set][1]];
+
+	ram_table->min_reduction[0][2] = min_reduction_table[abm_config[set][2]];
+	ram_table->min_reduction[1][2] = min_reduction_table[abm_config[set][2]];
+	ram_table->min_reduction[2][2] = min_reduction_table[abm_config[set][2]];
+	ram_table->min_reduction[3][2] = min_reduction_table[abm_config[set][2]];
+	ram_table->min_reduction[4][2] = min_reduction_table[abm_config[set][2]];
+	ram_table->max_reduction[0][2] = max_reduction_table[abm_config[set][2]];
+	ram_table->max_reduction[1][2] = max_reduction_table[abm_config[set][2]];
+	ram_table->max_reduction[2][2] = max_reduction_table[abm_config[set][2]];
+	ram_table->max_reduction[3][2] = max_reduction_table[abm_config[set][2]];
+	ram_table->max_reduction[4][2] = max_reduction_table[abm_config[set][2]];
+
+	ram_table->min_reduction[0][3] = min_reduction_table[abm_config[set][3]];
+	ram_table->min_reduction[1][3] = min_reduction_table[abm_config[set][3]];
+	ram_table->min_reduction[2][3] = min_reduction_table[abm_config[set][3]];
+	ram_table->min_reduction[3][3] = min_reduction_table[abm_config[set][3]];
+	ram_table->min_reduction[4][3] = min_reduction_table[abm_config[set][3]];
+	ram_table->max_reduction[0][3] = max_reduction_table[abm_config[set][3]];
+	ram_table->max_reduction[1][3] = max_reduction_table[abm_config[set][3]];
+	ram_table->max_reduction[2][3] = max_reduction_table[abm_config[set][3]];
+	ram_table->max_reduction[3][3] = max_reduction_table[abm_config[set][3]];
+	ram_table->max_reduction[4][3] = max_reduction_table[abm_config[set][3]];
+
+	ram_table->bright_pos_gain[0][0] = 0x20;
+	ram_table->bright_pos_gain[0][1] = 0x20;
+	ram_table->bright_pos_gain[0][2] = 0x20;
+	ram_table->bright_pos_gain[0][3] = 0x20;
+	ram_table->bright_pos_gain[1][0] = 0x20;
+	ram_table->bright_pos_gain[1][1] = 0x20;
+	ram_table->bright_pos_gain[1][2] = 0x20;
+	ram_table->bright_pos_gain[1][3] = 0x20;
+	ram_table->bright_pos_gain[2][0] = 0x20;
+	ram_table->bright_pos_gain[2][1] = 0x20;
+	ram_table->bright_pos_gain[2][2] = 0x20;
+	ram_table->bright_pos_gain[2][3] = 0x20;
+	ram_table->bright_pos_gain[3][0] = 0x20;
+	ram_table->bright_pos_gain[3][1] = 0x20;
+	ram_table->bright_pos_gain[3][2] = 0x20;
+	ram_table->bright_pos_gain[3][3] = 0x20;
+	ram_table->bright_pos_gain[4][0] = 0x20;
+	ram_table->bright_pos_gain[4][1] = 0x20;
+	ram_table->bright_pos_gain[4][2] = 0x20;
+	ram_table->bright_pos_gain[4][3] = 0x20;
+	ram_table->bright_neg_gain[0][1] = 0x00;
+	ram_table->bright_neg_gain[0][2] = 0x00;
+	ram_table->bright_neg_gain[0][3] = 0x00;
+	ram_table->bright_neg_gain[1][0] = 0x00;
+	ram_table->bright_neg_gain[1][1] = 0x00;
+	ram_table->bright_neg_gain[1][2] = 0x00;
+	ram_table->bright_neg_gain[1][3] = 0x00;
+	ram_table->bright_neg_gain[2][0] = 0x00;
+	ram_table->bright_neg_gain[2][1] = 0x00;
+	ram_table->bright_neg_gain[2][2] = 0x00;
+	ram_table->bright_neg_gain[2][3] = 0x00;
+	ram_table->bright_neg_gain[3][0] = 0x00;
+	ram_table->bright_neg_gain[3][1] = 0x00;
+	ram_table->bright_neg_gain[3][2] = 0x00;
+	ram_table->bright_neg_gain[3][3] = 0x00;
+	ram_table->bright_neg_gain[4][0] = 0x00;
+	ram_table->bright_neg_gain[4][1] = 0x00;
+	ram_table->bright_neg_gain[4][2] = 0x00;
+	ram_table->bright_neg_gain[4][3] = 0x00;
+	ram_table->dark_pos_gain[0][0] = 0x00;
+	ram_table->dark_pos_gain[0][1] = 0x00;
+	ram_table->dark_pos_gain[0][2] = 0x00;
+	ram_table->dark_pos_gain[0][3] = 0x00;
+	ram_table->dark_pos_gain[1][0] = 0x00;
+	ram_table->dark_pos_gain[1][1] = 0x00;
+	ram_table->dark_pos_gain[1][2] = 0x00;
+	ram_table->dark_pos_gain[1][3] = 0x00;
+	ram_table->dark_pos_gain[2][0] = 0x00;
+	ram_table->dark_pos_gain[2][1] = 0x00;
+	ram_table->dark_pos_gain[2][2] = 0x00;
+	ram_table->dark_pos_gain[2][3] = 0x00;
+	ram_table->dark_pos_gain[3][0] = 0x00;
+	ram_table->dark_pos_gain[3][1] = 0x00;
+	ram_table->dark_pos_gain[3][2] = 0x00;
+	ram_table->dark_pos_gain[3][3] = 0x00;
+	ram_table->dark_pos_gain[4][0] = 0x00;
+	ram_table->dark_pos_gain[4][1] = 0x00;
+	ram_table->dark_pos_gain[4][2] = 0x00;
+	ram_table->dark_pos_gain[4][3] = 0x00;
+	ram_table->dark_neg_gain[0][0] = 0x00;
+	ram_table->dark_neg_gain[0][1] = 0x00;
+	ram_table->dark_neg_gain[0][2] = 0x00;
+	ram_table->dark_neg_gain[0][3] = 0x00;
+	ram_table->dark_neg_gain[1][0] = 0x00;
+	ram_table->dark_neg_gain[1][1] = 0x00;
+	ram_table->dark_neg_gain[1][2] = 0x00;
+	ram_table->dark_neg_gain[1][3] = 0x00;
+	ram_table->dark_neg_gain[2][0] = 0x00;
+	ram_table->dark_neg_gain[2][1] = 0x00;
+	ram_table->dark_neg_gain[2][2] = 0x00;
+	ram_table->dark_neg_gain[2][3] = 0x00;
+	ram_table->dark_neg_gain[3][0] = 0x00;
+	ram_table->dark_neg_gain[3][1] = 0x00;
+	ram_table->dark_neg_gain[3][2] = 0x00;
+	ram_table->dark_neg_gain[3][3] = 0x00;
+	ram_table->dark_neg_gain[4][0] = 0x00;
+	ram_table->dark_neg_gain[4][1] = 0x00;
+	ram_table->dark_neg_gain[4][2] = 0x00;
+	ram_table->dark_neg_gain[4][3] = 0x00;
+
+	ram_table->iir_curve[0] = 0x65;
+	ram_table->iir_curve[1] = 0x65;
+	ram_table->iir_curve[2] = 0x65;
+	ram_table->iir_curve[3] = 0x65;
+	ram_table->iir_curve[4] = 0x65;
+
+	//Gamma 2.4
+	ram_table->crgb_thresh[0] = cpu_to_be16(0x13b6);
+	ram_table->crgb_thresh[1] = cpu_to_be16(0x1648);
+	ram_table->crgb_thresh[2] = cpu_to_be16(0x18e3);
+	ram_table->crgb_thresh[3] = cpu_to_be16(0x1b41);
+	ram_table->crgb_thresh[4] = cpu_to_be16(0x1d46);
+	ram_table->crgb_thresh[5] = cpu_to_be16(0x1f21);
+	ram_table->crgb_thresh[6] = cpu_to_be16(0x2167);
+	ram_table->crgb_thresh[7] = cpu_to_be16(0x2384);
+	ram_table->crgb_offset[0] = cpu_to_be16(0x2999);
+	ram_table->crgb_offset[1] = cpu_to_be16(0x3999);
+	ram_table->crgb_offset[2] = cpu_to_be16(0x4666);
+	ram_table->crgb_offset[3] = cpu_to_be16(0x5999);
+	ram_table->crgb_offset[4] = cpu_to_be16(0x6333);
+	ram_table->crgb_offset[5] = cpu_to_be16(0x7800);
+	ram_table->crgb_offset[6] = cpu_to_be16(0x8c00);
+	ram_table->crgb_offset[7] = cpu_to_be16(0xa000);
+	ram_table->crgb_slope[0]  = cpu_to_be16(0x3147);
+	ram_table->crgb_slope[1]  = cpu_to_be16(0x2978);
+	ram_table->crgb_slope[2]  = cpu_to_be16(0x23a2);
+	ram_table->crgb_slope[3]  = cpu_to_be16(0x1f55);
+	ram_table->crgb_slope[4]  = cpu_to_be16(0x1c63);
+	ram_table->crgb_slope[5]  = cpu_to_be16(0x1a0f);
+	ram_table->crgb_slope[6]  = cpu_to_be16(0x178d);
+	ram_table->crgb_slope[7]  = cpu_to_be16(0x15ab);
+
+	fill_backlight_transform_table(
+			params, ram_table);
+}
+
+void fill_iram_v_2_2(struct iram_table_v_2_2 *ram_table, struct dmcu_iram_parameters params)
+{
+	unsigned int set = params.set;
+
+	ram_table->flags = 0x0;
+
+	ram_table->deviation_gain[0] = 0xb3;
+	ram_table->deviation_gain[1] = 0xb3;
+	ram_table->deviation_gain[2] = 0xb3;
+	ram_table->deviation_gain[3] = 0xb3;
+
+	ram_table->min_reduction[0][0] = min_reduction_table_v_2_2[abm_config[set][0]];
+	ram_table->min_reduction[1][0] = min_reduction_table_v_2_2[abm_config[set][0]];
+	ram_table->min_reduction[2][0] = min_reduction_table_v_2_2[abm_config[set][0]];
+	ram_table->min_reduction[3][0] = min_reduction_table_v_2_2[abm_config[set][0]];
+	ram_table->min_reduction[4][0] = min_reduction_table_v_2_2[abm_config[set][0]];
+	ram_table->max_reduction[0][0] = max_reduction_table_v_2_2[abm_config[set][0]];
+	ram_table->max_reduction[1][0] = max_reduction_table_v_2_2[abm_config[set][0]];
+	ram_table->max_reduction[2][0] = max_reduction_table_v_2_2[abm_config[set][0]];
+	ram_table->max_reduction[3][0] = max_reduction_table_v_2_2[abm_config[set][0]];
+	ram_table->max_reduction[4][0] = max_reduction_table_v_2_2[abm_config[set][0]];
+
+	ram_table->min_reduction[0][1] = min_reduction_table_v_2_2[abm_config[set][1]];
+	ram_table->min_reduction[1][1] = min_reduction_table_v_2_2[abm_config[set][1]];
+	ram_table->min_reduction[2][1] = min_reduction_table_v_2_2[abm_config[set][1]];
+	ram_table->min_reduction[3][1] = min_reduction_table_v_2_2[abm_config[set][1]];
+	ram_table->min_reduction[4][1] = min_reduction_table_v_2_2[abm_config[set][1]];
+	ram_table->max_reduction[0][1] = max_reduction_table_v_2_2[abm_config[set][1]];
+	ram_table->max_reduction[1][1] = max_reduction_table_v_2_2[abm_config[set][1]];
+	ram_table->max_reduction[2][1] = max_reduction_table_v_2_2[abm_config[set][1]];
+	ram_table->max_reduction[3][1] = max_reduction_table_v_2_2[abm_config[set][1]];
+	ram_table->max_reduction[4][1] = max_reduction_table_v_2_2[abm_config[set][1]];
+
+	ram_table->min_reduction[0][2] = min_reduction_table_v_2_2[abm_config[set][2]];
+	ram_table->min_reduction[1][2] = min_reduction_table_v_2_2[abm_config[set][2]];
+	ram_table->min_reduction[2][2] = min_reduction_table_v_2_2[abm_config[set][2]];
+	ram_table->min_reduction[3][2] = min_reduction_table_v_2_2[abm_config[set][2]];
+	ram_table->min_reduction[4][2] = min_reduction_table_v_2_2[abm_config[set][2]];
+	ram_table->max_reduction[0][2] = max_reduction_table_v_2_2[abm_config[set][2]];
+	ram_table->max_reduction[1][2] = max_reduction_table_v_2_2[abm_config[set][2]];
+	ram_table->max_reduction[2][2] = max_reduction_table_v_2_2[abm_config[set][2]];
+	ram_table->max_reduction[3][2] = max_reduction_table_v_2_2[abm_config[set][2]];
+	ram_table->max_reduction[4][2] = max_reduction_table_v_2_2[abm_config[set][2]];
+
+	ram_table->min_reduction[0][3] = min_reduction_table_v_2_2[abm_config[set][3]];
+	ram_table->min_reduction[1][3] = min_reduction_table_v_2_2[abm_config[set][3]];
+	ram_table->min_reduction[2][3] = min_reduction_table_v_2_2[abm_config[set][3]];
+	ram_table->min_reduction[3][3] = min_reduction_table_v_2_2[abm_config[set][3]];
+	ram_table->min_reduction[4][3] = min_reduction_table_v_2_2[abm_config[set][3]];
+	ram_table->max_reduction[0][3] = max_reduction_table_v_2_2[abm_config[set][3]];
+	ram_table->max_reduction[1][3] = max_reduction_table_v_2_2[abm_config[set][3]];
+	ram_table->max_reduction[2][3] = max_reduction_table_v_2_2[abm_config[set][3]];
+	ram_table->max_reduction[3][3] = max_reduction_table_v_2_2[abm_config[set][3]];
+	ram_table->max_reduction[4][3] = max_reduction_table_v_2_2[abm_config[set][3]];
+
+	ram_table->bright_pos_gain[0][0] = 0x20;
+	ram_table->bright_pos_gain[0][1] = 0x20;
+	ram_table->bright_pos_gain[0][2] = 0x20;
+	ram_table->bright_pos_gain[0][3] = 0x20;
+	ram_table->bright_pos_gain[1][0] = 0x20;
+	ram_table->bright_pos_gain[1][1] = 0x20;
+	ram_table->bright_pos_gain[1][2] = 0x20;
+	ram_table->bright_pos_gain[1][3] = 0x20;
+	ram_table->bright_pos_gain[2][0] = 0x20;
+	ram_table->bright_pos_gain[2][1] = 0x20;
+	ram_table->bright_pos_gain[2][2] = 0x20;
+	ram_table->bright_pos_gain[2][3] = 0x20;
+	ram_table->bright_pos_gain[3][0] = 0x20;
+	ram_table->bright_pos_gain[3][1] = 0x20;
+	ram_table->bright_pos_gain[3][2] = 0x20;
+	ram_table->bright_pos_gain[3][3] = 0x20;
+	ram_table->bright_pos_gain[4][0] = 0x20;
+	ram_table->bright_pos_gain[4][1] = 0x20;
+	ram_table->bright_pos_gain[4][2] = 0x20;
+	ram_table->bright_pos_gain[4][3] = 0x20;
+
+	ram_table->dark_pos_gain[0][0] = 0x00;
+	ram_table->dark_pos_gain[0][1] = 0x00;
+	ram_table->dark_pos_gain[0][2] = 0x00;
+	ram_table->dark_pos_gain[0][3] = 0x00;
+	ram_table->dark_pos_gain[1][0] = 0x00;
+	ram_table->dark_pos_gain[1][1] = 0x00;
+	ram_table->dark_pos_gain[1][2] = 0x00;
+	ram_table->dark_pos_gain[1][3] = 0x00;
+	ram_table->dark_pos_gain[2][0] = 0x00;
+	ram_table->dark_pos_gain[2][1] = 0x00;
+	ram_table->dark_pos_gain[2][2] = 0x00;
+	ram_table->dark_pos_gain[2][3] = 0x00;
+	ram_table->dark_pos_gain[3][0] = 0x00;
+	ram_table->dark_pos_gain[3][1] = 0x00;
+	ram_table->dark_pos_gain[3][2] = 0x00;
+	ram_table->dark_pos_gain[3][3] = 0x00;
+	ram_table->dark_pos_gain[4][0] = 0x00;
+	ram_table->dark_pos_gain[4][1] = 0x00;
+	ram_table->dark_pos_gain[4][2] = 0x00;
+	ram_table->dark_pos_gain[4][3] = 0x00;
+
+	ram_table->hybridFactor[0] = 0xff;
+	ram_table->hybridFactor[1] = 0xff;
+	ram_table->hybridFactor[2] = 0xff;
+	ram_table->hybridFactor[3] = 0xc0;
+
+	ram_table->contrastFactor[0] = 0x99;
+	ram_table->contrastFactor[1] = 0x99;
+	ram_table->contrastFactor[2] = 0x99;
+	ram_table->contrastFactor[3] = 0x80;
+
+	ram_table->iir_curve[0] = 0x65;
+	ram_table->iir_curve[1] = 0x65;
+	ram_table->iir_curve[2] = 0x65;
+	ram_table->iir_curve[3] = 0x65;
+	ram_table->iir_curve[4] = 0x65;
+
+	//Gamma 2.2
+	ram_table->crgb_thresh[0] = cpu_to_be16(0x127c);
+	ram_table->crgb_thresh[1] = cpu_to_be16(0x151b);
+	ram_table->crgb_thresh[2] = cpu_to_be16(0x17d5);
+	ram_table->crgb_thresh[3] = cpu_to_be16(0x1a56);
+	ram_table->crgb_thresh[4] = cpu_to_be16(0x1c83);
+	ram_table->crgb_thresh[5] = cpu_to_be16(0x1e72);
+	ram_table->crgb_thresh[6] = cpu_to_be16(0x20f0);
+	ram_table->crgb_thresh[7] = cpu_to_be16(0x232b);
+	ram_table->crgb_offset[0] = cpu_to_be16(0x2999);
+	ram_table->crgb_offset[1] = cpu_to_be16(0x3999);
+	ram_table->crgb_offset[2] = cpu_to_be16(0x4666);
+	ram_table->crgb_offset[3] = cpu_to_be16(0x5999);
+	ram_table->crgb_offset[4] = cpu_to_be16(0x6333);
+	ram_table->crgb_offset[5] = cpu_to_be16(0x7800);
+	ram_table->crgb_offset[6] = cpu_to_be16(0x8c00);
+	ram_table->crgb_offset[7] = cpu_to_be16(0xa000);
+	ram_table->crgb_slope[0]  = cpu_to_be16(0x3609);
+	ram_table->crgb_slope[1]  = cpu_to_be16(0x2dfa);
+	ram_table->crgb_slope[2]  = cpu_to_be16(0x27ea);
+	ram_table->crgb_slope[3]  = cpu_to_be16(0x235d);
+	ram_table->crgb_slope[4]  = cpu_to_be16(0x2042);
+	ram_table->crgb_slope[5]  = cpu_to_be16(0x1dc3);
+	ram_table->crgb_slope[6]  = cpu_to_be16(0x1b1a);
+	ram_table->crgb_slope[7]  = cpu_to_be16(0x1910);
+
+	fill_backlight_transform_table_v_2_2(
+			params, ram_table);
+}
+
 bool dmcu_load_iram(struct dmcu *dmcu,
 	struct dmcu_iram_parameters params)
 {
-	struct iram_table_v_2 ram_table;
-	unsigned int set = params.set;
+	unsigned char ram_table[IRAM_SIZE];
+	bool result = false;
 
 	if (dmcu == NULL)
 		return false;
@@ -159,170 +558,23 @@ bool dmcu_load_iram(struct dmcu *dmcu,
 
 	memset(&ram_table, 0, sizeof(ram_table));
 
-	ram_table.flags = 0x0;
-	ram_table.deviation_gain = 0xb3;
+	if (dmcu->dmcu_version.abm_version == 0x22) {
+		fill_iram_v_2_2((struct iram_table_v_2_2 *)ram_table, params);
 
-	ram_table.blRampReduction =
-		cpu_to_be16(params.backlight_ramping_reduction);
-	ram_table.blRampStart =
-		cpu_to_be16(params.backlight_ramping_start);
+		result = dmcu->funcs->load_iram(
+				dmcu, 0, (char *)(&ram_table), IRAM_RESERVE_AREA_START_V2_2);
+	} else {
+		fill_iram_v_2((struct iram_table_v_2 *)ram_table, params);
 
-	ram_table.min_reduction[0][0] = min_reduction_table[abm_config[set][0]];
-	ram_table.min_reduction[1][0] = min_reduction_table[abm_config[set][0]];
-	ram_table.min_reduction[2][0] = min_reduction_table[abm_config[set][0]];
-	ram_table.min_reduction[3][0] = min_reduction_table[abm_config[set][0]];
-	ram_table.min_reduction[4][0] = min_reduction_table[abm_config[set][0]];
-	ram_table.max_reduction[0][0] = max_reduction_table[abm_config[set][0]];
-	ram_table.max_reduction[1][0] = max_reduction_table[abm_config[set][0]];
-	ram_table.max_reduction[2][0] = max_reduction_table[abm_config[set][0]];
-	ram_table.max_reduction[3][0] = max_reduction_table[abm_config[set][0]];
-	ram_table.max_reduction[4][0] = max_reduction_table[abm_config[set][0]];
-
-	ram_table.min_reduction[0][1] = min_reduction_table[abm_config[set][1]];
-	ram_table.min_reduction[1][1] = min_reduction_table[abm_config[set][1]];
-	ram_table.min_reduction[2][1] = min_reduction_table[abm_config[set][1]];
-	ram_table.min_reduction[3][1] = min_reduction_table[abm_config[set][1]];
-	ram_table.min_reduction[4][1] = min_reduction_table[abm_config[set][1]];
-	ram_table.max_reduction[0][1] = max_reduction_table[abm_config[set][1]];
-	ram_table.max_reduction[1][1] = max_reduction_table[abm_config[set][1]];
-	ram_table.max_reduction[2][1] = max_reduction_table[abm_config[set][1]];
-	ram_table.max_reduction[3][1] = max_reduction_table[abm_config[set][1]];
-	ram_table.max_reduction[4][1] = max_reduction_table[abm_config[set][1]];
-
-	ram_table.min_reduction[0][2] = min_reduction_table[abm_config[set][2]];
-	ram_table.min_reduction[1][2] = min_reduction_table[abm_config[set][2]];
-	ram_table.min_reduction[2][2] = min_reduction_table[abm_config[set][2]];
-	ram_table.min_reduction[3][2] = min_reduction_table[abm_config[set][2]];
-	ram_table.min_reduction[4][2] = min_reduction_table[abm_config[set][2]];
-	ram_table.max_reduction[0][2] = max_reduction_table[abm_config[set][2]];
-	ram_table.max_reduction[1][2] = max_reduction_table[abm_config[set][2]];
-	ram_table.max_reduction[2][2] = max_reduction_table[abm_config[set][2]];
-	ram_table.max_reduction[3][2] = max_reduction_table[abm_config[set][2]];
-	ram_table.max_reduction[4][2] = max_reduction_table[abm_config[set][2]];
-
-	ram_table.min_reduction[0][3] = min_reduction_table[abm_config[set][3]];
-	ram_table.min_reduction[1][3] = min_reduction_table[abm_config[set][3]];
-	ram_table.min_reduction[2][3] = min_reduction_table[abm_config[set][3]];
-	ram_table.min_reduction[3][3] = min_reduction_table[abm_config[set][3]];
-	ram_table.min_reduction[4][3] = min_reduction_table[abm_config[set][3]];
-	ram_table.max_reduction[0][3] = max_reduction_table[abm_config[set][3]];
-	ram_table.max_reduction[1][3] = max_reduction_table[abm_config[set][3]];
-	ram_table.max_reduction[2][3] = max_reduction_table[abm_config[set][3]];
-	ram_table.max_reduction[3][3] = max_reduction_table[abm_config[set][3]];
-	ram_table.max_reduction[4][3] = max_reduction_table[abm_config[set][3]];
-
-	ram_table.bright_pos_gain[0][0] = 0x20;
-	ram_table.bright_pos_gain[0][1] = 0x20;
-	ram_table.bright_pos_gain[0][2] = 0x20;
-	ram_table.bright_pos_gain[0][3] = 0x20;
-	ram_table.bright_pos_gain[1][0] = 0x20;
-	ram_table.bright_pos_gain[1][1] = 0x20;
-	ram_table.bright_pos_gain[1][2] = 0x20;
-	ram_table.bright_pos_gain[1][3] = 0x20;
-	ram_table.bright_pos_gain[2][0] = 0x20;
-	ram_table.bright_pos_gain[2][1] = 0x20;
-	ram_table.bright_pos_gain[2][2] = 0x20;
-	ram_table.bright_pos_gain[2][3] = 0x20;
-	ram_table.bright_pos_gain[3][0] = 0x20;
-	ram_table.bright_pos_gain[3][1] = 0x20;
-	ram_table.bright_pos_gain[3][2] = 0x20;
-	ram_table.bright_pos_gain[3][3] = 0x20;
-	ram_table.bright_pos_gain[4][0] = 0x20;
-	ram_table.bright_pos_gain[4][1] = 0x20;
-	ram_table.bright_pos_gain[4][2] = 0x20;
-	ram_table.bright_pos_gain[4][3] = 0x20;
-	ram_table.bright_neg_gain[0][1] = 0x00;
-	ram_table.bright_neg_gain[0][2] = 0x00;
-	ram_table.bright_neg_gain[0][3] = 0x00;
-	ram_table.bright_neg_gain[1][0] = 0x00;
-	ram_table.bright_neg_gain[1][1] = 0x00;
-	ram_table.bright_neg_gain[1][2] = 0x00;
-	ram_table.bright_neg_gain[1][3] = 0x00;
-	ram_table.bright_neg_gain[2][0] = 0x00;
-	ram_table.bright_neg_gain[2][1] = 0x00;
-	ram_table.bright_neg_gain[2][2] = 0x00;
-	ram_table.bright_neg_gain[2][3] = 0x00;
-	ram_table.bright_neg_gain[3][0] = 0x00;
-	ram_table.bright_neg_gain[3][1] = 0x00;
-	ram_table.bright_neg_gain[3][2] = 0x00;
-	ram_table.bright_neg_gain[3][3] = 0x00;
-	ram_table.bright_neg_gain[4][0] = 0x00;
-	ram_table.bright_neg_gain[4][1] = 0x00;
-	ram_table.bright_neg_gain[4][2] = 0x00;
-	ram_table.bright_neg_gain[4][3] = 0x00;
-	ram_table.dark_pos_gain[0][0] = 0x00;
-	ram_table.dark_pos_gain[0][1] = 0x00;
-	ram_table.dark_pos_gain[0][2] = 0x00;
-	ram_table.dark_pos_gain[0][3] = 0x00;
-	ram_table.dark_pos_gain[1][0] = 0x00;
-	ram_table.dark_pos_gain[1][1] = 0x00;
-	ram_table.dark_pos_gain[1][2] = 0x00;
-	ram_table.dark_pos_gain[1][3] = 0x00;
-	ram_table.dark_pos_gain[2][0] = 0x00;
-	ram_table.dark_pos_gain[2][1] = 0x00;
-	ram_table.dark_pos_gain[2][2] = 0x00;
-	ram_table.dark_pos_gain[2][3] = 0x00;
-	ram_table.dark_pos_gain[3][0] = 0x00;
-	ram_table.dark_pos_gain[3][1] = 0x00;
-	ram_table.dark_pos_gain[3][2] = 0x00;
-	ram_table.dark_pos_gain[3][3] = 0x00;
-	ram_table.dark_pos_gain[4][0] = 0x00;
-	ram_table.dark_pos_gain[4][1] = 0x00;
-	ram_table.dark_pos_gain[4][2] = 0x00;
-	ram_table.dark_pos_gain[4][3] = 0x00;
-	ram_table.dark_neg_gain[0][0] = 0x00;
-	ram_table.dark_neg_gain[0][1] = 0x00;
-	ram_table.dark_neg_gain[0][2] = 0x00;
-	ram_table.dark_neg_gain[0][3] = 0x00;
-	ram_table.dark_neg_gain[1][0] = 0x00;
-	ram_table.dark_neg_gain[1][1] = 0x00;
-	ram_table.dark_neg_gain[1][2] = 0x00;
-	ram_table.dark_neg_gain[1][3] = 0x00;
-	ram_table.dark_neg_gain[2][0] = 0x00;
-	ram_table.dark_neg_gain[2][1] = 0x00;
-	ram_table.dark_neg_gain[2][2] = 0x00;
-	ram_table.dark_neg_gain[2][3] = 0x00;
-	ram_table.dark_neg_gain[3][0] = 0x00;
-	ram_table.dark_neg_gain[3][1] = 0x00;
-	ram_table.dark_neg_gain[3][2] = 0x00;
-	ram_table.dark_neg_gain[3][3] = 0x00;
-	ram_table.dark_neg_gain[4][0] = 0x00;
-	ram_table.dark_neg_gain[4][1] = 0x00;
-	ram_table.dark_neg_gain[4][2] = 0x00;
-	ram_table.dark_neg_gain[4][3] = 0x00;
-	ram_table.iir_curve[0] = 0x65;
-	ram_table.iir_curve[1] = 0x65;
-	ram_table.iir_curve[2] = 0x65;
-	ram_table.iir_curve[3] = 0x65;
-	ram_table.iir_curve[4] = 0x65;
-	ram_table.crgb_thresh[0] = cpu_to_be16(0x13b6);
-	ram_table.crgb_thresh[1] = cpu_to_be16(0x1648);
-	ram_table.crgb_thresh[2] = cpu_to_be16(0x18e3);
-	ram_table.crgb_thresh[3] = cpu_to_be16(0x1b41);
-	ram_table.crgb_thresh[4] = cpu_to_be16(0x1d46);
-	ram_table.crgb_thresh[5] = cpu_to_be16(0x1f21);
-	ram_table.crgb_thresh[6] = cpu_to_be16(0x2167);
-	ram_table.crgb_thresh[7] = cpu_to_be16(0x2384);
-	ram_table.crgb_offset[0] = cpu_to_be16(0x2999);
-	ram_table.crgb_offset[1] = cpu_to_be16(0x3999);
-	ram_table.crgb_offset[2] = cpu_to_be16(0x4666);
-	ram_table.crgb_offset[3] = cpu_to_be16(0x5999);
-	ram_table.crgb_offset[4] = cpu_to_be16(0x6333);
-	ram_table.crgb_offset[5] = cpu_to_be16(0x7800);
-	ram_table.crgb_offset[6] = cpu_to_be16(0x8c00);
-	ram_table.crgb_offset[7] = cpu_to_be16(0xa000);
-	ram_table.crgb_slope[0]  = cpu_to_be16(0x3147);
-	ram_table.crgb_slope[1]  = cpu_to_be16(0x2978);
-	ram_table.crgb_slope[2]  = cpu_to_be16(0x23a2);
-	ram_table.crgb_slope[3]  = cpu_to_be16(0x1f55);
-	ram_table.crgb_slope[4]  = cpu_to_be16(0x1c63);
-	ram_table.crgb_slope[5]  = cpu_to_be16(0x1a0f);
-	ram_table.crgb_slope[6]  = cpu_to_be16(0x178d);
-	ram_table.crgb_slope[7]  = cpu_to_be16(0x15ab);
+		result = dmcu->funcs->load_iram(
+				dmcu, 0, (char *)(&ram_table), IRAM_RESERVE_AREA_START_V2);
 
-	fill_backlight_transform_table(
-			params, &ram_table);
+		if (result)
+			result = dmcu->funcs->load_iram(
+					dmcu, IRAM_RESERVE_AREA_END_V2 + 1,
+					(char *)(&ram_table) + IRAM_RESERVE_AREA_END_V2 + 1,
+					sizeof(ram_table) - IRAM_RESERVE_AREA_END_V2 - 1);
+	}
 
-	return dmcu->funcs->load_iram(
-			dmcu, 0, (char *)(&ram_table), sizeof(ram_table));
+	return result;
 }
diff --git a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_offset.h b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_offset.h
index 13d4de645190..d8e0dd192fdd 100644
--- a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_offset.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_offset.h
@@ -2247,6 +2247,8 @@
 
 // addressBlock: nbio_nbif_rcc_strap_BIFDEC1[13440..14975]
 // base address: 0x3480
+#define mmRCC_BIF_STRAP0                                                                               0x0000
+#define mmRCC_BIF_STRAP0_BASE_IDX                                                                      2
 #define mmRCC_DEV0_EPF0_STRAP0                                                                         0x000f
 #define mmRCC_DEV0_EPF0_STRAP0_BASE_IDX                                                                2
 
diff --git a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_sh_mask.h
index a02b67943372..29af5167cd00 100644
--- a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_sh_mask.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_sh_mask.h
@@ -16838,6 +16838,10 @@
 
 
 // addressBlock: nbio_nbif_rcc_strap_BIFDEC1[13440..14975]
+//RCC_BIF_STRAP0
+#define RCC_BIF_STRAP0__STRAP_PX_CAPABLE__SHIFT                                                               0x7
+#define RCC_BIF_STRAP0__STRAP_PX_CAPABLE_MASK                                                                 0x00000080L
+
 //RCC_DEV0_EPF0_STRAP0
 #define RCC_DEV0_EPF0_STRAP0__STRAP_DEVICE_ID_DEV0_F0__SHIFT                                                  0x0
 #define RCC_DEV0_EPF0_STRAP0__STRAP_MAJOR_REV_ID_DEV0_F0__SHIFT                                               0x10
diff --git a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_smn.h b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_smn.h
new file mode 100644
index 000000000000..8c75669eb500
--- /dev/null
+++ b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_6_1_smn.h
@@ -0,0 +1,58 @@
+/*
+ * Copyright (C) 2019  Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN
+ * AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef _nbio_6_1_SMN_HEADER
+#define _nbio_6_1_SMN_HEADER
+
+
+#define smnCPM_CONTROL					0x11180460
+#define smnPCIE_CNTL2					0x11180070
+#define smnPCIE_CONFIG_CNTL				0x11180044
+#define smnPCIE_CI_CNTL					0x11180080
+
+
+#define smnPCIE_PERF_COUNT_CNTL				0x11180200
+#define smnPCIE_PERF_CNTL_TXCLK				0x11180204
+#define smnPCIE_PERF_COUNT0_TXCLK			0x11180208
+#define smnPCIE_PERF_COUNT1_TXCLK			0x1118020c
+#define smnPCIE_PERF_CNTL_MST_R_CLK			0x11180210
+#define smnPCIE_PERF_COUNT0_MST_R_CLK			0x11180214
+#define smnPCIE_PERF_COUNT1_MST_R_CLK			0x11180218
+#define smnPCIE_PERF_CNTL_MST_C_CLK			0x1118021c
+#define smnPCIE_PERF_COUNT0_MST_C_CLK			0x11180220
+#define smnPCIE_PERF_COUNT1_MST_C_CLK			0x11180224
+#define smnPCIE_PERF_CNTL_SLV_R_CLK			0x11180228
+#define smnPCIE_PERF_COUNT0_SLV_R_CLK			0x1118022c
+#define smnPCIE_PERF_COUNT1_SLV_R_CLK			0x11180230
+#define smnPCIE_PERF_CNTL_SLV_S_C_CLK			0x11180234
+#define smnPCIE_PERF_COUNT0_SLV_S_C_CLK			0x11180238
+#define smnPCIE_PERF_COUNT1_SLV_S_C_CLK			0x1118023c
+#define smnPCIE_PERF_CNTL_SLV_NS_C_CLK			0x11180240
+#define smnPCIE_PERF_COUNT0_SLV_NS_C_CLK		0x11180244
+#define smnPCIE_PERF_COUNT1_SLV_NS_C_CLK		0x11180248
+#define smnPCIE_PERF_CNTL_EVENT0_PORT_SEL		0x1118024c
+#define smnPCIE_PERF_CNTL_EVENT1_PORT_SEL		0x11180250
+#define smnPCIE_PERF_CNTL_TXCLK2			0x11180254
+#define smnPCIE_PERF_COUNT0_TXCLK2			0x11180258
+#define smnPCIE_PERF_COUNT1_TXCLK2			0x1118025c
+
+#endif	// _nbio_6_1_SMN_HEADER
+
diff --git a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_0_smn.h b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_0_smn.h
new file mode 100644
index 000000000000..5563f0715896
--- /dev/null
+++ b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_0_smn.h
@@ -0,0 +1,54 @@
+/*
+ * Copyright (C) 2019  Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN
+ * AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef _nbio_7_0_SMN_HEADER
+#define _nbio_7_0_SMN_HEADER
+
+
+#define smnCPM_CONTROL					0x11180460
+#define smnPCIE_CNTL2					0x11180070
+
+#define smnPCIE_PERF_COUNT_CNTL				0x11180200
+#define smnPCIE_PERF_CNTL_TXCLK				0x11180204
+#define smnPCIE_PERF_COUNT0_TXCLK			0x11180208
+#define smnPCIE_PERF_COUNT1_TXCLK			0x1118020c
+#define smnPCIE_PERF_CNTL_MST_R_CLK			0x11180210
+#define smnPCIE_PERF_COUNT0_MST_R_CLK			0x11180214
+#define smnPCIE_PERF_COUNT1_MST_R_CLK			0x11180218
+#define smnPCIE_PERF_CNTL_MST_C_CLK			0x1118021c
+#define smnPCIE_PERF_COUNT0_MST_C_CLK			0x11180220
+#define smnPCIE_PERF_COUNT1_MST_C_CLK			0x11180224
+#define smnPCIE_PERF_CNTL_SLV_R_CLK			0x11180228
+#define smnPCIE_PERF_COUNT0_SLV_R_CLK			0x1118022c
+#define smnPCIE_PERF_COUNT1_SLV_R_CLK			0x11180230
+#define smnPCIE_PERF_CNTL_SLV_S_C_CLK			0x11180234
+#define smnPCIE_PERF_COUNT0_SLV_S_C_CLK			0x11180238
+#define smnPCIE_PERF_COUNT1_SLV_S_C_CLK			0x1118023c
+#define smnPCIE_PERF_CNTL_SLV_NS_C_CLK			0x11180240
+#define smnPCIE_PERF_COUNT0_SLV_NS_C_CLK		0x11180244
+#define smnPCIE_PERF_COUNT1_SLV_NS_C_CLK		0x11180248
+#define smnPCIE_PERF_CNTL_EVENT0_PORT_SEL		0x1118024c
+#define smnPCIE_PERF_CNTL_EVENT1_PORT_SEL		0x11180250
+#define smnPCIE_PERF_CNTL_TXCLK2			0x11180254
+#define smnPCIE_PERF_COUNT0_TXCLK2			0x11180258
+#define smnPCIE_PERF_COUNT1_TXCLK2			0x1118025c
+
+#endif	// _nbio_7_0_SMN_HEADER
diff --git a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_0_smn.h b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_0_smn.h
new file mode 100644
index 000000000000..c1457d880c4d
--- /dev/null
+++ b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_0_smn.h
@@ -0,0 +1,53 @@
+/*
+ * Copyright (C) 2019  Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN
+ * AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef _nbio_7_4_0_SMN_HEADER
+#define _nbio_7_4_0_SMN_HEADER
+
+
+#define smnNBIF_MGCG_CTRL_LCLK				0x1013a21c
+#define smnCPM_CONTROL					0x11180460
+#define smnPCIE_CNTL2					0x11180070
+#define smnPCIE_CI_CNTL					0x11180080
+
+#define smnPCIE_PERF_COUNT_CNTL				0x11180200
+#define smnPCIE_PERF_CNTL_TXCLK1			0x11180204
+#define smnPCIE_PERF_COUNT0_TXCLK1			0x11180208
+#define smnPCIE_PERF_COUNT1_TXCLK1			0x1118020c
+#define smnPCIE_PERF_CNTL_TXCLK2			0x11180210
+#define smnPCIE_PERF_COUNT0_TXCLK2			0x11180214
+#define smnPCIE_PERF_COUNT1_TXCLK2			0x11180218
+#define smnPCIE_PERF_CNTL_TXCLK3			0x1118021c
+#define smnPCIE_PERF_COUNT0_TXCLK3			0x11180220
+#define smnPCIE_PERF_COUNT1_TXCLK3			0x11180224
+#define smnPCIE_PERF_CNTL_TXCLK4			0x11180228
+#define smnPCIE_PERF_COUNT0_TXCLK4			0x1118022c
+#define smnPCIE_PERF_COUNT1_TXCLK4			0x11180230
+#define smnPCIE_PERF_CNTL_SCLK1				0x11180234
+#define smnPCIE_PERF_COUNT0_SCLK1			0x11180238
+#define smnPCIE_PERF_COUNT1_SCLK1			0x1118023c
+#define smnPCIE_PERF_CNTL_SCLK2				0x11180240
+#define smnPCIE_PERF_COUNT0_SCLK2			0x11180244
+#define smnPCIE_PERF_COUNT1_SCLK2			0x11180248
+#define smnPCIE_PERF_CNTL_EVENT_LC_PORT_SEL		0x1118024c
+#define smnPCIE_PERF_CNTL_EVENT_CI_PORT_SEL		0x11180250
+
+#endif	// _nbio_7_4_0_SMN_HEADER
diff --git a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_offset.h b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_offset.h
index e932213f87f0..994e796a28d7 100644
--- a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_offset.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_offset.h
@@ -2567,6 +2567,8 @@
 
 // addressBlock: nbio_nbif0_rcc_strap_BIFDEC1
 // base address: 0x0
+#define mmRCC_BIF_STRAP0                                                                               0x0000
+#define mmRCC_BIF_STRAP0_BASE_IDX                                                                      2
 #define mmRCC_DEV0_EPF0_STRAP0                                                                         0x0011
 #define mmRCC_DEV0_EPF0_STRAP0_BASE_IDX                                                                2
 
diff --git a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_sh_mask.h
index d3704b438f2d..d467b939c971 100644
--- a/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_sh_mask.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/nbio/nbio_7_4_sh_mask.h
@@ -19690,6 +19690,9 @@
 
 
 // addressBlock: nbio_nbif0_rcc_strap_BIFDEC1
+//RCC_BIF_STRAP0
+#define RCC_BIF_STRAP0__STRAP_PX_CAPABLE__SHIFT                                                               0x7
+#define RCC_BIF_STRAP0__STRAP_PX_CAPABLE_MASK                                                                 0x00000080L
 //RCC_DEV0_EPF0_STRAP0
 #define RCC_DEV0_EPF0_STRAP0__STRAP_DEVICE_ID_DEV0_F0__SHIFT                                                  0x0
 #define RCC_DEV0_EPF0_STRAP0__STRAP_MAJOR_REV_ID_DEV0_F0__SHIFT                                               0x10
diff --git a/drivers/gpu/drm/amd/include/asic_reg/thm/thm_11_0_2_offset.h b/drivers/gpu/drm/amd/include/asic_reg/thm/thm_11_0_2_offset.h
index a9eb57a53e59..a485526f3a51 100644
--- a/drivers/gpu/drm/amd/include/asic_reg/thm/thm_11_0_2_offset.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/thm/thm_11_0_2_offset.h
@@ -46,4 +46,7 @@
 #define mmTHM_TCON_THERM_TRIP                                                                          0x0002
 #define mmTHM_TCON_THERM_TRIP_BASE_IDX                                                                 0
 
+#define mmTHM_BACO_CNTL                                                                                0x0081
+#define mmTHM_BACO_CNTL_BASE_IDX                                                                       0
+
 #endif
diff --git a/drivers/gpu/drm/amd/include/atombios.h b/drivers/gpu/drm/amd/include/atombios.h
index 7931502fa54f..8ba21747b40a 100644
--- a/drivers/gpu/drm/amd/include/atombios.h
+++ b/drivers/gpu/drm/amd/include/atombios.h
@@ -4106,7 +4106,7 @@ typedef struct  _ATOM_LCD_MODE_CONTROL_CAP
 typedef struct _ATOM_FAKE_EDID_PATCH_RECORD
 {
   UCHAR ucRecordType;
-  UCHAR ucFakeEDIDLength;       // = 128 means EDID lenght is 128 bytes, otherwise the EDID length = ucFakeEDIDLength*128
+  UCHAR ucFakeEDIDLength;       // = 128 means EDID length is 128 bytes, otherwise the EDID length = ucFakeEDIDLength*128
   UCHAR ucFakeEDIDString[1];    // This actually has ucFakeEdidLength elements.
 } ATOM_FAKE_EDID_PATCH_RECORD;
 
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 8154d67388cc..5f3c10ebff08 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -34,7 +34,6 @@
 
 struct pci_dev;
 
-#define KFD_INTERFACE_VERSION 2
 #define KGD_MAX_QUEUES 128
 
 struct kfd_dev;
@@ -138,20 +137,17 @@ struct kgd2kfd_shared_resources {
 	/* Bit n == 1 means Queue n is available for KFD */
 	DECLARE_BITMAP(queue_bitmap, KGD_MAX_QUEUES);
 
-	/* Doorbell assignments (SOC15 and later chips only). Only
+	/* SDMA doorbell assignments (SOC15 and later chips only). Only
 	 * specific doorbells are routed to each SDMA engine. Others
 	 * are routed to IH and VCN. They are not usable by the CP.
-	 *
-	 * Any doorbell number D that satisfies the following condition
-	 * is reserved: (D & reserved_doorbell_mask) == reserved_doorbell_val
-	 *
-	 * KFD currently uses 1024 (= 0x3ff) doorbells per process. If
-	 * doorbells 0x0e0-0x0ff and 0x2e0-0x2ff are reserved, that means
-	 * mask would be set to 0x1e0 and val set to 0x0e0.
 	 */
-	unsigned int sdma_doorbell[2][8];
-	unsigned int reserved_doorbell_mask;
-	unsigned int reserved_doorbell_val;
+	uint32_t *sdma_doorbell_idx;
+
+	/* From SOC15 onward, the doorbell index range not usable for CP
+	 * queues.
+	 */
+	uint32_t non_cp_doorbells_start;
+	uint32_t non_cp_doorbells_end;
 
 	/* Base address of doorbell aperture. */
 	phys_addr_t doorbell_physical_address;
@@ -330,56 +326,4 @@ struct kfd2kgd_calls {
 
 };
 
-/**
- * struct kgd2kfd_calls
- *
- * @exit: Notifies amdkfd that kgd module is unloaded
- *
- * @probe: Notifies amdkfd about a probe done on a device in the kgd driver.
- *
- * @device_init: Initialize the newly probed device (if it is a device that
- * amdkfd supports)
- *
- * @device_exit: Notifies amdkfd about a removal of a kgd device
- *
- * @suspend: Notifies amdkfd about a suspend action done to a kgd device
- *
- * @resume: Notifies amdkfd about a resume action done to a kgd device
- *
- * @quiesce_mm: Quiesce all user queue access to specified MM address space
- *
- * @resume_mm: Resume user queue access to specified MM address space
- *
- * @schedule_evict_and_restore_process: Schedules work queue that will prepare
- * for safe eviction of KFD BOs that belong to the specified process.
- *
- * @pre_reset: Notifies amdkfd that amdgpu about to reset the gpu
- *
- * @post_reset: Notify amdkfd that amgpu successfully reseted the gpu
- *
- * This structure contains function callback pointers so the kgd driver
- * will notify to the amdkfd about certain status changes.
- *
- */
-struct kgd2kfd_calls {
-	void (*exit)(void);
-	struct kfd_dev* (*probe)(struct kgd_dev *kgd, struct pci_dev *pdev,
-		const struct kfd2kgd_calls *f2g);
-	bool (*device_init)(struct kfd_dev *kfd,
-			const struct kgd2kfd_shared_resources *gpu_resources);
-	void (*device_exit)(struct kfd_dev *kfd);
-	void (*interrupt)(struct kfd_dev *kfd, const void *ih_ring_entry);
-	void (*suspend)(struct kfd_dev *kfd);
-	int (*resume)(struct kfd_dev *kfd);
-	int (*quiesce_mm)(struct mm_struct *mm);
-	int (*resume_mm)(struct mm_struct *mm);
-	int (*schedule_evict_and_restore_process)(struct mm_struct *mm,
-			struct dma_fence *fence);
-	int  (*pre_reset)(struct kfd_dev *kfd);
-	int  (*post_reset)(struct kfd_dev *kfd);
-};
-
-int kgd2kfd_init(unsigned interface_version,
-		const struct kgd2kfd_calls **g2f);
-
 #endif	/* KGD_KFD_INTERFACE_H_INCLUDED */
diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
index 789c4f288485..2b579ba9b685 100644
--- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
@@ -92,6 +92,9 @@ enum pp_clock_type {
 	PP_SCLK,
 	PP_MCLK,
 	PP_PCIE,
+	PP_SOCCLK,
+	PP_FCLK,
+	PP_DCEFCLK,
 	OD_SCLK,
 	OD_MCLK,
 	OD_VDDC_CURVE,
@@ -281,6 +284,11 @@ struct amd_pm_funcs {
 	int (*set_hard_min_dcefclk_by_freq)(void *handle, uint32_t clock);
 	int (*set_hard_min_fclk_by_freq)(void *handle, uint32_t clock);
 	int (*set_min_deep_sleep_dcefclk)(void *handle, uint32_t clock);
+	int (*get_asic_baco_capability)(void *handle, bool *cap);
+	int (*get_asic_baco_state)(void *handle, int *state);
+	int (*set_asic_baco_state)(void *handle, int state);
+	int (*get_ppfeature_status)(void *handle, char *buf);
+	int (*set_ppfeature_status)(void *handle, uint64_t ppfeature_masks);
 };
 
 #endif
diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
index 9bc27f468d5b..3f73f7cd18b9 100644
--- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
+++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c
@@ -1404,6 +1404,97 @@ static int pp_set_active_display_count(void *handle, uint32_t count)
 	return ret;
 }
 
+static int pp_get_asic_baco_capability(void *handle, bool *cap)
+{
+	struct pp_hwmgr *hwmgr = handle;
+
+	if (!hwmgr)
+		return -EINVAL;
+
+	if (!hwmgr->pm_en || !hwmgr->hwmgr_func->get_asic_baco_capability)
+		return 0;
+
+	mutex_lock(&hwmgr->smu_lock);
+	hwmgr->hwmgr_func->get_asic_baco_capability(hwmgr, cap);
+	mutex_unlock(&hwmgr->smu_lock);
+
+	return 0;
+}
+
+static int pp_get_asic_baco_state(void *handle, int *state)
+{
+	struct pp_hwmgr *hwmgr = handle;
+
+	if (!hwmgr)
+		return -EINVAL;
+
+	if (!hwmgr->pm_en || !hwmgr->hwmgr_func->get_asic_baco_state)
+		return 0;
+
+	mutex_lock(&hwmgr->smu_lock);
+	hwmgr->hwmgr_func->get_asic_baco_state(hwmgr, (enum BACO_STATE *)state);
+	mutex_unlock(&hwmgr->smu_lock);
+
+	return 0;
+}
+
+static int pp_set_asic_baco_state(void *handle, int state)
+{
+	struct pp_hwmgr *hwmgr = handle;
+
+	if (!hwmgr)
+		return -EINVAL;
+
+	if (!hwmgr->pm_en || !hwmgr->hwmgr_func->set_asic_baco_state)
+		return 0;
+
+	mutex_lock(&hwmgr->smu_lock);
+	hwmgr->hwmgr_func->set_asic_baco_state(hwmgr, (enum BACO_STATE)state);
+	mutex_unlock(&hwmgr->smu_lock);
+
+	return 0;
+}
+
+static int pp_get_ppfeature_status(void *handle, char *buf)
+{
+	struct pp_hwmgr *hwmgr = handle;
+	int ret = 0;
+
+	if (!hwmgr || !hwmgr->pm_en || !buf)
+		return -EINVAL;
+
+	if (hwmgr->hwmgr_func->get_ppfeature_status == NULL) {
+		pr_info_ratelimited("%s was not implemented.\n", __func__);
+		return -EINVAL;
+	}
+
+	mutex_lock(&hwmgr->smu_lock);
+	ret = hwmgr->hwmgr_func->get_ppfeature_status(hwmgr, buf);
+	mutex_unlock(&hwmgr->smu_lock);
+
+	return ret;
+}
+
+static int pp_set_ppfeature_status(void *handle, uint64_t ppfeature_masks)
+{
+	struct pp_hwmgr *hwmgr = handle;
+	int ret = 0;
+
+	if (!hwmgr || !hwmgr->pm_en)
+		return -EINVAL;
+
+	if (hwmgr->hwmgr_func->set_ppfeature_status == NULL) {
+		pr_info_ratelimited("%s was not implemented.\n", __func__);
+		return -EINVAL;
+	}
+
+	mutex_lock(&hwmgr->smu_lock);
+	ret = hwmgr->hwmgr_func->set_ppfeature_status(hwmgr, ppfeature_masks);
+	mutex_unlock(&hwmgr->smu_lock);
+
+	return ret;
+}
+
 static const struct amd_pm_funcs pp_dpm_funcs = {
 	.load_firmware = pp_dpm_load_fw,
 	.wait_for_fw_loading_complete = pp_dpm_fw_loading_complete,
@@ -1454,4 +1545,9 @@ static const struct amd_pm_funcs pp_dpm_funcs = {
 	.set_min_deep_sleep_dcefclk = pp_set_min_deep_sleep_dcefclk,
 	.set_hard_min_dcefclk_by_freq = pp_set_hard_min_dcefclk_by_freq,
 	.set_hard_min_fclk_by_freq = pp_set_hard_min_fclk_by_freq,
+	.get_asic_baco_capability = pp_get_asic_baco_capability,
+	.get_asic_baco_state = pp_get_asic_baco_state,
+	.set_asic_baco_state = pp_set_asic_baco_state,
+	.get_ppfeature_status = pp_get_ppfeature_status,
+	.set_ppfeature_status = pp_set_ppfeature_status,
 };
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/Makefile b/drivers/gpu/drm/amd/powerplay/hwmgr/Makefile
index ade8973b6f4d..0b3c6d1d52e4 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/Makefile
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/Makefile
@@ -35,7 +35,7 @@ HARDWARE_MGR = hwmgr.o processpptables.o \
 		vega12_thermal.o \
 		pp_overdriver.o smu_helper.o \
 		vega20_processpptables.o vega20_hwmgr.o vega20_powertune.o \
-		vega20_thermal.o
+		vega20_thermal.o common_baco.o vega10_baco.o  vega20_baco.o
 
 AMD_PP_HWMGR = $(addprefix $(AMD_PP_PATH)/hwmgr/,$(HARDWARE_MGR))
 
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/common_baco.c b/drivers/gpu/drm/amd/powerplay/hwmgr/common_baco.c
new file mode 100644
index 000000000000..9c57c1f67749
--- /dev/null
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/common_baco.c
@@ -0,0 +1,101 @@
+/*
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "common_baco.h"
+
+
+static bool baco_wait_register(struct pp_hwmgr *hwmgr, u32 reg, u32 mask, u32 value)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+	u32 timeout = 5000, data;
+
+	do {
+		msleep(1);
+		data = RREG32(reg);
+		timeout--;
+	} while (value != (data & mask) && (timeout != 0));
+
+	if (timeout == 0)
+		return false;
+
+	return true;
+}
+
+static bool baco_cmd_handler(struct pp_hwmgr *hwmgr, u32 command, u32 reg, u32 mask,
+			        u32 shift, u32 value, u32 timeout)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+	u32 data;
+	bool ret = true;
+
+	switch (command) {
+	case CMD_WRITE:
+		WREG32(reg, value << shift);
+		break;
+	case CMD_READMODIFYWRITE:
+		data = RREG32(reg);
+		data = (data & (~mask)) | (value << shift);
+		WREG32(reg, data);
+		break;
+	case CMD_WAITFOR:
+		ret = baco_wait_register(hwmgr, reg, mask, value);
+		break;
+	case CMD_DELAY_MS:
+		if (timeout)
+			/* Delay in milli Seconds */
+			msleep(timeout);
+		break;
+	case CMD_DELAY_US:
+		if (timeout)
+			/* Delay in micro Seconds */
+			udelay(timeout);
+		break;
+
+	default:
+		dev_warn(adev->dev, "Invalid BACO command.\n");
+		ret = false;
+	}
+
+	return ret;
+}
+
+bool soc15_baco_program_registers(struct pp_hwmgr *hwmgr,
+				 const struct soc15_baco_cmd_entry *entry,
+				 const u32 array_size)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+	u32 i, reg = 0;
+
+	for (i = 0; i < array_size; i++) {
+		if ((entry[i].cmd == CMD_WRITE) ||
+		    (entry[i].cmd == CMD_READMODIFYWRITE) ||
+		    (entry[i].cmd == CMD_WAITFOR))
+			reg = adev->reg_offset[entry[i].hwip][entry[i].inst][entry[i].seg]
+				+ entry[i].reg_offset;
+		if (!baco_cmd_handler(hwmgr, entry[i].cmd, reg, entry[i].mask,
+				     entry[i].shift, entry[i].val, entry[i].timeout))
+			return false;
+	}
+
+	return true;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_sw_engine_dce80.h b/drivers/gpu/drm/amd/powerplay/hwmgr/common_baco.h
index 26355c088746..95296c916f4e 100644
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2c_sw_engine_dce80.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/common_baco.h
@@ -1,5 +1,5 @@
 /*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
+ * Copyright 2018 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -19,25 +19,32 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  *
- * Authors: AMD
- *
  */
+#ifndef __COMMON_BOCO_H__
+#define __COMMON_BOCO_H__
+#include "hwmgr.h"
 
-#ifndef __DAL_I2C_SW_ENGINE_DCE80_H__
-#define __DAL_I2C_SW_ENGINE_DCE80_H__
 
-struct i2c_sw_engine_dce80 {
-	struct i2c_sw_engine base;
-	uint32_t engine_id;
+enum baco_cmd_type {
+	CMD_WRITE = 0,
+	CMD_READMODIFYWRITE,
+	CMD_WAITFOR,
+	CMD_DELAY_MS,
+	CMD_DELAY_US,
 };
 
-struct i2c_sw_engine_dce80_create_arg {
-	uint32_t engine_id;
-	uint32_t default_speed;
-	struct dc_context *ctx;
+struct soc15_baco_cmd_entry {
+	enum baco_cmd_type cmd;
+	uint32_t 	hwip;
+	uint32_t 	inst;
+	uint32_t 	seg;
+	uint32_t 	reg_offset;
+	uint32_t     	mask;
+	uint32_t     	shift;
+	uint32_t     	timeout;
+	uint32_t     	val;
 };
-
-struct i2c_engine *dal_i2c_sw_engine_dce80_create(
-	const struct i2c_sw_engine_dce80_create_arg *arg);
-
+extern bool soc15_baco_program_registers(struct pp_hwmgr *hwmgr,
+					const struct soc15_baco_cmd_entry *entry,
+					const u32 array_size);
 #endif
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c b/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
index 1f92a9f4c9e3..c1c51c115e57 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
@@ -154,15 +154,6 @@ int phm_powerdown_uvd(struct pp_hwmgr *hwmgr)
 	return 0;
 }
 
-int phm_enable_clock_power_gatings(struct pp_hwmgr *hwmgr)
-{
-	PHM_FUNC_CHECK(hwmgr);
-
-	if (NULL != hwmgr->hwmgr_func->enable_clock_power_gating)
-		return hwmgr->hwmgr_func->enable_clock_power_gating(hwmgr);
-
-	return 0;
-}
 
 int phm_disable_clock_power_gatings(struct pp_hwmgr *hwmgr)
 {
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
index 310b102a9292..6cd6497c6fc2 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
@@ -273,7 +273,7 @@ int hwmgr_hw_fini(struct pp_hwmgr *hwmgr)
 
 	phm_stop_thermal_controller(hwmgr);
 	psm_set_boot_states(hwmgr);
-	psm_adjust_power_state_dynamic(hwmgr, false, NULL);
+	psm_adjust_power_state_dynamic(hwmgr, true, NULL);
 	phm_disable_dynamic_state_management(hwmgr);
 	phm_disable_clock_power_gatings(hwmgr);
 
@@ -295,7 +295,7 @@ int hwmgr_suspend(struct pp_hwmgr *hwmgr)
 	ret = psm_set_boot_states(hwmgr);
 	if (ret)
 		return ret;
-	ret = psm_adjust_power_state_dynamic(hwmgr, false, NULL);
+	ret = psm_adjust_power_state_dynamic(hwmgr, true, NULL);
 	if (ret)
 		return ret;
 	ret = phm_power_down_asic(hwmgr);
@@ -325,7 +325,7 @@ int hwmgr_resume(struct pp_hwmgr *hwmgr)
 	if (ret)
 		return ret;
 
-	ret = psm_adjust_power_state_dynamic(hwmgr, false, NULL);
+	ret = psm_adjust_power_state_dynamic(hwmgr, true, NULL);
 
 	return ret;
 }
@@ -379,12 +379,12 @@ int hwmgr_handle_task(struct pp_hwmgr *hwmgr, enum amd_pp_task task_id,
 		ret = psm_set_user_performance_state(hwmgr, requested_ui_label, &requested_ps);
 		if (ret)
 			return ret;
-		ret = psm_adjust_power_state_dynamic(hwmgr, false, requested_ps);
+		ret = psm_adjust_power_state_dynamic(hwmgr, true, requested_ps);
 		break;
 	}
 	case AMD_PP_TASK_COMPLETE_INIT:
 	case AMD_PP_TASK_READJUST_POWER_STATE:
-		ret = psm_adjust_power_state_dynamic(hwmgr, false, NULL);
+		ret = psm_adjust_power_state_dynamic(hwmgr, true, NULL);
 		break;
 	default:
 		break;
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/pp_psm.c b/drivers/gpu/drm/amd/powerplay/hwmgr/pp_psm.c
index 56437866d120..ce177d7f04cb 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/pp_psm.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/pp_psm.c
@@ -256,16 +256,14 @@ static void power_state_management(struct pp_hwmgr *hwmgr,
 	}
 }
 
-int psm_adjust_power_state_dynamic(struct pp_hwmgr *hwmgr, bool skip,
+int psm_adjust_power_state_dynamic(struct pp_hwmgr *hwmgr, bool skip_display_settings,
 						struct pp_power_state *new_ps)
 {
 	uint32_t index;
 	long workload;
 
-	if (skip)
-		return 0;
-
-	phm_display_configuration_changed(hwmgr);
+	if (!skip_display_settings)
+		phm_display_configuration_changed(hwmgr);
 
 	if (hwmgr->ps)
 		power_state_management(hwmgr, new_ps);
@@ -276,9 +274,11 @@ int psm_adjust_power_state_dynamic(struct pp_hwmgr *hwmgr, bool skip,
 		 */
 		phm_apply_clock_adjust_rules(hwmgr);
 
-	phm_notify_smc_display_config_after_ps_adjustment(hwmgr);
+	if (!skip_display_settings)
+		phm_notify_smc_display_config_after_ps_adjustment(hwmgr);
 
-	if (!phm_force_dpm_levels(hwmgr, hwmgr->request_dpm_level))
+	if ((hwmgr->request_dpm_level != hwmgr->dpm_level) &&
+	    !phm_force_dpm_levels(hwmgr, hwmgr->request_dpm_level))
 		hwmgr->dpm_level = hwmgr->request_dpm_level;
 
 	if (hwmgr->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL) {
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/pp_psm.h b/drivers/gpu/drm/amd/powerplay/hwmgr/pp_psm.h
index fa1b6825036a..b62d55f1f289 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/pp_psm.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/pp_psm.h
@@ -34,7 +34,7 @@ int psm_set_user_performance_state(struct pp_hwmgr *hwmgr,
 					enum PP_StateUILabel label_id,
 					struct pp_power_state **state);
 int psm_adjust_power_state_dynamic(struct pp_hwmgr *hwmgr,
-				bool skip,
+				bool skip_display_settings,
 				struct pp_power_state *new_ps);
 
 #endif
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
index 5273de3c5b98..0ad8fe4a6277 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
@@ -139,12 +139,10 @@ static int smu10_construct_max_power_limits_table(struct pp_hwmgr *hwmgr,
 static int smu10_init_dynamic_state_adjustment_rule_settings(
 							struct pp_hwmgr *hwmgr)
 {
-	uint32_t table_size =
-		sizeof(struct phm_clock_voltage_dependency_table) +
-		(7 * sizeof(struct phm_clock_voltage_dependency_record));
+	struct phm_clock_voltage_dependency_table *table_clk_vlt;
 
-	struct phm_clock_voltage_dependency_table *table_clk_vlt =
-					kzalloc(table_size, GFP_KERNEL);
+	table_clk_vlt = kzalloc(struct_size(table_clk_vlt, entries, 7),
+				GFP_KERNEL);
 
 	if (NULL == table_clk_vlt) {
 		pr_err("Can not allocate memory!\n");
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index c8f5c00dd1e7..48187acac59e 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -3681,10 +3681,12 @@ static int smu7_request_link_speed_change_before_state_change(
 			data->force_pcie_gen = PP_PCIEGen2;
 			if (current_link_speed == PP_PCIEGen2)
 				break;
+			/* fall through */
 		case PP_PCIEGen2:
 			if (0 == amdgpu_acpi_pcie_performance_request(hwmgr->adev, PCIE_PERF_REQ_GEN2, false))
 				break;
 #endif
+			/* fall through */
 		default:
 			data->force_pcie_gen = smu7_get_current_pcie_speed(hwmgr);
 			break;
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_powertune.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_powertune.c
index d138ddae563d..58f5589aaf12 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_powertune.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_powertune.c
@@ -1211,7 +1211,7 @@ int smu7_power_control_set_level(struct pp_hwmgr *hwmgr)
 				hwmgr->platform_descriptor.TDPAdjustment :
 				(-1 * hwmgr->platform_descriptor.TDPAdjustment);
 
-		 if (hwmgr->chip_id > CHIP_TONGA)
+		if (hwmgr->chip_id > CHIP_TONGA)
 			target_tdp = ((100 + adjust_percent) * (int)(cac_table->usTDP * 256)) / 100;
 		else
 			target_tdp = ((100 + adjust_percent) * (int)(cac_table->usConfigurableTDP * 256)) / 100;
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu8_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu8_hwmgr.c
index 553a203ac47c..019d6a206492 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu8_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu8_hwmgr.c
@@ -272,12 +272,10 @@ static int smu8_init_dynamic_state_adjustment_rule_settings(
 			struct pp_hwmgr *hwmgr,
 			ATOM_CLK_VOLT_CAPABILITY *disp_voltage_table)
 {
-	uint32_t table_size =
-		sizeof(struct phm_clock_voltage_dependency_table) +
-		(7 * sizeof(struct phm_clock_voltage_dependency_record));
+	struct phm_clock_voltage_dependency_table *table_clk_vlt;
 
-	struct phm_clock_voltage_dependency_table *table_clk_vlt =
-					kzalloc(table_size, GFP_KERNEL);
+	table_clk_vlt = kzalloc(struct_size(table_clk_vlt, entries, 7),
+				GFP_KERNEL);
 
 	if (NULL == table_clk_vlt) {
 		pr_err("Can not allocate memory!\n");
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_baco.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_baco.c
new file mode 100644
index 000000000000..7337be5602e4
--- /dev/null
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_baco.c
@@ -0,0 +1,158 @@
+/*
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+#include "amdgpu.h"
+#include "soc15.h"
+#include "soc15_hw_ip.h"
+#include "vega10_ip_offset.h"
+#include "soc15_common.h"
+#include "vega10_inc.h"
+#include "vega10_ppsmc.h"
+#include "vega10_baco.h"
+
+
+
+static const struct soc15_baco_cmd_entry  pre_baco_tbl[] =
+{
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBIF_DOORBELL_CNTL), BIF_DOORBELL_CNTL__DOORBELL_MONITOR_EN_MASK, BIF_DOORBELL_CNTL__DOORBELL_MONITOR_EN__SHIFT, 0, 1},
+	{CMD_WRITE, SOC15_REG_ENTRY(NBIF, 0, mmBIF_FB_EN), 0, 0, 0, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_DSTATE_BYPASS_MASK, BACO_CNTL__BACO_DSTATE_BYPASS__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_RST_INTR_MASK_MASK, BACO_CNTL__BACO_RST_INTR_MASK__SHIFT, 0, 1}
+};
+
+static const struct soc15_baco_cmd_entry enter_baco_tbl[] =
+{
+	{CMD_WAITFOR, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__SOC_DOMAIN_IDLE_MASK, THM_BACO_CNTL__SOC_DOMAIN_IDLE__SHIFT, 0xffffffff, 0x80000000},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_EN_MASK, BACO_CNTL__BACO_EN__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_BIF_LCLK_SWITCH_MASK, BACO_CNTL__BACO_BIF_LCLK_SWITCH__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_DUMMY_EN_MASK, BACO_CNTL__BACO_DUMMY_EN__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_SOC_VDCI_RESET_MASK, THM_BACO_CNTL__BACO_SOC_VDCI_RESET__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_SMNCLK_MUX_MASK, THM_BACO_CNTL__BACO_SMNCLK_MUX__SHIFT,0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_ISO_EN_MASK, THM_BACO_CNTL__BACO_ISO_EN__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_AEB_ISO_EN_MASK, THM_BACO_CNTL__BACO_AEB_ISO_EN__SHIFT,0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_ANA_ISO_EN_MASK, THM_BACO_CNTL__BACO_ANA_ISO_EN__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_SOC_REFCLK_OFF_MASK,     THM_BACO_CNTL__BACO_SOC_REFCLK_OFF__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_POWER_OFF_MASK, BACO_CNTL__BACO_POWER_OFF__SHIFT, 0, 1},
+	{CMD_DELAY_MS, 0, 0, 0, 0, 0, 0, 5, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_RESET_EN_MASK, THM_BACO_CNTL__BACO_RESET_EN__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_PWROKRAW_CNTL_MASK, THM_BACO_CNTL__BACO_PWROKRAW_CNTL__SHIFT, 0, 0},
+	{CMD_WAITFOR, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_MODE_MASK, BACO_CNTL__BACO_MODE__SHIFT, 0xffffffff, 0x100}
+};
+
+static const struct soc15_baco_cmd_entry exit_baco_tbl[] =
+{
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_POWER_OFF_MASK, BACO_CNTL__BACO_POWER_OFF__SHIFT, 0, 0},
+	{CMD_DELAY_MS, 0, 0, 0, 0, 0, 0, 10,0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_SOC_REFCLK_OFF_MASK, THM_BACO_CNTL__BACO_SOC_REFCLK_OFF__SHIFT, 0,0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_ANA_ISO_EN_MASK, THM_BACO_CNTL__BACO_ANA_ISO_EN__SHIFT, 0, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_AEB_ISO_EN_MASK, THM_BACO_CNTL__BACO_AEB_ISO_EN__SHIFT,0, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_ISO_EN_MASK, THM_BACO_CNTL__BACO_ISO_EN__SHIFT, 0, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_PWROKRAW_CNTL_MASK, THM_BACO_CNTL__BACO_PWROKRAW_CNTL__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_SMNCLK_MUX_MASK, THM_BACO_CNTL__BACO_SMNCLK_MUX__SHIFT, 0, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_SOC_VDCI_RESET_MASK, THM_BACO_CNTL__BACO_SOC_VDCI_RESET__SHIFT, 0, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_EXIT_MASK, THM_BACO_CNTL__BACO_EXIT__SHIFT, 0, 1},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_RESET_EN_MASK, THM_BACO_CNTL__BACO_RESET_EN__SHIFT, 0, 0},
+	{CMD_WAITFOR, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_EXIT_MASK, 0, 0xffffffff, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(THM, 0, mmTHM_BACO_CNTL), THM_BACO_CNTL__BACO_SB_AXI_FENCE_MASK, THM_BACO_CNTL__BACO_SB_AXI_FENCE__SHIFT, 0, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_DUMMY_EN_MASK, BACO_CNTL__BACO_DUMMY_EN__SHIFT,  0, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_BIF_LCLK_SWITCH_MASK ,BACO_CNTL__BACO_BIF_LCLK_SWITCH__SHIFT, 0, 0},
+	{CMD_READMODIFYWRITE, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_EN_MASK , BACO_CNTL__BACO_EN__SHIFT, 0,0},
+	{CMD_WAITFOR, SOC15_REG_ENTRY(NBIF, 0, mmBACO_CNTL), BACO_CNTL__BACO_MODE_MASK, 0, 0xffffffff, 0}
+ };
+
+static const struct soc15_baco_cmd_entry clean_baco_tbl[] =
+{
+	{CMD_WRITE, SOC15_REG_ENTRY(NBIF, 0, mmBIOS_SCRATCH_6), 0, 0, 0, 0},
+	{CMD_WRITE, SOC15_REG_ENTRY(NBIF, 0, mmBIOS_SCRATCH_7), 0, 0, 0, 0},
+};
+
+int vega10_baco_get_capability(struct pp_hwmgr *hwmgr, bool *cap)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+	uint32_t reg, data;
+
+	*cap = false;
+	if (!phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_BACO))
+		return 0;
+
+	WREG32(0x12074, 0xFFF0003B);
+	data = RREG32(0x12075);
+
+	if (data == 0x1) {
+		reg = RREG32_SOC15(NBIF, 0, mmRCC_BIF_STRAP0);
+
+		if (reg & RCC_BIF_STRAP0__STRAP_PX_CAPABLE_MASK)
+			*cap = true;
+	}
+
+	return 0;
+}
+
+int vega10_baco_get_state(struct pp_hwmgr *hwmgr, enum BACO_STATE *state)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+	uint32_t reg;
+
+	reg = RREG32_SOC15(NBIF, 0, mmBACO_CNTL);
+
+	if (reg & BACO_CNTL__BACO_MODE_MASK)
+		/* gfx has already entered BACO state */
+		*state = BACO_STATE_IN;
+	else
+		*state = BACO_STATE_OUT;
+	return 0;
+}
+
+int vega10_baco_set_state(struct pp_hwmgr *hwmgr, enum BACO_STATE state)
+{
+	enum BACO_STATE cur_state;
+
+	vega10_baco_get_state(hwmgr, &cur_state);
+
+	if (cur_state == state)
+		/* aisc already in the target state */
+		return 0;
+
+	if (state == BACO_STATE_IN) {
+		if (soc15_baco_program_registers(hwmgr, pre_baco_tbl,
+					     ARRAY_SIZE(pre_baco_tbl))) {
+			if (smum_send_msg_to_smc(hwmgr, PPSMC_MSG_EnterBaco))
+				return -EINVAL;
+
+			if (soc15_baco_program_registers(hwmgr, enter_baco_tbl,
+						   ARRAY_SIZE(enter_baco_tbl)))
+				return 0;
+		}
+	} else if (state == BACO_STATE_OUT) {
+		/* HW requires at least 20ms between regulator off and on */
+		msleep(20);
+		/* Execute Hardware BACO exit sequence */
+		if (soc15_baco_program_registers(hwmgr, exit_baco_tbl,
+					     ARRAY_SIZE(exit_baco_tbl))) {
+			if (soc15_baco_program_registers(hwmgr, clean_baco_tbl,
+						     ARRAY_SIZE(clean_baco_tbl)))
+				return 0;
+		}
+	}
+
+	return -EINVAL;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2caux_dce80.h b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_baco.h
index 21908629e973..f7a3ffa744b3 100644
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce80/i2caux_dce80.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_baco.h
@@ -1,5 +1,5 @@
 /*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
+ * Copyright 2018 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -19,20 +19,14 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  *
- * Authors: AMD
- *
  */
+#ifndef __VEGA10_BACO_H__
+#define __VEGA10_BACO_H__
+#include "hwmgr.h"
+#include "common_baco.h"
 
-#ifndef __DAL_I2C_AUX_DCE80_H__
-#define __DAL_I2C_AUX_DCE80_H__
-
-struct i2caux_dce80 {
-	struct i2caux base;
-	/* indicate the I2C HW circular buffer is in use */
-	bool i2c_hw_buffer_in_use;
-};
-
-struct i2caux *dal_i2caux_dce80_create(
-	struct dc_context *ctx);
+extern int vega10_baco_get_capability(struct pp_hwmgr *hwmgr, bool *cap);
+extern int vega10_baco_get_state(struct pp_hwmgr *hwmgr, enum BACO_STATE *state);
+extern int vega10_baco_set_state(struct pp_hwmgr *hwmgr, enum BACO_STATE state);
 
 #endif
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
index 91e3bbe6d61d..5479125ff4f6 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
@@ -48,6 +48,7 @@
 #include "ppinterrupt.h"
 #include "pp_overdriver.h"
 #include "pp_thermal.h"
+#include "vega10_baco.h"
 
 #include "smuio/smuio_9_0_offset.h"
 #include "smuio/smuio_9_0_sh_mask.h"
@@ -71,6 +72,21 @@ static const uint32_t channel_number[] = {1, 2, 0, 4, 0, 8, 0, 16, 2};
 #define DF_CS_AON0_DramBaseAddress0__IntLvAddrSel_MASK                                                        0x00000700L
 #define DF_CS_AON0_DramBaseAddress0__DramBaseAddr_MASK                                                        0xFFFFF000L
 
+typedef enum {
+	CLK_SMNCLK = 0,
+	CLK_SOCCLK,
+	CLK_MP0CLK,
+	CLK_MP1CLK,
+	CLK_LCLK,
+	CLK_DCEFCLK,
+	CLK_VCLK,
+	CLK_DCLK,
+	CLK_ECLK,
+	CLK_UCLK,
+	CLK_GFXCLK,
+	CLK_COUNT,
+} CLOCK_ID_e;
+
 static const ULONG PhwVega10_Magic = (ULONG)(PHM_VIslands_Magic);
 
 struct vega10_power_state *cast_phw_vega10_power_state(
@@ -3485,6 +3501,17 @@ static int vega10_upload_dpm_bootup_level(struct pp_hwmgr *hwmgr)
 		}
 	}
 
+	if (!data->registry_data.socclk_dpm_key_disabled) {
+		if (data->smc_state_table.soc_boot_level !=
+				data->dpm_table.soc_table.dpm_state.soft_min_level) {
+			smum_send_msg_to_smc_with_parameter(hwmgr,
+				PPSMC_MSG_SetSoftMinSocclkByIndex,
+				data->smc_state_table.soc_boot_level);
+			data->dpm_table.soc_table.dpm_state.soft_min_level =
+					data->smc_state_table.soc_boot_level;
+		}
+	}
+
 	return 0;
 }
 
@@ -3516,6 +3543,17 @@ static int vega10_upload_dpm_max_level(struct pp_hwmgr *hwmgr)
 		}
 	}
 
+	if (!data->registry_data.socclk_dpm_key_disabled) {
+		if (data->smc_state_table.soc_max_level !=
+			data->dpm_table.soc_table.dpm_state.soft_max_level) {
+			smum_send_msg_to_smc_with_parameter(hwmgr,
+				PPSMC_MSG_SetSoftMaxSocclkByIndex,
+				data->smc_state_table.soc_max_level);
+			data->dpm_table.soc_table.dpm_state.soft_max_level =
+					data->smc_state_table.soc_max_level;
+		}
+	}
+
 	return 0;
 }
 
@@ -3541,6 +3579,10 @@ static int vega10_generate_dpm_level_enable_mask(
 			vega10_find_lowest_dpm_level(&(data->dpm_table.mem_table));
 	data->smc_state_table.mem_max_level =
 			vega10_find_highest_dpm_level(&(data->dpm_table.mem_table));
+	data->smc_state_table.soc_boot_level =
+			vega10_find_lowest_dpm_level(&(data->dpm_table.soc_table));
+	data->smc_state_table.soc_max_level =
+			vega10_find_highest_dpm_level(&(data->dpm_table.soc_table));
 
 	PP_ASSERT_WITH_CODE(!vega10_upload_dpm_bootup_level(hwmgr),
 			"Attempt to upload DPM Bootup Levels Failed!",
@@ -3555,6 +3597,9 @@ static int vega10_generate_dpm_level_enable_mask(
 	for(i = data->smc_state_table.mem_boot_level; i < data->smc_state_table.mem_max_level; i++)
 		data->dpm_table.mem_table.dpm_levels[i].enabled = true;
 
+	for (i = data->smc_state_table.soc_boot_level; i < data->smc_state_table.soc_max_level; i++)
+		data->dpm_table.soc_table.dpm_levels[i].enabled = true;
+
 	return 0;
 }
 
@@ -4028,6 +4073,24 @@ static int vega10_force_clock_level(struct pp_hwmgr *hwmgr,
 
 		break;
 
+	case PP_SOCCLK:
+		data->smc_state_table.soc_boot_level = mask ? (ffs(mask) - 1) : 0;
+		data->smc_state_table.soc_max_level = mask ? (fls(mask) - 1) : 0;
+
+		PP_ASSERT_WITH_CODE(!vega10_upload_dpm_bootup_level(hwmgr),
+			"Failed to upload boot level to lowest!",
+			return -EINVAL);
+
+		PP_ASSERT_WITH_CODE(!vega10_upload_dpm_max_level(hwmgr),
+			"Failed to upload dpm max level to highest!",
+			return -EINVAL);
+
+		break;
+
+	case PP_DCEFCLK:
+		pr_info("Setting DCEFCLK min/max dpm level is not supported!\n");
+		break;
+
 	case PP_PCIE:
 	default:
 		break;
@@ -4267,12 +4330,113 @@ static int vega10_set_watermarks_for_clocks_ranges(struct pp_hwmgr *hwmgr,
 	return result;
 }
 
+static int vega10_get_ppfeature_status(struct pp_hwmgr *hwmgr, char *buf)
+{
+	static const char *ppfeature_name[] = {
+				"DPM_PREFETCHER",
+				"GFXCLK_DPM",
+				"UCLK_DPM",
+				"SOCCLK_DPM",
+				"UVD_DPM",
+				"VCE_DPM",
+				"ULV",
+				"MP0CLK_DPM",
+				"LINK_DPM",
+				"DCEFCLK_DPM",
+				"AVFS",
+				"GFXCLK_DS",
+				"SOCCLK_DS",
+				"LCLK_DS",
+				"PPT",
+				"TDC",
+				"THERMAL",
+				"GFX_PER_CU_CG",
+				"RM",
+				"DCEFCLK_DS",
+				"ACDC",
+				"VR0HOT",
+				"VR1HOT",
+				"FW_CTF",
+				"LED_DISPLAY",
+				"FAN_CONTROL",
+				"FAST_PPT",
+				"DIDT",
+				"ACG",
+				"PCC_LIMIT"};
+	static const char *output_title[] = {
+				"FEATURES",
+				"BITMASK",
+				"ENABLEMENT"};
+	uint64_t features_enabled;
+	int i;
+	int ret = 0;
+	int size = 0;
+
+	ret = vega10_get_enabled_smc_features(hwmgr, &features_enabled);
+	PP_ASSERT_WITH_CODE(!ret,
+			"[EnableAllSmuFeatures] Failed to get enabled smc features!",
+			return ret);
+
+	size += sprintf(buf + size, "Current ppfeatures: 0x%016llx\n", features_enabled);
+	size += sprintf(buf + size, "%-19s %-22s %s\n",
+				output_title[0],
+				output_title[1],
+				output_title[2]);
+	for (i = 0; i < GNLD_FEATURES_MAX; i++) {
+		size += sprintf(buf + size, "%-19s 0x%016llx %6s\n",
+					ppfeature_name[i],
+					1ULL << i,
+					(features_enabled & (1ULL << i)) ? "Y" : "N");
+	}
+
+	return size;
+}
+
+static int vega10_set_ppfeature_status(struct pp_hwmgr *hwmgr, uint64_t new_ppfeature_masks)
+{
+	uint64_t features_enabled;
+	uint64_t features_to_enable;
+	uint64_t features_to_disable;
+	int ret = 0;
+
+	if (new_ppfeature_masks >= (1ULL << GNLD_FEATURES_MAX))
+		return -EINVAL;
+
+	ret = vega10_get_enabled_smc_features(hwmgr, &features_enabled);
+	if (ret)
+		return ret;
+
+	features_to_disable =
+		(features_enabled ^ new_ppfeature_masks) & features_enabled;
+	features_to_enable =
+		(features_enabled ^ new_ppfeature_masks) ^ features_to_disable;
+
+	pr_debug("features_to_disable 0x%llx\n", features_to_disable);
+	pr_debug("features_to_enable 0x%llx\n", features_to_enable);
+
+	if (features_to_disable) {
+		ret = vega10_enable_smc_features(hwmgr, false, features_to_disable);
+		if (ret)
+			return ret;
+	}
+
+	if (features_to_enable) {
+		ret = vega10_enable_smc_features(hwmgr, true, features_to_enable);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 static int vega10_print_clock_levels(struct pp_hwmgr *hwmgr,
 		enum pp_clock_type type, char *buf)
 {
 	struct vega10_hwmgr *data = hwmgr->backend;
 	struct vega10_single_dpm_table *sclk_table = &(data->dpm_table.gfx_table);
 	struct vega10_single_dpm_table *mclk_table = &(data->dpm_table.mem_table);
+	struct vega10_single_dpm_table *soc_table = &(data->dpm_table.soc_table);
+	struct vega10_single_dpm_table *dcef_table = &(data->dpm_table.dcef_table);
 	struct vega10_pcie_table *pcie_table = &(data->dpm_table.pcie_table);
 	struct vega10_odn_clock_voltage_dependency_table *podn_vdd_dep = NULL;
 
@@ -4303,6 +4467,32 @@ static int vega10_print_clock_levels(struct pp_hwmgr *hwmgr,
 					i, mclk_table->dpm_levels[i].value / 100,
 					(i == now) ? "*" : "");
 		break;
+	case PP_SOCCLK:
+		if (data->registry_data.socclk_dpm_key_disabled)
+			break;
+
+		smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetCurrentSocclkIndex);
+		now = smum_get_argument(hwmgr);
+
+		for (i = 0; i < soc_table->count; i++)
+			size += sprintf(buf + size, "%d: %uMhz %s\n",
+					i, soc_table->dpm_levels[i].value / 100,
+					(i == now) ? "*" : "");
+		break;
+	case PP_DCEFCLK:
+		if (data->registry_data.dcefclk_dpm_key_disabled)
+			break;
+
+		smum_send_msg_to_smc_with_parameter(hwmgr,
+				PPSMC_MSG_GetClockFreqMHz, CLK_DCEFCLK);
+		now = smum_get_argument(hwmgr);
+
+		for (i = 0; i < dcef_table->count; i++)
+			size += sprintf(buf + size, "%d: %uMhz %s\n",
+					i, dcef_table->dpm_levels[i].value / 100,
+					(dcef_table->dpm_levels[i].value / 100 == now) ?
+					"*" : "");
+		break;
 	case PP_PCIE:
 		smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetCurrentLinkIndex);
 		now = smum_get_argument(hwmgr);
@@ -4980,6 +5170,12 @@ static const struct pp_hwmgr_func vega10_hwmgr_funcs = {
 	.set_power_limit = vega10_set_power_limit,
 	.odn_edit_dpm_table = vega10_odn_edit_dpm_table,
 	.get_performance_level = vega10_get_performance_level,
+	.get_asic_baco_capability = vega10_baco_get_capability,
+	.get_asic_baco_state = vega10_baco_get_state,
+	.set_asic_baco_state = vega10_baco_set_state,
+	.enable_mgpu_fan_boost = vega10_enable_mgpu_fan_boost,
+	.get_ppfeature_status = vega10_get_ppfeature_status,
+	.set_ppfeature_status = vega10_set_ppfeature_status,
 };
 
 int vega10_hwmgr_init(struct pp_hwmgr *hwmgr)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.h b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.h
index 89870556de1b..f752b4ad0c8a 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.h
@@ -199,6 +199,7 @@ struct vega10_smc_state_table {
 	uint32_t        vce_boot_level;
 	uint32_t        gfx_max_level;
 	uint32_t        mem_max_level;
+	uint32_t        soc_max_level;
 	uint8_t         vr_hot_gpio;
 	uint8_t         ac_dc_gpio;
 	uint8_t         therm_out_gpio;
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_pptable.h b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_pptable.h
index b3e63003a789..c934e9612c1b 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_pptable.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_pptable.h
@@ -282,6 +282,30 @@ typedef struct _ATOM_Vega10_Fan_Table_V2 {
 	UCHAR   ucFanMaxRPM;
 } ATOM_Vega10_Fan_Table_V2;
 
+typedef struct _ATOM_Vega10_Fan_Table_V3 {
+	UCHAR   ucRevId;
+	USHORT  usFanOutputSensitivity;
+	USHORT  usFanAcousticLimitRpm;
+	USHORT  usThrottlingRPM;
+	USHORT  usTargetTemperature;
+	USHORT  usMinimumPWMLimit;
+	USHORT  usTargetGfxClk;
+	USHORT  usFanGainEdge;
+	USHORT  usFanGainHotspot;
+	USHORT  usFanGainLiquid;
+	USHORT  usFanGainVrVddc;
+	USHORT  usFanGainVrMvdd;
+	USHORT  usFanGainPlx;
+	USHORT  usFanGainHbm;
+	UCHAR   ucEnableZeroRPM;
+	USHORT  usFanStopTemperature;
+	USHORT  usFanStartTemperature;
+	UCHAR   ucFanParameters;
+	UCHAR   ucFanMinRPM;
+	UCHAR   ucFanMaxRPM;
+	USHORT  usMGpuThrottlingRPM;
+} ATOM_Vega10_Fan_Table_V3;
+
 typedef struct _ATOM_Vega10_Thermal_Controller {
 	UCHAR ucRevId;
 	UCHAR ucType;           /* one of ATOM_VEGA10_PP_THERMALCONTROLLER_*/
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c
index 99d596dc0e89..b6767d74dc85 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c
@@ -123,6 +123,7 @@ static int init_thermal_controller(
 	const Vega10_PPTable_Generic_SubTable_Header *header;
 	const ATOM_Vega10_Fan_Table *fan_table_v1;
 	const ATOM_Vega10_Fan_Table_V2 *fan_table_v2;
+	const ATOM_Vega10_Fan_Table_V3 *fan_table_v3;
 
 	thermal_controller = (ATOM_Vega10_Thermal_Controller *)
 			(((unsigned long)powerplay_table) +
@@ -207,7 +208,7 @@ static int init_thermal_controller(
 				le16_to_cpu(fan_table_v1->usFanStopTemperature);
 		hwmgr->thermal_controller.advanceFanControlParameters.usZeroRPMStartTemperature =
 				le16_to_cpu(fan_table_v1->usFanStartTemperature);
-	} else if (header->ucRevId > 10) {
+	} else if (header->ucRevId == 0xb) {
 		fan_table_v2 = (ATOM_Vega10_Fan_Table_V2 *)header;
 
 		hwmgr->thermal_controller.fanInfo.ucTachometerPulsesPerRevolution =
@@ -251,7 +252,54 @@ static int init_thermal_controller(
 				le16_to_cpu(fan_table_v2->usFanStopTemperature);
 		hwmgr->thermal_controller.advanceFanControlParameters.usZeroRPMStartTemperature =
 				le16_to_cpu(fan_table_v2->usFanStartTemperature);
+	} else if (header->ucRevId > 0xb) {
+		fan_table_v3 = (ATOM_Vega10_Fan_Table_V3 *)header;
+
+		hwmgr->thermal_controller.fanInfo.ucTachometerPulsesPerRevolution =
+				fan_table_v3->ucFanParameters & ATOM_VEGA10_PP_FANPARAMETERS_TACHOMETER_PULSES_PER_REVOLUTION_MASK;
+		hwmgr->thermal_controller.fanInfo.ulMinRPM = fan_table_v3->ucFanMinRPM * 100UL;
+		hwmgr->thermal_controller.fanInfo.ulMaxRPM = fan_table_v3->ucFanMaxRPM * 100UL;
+		phm_cap_set(hwmgr->platform_descriptor.platformCaps,
+				PHM_PlatformCaps_MicrocodeFanControl);
+		hwmgr->thermal_controller.advanceFanControlParameters.usFanOutputSensitivity =
+				le16_to_cpu(fan_table_v3->usFanOutputSensitivity);
+		hwmgr->thermal_controller.advanceFanControlParameters.usMaxFanRPM =
+				fan_table_v3->ucFanMaxRPM * 100UL;
+		hwmgr->thermal_controller.advanceFanControlParameters.usFanRPMMaxLimit =
+				le16_to_cpu(fan_table_v3->usThrottlingRPM);
+		hwmgr->thermal_controller.advanceFanControlParameters.ulMinFanSCLKAcousticLimit =
+				le16_to_cpu(fan_table_v3->usFanAcousticLimitRpm);
+		hwmgr->thermal_controller.advanceFanControlParameters.usTMax =
+				le16_to_cpu(fan_table_v3->usTargetTemperature);
+		hwmgr->thermal_controller.advanceFanControlParameters.usPWMMin =
+				le16_to_cpu(fan_table_v3->usMinimumPWMLimit);
+		hwmgr->thermal_controller.advanceFanControlParameters.ulTargetGfxClk =
+				le16_to_cpu(fan_table_v3->usTargetGfxClk);
+		hwmgr->thermal_controller.advanceFanControlParameters.usFanGainEdge =
+				le16_to_cpu(fan_table_v3->usFanGainEdge);
+		hwmgr->thermal_controller.advanceFanControlParameters.usFanGainHotspot =
+				le16_to_cpu(fan_table_v3->usFanGainHotspot);
+		hwmgr->thermal_controller.advanceFanControlParameters.usFanGainLiquid =
+				le16_to_cpu(fan_table_v3->usFanGainLiquid);
+		hwmgr->thermal_controller.advanceFanControlParameters.usFanGainVrVddc =
+				le16_to_cpu(fan_table_v3->usFanGainVrVddc);
+		hwmgr->thermal_controller.advanceFanControlParameters.usFanGainVrMvdd =
+				le16_to_cpu(fan_table_v3->usFanGainVrMvdd);
+		hwmgr->thermal_controller.advanceFanControlParameters.usFanGainPlx =
+				le16_to_cpu(fan_table_v3->usFanGainPlx);
+		hwmgr->thermal_controller.advanceFanControlParameters.usFanGainHbm =
+				le16_to_cpu(fan_table_v3->usFanGainHbm);
+
+		hwmgr->thermal_controller.advanceFanControlParameters.ucEnableZeroRPM =
+				fan_table_v3->ucEnableZeroRPM;
+		hwmgr->thermal_controller.advanceFanControlParameters.usZeroRPMStopTemperature =
+				le16_to_cpu(fan_table_v3->usFanStopTemperature);
+		hwmgr->thermal_controller.advanceFanControlParameters.usZeroRPMStartTemperature =
+				le16_to_cpu(fan_table_v3->usFanStartTemperature);
+		hwmgr->thermal_controller.advanceFanControlParameters.usMGpuThrottlingRPMLimit =
+				le16_to_cpu(fan_table_v3->usMGpuThrottlingRPM);
 	}
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.c
index 3f807d6c95ce..ba8763daa380 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.c
@@ -556,6 +556,43 @@ int vega10_thermal_setup_fan_table(struct pp_hwmgr *hwmgr)
 	return ret;
 }
 
+int vega10_enable_mgpu_fan_boost(struct pp_hwmgr *hwmgr)
+{
+	struct vega10_hwmgr *data = hwmgr->backend;
+	PPTable_t *table = &(data->smc_state_table.pp_table);
+	int ret;
+
+	if (!data->smu_features[GNLD_FAN_CONTROL].supported)
+		return 0;
+
+	if (!hwmgr->thermal_controller.advanceFanControlParameters.
+			usMGpuThrottlingRPMLimit)
+		return 0;
+
+	table->FanThrottlingRpm = hwmgr->thermal_controller.
+			advanceFanControlParameters.usMGpuThrottlingRPMLimit;
+
+	ret = smum_smc_table_manager(hwmgr,
+				(uint8_t *)(&(data->smc_state_table.pp_table)),
+				PPTABLE, false);
+	if (ret) {
+		pr_info("Failed to update fan control table in pptable!");
+		return ret;
+	}
+
+	ret = vega10_disable_fan_control_feature(hwmgr);
+	if (ret) {
+		pr_info("Attempt to disable SMC fan control feature failed!");
+		return ret;
+	}
+
+	ret = vega10_enable_fan_control_feature(hwmgr);
+	if (ret)
+		pr_info("Attempt to enable SMC fan control feature failed!");
+
+	return ret;
+}
+
 /**
 * Start the fan control on the SMC.
 * @param    hwmgr  the address of the powerplay hardware manager.
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.h b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.h
index 21e7c4dfa2ca..4a0ede7c1f07 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.h
@@ -73,6 +73,7 @@ extern int vega10_thermal_disable_alert(struct pp_hwmgr *hwmgr);
 extern int vega10_fan_ctrl_start_smc_fan_control(struct pp_hwmgr *hwmgr);
 extern int vega10_start_thermal_controller(struct pp_hwmgr *hwmgr,
 				struct PP_TemperatureRange *range);
+extern int vega10_enable_mgpu_fan_boost(struct pp_hwmgr *hwmgr);
 
 
 #endif
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
index 0c8212902275..6c8e78611c03 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega12_hwmgr.c
@@ -1093,6 +1093,16 @@ static int vega12_upload_dpm_min_level(struct pp_hwmgr *hwmgr)
 					return ret);
 	}
 
+	if (data->smu_features[GNLD_DPM_DCEFCLK].enabled) {
+		min_freq = data->dpm_table.dcef_table.dpm_state.hard_min_level;
+
+		PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
+					hwmgr, PPSMC_MSG_SetHardMinByFreq,
+					(PPCLK_DCEFCLK << 16) | (min_freq & 0xffff))),
+					"Failed to set hard min dcefclk!",
+					return ret);
+	}
+
 	return ret;
 
 }
@@ -1818,7 +1828,7 @@ static int vega12_force_clock_level(struct pp_hwmgr *hwmgr,
 		enum pp_clock_type type, uint32_t mask)
 {
 	struct vega12_hwmgr *data = (struct vega12_hwmgr *)(hwmgr->backend);
-	uint32_t soft_min_level, soft_max_level;
+	uint32_t soft_min_level, soft_max_level, hard_min_level;
 	int ret = 0;
 
 	switch (type) {
@@ -1863,6 +1873,56 @@ static int vega12_force_clock_level(struct pp_hwmgr *hwmgr,
 
 		break;
 
+	case PP_SOCCLK:
+		soft_min_level = mask ? (ffs(mask) - 1) : 0;
+		soft_max_level = mask ? (fls(mask) - 1) : 0;
+
+		if (soft_max_level >= data->dpm_table.soc_table.count) {
+			pr_err("Clock level specified %d is over max allowed %d\n",
+					soft_max_level,
+					data->dpm_table.soc_table.count - 1);
+			return -EINVAL;
+		}
+
+		data->dpm_table.soc_table.dpm_state.soft_min_level =
+			data->dpm_table.soc_table.dpm_levels[soft_min_level].value;
+		data->dpm_table.soc_table.dpm_state.soft_max_level =
+			data->dpm_table.soc_table.dpm_levels[soft_max_level].value;
+
+		ret = vega12_upload_dpm_min_level(hwmgr);
+		PP_ASSERT_WITH_CODE(!ret,
+			"Failed to upload boot level to lowest!",
+			return ret);
+
+		ret = vega12_upload_dpm_max_level(hwmgr);
+		PP_ASSERT_WITH_CODE(!ret,
+			"Failed to upload dpm max level to highest!",
+			return ret);
+
+		break;
+
+	case PP_DCEFCLK:
+		hard_min_level = mask ? (ffs(mask) - 1) : 0;
+
+		if (hard_min_level >= data->dpm_table.dcef_table.count) {
+			pr_err("Clock level specified %d is over max allowed %d\n",
+					hard_min_level,
+					data->dpm_table.dcef_table.count - 1);
+			return -EINVAL;
+		}
+
+		data->dpm_table.dcef_table.dpm_state.hard_min_level =
+			data->dpm_table.dcef_table.dpm_levels[hard_min_level].value;
+
+		ret = vega12_upload_dpm_min_level(hwmgr);
+		PP_ASSERT_WITH_CODE(!ret,
+			"Failed to upload boot level to lowest!",
+			return ret);
+
+		//TODO: Setting DCEFCLK max dpm level is not supported
+
+		break;
+
 	case PP_PCIE:
 		break;
 
@@ -1873,6 +1933,104 @@ static int vega12_force_clock_level(struct pp_hwmgr *hwmgr,
 	return 0;
 }
 
+static int vega12_get_ppfeature_status(struct pp_hwmgr *hwmgr, char *buf)
+{
+	static const char *ppfeature_name[] = {
+			"DPM_PREFETCHER",
+			"GFXCLK_DPM",
+			"UCLK_DPM",
+			"SOCCLK_DPM",
+			"UVD_DPM",
+			"VCE_DPM",
+			"ULV",
+			"MP0CLK_DPM",
+			"LINK_DPM",
+			"DCEFCLK_DPM",
+			"GFXCLK_DS",
+			"SOCCLK_DS",
+			"LCLK_DS",
+			"PPT",
+			"TDC",
+			"THERMAL",
+			"GFX_PER_CU_CG",
+			"RM",
+			"DCEFCLK_DS",
+			"ACDC",
+			"VR0HOT",
+			"VR1HOT",
+			"FW_CTF",
+			"LED_DISPLAY",
+			"FAN_CONTROL",
+			"DIDT",
+			"GFXOFF",
+			"CG",
+			"ACG"};
+	static const char *output_title[] = {
+			"FEATURES",
+			"BITMASK",
+			"ENABLEMENT"};
+	uint64_t features_enabled;
+	int i;
+	int ret = 0;
+	int size = 0;
+
+	ret = vega12_get_enabled_smc_features(hwmgr, &features_enabled);
+	PP_ASSERT_WITH_CODE(!ret,
+		"[EnableAllSmuFeatures] Failed to get enabled smc features!",
+		return ret);
+
+	size += sprintf(buf + size, "Current ppfeatures: 0x%016llx\n", features_enabled);
+	size += sprintf(buf + size, "%-19s %-22s %s\n",
+				output_title[0],
+				output_title[1],
+				output_title[2]);
+	for (i = 0; i < GNLD_FEATURES_MAX; i++) {
+		size += sprintf(buf + size, "%-19s 0x%016llx %6s\n",
+				ppfeature_name[i],
+				1ULL << i,
+				(features_enabled & (1ULL << i)) ? "Y" : "N");
+	}
+
+	return size;
+}
+
+static int vega12_set_ppfeature_status(struct pp_hwmgr *hwmgr, uint64_t new_ppfeature_masks)
+{
+	uint64_t features_enabled;
+	uint64_t features_to_enable;
+	uint64_t features_to_disable;
+	int ret = 0;
+
+	if (new_ppfeature_masks >= (1ULL << GNLD_FEATURES_MAX))
+		return -EINVAL;
+
+	ret = vega12_get_enabled_smc_features(hwmgr, &features_enabled);
+	if (ret)
+		return ret;
+
+	features_to_disable =
+		(features_enabled ^ new_ppfeature_masks) & features_enabled;
+	features_to_enable =
+		(features_enabled ^ new_ppfeature_masks) ^ features_to_disable;
+
+	pr_debug("features_to_disable 0x%llx\n", features_to_disable);
+	pr_debug("features_to_enable 0x%llx\n", features_to_enable);
+
+	if (features_to_disable) {
+		ret = vega12_enable_smc_features(hwmgr, false, features_to_disable);
+		if (ret)
+			return ret;
+	}
+
+	if (features_to_enable) {
+		ret = vega12_enable_smc_features(hwmgr, true, features_to_enable);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 static int vega12_print_clock_levels(struct pp_hwmgr *hwmgr,
 		enum pp_clock_type type, char *buf)
 {
@@ -1912,6 +2070,42 @@ static int vega12_print_clock_levels(struct pp_hwmgr *hwmgr,
 				(clocks.data[i].clocks_in_khz / 1000 == now / 100) ? "*" : "");
 		break;
 
+	case PP_SOCCLK:
+		PP_ASSERT_WITH_CODE(
+				smum_send_msg_to_smc_with_parameter(hwmgr,
+					PPSMC_MSG_GetDpmClockFreq, (PPCLK_SOCCLK << 16)) == 0,
+				"Attempt to get Current SOCCLK Frequency Failed!",
+				return -EINVAL);
+		now = smum_get_argument(hwmgr);
+
+		PP_ASSERT_WITH_CODE(
+				vega12_get_socclocks(hwmgr, &clocks) == 0,
+				"Attempt to get soc clk levels Failed!",
+				return -1);
+		for (i = 0; i < clocks.num_levels; i++)
+			size += sprintf(buf + size, "%d: %uMhz %s\n",
+				i, clocks.data[i].clocks_in_khz / 1000,
+				(clocks.data[i].clocks_in_khz / 1000 == now) ? "*" : "");
+		break;
+
+	case PP_DCEFCLK:
+		PP_ASSERT_WITH_CODE(
+				smum_send_msg_to_smc_with_parameter(hwmgr,
+					PPSMC_MSG_GetDpmClockFreq, (PPCLK_DCEFCLK << 16)) == 0,
+				"Attempt to get Current DCEFCLK Frequency Failed!",
+				return -EINVAL);
+		now = smum_get_argument(hwmgr);
+
+		PP_ASSERT_WITH_CODE(
+				vega12_get_dcefclocks(hwmgr, &clocks) == 0,
+				"Attempt to get dcef clk levels Failed!",
+				return -1);
+		for (i = 0; i < clocks.num_levels; i++)
+			size += sprintf(buf + size, "%d: %uMhz %s\n",
+				i, clocks.data[i].clocks_in_khz / 1000,
+				(clocks.data[i].clocks_in_khz / 1000 == now) ? "*" : "");
+		break;
+
 	case PP_PCIE:
 		break;
 
@@ -2432,6 +2626,8 @@ static const struct pp_hwmgr_func vega12_hwmgr_funcs = {
 	.start_thermal_controller = vega12_start_thermal_controller,
 	.powergate_gfx = vega12_gfx_off_control,
 	.get_performance_level = vega12_get_performance_level,
+	.get_ppfeature_status = vega12_get_ppfeature_status,
+	.set_ppfeature_status = vega12_set_ppfeature_status,
 };
 
 int vega12_hwmgr_init(struct pp_hwmgr *hwmgr)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_baco.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_baco.c
new file mode 100644
index 000000000000..5e8602a79b1c
--- /dev/null
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_baco.c
@@ -0,0 +1,103 @@
+/*
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+#include "amdgpu.h"
+#include "soc15.h"
+#include "soc15_hw_ip.h"
+#include "soc15_common.h"
+#include "vega20_inc.h"
+#include "vega20_ppsmc.h"
+#include "vega20_baco.h"
+
+
+
+static const struct soc15_baco_cmd_entry clean_baco_tbl[] =
+{
+	{CMD_WRITE, SOC15_REG_ENTRY(NBIF, 0, mmBIOS_SCRATCH_6), 0, 0, 0, 0},
+	{CMD_WRITE, SOC15_REG_ENTRY(NBIF, 0, mmBIOS_SCRATCH_7), 0, 0, 0, 0},
+};
+
+int vega20_baco_get_capability(struct pp_hwmgr *hwmgr, bool *cap)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+	uint32_t reg;
+
+	*cap = false;
+	if (!phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_BACO))
+		return 0;
+
+	if (((RREG32(0x17569) & 0x20000000) >> 29) == 0x1) {
+		reg = RREG32_SOC15(NBIF, 0, mmRCC_BIF_STRAP0);
+
+		if (reg & RCC_BIF_STRAP0__STRAP_PX_CAPABLE_MASK)
+			*cap = true;
+	}
+
+	return 0;
+}
+
+int vega20_baco_get_state(struct pp_hwmgr *hwmgr, enum BACO_STATE *state)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+	uint32_t reg;
+
+	reg = RREG32_SOC15(NBIF, 0, mmBACO_CNTL);
+
+	if (reg & BACO_CNTL__BACO_MODE_MASK)
+		/* gfx has already entered BACO state */
+		*state = BACO_STATE_IN;
+	else
+		*state = BACO_STATE_OUT;
+	return 0;
+}
+
+int vega20_baco_set_state(struct pp_hwmgr *hwmgr, enum BACO_STATE state)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+	enum BACO_STATE cur_state;
+	uint32_t data;
+
+	vega20_baco_get_state(hwmgr, &cur_state);
+
+	if (cur_state == state)
+		/* aisc already in the target state */
+		return 0;
+
+	if (state == BACO_STATE_IN) {
+		data = RREG32_SOC15(THM, 0, mmTHM_BACO_CNTL);
+		data |= 0x80000000;
+		WREG32_SOC15(THM, 0, mmTHM_BACO_CNTL, data);
+
+
+		if(smum_send_msg_to_smc_with_parameter(hwmgr, PPSMC_MSG_EnterBaco, 0))
+			return -EINVAL;
+
+	} else if (state == BACO_STATE_OUT) {
+		if (smum_send_msg_to_smc(hwmgr, PPSMC_MSG_ExitBaco))
+			return -EINVAL;
+		if (!soc15_baco_program_registers(hwmgr, clean_baco_tbl,
+						     ARRAY_SIZE(clean_baco_tbl)))
+			return -EINVAL;
+	}
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce100/i2caux_dce100.h b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_baco.h
index 2b508d3e0ef4..51c7f8392925 100644
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce100/i2caux_dce100.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_baco.h
@@ -1,5 +1,5 @@
 /*
- * Copyright 2012-15 Advanced Micro Devices, Inc.
+ * Copyright 2018 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -19,14 +19,14 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  *
- * Authors: AMD
- *
  */
+#ifndef __VEGA20_BACO_H__
+#define __VEGA20_BACO_H__
+#include "hwmgr.h"
+#include "common_baco.h"
 
-#ifndef __DAL_I2C_AUX_DCE100_H__
-#define __DAL_I2C_AUX_DCE100_H__
-
-struct i2caux *dal_i2caux_dce100_create(
-	struct dc_context *ctx);
+extern int vega20_baco_get_capability(struct pp_hwmgr *hwmgr, bool *cap);
+extern int vega20_baco_get_state(struct pp_hwmgr *hwmgr, enum BACO_STATE *state);
+extern int vega20_baco_set_state(struct pp_hwmgr *hwmgr, enum BACO_STATE state);
 
-#endif /* __DAL_I2C_AUX_DCE100_H__ */
+#endif
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index 82935a3bd950..aad79affb081 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -47,6 +47,7 @@
 #include "pp_overdriver.h"
 #include "pp_thermal.h"
 #include "soc15_common.h"
+#include "vega20_baco.h"
 #include "smuio/smuio_9_0_offset.h"
 #include "smuio/smuio_9_0_sh_mask.h"
 #include "nbio/nbio_7_4_sh_mask.h"
@@ -770,6 +771,54 @@ static int vega20_init_smc_table(struct pp_hwmgr *hwmgr)
 	return 0;
 }
 
+/*
+ * Override PCIe link speed and link width for DPM Level 1. PPTable entries
+ * reflect the ASIC capabilities and not the system capabilities. For e.g.
+ * Vega20 board in a PCI Gen3 system. In this case, when SMU's tries to switch
+ * to DPM1, it fails as system doesn't support Gen4.
+ */
+static int vega20_override_pcie_parameters(struct pp_hwmgr *hwmgr)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)(hwmgr->adev);
+	uint32_t pcie_gen = 0, pcie_width = 0, smu_pcie_arg;
+	int ret;
+
+	if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN4)
+		pcie_gen = 3;
+	else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN3)
+		pcie_gen = 2;
+	else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN2)
+		pcie_gen = 1;
+	else if (adev->pm.pcie_gen_mask & CAIL_PCIE_LINK_SPEED_SUPPORT_GEN1)
+		pcie_gen = 0;
+
+	if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X16)
+		pcie_width = 6;
+	else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X12)
+		pcie_width = 5;
+	else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X8)
+		pcie_width = 4;
+	else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X4)
+		pcie_width = 3;
+	else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X2)
+		pcie_width = 2;
+	else if (adev->pm.pcie_mlw_mask & CAIL_PCIE_LINK_WIDTH_SUPPORT_X1)
+		pcie_width = 1;
+
+	/* Bit 31:16: LCLK DPM level. 0 is DPM0, and 1 is DPM1
+	 * Bit 15:8:  PCIE GEN, 0 to 3 corresponds to GEN1 to GEN4
+	 * Bit 7:0:   PCIE lane width, 1 to 7 corresponds is x1 to x32
+	 */
+	smu_pcie_arg = (1 << 16) | (pcie_gen << 8) | pcie_width;
+	ret = smum_send_msg_to_smc_with_parameter(hwmgr,
+			PPSMC_MSG_OverridePcieParameters, smu_pcie_arg);
+	PP_ASSERT_WITH_CODE(!ret,
+		"[OverridePcieParameters] Attempt to override pcie params failed!",
+		return ret);
+
+	return 0;
+}
+
 static int vega20_set_allowed_featuresmask(struct pp_hwmgr *hwmgr)
 {
 	struct vega20_hwmgr *data =
@@ -803,6 +852,11 @@ static int vega20_set_allowed_featuresmask(struct pp_hwmgr *hwmgr)
 	return 0;
 }
 
+static int vega20_run_btc(struct pp_hwmgr *hwmgr)
+{
+	return smum_send_msg_to_smc(hwmgr, PPSMC_MSG_RunBtc);
+}
+
 static int vega20_run_btc_afll(struct pp_hwmgr *hwmgr)
 {
 	return smum_send_msg_to_smc(hwmgr, PPSMC_MSG_RunAfllBtc);
@@ -1564,6 +1618,11 @@ static int vega20_enable_dpm_tasks(struct pp_hwmgr *hwmgr)
 			"[EnableDPMTasks] Failed to initialize SMC table!",
 			return result);
 
+	result = vega20_run_btc(hwmgr);
+	PP_ASSERT_WITH_CODE(!result,
+			"[EnableDPMTasks] Failed to run btc!",
+			return result);
+
 	result = vega20_run_btc_afll(hwmgr);
 	PP_ASSERT_WITH_CODE(!result,
 			"[EnableDPMTasks] Failed to run btc afll!",
@@ -1574,6 +1633,11 @@ static int vega20_enable_dpm_tasks(struct pp_hwmgr *hwmgr)
 			"[EnableDPMTasks] Failed to enable all smu features!",
 			return result);
 
+	result = vega20_override_pcie_parameters(hwmgr);
+	PP_ASSERT_WITH_CODE(!result,
+			"[EnableDPMTasks] Failed to override pcie parameters!",
+			return result);
+
 	result = vega20_notify_smc_display_change(hwmgr);
 	PP_ASSERT_WITH_CODE(!result,
 			"[EnableDPMTasks] Failed to notify smc display change!",
@@ -1735,6 +1799,28 @@ static int vega20_upload_dpm_min_level(struct pp_hwmgr *hwmgr, uint32_t feature_
 					return ret);
 	}
 
+	if (data->smu_features[GNLD_DPM_FCLK].enabled &&
+	   (feature_mask & FEATURE_DPM_FCLK_MASK)) {
+		min_freq = data->dpm_table.fclk_table.dpm_state.soft_min_level;
+
+		PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
+					hwmgr, PPSMC_MSG_SetSoftMinByFreq,
+					(PPCLK_FCLK << 16) | (min_freq & 0xffff))),
+					"Failed to set soft min fclk!",
+					return ret);
+	}
+
+	if (data->smu_features[GNLD_DPM_DCEFCLK].enabled &&
+	   (feature_mask & FEATURE_DPM_DCEFCLK_MASK)) {
+		min_freq = data->dpm_table.dcef_table.dpm_state.hard_min_level;
+
+		PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
+					hwmgr, PPSMC_MSG_SetHardMinByFreq,
+					(PPCLK_DCEFCLK << 16) | (min_freq & 0xffff))),
+					"Failed to set hard min dcefclk!",
+					return ret);
+	}
+
 	return ret;
 }
 
@@ -1807,6 +1893,17 @@ static int vega20_upload_dpm_max_level(struct pp_hwmgr *hwmgr, uint32_t feature_
 					return ret);
 	}
 
+	if (data->smu_features[GNLD_DPM_FCLK].enabled &&
+	   (feature_mask & FEATURE_DPM_FCLK_MASK)) {
+		max_freq = data->dpm_table.fclk_table.dpm_state.soft_max_level;
+
+		PP_ASSERT_WITH_CODE(!(ret = smum_send_msg_to_smc_with_parameter(
+					hwmgr, PPSMC_MSG_SetSoftMaxByFreq,
+					(PPCLK_FCLK << 16) | (max_freq & 0xffff))),
+					"Failed to set soft max fclk!",
+					return ret);
+	}
+
 	return ret;
 }
 
@@ -1914,16 +2011,36 @@ static uint32_t vega20_dpm_get_mclk(struct pp_hwmgr *hwmgr, bool low)
 	return (mem_clk * 100);
 }
 
+static int vega20_get_metrics_table(struct pp_hwmgr *hwmgr, SmuMetrics_t *metrics_table)
+{
+	struct vega20_hwmgr *data =
+			(struct vega20_hwmgr *)(hwmgr->backend);
+	int ret = 0;
+
+	if (!data->metrics_time || time_after(jiffies, data->metrics_time + HZ / 2)) {
+		ret = smum_smc_table_manager(hwmgr, (uint8_t *)metrics_table,
+				TABLE_SMU_METRICS, true);
+		if (ret) {
+			pr_info("Failed to export SMU metrics table!\n");
+			return ret;
+		}
+		memcpy(&data->metrics_table, metrics_table, sizeof(SmuMetrics_t));
+		data->metrics_time = jiffies;
+	} else
+		memcpy(metrics_table, &data->metrics_table, sizeof(SmuMetrics_t));
+
+	return ret;
+}
+
 static int vega20_get_gpu_power(struct pp_hwmgr *hwmgr,
 		uint32_t *query)
 {
 	int ret = 0;
 	SmuMetrics_t metrics_table;
 
-	ret = smum_smc_table_manager(hwmgr, (uint8_t *)&metrics_table, TABLE_SMU_METRICS, true);
-	PP_ASSERT_WITH_CODE(!ret,
-			"Failed to export SMU METRICS table!",
-			return ret);
+	ret = vega20_get_metrics_table(hwmgr, &metrics_table);
+	if (ret)
+		return ret;
 
 	*query = metrics_table.CurrSocketPower << 8;
 
@@ -1954,10 +2071,9 @@ static int vega20_get_current_activity_percent(struct pp_hwmgr *hwmgr,
 	int ret = 0;
 	SmuMetrics_t metrics_table;
 
-	ret = smum_smc_table_manager(hwmgr, (uint8_t *)&metrics_table, TABLE_SMU_METRICS, true);
-	PP_ASSERT_WITH_CODE(!ret,
-			"Failed to export SMU METRICS table!",
-			return ret);
+	ret = vega20_get_metrics_table(hwmgr, &metrics_table);
+	if (ret)
+		return ret;
 
 	*activity_percent = metrics_table.AverageGfxActivity;
 
@@ -1969,16 +2085,18 @@ static int vega20_read_sensor(struct pp_hwmgr *hwmgr, int idx,
 {
 	struct vega20_hwmgr *data = (struct vega20_hwmgr *)(hwmgr->backend);
 	struct amdgpu_device *adev = hwmgr->adev;
+	SmuMetrics_t metrics_table;
 	uint32_t val_vid;
 	int ret = 0;
 
 	switch (idx) {
 	case AMDGPU_PP_SENSOR_GFX_SCLK:
-		ret = vega20_get_current_clk_freq(hwmgr,
-				PPCLK_GFXCLK,
-				(uint32_t *)value);
-		if (!ret)
-			*size = 4;
+		ret = vega20_get_metrics_table(hwmgr, &metrics_table);
+		if (ret)
+			return ret;
+
+		*((uint32_t *)value) = metrics_table.AverageGfxclkFrequency * 100;
+		*size = 4;
 		break;
 	case AMDGPU_PP_SENSOR_GFX_MCLK:
 		ret = vega20_get_current_clk_freq(hwmgr,
@@ -2136,6 +2254,12 @@ static int vega20_force_dpm_highest(struct pp_hwmgr *hwmgr)
 		data->dpm_table.mem_table.dpm_state.soft_max_level =
 		data->dpm_table.mem_table.dpm_levels[soft_level].value;
 
+	soft_level = vega20_find_highest_dpm_level(&(data->dpm_table.soc_table));
+
+	data->dpm_table.soc_table.dpm_state.soft_min_level =
+		data->dpm_table.soc_table.dpm_state.soft_max_level =
+		data->dpm_table.soc_table.dpm_levels[soft_level].value;
+
 	ret = vega20_upload_dpm_min_level(hwmgr, 0xFFFFFFFF);
 	PP_ASSERT_WITH_CODE(!ret,
 			"Failed to upload boot level to highest!",
@@ -2168,6 +2292,12 @@ static int vega20_force_dpm_lowest(struct pp_hwmgr *hwmgr)
 		data->dpm_table.mem_table.dpm_state.soft_max_level =
 		data->dpm_table.mem_table.dpm_levels[soft_level].value;
 
+	soft_level = vega20_find_lowest_dpm_level(&(data->dpm_table.soc_table));
+
+	data->dpm_table.soc_table.dpm_state.soft_min_level =
+		data->dpm_table.soc_table.dpm_state.soft_max_level =
+		data->dpm_table.soc_table.dpm_levels[soft_level].value;
+
 	ret = vega20_upload_dpm_min_level(hwmgr, 0xFFFFFFFF);
 	PP_ASSERT_WITH_CODE(!ret,
 			"Failed to upload boot level to highest!",
@@ -2184,8 +2314,32 @@ static int vega20_force_dpm_lowest(struct pp_hwmgr *hwmgr)
 
 static int vega20_unforce_dpm_levels(struct pp_hwmgr *hwmgr)
 {
+	struct vega20_hwmgr *data =
+			(struct vega20_hwmgr *)(hwmgr->backend);
+	uint32_t soft_min_level, soft_max_level;
 	int ret = 0;
 
+	soft_min_level = vega20_find_lowest_dpm_level(&(data->dpm_table.gfx_table));
+	soft_max_level = vega20_find_highest_dpm_level(&(data->dpm_table.gfx_table));
+	data->dpm_table.gfx_table.dpm_state.soft_min_level =
+		data->dpm_table.gfx_table.dpm_levels[soft_min_level].value;
+	data->dpm_table.gfx_table.dpm_state.soft_max_level =
+		data->dpm_table.gfx_table.dpm_levels[soft_max_level].value;
+
+	soft_min_level = vega20_find_lowest_dpm_level(&(data->dpm_table.mem_table));
+	soft_max_level = vega20_find_highest_dpm_level(&(data->dpm_table.mem_table));
+	data->dpm_table.mem_table.dpm_state.soft_min_level =
+		data->dpm_table.mem_table.dpm_levels[soft_min_level].value;
+	data->dpm_table.mem_table.dpm_state.soft_max_level =
+		data->dpm_table.mem_table.dpm_levels[soft_max_level].value;
+
+	soft_min_level = vega20_find_lowest_dpm_level(&(data->dpm_table.soc_table));
+	soft_max_level = vega20_find_highest_dpm_level(&(data->dpm_table.soc_table));
+	data->dpm_table.soc_table.dpm_state.soft_min_level =
+		data->dpm_table.soc_table.dpm_levels[soft_min_level].value;
+	data->dpm_table.soc_table.dpm_state.soft_max_level =
+		data->dpm_table.soc_table.dpm_levels[soft_max_level].value;
+
 	ret = vega20_upload_dpm_min_level(hwmgr, 0xFFFFFFFF);
 	PP_ASSERT_WITH_CODE(!ret,
 			"Failed to upload DPM Bootup Levels!",
@@ -2236,7 +2390,7 @@ static int vega20_force_clock_level(struct pp_hwmgr *hwmgr,
 		enum pp_clock_type type, uint32_t mask)
 {
 	struct vega20_hwmgr *data = (struct vega20_hwmgr *)(hwmgr->backend);
-	uint32_t soft_min_level, soft_max_level;
+	uint32_t soft_min_level, soft_max_level, hard_min_level;
 	int ret = 0;
 
 	switch (type) {
@@ -2295,6 +2449,84 @@ static int vega20_force_clock_level(struct pp_hwmgr *hwmgr,
 
 		break;
 
+	case PP_SOCCLK:
+		soft_min_level = mask ? (ffs(mask) - 1) : 0;
+		soft_max_level = mask ? (fls(mask) - 1) : 0;
+
+		if (soft_max_level >= data->dpm_table.soc_table.count) {
+			pr_err("Clock level specified %d is over max allowed %d\n",
+					soft_max_level,
+					data->dpm_table.soc_table.count - 1);
+			return -EINVAL;
+		}
+
+		data->dpm_table.soc_table.dpm_state.soft_min_level =
+			data->dpm_table.soc_table.dpm_levels[soft_min_level].value;
+		data->dpm_table.soc_table.dpm_state.soft_max_level =
+			data->dpm_table.soc_table.dpm_levels[soft_max_level].value;
+
+		ret = vega20_upload_dpm_min_level(hwmgr, FEATURE_DPM_SOCCLK_MASK);
+		PP_ASSERT_WITH_CODE(!ret,
+			"Failed to upload boot level to lowest!",
+			return ret);
+
+		ret = vega20_upload_dpm_max_level(hwmgr, FEATURE_DPM_SOCCLK_MASK);
+		PP_ASSERT_WITH_CODE(!ret,
+			"Failed to upload dpm max level to highest!",
+			return ret);
+
+		break;
+
+	case PP_FCLK:
+		soft_min_level = mask ? (ffs(mask) - 1) : 0;
+		soft_max_level = mask ? (fls(mask) - 1) : 0;
+
+		if (soft_max_level >= data->dpm_table.fclk_table.count) {
+			pr_err("Clock level specified %d is over max allowed %d\n",
+					soft_max_level,
+					data->dpm_table.fclk_table.count - 1);
+			return -EINVAL;
+		}
+
+		data->dpm_table.fclk_table.dpm_state.soft_min_level =
+			data->dpm_table.fclk_table.dpm_levels[soft_min_level].value;
+		data->dpm_table.fclk_table.dpm_state.soft_max_level =
+			data->dpm_table.fclk_table.dpm_levels[soft_max_level].value;
+
+		ret = vega20_upload_dpm_min_level(hwmgr, FEATURE_DPM_FCLK_MASK);
+		PP_ASSERT_WITH_CODE(!ret,
+			"Failed to upload boot level to lowest!",
+			return ret);
+
+		ret = vega20_upload_dpm_max_level(hwmgr, FEATURE_DPM_FCLK_MASK);
+		PP_ASSERT_WITH_CODE(!ret,
+			"Failed to upload dpm max level to highest!",
+			return ret);
+
+		break;
+
+	case PP_DCEFCLK:
+		hard_min_level = mask ? (ffs(mask) - 1) : 0;
+
+		if (hard_min_level >= data->dpm_table.dcef_table.count) {
+			pr_err("Clock level specified %d is over max allowed %d\n",
+					hard_min_level,
+					data->dpm_table.dcef_table.count - 1);
+			return -EINVAL;
+		}
+
+		data->dpm_table.dcef_table.dpm_state.hard_min_level =
+			data->dpm_table.dcef_table.dpm_levels[hard_min_level].value;
+
+		ret = vega20_upload_dpm_min_level(hwmgr, FEATURE_DPM_DCEFCLK_MASK);
+		PP_ASSERT_WITH_CODE(!ret,
+			"Failed to upload boot level to lowest!",
+			return ret);
+
+		//TODO: Setting DCEFCLK max dpm level is not supported
+
+		break;
+
 	case PP_PCIE:
 		soft_min_level = mask ? (ffs(mask) - 1) : 0;
 		soft_max_level = mask ? (fls(mask) - 1) : 0;
@@ -2345,6 +2577,7 @@ static int vega20_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
 			return ret;
 		vega20_force_clock_level(hwmgr, PP_SCLK, 1 << sclk_mask);
 		vega20_force_clock_level(hwmgr, PP_MCLK, 1 << mclk_mask);
+		vega20_force_clock_level(hwmgr, PP_SOCCLK, 1 << soc_mask);
 		break;
 
 	case AMD_DPM_FORCED_LEVEL_MANUAL:
@@ -2775,6 +3008,108 @@ static int vega20_odn_edit_dpm_table(struct pp_hwmgr *hwmgr,
 	return 0;
 }
 
+static int vega20_get_ppfeature_status(struct pp_hwmgr *hwmgr, char *buf)
+{
+	static const char *ppfeature_name[] = {
+				"DPM_PREFETCHER",
+				"GFXCLK_DPM",
+				"UCLK_DPM",
+				"SOCCLK_DPM",
+				"UVD_DPM",
+				"VCE_DPM",
+				"ULV",
+				"MP0CLK_DPM",
+				"LINK_DPM",
+				"DCEFCLK_DPM",
+				"GFXCLK_DS",
+				"SOCCLK_DS",
+				"LCLK_DS",
+				"PPT",
+				"TDC",
+				"THERMAL",
+				"GFX_PER_CU_CG",
+				"RM",
+				"DCEFCLK_DS",
+				"ACDC",
+				"VR0HOT",
+				"VR1HOT",
+				"FW_CTF",
+				"LED_DISPLAY",
+				"FAN_CONTROL",
+				"GFX_EDC",
+				"GFXOFF",
+				"CG",
+				"FCLK_DPM",
+				"FCLK_DS",
+				"MP1CLK_DS",
+				"MP0CLK_DS",
+				"XGMI"};
+	static const char *output_title[] = {
+				"FEATURES",
+				"BITMASK",
+				"ENABLEMENT"};
+	uint64_t features_enabled;
+	int i;
+	int ret = 0;
+	int size = 0;
+
+	ret = vega20_get_enabled_smc_features(hwmgr, &features_enabled);
+	PP_ASSERT_WITH_CODE(!ret,
+			"[EnableAllSmuFeatures] Failed to get enabled smc features!",
+			return ret);
+
+	size += sprintf(buf + size, "Current ppfeatures: 0x%016llx\n", features_enabled);
+	size += sprintf(buf + size, "%-19s %-22s %s\n",
+				output_title[0],
+				output_title[1],
+				output_title[2]);
+	for (i = 0; i < GNLD_FEATURES_MAX; i++) {
+		size += sprintf(buf + size, "%-19s 0x%016llx %6s\n",
+					ppfeature_name[i],
+					1ULL << i,
+					(features_enabled & (1ULL << i)) ? "Y" : "N");
+	}
+
+	return size;
+}
+
+static int vega20_set_ppfeature_status(struct pp_hwmgr *hwmgr, uint64_t new_ppfeature_masks)
+{
+	uint64_t features_enabled;
+	uint64_t features_to_enable;
+	uint64_t features_to_disable;
+	int ret = 0;
+
+	if (new_ppfeature_masks >= (1ULL << GNLD_FEATURES_MAX))
+		return -EINVAL;
+
+	ret = vega20_get_enabled_smc_features(hwmgr, &features_enabled);
+	if (ret)
+		return ret;
+
+	features_to_disable =
+		(features_enabled ^ new_ppfeature_masks) & features_enabled;
+	features_to_enable =
+		(features_enabled ^ new_ppfeature_masks) ^ features_to_disable;
+
+	pr_debug("features_to_disable 0x%llx\n", features_to_disable);
+	pr_debug("features_to_enable 0x%llx\n", features_to_enable);
+
+	if (features_to_disable) {
+		ret = vega20_enable_smc_features(hwmgr, false, features_to_disable);
+		if (ret)
+			return ret;
+	}
+
+	if (features_to_enable) {
+		ret = vega20_enable_smc_features(hwmgr, true, features_to_enable);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
 static int vega20_print_clock_levels(struct pp_hwmgr *hwmgr,
 		enum pp_clock_type type, char *buf)
 {
@@ -2789,6 +3124,8 @@ static int vega20_print_clock_levels(struct pp_hwmgr *hwmgr,
 	PPTable_t *pptable = (PPTable_t *)pptable_information->smc_pptable;
 	struct amdgpu_device *adev = hwmgr->adev;
 	struct pp_clock_levels_with_latency clocks;
+	struct vega20_single_dpm_table *fclk_dpm_table =
+			&(data->dpm_table.fclk_table);
 	int i, now, size = 0;
 	int ret = 0;
 	uint32_t gen_speed, lane_width;
@@ -2828,6 +3165,52 @@ static int vega20_print_clock_levels(struct pp_hwmgr *hwmgr,
 				(clocks.data[i].clocks_in_khz == now * 10) ? "*" : "");
 		break;
 
+	case PP_SOCCLK:
+		ret = vega20_get_current_clk_freq(hwmgr, PPCLK_SOCCLK, &now);
+		PP_ASSERT_WITH_CODE(!ret,
+				"Attempt to get current socclk freq Failed!",
+				return ret);
+
+		ret = vega20_get_socclocks(hwmgr, &clocks);
+		PP_ASSERT_WITH_CODE(!ret,
+				"Attempt to get soc clk levels Failed!",
+				return ret);
+
+		for (i = 0; i < clocks.num_levels; i++)
+			size += sprintf(buf + size, "%d: %uMhz %s\n",
+				i, clocks.data[i].clocks_in_khz / 1000,
+				(clocks.data[i].clocks_in_khz == now * 10) ? "*" : "");
+		break;
+
+	case PP_FCLK:
+		ret = vega20_get_current_clk_freq(hwmgr, PPCLK_FCLK, &now);
+		PP_ASSERT_WITH_CODE(!ret,
+				"Attempt to get current fclk freq Failed!",
+				return ret);
+
+		for (i = 0; i < fclk_dpm_table->count; i++)
+			size += sprintf(buf + size, "%d: %uMhz %s\n",
+				i, fclk_dpm_table->dpm_levels[i].value,
+				fclk_dpm_table->dpm_levels[i].value == (now / 100) ? "*" : "");
+		break;
+
+	case PP_DCEFCLK:
+		ret = vega20_get_current_clk_freq(hwmgr, PPCLK_DCEFCLK, &now);
+		PP_ASSERT_WITH_CODE(!ret,
+				"Attempt to get current dcefclk freq Failed!",
+				return ret);
+
+		ret = vega20_get_dcefclocks(hwmgr, &clocks);
+		PP_ASSERT_WITH_CODE(!ret,
+				"Attempt to get dcefclk levels Failed!",
+				return ret);
+
+		for (i = 0; i < clocks.num_levels; i++)
+			size += sprintf(buf + size, "%d: %uMhz %s\n",
+				i, clocks.data[i].clocks_in_khz / 1000,
+				(clocks.data[i].clocks_in_khz == now * 10) ? "*" : "");
+		break;
+
 	case PP_PCIE:
 		gen_speed = (RREG32_PCIE(smnPCIE_LC_SPEED_CNTL) &
 			     PSWUSP0_PCIE_LC_SPEED_CNTL__LC_CURRENT_DATA_RATE_MASK)
@@ -3073,7 +3456,7 @@ static int vega20_apply_clocks_adjust_rules(struct pp_hwmgr *hwmgr)
 	disable_mclk_switching = ((1 < hwmgr->display_config->num_display) &&
                            !hwmgr->display_config->multi_monitor_in_sync) ||
                             vblank_too_short;
-    latency = hwmgr->display_config->dce_tolerable_mclk_in_active_latency;
+	latency = hwmgr->display_config->dce_tolerable_mclk_in_active_latency;
 
 	/* gfxclk */
 	dpm_table = &(data->dpm_table.gfx_table);
@@ -3571,6 +3954,8 @@ static const struct pp_hwmgr_func vega20_hwmgr_funcs = {
 	.force_clock_level = vega20_force_clock_level,
 	.print_clock_levels = vega20_print_clock_levels,
 	.read_sensor = vega20_read_sensor,
+	.get_ppfeature_status = vega20_get_ppfeature_status,
+	.set_ppfeature_status = vega20_set_ppfeature_status,
 	/* powergate related */
 	.powergate_uvd = vega20_power_gate_uvd,
 	.powergate_vce = vega20_power_gate_vce,
@@ -3591,6 +3976,10 @@ static const struct pp_hwmgr_func vega20_hwmgr_funcs = {
 	/* smu memory related */
 	.notify_cac_buffer_info = vega20_notify_cac_buffer_info,
 	.enable_mgpu_fan_boost = vega20_enable_mgpu_fan_boost,
+	/* BACO related */
+	.get_asic_baco_capability = vega20_baco_get_capability,
+	.get_asic_baco_state = vega20_baco_get_state,
+	.set_asic_baco_state = vega20_baco_set_state,
 };
 
 int vega20_hwmgr_init(struct pp_hwmgr *hwmgr)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.h b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.h
index 25faaa5c5b10..37f5f5e657da 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.h
@@ -520,6 +520,9 @@ struct vega20_hwmgr {
 	/* ---- Gfxoff ---- */
 	bool                           gfxoff_allowed;
 	uint32_t                       counter_gfxoff;
+
+	unsigned long                  metrics_time;
+	SmuMetrics_t                   metrics_table;
 };
 
 #define VEGA20_DPM2_NEAR_TDP_DEC                      10
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_inc.h b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_inc.h
index 6738bad53602..613cb1989b3d 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_inc.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_inc.h
@@ -31,5 +31,6 @@
 #include "asic_reg/mp/mp_9_0_sh_mask.h"
 
 #include "asic_reg/nbio/nbio_7_4_offset.h"
+#include "asic_reg/nbio/nbio_7_4_sh_mask.h"
 
 #endif
diff --git a/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h b/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h
index f4dab979a3a1..6e0be6027705 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h
@@ -397,7 +397,6 @@ struct phm_odn_clock_levels {
 };
 
 extern int phm_disable_clock_power_gatings(struct pp_hwmgr *hwmgr);
-extern int phm_enable_clock_power_gatings(struct pp_hwmgr *hwmgr);
 extern int phm_powerdown_uvd(struct pp_hwmgr *hwmgr);
 extern int phm_setup_asic(struct pp_hwmgr *hwmgr);
 extern int phm_enable_dynamic_state_management(struct pp_hwmgr *hwmgr);
diff --git a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
index 8cb831b6a016..bac3d85e3b82 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
@@ -47,6 +47,11 @@ enum DISPLAY_GAP {
 };
 typedef enum DISPLAY_GAP DISPLAY_GAP;
 
+enum BACO_STATE {
+	BACO_STATE_OUT = 0,
+	BACO_STATE_IN,
+};
+
 struct vi_dpm_level {
 	bool enabled;
 	uint32_t value;
@@ -251,7 +256,6 @@ struct pp_hwmgr_func {
 	uint32_t (*get_sclk)(struct pp_hwmgr *hwmgr, bool low);
 	int (*power_state_set)(struct pp_hwmgr *hwmgr,
 						const void *state);
-	int (*enable_clock_power_gating)(struct pp_hwmgr *hwmgr);
 	int (*notify_smc_display_config_after_ps_adjustment)(struct pp_hwmgr *hwmgr);
 	int (*pre_display_config_changed)(struct pp_hwmgr *hwmgr);
 	int (*display_config_changed)(struct pp_hwmgr *hwmgr);
@@ -334,6 +338,11 @@ struct pp_hwmgr_func {
 	int (*enable_mgpu_fan_boost)(struct pp_hwmgr *hwmgr);
 	int (*set_hard_min_dcefclk_by_freq)(struct pp_hwmgr *hwmgr, uint32_t clock);
 	int (*set_hard_min_fclk_by_freq)(struct pp_hwmgr *hwmgr, uint32_t clock);
+	int (*get_asic_baco_capability)(struct pp_hwmgr *hwmgr, bool *cap);
+	int (*get_asic_baco_state)(struct pp_hwmgr *hwmgr, enum BACO_STATE *state);
+	int (*set_asic_baco_state)(struct pp_hwmgr *hwmgr, enum BACO_STATE state);
+	int (*get_ppfeature_status)(struct pp_hwmgr *hwmgr, char *buf);
+	int (*set_ppfeature_status)(struct pp_hwmgr *hwmgr, uint64_t ppfeature_masks);
 };
 
 struct pp_table_func {
@@ -678,6 +687,7 @@ struct pp_advance_fan_control_parameters {
 	uint32_t  ulTargetGfxClk;
 	uint16_t  usZeroRPMStartTemperature;
 	uint16_t  usZeroRPMStopTemperature;
+	uint16_t  usMGpuThrottlingRPMLimit;
 };
 
 struct pp_thermal_controller_info {
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
index a6edd5df33b0..4240aeec9000 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
@@ -29,6 +29,10 @@
 #include <drm/amdgpu_drm.h>
 #include "smumgr.h"
 
+MODULE_FIRMWARE("amdgpu/bonaire_smc.bin");
+MODULE_FIRMWARE("amdgpu/bonaire_k_smc.bin");
+MODULE_FIRMWARE("amdgpu/hawaii_smc.bin");
+MODULE_FIRMWARE("amdgpu/hawaii_k_smc.bin");
 MODULE_FIRMWARE("amdgpu/topaz_smc.bin");
 MODULE_FIRMWARE("amdgpu/topaz_k_smc.bin");
 MODULE_FIRMWARE("amdgpu/tonga_smc.bin");
diff --git a/drivers/gpu/drm/arc/arcpgu_crtc.c b/drivers/gpu/drm/arc/arcpgu_crtc.c
index 62f51f70606d..73e508e00e30 100644
--- a/drivers/gpu/drm/arc/arcpgu_crtc.c
+++ b/drivers/gpu/drm/arc/arcpgu_crtc.c
@@ -15,10 +15,12 @@
  */
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_device.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_vblank.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/platform_data/simplefb.h>
 
diff --git a/drivers/gpu/drm/arc/arcpgu_drv.c b/drivers/gpu/drm/arc/arcpgu_drv.c
index 206a76abf771..c9f78397d345 100644
--- a/drivers/gpu/drm/arc/arcpgu_drv.c
+++ b/drivers/gpu/drm/arc/arcpgu_drv.c
@@ -15,13 +15,19 @@
  */
 
 #include <linux/clk.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_debugfs.h>
+#include <drm/drm_device.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
-#include <drm/drm_atomic_helper.h>
+#include <drm/drm_probe_helper.h>
+#include <linux/dma-mapping.h>
+#include <linux/module.h>
 #include <linux/of_reserved_mem.h>
+#include <linux/platform_device.h>
 
 #include "arcpgu.h"
 #include "arcpgu_regs.h"
diff --git a/drivers/gpu/drm/arc/arcpgu_sim.c b/drivers/gpu/drm/arc/arcpgu_sim.c
index 68629e614990..5ea053cf805c 100644
--- a/drivers/gpu/drm/arc/arcpgu_sim.c
+++ b/drivers/gpu/drm/arc/arcpgu_sim.c
@@ -14,8 +14,9 @@
  *
  */
 
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_device.h>
+#include <drm/drm_probe_helper.h>
 
 #include "arcpgu.h"
 
@@ -51,7 +52,6 @@ arcpgu_drm_connector_helper_funcs = {
 };
 
 static const struct drm_connector_funcs arcpgu_drm_connector_funcs = {
-	.dpms = drm_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = arcpgu_drm_connector_destroy,
diff --git a/drivers/gpu/drm/arm/Kconfig b/drivers/gpu/drm/arm/Kconfig
index 9a18e1bd57b4..a204103b3efb 100644
--- a/drivers/gpu/drm/arm/Kconfig
+++ b/drivers/gpu/drm/arm/Kconfig
@@ -1,13 +1,10 @@
-config DRM_ARM
-	bool
-	help
-	  Choose this option to select drivers for ARM's devices
+# SPDX-License-Identifier: GPL-2.0
+menu "ARM devices"
 
 config DRM_HDLCD
 	tristate "ARM HDLCD"
 	depends on DRM && OF && (ARM || ARM64)
 	depends on COMMON_CLK
-	select DRM_ARM
 	select DRM_KMS_HELPER
 	select DRM_KMS_CMA_HELPER
 	help
@@ -29,7 +26,6 @@ config DRM_MALI_DISPLAY
 	tristate "ARM Mali Display Processor"
 	depends on DRM && OF && (ARM || ARM64)
 	depends on COMMON_CLK
-	select DRM_ARM
 	select DRM_KMS_HELPER
 	select DRM_KMS_CMA_HELPER
 	select DRM_GEM_CMA_HELPER
@@ -40,3 +36,7 @@ config DRM_MALI_DISPLAY
 	  of the hardware.
 
 	  If compiled as a module it will be called mali-dp.
+
+source "drivers/gpu/drm/arm/display/Kconfig"
+
+endmenu
diff --git a/drivers/gpu/drm/arm/Makefile b/drivers/gpu/drm/arm/Makefile
index 3bf31d1a4722..120bef801fcf 100644
--- a/drivers/gpu/drm/arm/Makefile
+++ b/drivers/gpu/drm/arm/Makefile
@@ -3,3 +3,4 @@ obj-$(CONFIG_DRM_HDLCD)	+= hdlcd.o
 mali-dp-y := malidp_drv.o malidp_hw.o malidp_planes.o malidp_crtc.o
 mali-dp-y += malidp_mw.o
 obj-$(CONFIG_DRM_MALI_DISPLAY)	+= mali-dp.o
+obj-$(CONFIG_DRM_KOMEDA) += display/
diff --git a/drivers/gpu/drm/arm/display/Kbuild b/drivers/gpu/drm/arm/display/Kbuild
new file mode 100644
index 000000000000..382f1ca831e4
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/Kbuild
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_DRM_KOMEDA) += komeda/
diff --git a/drivers/gpu/drm/arm/display/Kconfig b/drivers/gpu/drm/arm/display/Kconfig
new file mode 100644
index 000000000000..cec0639e3aa1
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/Kconfig
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0
+config DRM_KOMEDA
+	tristate "ARM Komeda display driver"
+	depends on DRM && OF
+	depends on COMMON_CLK
+	select DRM_KMS_HELPER
+	select DRM_KMS_CMA_HELPER
+	select DRM_GEM_CMA_HELPER
+	select VIDEOMODE_HELPERS
+	help
+	  Choose this option if you want to compile the ARM Komeda display
+	  Processor driver. It supports the D71 variants of the hardware.
+
+	  If compiled as a module it will be called komeda.
diff --git a/drivers/gpu/drm/arm/display/include/malidp_io.h b/drivers/gpu/drm/arm/display/include/malidp_io.h
new file mode 100644
index 000000000000..4fb3caf864ce
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/include/malidp_io.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#ifndef _MALIDP_IO_H_
+#define _MALIDP_IO_H_
+
+#include <linux/io.h>
+
+static inline u32
+malidp_read32(u32 __iomem *base, u32 offset)
+{
+	return readl((base + (offset >> 2)));
+}
+
+static inline void
+malidp_write32(u32 __iomem *base, u32 offset, u32 v)
+{
+	writel(v, (base + (offset >> 2)));
+}
+
+static inline void
+malidp_write32_mask(u32 __iomem *base, u32 offset, u32 m, u32 v)
+{
+	u32 tmp = malidp_read32(base, offset);
+
+	tmp &= (~m);
+	malidp_write32(base, offset, v | tmp);
+}
+
+static inline void
+malidp_write_group(u32 __iomem *base, u32 offset, int num, const u32 *values)
+{
+	int i;
+
+	for (i = 0; i < num; i++)
+		malidp_write32(base, offset + i * 4, values[i]);
+}
+
+#endif /*_MALIDP_IO_H_*/
diff --git a/drivers/gpu/drm/arm/display/include/malidp_product.h b/drivers/gpu/drm/arm/display/include/malidp_product.h
new file mode 100644
index 000000000000..b35fc5db866b
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/include/malidp_product.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#ifndef _MALIDP_PRODUCT_H_
+#define _MALIDP_PRODUCT_H_
+
+/* Product identification */
+#define MALIDP_CORE_ID(__product, __major, __minor, __status) \
+	((((__product) & 0xFFFF) << 16) | (((__major) & 0xF) << 12) | \
+	(((__minor) & 0xF) << 8) | ((__status) & 0xFF))
+
+#define MALIDP_CORE_ID_PRODUCT_ID(__core_id) ((__u32)(__core_id) >> 16)
+#define MALIDP_CORE_ID_MAJOR(__core_id)      (((__u32)(__core_id) >> 12) & 0xF)
+#define MALIDP_CORE_ID_MINOR(__core_id)      (((__u32)(__core_id) >> 8) & 0xF)
+#define MALIDP_CORE_ID_STATUS(__core_id)     (((__u32)(__core_id)) & 0xFF)
+
+/* Mali-display product IDs */
+#define MALIDP_D71_PRODUCT_ID   0x0071
+
+#endif /* _MALIDP_PRODUCT_H_ */
diff --git a/drivers/gpu/drm/arm/display/include/malidp_utils.h b/drivers/gpu/drm/arm/display/include/malidp_utils.h
new file mode 100644
index 000000000000..63cc47cefcf8
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/include/malidp_utils.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#ifndef _MALIDP_UTILS_
+#define _MALIDP_UTILS_
+
+#define has_bit(nr, mask)	(BIT(nr) & (mask))
+#define has_bits(bits, mask)	(((bits) & (mask)) == (bits))
+
+#define dp_for_each_set_bit(bit, mask) \
+	for_each_set_bit((bit), ((unsigned long *)&(mask)), sizeof(mask) * 8)
+
+#endif /* _MALIDP_UTILS_ */
diff --git a/drivers/gpu/drm/arm/display/komeda/Makefile b/drivers/gpu/drm/arm/display/komeda/Makefile
new file mode 100644
index 000000000000..1b875e5dc0f6
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/Makefile
@@ -0,0 +1,21 @@
+# SPDX-License-Identifier: GPL-2.0
+
+ccflags-y := \
+	-I$(src)/../include \
+	-I$(src)
+
+komeda-y := \
+	komeda_drv.o \
+	komeda_dev.o \
+	komeda_format_caps.o \
+	komeda_pipeline.o \
+	komeda_framebuffer.o \
+	komeda_kms.o \
+	komeda_crtc.o \
+	komeda_plane.o \
+	komeda_private_obj.o
+
+komeda-y += \
+	d71/d71_dev.o
+
+obj-$(CONFIG_DRM_KOMEDA) += komeda.o
diff --git a/drivers/gpu/drm/arm/display/komeda/d71/d71_dev.c b/drivers/gpu/drm/arm/display/komeda/d71/d71_dev.c
new file mode 100644
index 000000000000..edbf9daa1545
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/d71/d71_dev.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#include "malidp_io.h"
+#include "komeda_dev.h"
+
+static int d71_enum_resources(struct komeda_dev *mdev)
+{
+	/* TODO add enum resources */
+	return -1;
+}
+
+#define __HW_ID(__group, __format) \
+	((((__group) & 0x7) << 3) | ((__format) & 0x7))
+
+#define RICH		KOMEDA_FMT_RICH_LAYER
+#define SIMPLE		KOMEDA_FMT_SIMPLE_LAYER
+#define RICH_SIMPLE	(KOMEDA_FMT_RICH_LAYER | KOMEDA_FMT_SIMPLE_LAYER)
+#define RICH_WB		(KOMEDA_FMT_RICH_LAYER | KOMEDA_FMT_WB_LAYER)
+#define RICH_SIMPLE_WB	(RICH_SIMPLE | KOMEDA_FMT_WB_LAYER)
+
+#define Rot_0		DRM_MODE_ROTATE_0
+#define Flip_H_V	(DRM_MODE_REFLECT_X | DRM_MODE_REFLECT_Y | Rot_0)
+#define Rot_ALL_H_V	(DRM_MODE_ROTATE_MASK | Flip_H_V)
+
+#define LYT_NM		BIT(AFBC_FORMAT_MOD_BLOCK_SIZE_16x16)
+#define LYT_WB		BIT(AFBC_FORMAT_MOD_BLOCK_SIZE_32x8)
+#define LYT_NM_WB	(LYT_NM | LYT_WB)
+
+#define AFB_TH		AFBC(_TILED | _SPARSE)
+#define AFB_TH_SC_YTR	AFBC(_TILED | _SC | _SPARSE | _YTR)
+#define AFB_TH_SC_YTR_BS AFBC(_TILED | _SC | _SPARSE | _YTR | _SPLIT)
+
+static struct komeda_format_caps d71_format_caps_table[] = {
+	/*   HW_ID    |        fourcc        | tile_sz |   layer_types |   rots    | afbc_layouts | afbc_features */
+	/* ABGR_2101010*/
+	{__HW_ID(0, 0),	DRM_FORMAT_ARGB2101010,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	{__HW_ID(0, 1),	DRM_FORMAT_ABGR2101010,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	{__HW_ID(0, 1),	DRM_FORMAT_ABGR2101010,	1,	RICH_SIMPLE,	Rot_ALL_H_V,	LYT_NM_WB, AFB_TH_SC_YTR_BS}, /* afbc */
+	{__HW_ID(0, 2),	DRM_FORMAT_RGBA1010102,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	{__HW_ID(0, 3),	DRM_FORMAT_BGRA1010102,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	/* ABGR_8888*/
+	{__HW_ID(1, 0),	DRM_FORMAT_ARGB8888,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	{__HW_ID(1, 1),	DRM_FORMAT_ABGR8888,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	{__HW_ID(1, 1),	DRM_FORMAT_ABGR8888,	1,	RICH_SIMPLE,	Rot_ALL_H_V,	LYT_NM_WB, AFB_TH_SC_YTR_BS}, /* afbc */
+	{__HW_ID(1, 2),	DRM_FORMAT_RGBA8888,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	{__HW_ID(1, 3),	DRM_FORMAT_BGRA8888,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	/* XBGB_8888 */
+	{__HW_ID(2, 0),	DRM_FORMAT_XRGB8888,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	{__HW_ID(2, 1),	DRM_FORMAT_XBGR8888,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	{__HW_ID(2, 2),	DRM_FORMAT_RGBX8888,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	{__HW_ID(2, 3),	DRM_FORMAT_BGRX8888,	1,	RICH_SIMPLE_WB,	Flip_H_V,		0, 0},
+	/* BGR_888 */ /* none-afbc RGB888 doesn't support rotation and flip */
+	{__HW_ID(3, 0),	DRM_FORMAT_RGB888,	1,	RICH_SIMPLE_WB,	Rot_0,			0, 0},
+	{__HW_ID(3, 1),	DRM_FORMAT_BGR888,	1,	RICH_SIMPLE_WB,	Rot_0,			0, 0},
+	{__HW_ID(3, 1),	DRM_FORMAT_BGR888,	1,	RICH_SIMPLE,	Rot_ALL_H_V,	LYT_NM_WB, AFB_TH_SC_YTR_BS}, /* afbc */
+	/* BGR 16bpp */
+	{__HW_ID(4, 0),	DRM_FORMAT_RGBA5551,	1,	RICH_SIMPLE,	Flip_H_V,		0, 0},
+	{__HW_ID(4, 1),	DRM_FORMAT_ABGR1555,	1,	RICH_SIMPLE,	Flip_H_V,		0, 0},
+	{__HW_ID(4, 1),	DRM_FORMAT_ABGR1555,	1,	RICH_SIMPLE,	Rot_ALL_H_V,	LYT_NM_WB, AFB_TH_SC_YTR}, /* afbc */
+	{__HW_ID(4, 2),	DRM_FORMAT_RGB565,	1,	RICH_SIMPLE,	Flip_H_V,		0, 0},
+	{__HW_ID(4, 3),	DRM_FORMAT_BGR565,	1,	RICH_SIMPLE,	Flip_H_V,		0, 0},
+	{__HW_ID(4, 3),	DRM_FORMAT_BGR565,	1,	RICH_SIMPLE,	Rot_ALL_H_V,	LYT_NM_WB, AFB_TH_SC_YTR}, /* afbc */
+	{__HW_ID(4, 4), DRM_FORMAT_R8,		1,	SIMPLE,		Rot_0,			0, 0},
+	/* YUV 444/422/420 8bit  */
+	{__HW_ID(5, 0),	0 /*XYUV8888*/,		1,	0,		0,			0, 0},
+	/* XYUV unsupported*/
+	{__HW_ID(5, 1),	DRM_FORMAT_YUYV,	1,	RICH,		Rot_ALL_H_V,	LYT_NM, AFB_TH}, /* afbc */
+	{__HW_ID(5, 2),	DRM_FORMAT_YUYV,	1,	RICH,		Flip_H_V,		0, 0},
+	{__HW_ID(5, 3),	DRM_FORMAT_UYVY,	1,	RICH,		Flip_H_V,		0, 0},
+	{__HW_ID(5, 4),	0, /*X0L0 */		2,		0,			0, 0}, /* Y0L0 unsupported */
+	{__HW_ID(5, 6),	DRM_FORMAT_NV12,	1,	RICH,		Flip_H_V,		0, 0},
+	{__HW_ID(5, 6),	0/*DRM_FORMAT_YUV420_8BIT*/,	1,	RICH,	Rot_ALL_H_V,	LYT_NM, AFB_TH}, /* afbc */
+	{__HW_ID(5, 7),	DRM_FORMAT_YUV420,	1,	RICH,		Flip_H_V,		0, 0},
+	/* YUV 10bit*/
+	{__HW_ID(6, 0),	0,/*XVYU2101010*/	1,	0,		0,			0, 0},/* VYV30 unsupported */
+	{__HW_ID(6, 6),	0/*DRM_FORMAT_X0L2*/,	2,	RICH,		Flip_H_V,		0, 0},
+	{__HW_ID(6, 7),	0/*DRM_FORMAT_P010*/,	1,	RICH,		Flip_H_V,		0, 0},
+	{__HW_ID(6, 7),	0/*DRM_FORMAT_YUV420_10BIT*/, 1,	RICH,	Rot_ALL_H_V,	LYT_NM, AFB_TH},
+};
+
+static void d71_init_fmt_tbl(struct komeda_dev *mdev)
+{
+	struct komeda_format_caps_table *table = &mdev->fmt_tbl;
+
+	table->format_caps = d71_format_caps_table;
+	table->n_formats = ARRAY_SIZE(d71_format_caps_table);
+}
+
+static struct komeda_dev_funcs d71_chip_funcs = {
+	.init_format_table = d71_init_fmt_tbl,
+	.enum_resources	= d71_enum_resources,
+	.cleanup	= NULL,
+};
+
+#define GLB_ARCH_ID		0x000
+#define GLB_CORE_ID		0x004
+#define GLB_CORE_INFO		0x008
+
+struct komeda_dev_funcs *
+d71_identify(u32 __iomem *reg_base, struct komeda_chip_info *chip)
+{
+	chip->arch_id	= malidp_read32(reg_base, GLB_ARCH_ID);
+	chip->core_id	= malidp_read32(reg_base, GLB_CORE_ID);
+	chip->core_info	= malidp_read32(reg_base, GLB_CORE_INFO);
+
+	return &d71_chip_funcs;
+}
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_crtc.c b/drivers/gpu/drm/arm/display/komeda/komeda_crtc.c
new file mode 100644
index 000000000000..3ca5718aa0c2
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_crtc.c
@@ -0,0 +1,110 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#include <linux/clk.h>
+#include <linux/pm_runtime.h>
+#include <linux/spinlock.h>
+
+#include <drm/drm_atomic.h>
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_crtc_helper.h>
+#include <drm/drm_plane_helper.h>
+#include <drm/drm_print.h>
+#include <drm/drm_vblank.h>
+
+#include "komeda_dev.h"
+#include "komeda_kms.h"
+
+struct drm_crtc_helper_funcs komeda_crtc_helper_funcs = {
+};
+
+static const struct drm_crtc_funcs komeda_crtc_funcs = {
+};
+
+int komeda_kms_setup_crtcs(struct komeda_kms_dev *kms,
+			   struct komeda_dev *mdev)
+{
+	struct komeda_crtc *crtc;
+	struct komeda_pipeline *master;
+	char str[16];
+	int i;
+
+	kms->n_crtcs = 0;
+
+	for (i = 0; i < mdev->n_pipelines; i++) {
+		crtc = &kms->crtcs[kms->n_crtcs];
+		master = mdev->pipelines[i];
+
+		crtc->master = master;
+		crtc->slave  = NULL;
+
+		if (crtc->slave)
+			sprintf(str, "pipe-%d", crtc->slave->id);
+		else
+			sprintf(str, "None");
+
+		DRM_INFO("crtc%d: master(pipe-%d) slave(%s) output: %s.\n",
+			 kms->n_crtcs, master->id, str,
+			 master->of_output_dev ?
+			 master->of_output_dev->full_name : "None");
+
+		kms->n_crtcs++;
+	}
+
+	return 0;
+}
+
+static struct drm_plane *
+get_crtc_primary(struct komeda_kms_dev *kms, struct komeda_crtc *crtc)
+{
+	struct komeda_plane *kplane;
+	struct drm_plane *plane;
+
+	drm_for_each_plane(plane, &kms->base) {
+		if (plane->type != DRM_PLANE_TYPE_PRIMARY)
+			continue;
+
+		kplane = to_kplane(plane);
+		/* only master can be primary */
+		if (kplane->layer->base.pipeline == crtc->master)
+			return plane;
+	}
+
+	return NULL;
+}
+
+static int komeda_crtc_add(struct komeda_kms_dev *kms,
+			   struct komeda_crtc *kcrtc)
+{
+	struct drm_crtc *crtc = &kcrtc->base;
+	int err;
+
+	err = drm_crtc_init_with_planes(&kms->base, crtc,
+					get_crtc_primary(kms, kcrtc), NULL,
+					&komeda_crtc_funcs, NULL);
+	if (err)
+		return err;
+
+	drm_crtc_helper_add(crtc, &komeda_crtc_helper_funcs);
+	drm_crtc_vblank_reset(crtc);
+
+	crtc->port = kcrtc->master->of_output_port;
+
+	return 0;
+}
+
+int komeda_kms_add_crtcs(struct komeda_kms_dev *kms, struct komeda_dev *mdev)
+{
+	int i, err;
+
+	for (i = 0; i < kms->n_crtcs; i++) {
+		err = komeda_crtc_add(kms, &kms->crtcs[i]);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_dev.c b/drivers/gpu/drm/arm/display/komeda/komeda_dev.c
new file mode 100644
index 000000000000..70e9bb7fa30c
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_dev.c
@@ -0,0 +1,190 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#include <linux/io.h>
+#include <linux/of_device.h>
+#include <linux/of_graph.h>
+#include <linux/platform_device.h>
+
+#include <drm/drm_print.h>
+
+#include "komeda_dev.h"
+
+static int komeda_parse_pipe_dt(struct komeda_dev *mdev, struct device_node *np)
+{
+	struct komeda_pipeline *pipe;
+	struct clk *clk;
+	u32 pipe_id;
+	int ret = 0;
+
+	ret = of_property_read_u32(np, "reg", &pipe_id);
+	if (ret != 0 || pipe_id >= mdev->n_pipelines)
+		return -EINVAL;
+
+	pipe = mdev->pipelines[pipe_id];
+
+	clk = of_clk_get_by_name(np, "aclk");
+	if (IS_ERR(clk)) {
+		DRM_ERROR("get aclk for pipeline %d failed!\n", pipe_id);
+		return PTR_ERR(clk);
+	}
+	pipe->aclk = clk;
+
+	clk = of_clk_get_by_name(np, "pxclk");
+	if (IS_ERR(clk)) {
+		DRM_ERROR("get pxclk for pipeline %d failed!\n", pipe_id);
+		return PTR_ERR(clk);
+	}
+	pipe->pxlclk = clk;
+
+	/* enum ports */
+	pipe->of_output_dev =
+		of_graph_get_remote_node(np, KOMEDA_OF_PORT_OUTPUT, 0);
+	pipe->of_output_port =
+		of_graph_get_port_by_id(np, KOMEDA_OF_PORT_OUTPUT);
+
+	pipe->of_node = np;
+
+	return 0;
+}
+
+static int komeda_parse_dt(struct device *dev, struct komeda_dev *mdev)
+{
+	struct device_node *child, *np = dev->of_node;
+	struct clk *clk;
+	int ret;
+
+	clk = devm_clk_get(dev, "mclk");
+	if (IS_ERR(clk))
+		return PTR_ERR(clk);
+
+	mdev->mclk = clk;
+
+	for_each_available_child_of_node(np, child) {
+		if (of_node_cmp(child->name, "pipeline") == 0) {
+			ret = komeda_parse_pipe_dt(mdev, child);
+			if (ret) {
+				DRM_ERROR("parse pipeline dt error!\n");
+				of_node_put(child);
+				break;
+			}
+		}
+	}
+
+	return ret;
+}
+
+struct komeda_dev *komeda_dev_create(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	const struct komeda_product_data *product;
+	struct komeda_dev *mdev;
+	struct resource *io_res;
+	int err = 0;
+
+	product = of_device_get_match_data(dev);
+	if (!product)
+		return ERR_PTR(-ENODEV);
+
+	io_res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!io_res) {
+		DRM_ERROR("No registers defined.\n");
+		return ERR_PTR(-ENODEV);
+	}
+
+	mdev = devm_kzalloc(dev, sizeof(*mdev), GFP_KERNEL);
+	if (!mdev)
+		return ERR_PTR(-ENOMEM);
+
+	mdev->dev = dev;
+	mdev->reg_base = devm_ioremap_resource(dev, io_res);
+	if (IS_ERR(mdev->reg_base)) {
+		DRM_ERROR("Map register space failed.\n");
+		err = PTR_ERR(mdev->reg_base);
+		mdev->reg_base = NULL;
+		goto err_cleanup;
+	}
+
+	mdev->pclk = devm_clk_get(dev, "pclk");
+	if (IS_ERR(mdev->pclk)) {
+		DRM_ERROR("Get APB clk failed.\n");
+		err = PTR_ERR(mdev->pclk);
+		mdev->pclk = NULL;
+		goto err_cleanup;
+	}
+
+	/* Enable APB clock to access the registers */
+	clk_prepare_enable(mdev->pclk);
+
+	mdev->funcs = product->identify(mdev->reg_base, &mdev->chip);
+	if (!komeda_product_match(mdev, product->product_id)) {
+		DRM_ERROR("DT configured %x mismatch with real HW %x.\n",
+			  product->product_id,
+			  MALIDP_CORE_ID_PRODUCT_ID(mdev->chip.core_id));
+		err = -ENODEV;
+		goto err_cleanup;
+	}
+
+	DRM_INFO("Found ARM Mali-D%x version r%dp%d\n",
+		 MALIDP_CORE_ID_PRODUCT_ID(mdev->chip.core_id),
+		 MALIDP_CORE_ID_MAJOR(mdev->chip.core_id),
+		 MALIDP_CORE_ID_MINOR(mdev->chip.core_id));
+
+	mdev->funcs->init_format_table(mdev);
+
+	err = mdev->funcs->enum_resources(mdev);
+	if (err) {
+		DRM_ERROR("enumerate display resource failed.\n");
+		goto err_cleanup;
+	}
+
+	err = komeda_parse_dt(dev, mdev);
+	if (err) {
+		DRM_ERROR("parse device tree failed.\n");
+		goto err_cleanup;
+	}
+
+	return mdev;
+
+err_cleanup:
+	komeda_dev_destroy(mdev);
+	return ERR_PTR(err);
+}
+
+void komeda_dev_destroy(struct komeda_dev *mdev)
+{
+	struct device *dev = mdev->dev;
+	struct komeda_dev_funcs *funcs = mdev->funcs;
+	int i;
+
+	for (i = 0; i < mdev->n_pipelines; i++) {
+		komeda_pipeline_destroy(mdev, mdev->pipelines[i]);
+		mdev->pipelines[i] = NULL;
+	}
+
+	mdev->n_pipelines = 0;
+
+	if (funcs && funcs->cleanup)
+		funcs->cleanup(mdev);
+
+	if (mdev->reg_base) {
+		devm_iounmap(dev, mdev->reg_base);
+		mdev->reg_base = NULL;
+	}
+
+	if (mdev->mclk) {
+		devm_clk_put(dev, mdev->mclk);
+		mdev->mclk = NULL;
+	}
+
+	if (mdev->pclk) {
+		clk_disable_unprepare(mdev->pclk);
+		devm_clk_put(dev, mdev->pclk);
+		mdev->pclk = NULL;
+	}
+
+	devm_kfree(dev, mdev);
+}
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_dev.h b/drivers/gpu/drm/arm/display/komeda/komeda_dev.h
new file mode 100644
index 000000000000..0f77dead6a23
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_dev.h
@@ -0,0 +1,110 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#ifndef _KOMEDA_DEV_H_
+#define _KOMEDA_DEV_H_
+
+#include <linux/device.h>
+#include <linux/clk.h>
+#include "komeda_pipeline.h"
+#include "malidp_product.h"
+#include "komeda_format_caps.h"
+
+/* malidp device id */
+enum {
+	MALI_D71 = 0,
+};
+
+/* pipeline DT ports */
+enum {
+	KOMEDA_OF_PORT_OUTPUT		= 0,
+	KOMEDA_OF_PORT_COPROC		= 1,
+};
+
+struct komeda_chip_info {
+	u32 arch_id;
+	u32 core_id;
+	u32 core_info;
+	u32 bus_width;
+};
+
+struct komeda_product_data {
+	u32 product_id;
+	struct komeda_dev_funcs *(*identify)(u32 __iomem *reg,
+					     struct komeda_chip_info *info);
+};
+
+struct komeda_dev;
+
+/**
+ * struct komeda_dev_funcs
+ *
+ * Supplied by chip level and returned by the chip entry function xxx_identify,
+ */
+struct komeda_dev_funcs {
+	/**
+	 * @init_format_table:
+	 *
+	 * initialize &komeda_dev->format_table, this function should be called
+	 * before the &enum_resource
+	 */
+	void (*init_format_table)(struct komeda_dev *mdev);
+	/**
+	 * @enum_resources:
+	 *
+	 * for CHIP to report or add pipeline and component resources to CORE
+	 */
+	int (*enum_resources)(struct komeda_dev *mdev);
+	/** @cleanup: call to chip to cleanup komeda_dev->chip data */
+	void (*cleanup)(struct komeda_dev *mdev);
+};
+
+/**
+ * struct komeda_dev
+ *
+ * Pipeline and component are used to describe how to handle the pixel data.
+ * komeda_device is for describing the whole view of the device, and the
+ * control-abilites of device.
+ */
+struct komeda_dev {
+	struct device *dev;
+	u32 __iomem   *reg_base;
+
+	struct komeda_chip_info chip;
+	/** @fmt_tbl: initialized by &komeda_dev_funcs->init_format_table */
+	struct komeda_format_caps_table fmt_tbl;
+	/** @pclk: APB clock for register access */
+	struct clk *pclk;
+	/** @mck: HW main engine clk */
+	struct clk *mclk;
+
+	int n_pipelines;
+	struct komeda_pipeline *pipelines[KOMEDA_MAX_PIPELINES];
+
+	/** @funcs: chip funcs to access to HW */
+	struct komeda_dev_funcs *funcs;
+	/**
+	 * @chip_data:
+	 *
+	 * chip data will be added by &komeda_dev_funcs.enum_resources() and
+	 * destroyed by &komeda_dev_funcs.cleanup()
+	 */
+	void *chip_data;
+};
+
+static inline bool
+komeda_product_match(struct komeda_dev *mdev, u32 target)
+{
+	return MALIDP_CORE_ID_PRODUCT_ID(mdev->chip.core_id) == target;
+}
+
+struct komeda_dev_funcs *
+d71_identify(u32 __iomem *reg, struct komeda_chip_info *chip);
+
+struct komeda_dev *komeda_dev_create(struct device *dev);
+void komeda_dev_destroy(struct komeda_dev *mdev);
+
+#endif /*_KOMEDA_DEV_H_*/
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_drv.c b/drivers/gpu/drm/arm/display/komeda/komeda_drv.c
new file mode 100644
index 000000000000..2bdd189b041d
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_drv.c
@@ -0,0 +1,144 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/platform_device.h>
+#include <linux/component.h>
+#include <drm/drm_of.h>
+#include "komeda_dev.h"
+#include "komeda_kms.h"
+
+struct komeda_drv {
+	struct komeda_dev *mdev;
+	struct komeda_kms_dev *kms;
+};
+
+static void komeda_unbind(struct device *dev)
+{
+	struct komeda_drv *mdrv = dev_get_drvdata(dev);
+
+	if (!mdrv)
+		return;
+
+	komeda_kms_detach(mdrv->kms);
+	komeda_dev_destroy(mdrv->mdev);
+
+	dev_set_drvdata(dev, NULL);
+	devm_kfree(dev, mdrv);
+}
+
+static int komeda_bind(struct device *dev)
+{
+	struct komeda_drv *mdrv;
+	int err;
+
+	mdrv = devm_kzalloc(dev, sizeof(*mdrv), GFP_KERNEL);
+	if (!mdrv)
+		return -ENOMEM;
+
+	mdrv->mdev = komeda_dev_create(dev);
+	if (IS_ERR(mdrv->mdev)) {
+		err = PTR_ERR(mdrv->mdev);
+		goto free_mdrv;
+	}
+
+	mdrv->kms = komeda_kms_attach(mdrv->mdev);
+	if (IS_ERR(mdrv->kms)) {
+		err = PTR_ERR(mdrv->kms);
+		goto destroy_mdev;
+	}
+
+	dev_set_drvdata(dev, mdrv);
+
+	return 0;
+
+destroy_mdev:
+	komeda_dev_destroy(mdrv->mdev);
+
+free_mdrv:
+	devm_kfree(dev, mdrv);
+	return err;
+}
+
+static const struct component_master_ops komeda_master_ops = {
+	.bind	= komeda_bind,
+	.unbind	= komeda_unbind,
+};
+
+static int compare_of(struct device *dev, void *data)
+{
+	return dev->of_node == data;
+}
+
+static void komeda_add_slave(struct device *master,
+			     struct component_match **match,
+			     struct device_node *np, int port)
+{
+	struct device_node *remote;
+
+	remote = of_graph_get_remote_node(np, port, 0);
+	if (remote) {
+		drm_of_component_match_add(master, match, compare_of, remote);
+		of_node_put(remote);
+	}
+}
+
+static int komeda_platform_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct component_match *match = NULL;
+	struct device_node *child;
+
+	if (!dev->of_node)
+		return -ENODEV;
+
+	for_each_available_child_of_node(dev->of_node, child) {
+		if (of_node_cmp(child->name, "pipeline") != 0)
+			continue;
+
+		/* add connector */
+		komeda_add_slave(dev, &match, child, KOMEDA_OF_PORT_OUTPUT);
+	}
+
+	return component_master_add_with_match(dev, &komeda_master_ops, match);
+}
+
+static int komeda_platform_remove(struct platform_device *pdev)
+{
+	component_master_del(&pdev->dev, &komeda_master_ops);
+	return 0;
+}
+
+static const struct komeda_product_data komeda_products[] = {
+	[MALI_D71] = {
+		.product_id = MALIDP_D71_PRODUCT_ID,
+		.identify = d71_identify,
+	},
+};
+
+const struct of_device_id komeda_of_match[] = {
+	{ .compatible = "arm,mali-d71", .data = &komeda_products[MALI_D71], },
+	{},
+};
+
+MODULE_DEVICE_TABLE(of, komeda_of_match);
+
+static struct platform_driver komeda_platform_driver = {
+	.probe	= komeda_platform_probe,
+	.remove	= komeda_platform_remove,
+	.driver	= {
+		.name = "komeda",
+		.of_match_table	= komeda_of_match,
+		.pm = NULL,
+	},
+};
+
+module_platform_driver(komeda_platform_driver);
+
+MODULE_AUTHOR("James.Qian.Wang <james.qian.wang@arm.com>");
+MODULE_DESCRIPTION("Komeda KMS driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_format_caps.c b/drivers/gpu/drm/arm/display/komeda/komeda_format_caps.c
new file mode 100644
index 000000000000..1e17bd6107a4
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_format_caps.c
@@ -0,0 +1,75 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+
+#include <linux/slab.h>
+#include "komeda_format_caps.h"
+#include "malidp_utils.h"
+
+const struct komeda_format_caps *
+komeda_get_format_caps(struct komeda_format_caps_table *table,
+		       u32 fourcc, u64 modifier)
+{
+	const struct komeda_format_caps *caps;
+	u64 afbc_features = modifier & ~(AFBC_FORMAT_MOD_BLOCK_SIZE_MASK);
+	u32 afbc_layout = modifier & AFBC_FORMAT_MOD_BLOCK_SIZE_MASK;
+	int id;
+
+	for (id = 0; id < table->n_formats; id++) {
+		caps = &table->format_caps[id];
+
+		if (fourcc != caps->fourcc)
+			continue;
+
+		if ((modifier == 0ULL) && (caps->supported_afbc_layouts == 0))
+			return caps;
+
+		if (has_bits(afbc_features, caps->supported_afbc_features) &&
+		    has_bit(afbc_layout, caps->supported_afbc_layouts))
+			return caps;
+	}
+
+	return NULL;
+}
+
+u32 *komeda_get_layer_fourcc_list(struct komeda_format_caps_table *table,
+				  u32 layer_type, u32 *n_fmts)
+{
+	const struct komeda_format_caps *cap;
+	u32 *fmts;
+	int i, j, n = 0;
+
+	fmts = kcalloc(table->n_formats, sizeof(u32), GFP_KERNEL);
+	if (!fmts)
+		return NULL;
+
+	for (i = 0; i < table->n_formats; i++) {
+		cap = &table->format_caps[i];
+		if (!(layer_type & cap->supported_layer_types) ||
+		    (cap->fourcc == 0))
+			continue;
+
+		/* one fourcc may has two caps items in table (afbc/none-afbc),
+		 * so check the existing list to avoid adding a duplicated one.
+		 */
+		for (j = n - 1; j >= 0; j--)
+			if (fmts[j] == cap->fourcc)
+				break;
+
+		if (j < 0)
+			fmts[n++] = cap->fourcc;
+	}
+
+	if (n_fmts)
+		*n_fmts = n;
+
+	return fmts;
+}
+
+void komeda_put_fourcc_list(u32 *fourcc_list)
+{
+	kfree(fourcc_list);
+}
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_format_caps.h b/drivers/gpu/drm/arm/display/komeda/komeda_format_caps.h
new file mode 100644
index 000000000000..60f39e77b098
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_format_caps.h
@@ -0,0 +1,89 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+
+#ifndef _KOMEDA_FORMAT_CAPS_H_
+#define _KOMEDA_FORMAT_CAPS_H_
+
+#include <linux/types.h>
+#include <uapi/drm/drm_fourcc.h>
+#include <drm/drm_fourcc.h>
+
+#define AFBC(x)		DRM_FORMAT_MOD_ARM_AFBC(x)
+
+/* afbc layerout */
+#define AFBC_16x16(x)	AFBC(AFBC_FORMAT_MOD_BLOCK_SIZE_16x16 | (x))
+#define AFBC_32x8(x)	AFBC(AFBC_FORMAT_MOD_BLOCK_SIZE_32x8 | (x))
+
+/* afbc features */
+#define _YTR		AFBC_FORMAT_MOD_YTR
+#define _SPLIT		AFBC_FORMAT_MOD_SPLIT
+#define _SPARSE		AFBC_FORMAT_MOD_SPARSE
+#define _CBR		AFBC_FORMAT_MOD_CBR
+#define _TILED		AFBC_FORMAT_MOD_TILED
+#define _SC		AFBC_FORMAT_MOD_SC
+
+/* layer_type */
+#define KOMEDA_FMT_RICH_LAYER		BIT(0)
+#define KOMEDA_FMT_SIMPLE_LAYER		BIT(1)
+#define KOMEDA_FMT_WB_LAYER		BIT(2)
+
+#define AFBC_TH_LAYOUT_ALIGNMENT	8
+#define AFBC_HEADER_SIZE		16
+#define AFBC_SUPERBLK_ALIGNMENT		128
+#define AFBC_SUPERBLK_PIXELS		256
+#define AFBC_BODY_START_ALIGNMENT	1024
+#define AFBC_TH_BODY_START_ALIGNMENT	4096
+
+/**
+ * struct komeda_format_caps
+ *
+ * komeda_format_caps is for describing ARM display specific features and
+ * limitations for a specific format, and format_caps will be linked into
+ * &komeda_framebuffer like a extension of &drm_format_info.
+ *
+ * NOTE: one fourcc may has two different format_caps items for fourcc and
+ * fourcc+modifier
+ *
+ * @hw_id: hw format id, hw specific value.
+ * @fourcc: drm fourcc format.
+ * @tile_size: format tiled size, used by ARM format X0L0/X0L2
+ * @supported_layer_types: indicate which layer supports this format
+ * @supported_rots: allowed rotations for this format
+ * @supported_afbc_layouts: supported afbc layerout
+ * @supported_afbc_features: supported afbc features
+ */
+struct komeda_format_caps {
+	u32 hw_id;
+	u32 fourcc;
+	u32 tile_size;
+	u32 supported_layer_types;
+	u32 supported_rots;
+	u32 supported_afbc_layouts;
+	u64 supported_afbc_features;
+};
+
+/**
+ * struct komeda_format_caps_table - format_caps mananger
+ *
+ * @n_formats: the size of format_caps list.
+ * @format_caps: format_caps list.
+ */
+struct komeda_format_caps_table {
+	u32 n_formats;
+	const struct komeda_format_caps *format_caps;
+};
+
+const struct komeda_format_caps *
+komeda_get_format_caps(struct komeda_format_caps_table *table,
+		       u32 fourcc, u64 modifier);
+
+u32 *komeda_get_layer_fourcc_list(struct komeda_format_caps_table *table,
+				  u32 layer_type, u32 *n_fmts);
+
+void komeda_put_fourcc_list(u32 *fourcc_list);
+
+#endif
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_framebuffer.c b/drivers/gpu/drm/arm/display/komeda/komeda_framebuffer.c
new file mode 100644
index 000000000000..9cc9935024f7
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_framebuffer.c
@@ -0,0 +1,167 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#include <drm/drm_device.h>
+#include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_gem_framebuffer_helper.h>
+
+#include "komeda_framebuffer.h"
+#include "komeda_dev.h"
+
+static void komeda_fb_destroy(struct drm_framebuffer *fb)
+{
+	struct komeda_fb *kfb = to_kfb(fb);
+	u32 i;
+
+	for (i = 0; i < fb->format->num_planes; i++)
+		drm_gem_object_put_unlocked(fb->obj[i]);
+
+	drm_framebuffer_cleanup(fb);
+	kfree(kfb);
+}
+
+static int komeda_fb_create_handle(struct drm_framebuffer *fb,
+				   struct drm_file *file, u32 *handle)
+{
+	return drm_gem_handle_create(file, fb->obj[0], handle);
+}
+
+static const struct drm_framebuffer_funcs komeda_fb_funcs = {
+	.destroy	= komeda_fb_destroy,
+	.create_handle	= komeda_fb_create_handle,
+};
+
+static int
+komeda_fb_none_afbc_size_check(struct komeda_dev *mdev, struct komeda_fb *kfb,
+			       struct drm_file *file,
+			       const struct drm_mode_fb_cmd2 *mode_cmd)
+{
+	struct drm_framebuffer *fb = &kfb->base;
+	struct drm_gem_object *obj;
+	u32 min_size = 0;
+	u32 i;
+
+	for (i = 0; i < fb->format->num_planes; i++) {
+		obj = drm_gem_object_lookup(file, mode_cmd->handles[i]);
+		if (!obj) {
+			DRM_DEBUG_KMS("Failed to lookup GEM object\n");
+			fb->obj[i] = NULL;
+
+			return -ENOENT;
+		}
+
+		kfb->aligned_w = fb->width / (i ? fb->format->hsub : 1);
+		kfb->aligned_h = fb->height / (i ? fb->format->vsub : 1);
+
+		if (fb->pitches[i] % mdev->chip.bus_width) {
+			DRM_DEBUG_KMS("Pitch[%d]: 0x%x doesn't align to 0x%x\n",
+				      i, fb->pitches[i], mdev->chip.bus_width);
+			drm_gem_object_put_unlocked(obj);
+			fb->obj[i] = NULL;
+
+			return -EINVAL;
+		}
+
+		min_size = ((kfb->aligned_h / kfb->format_caps->tile_size - 1)
+			    * fb->pitches[i])
+			    + (kfb->aligned_w * fb->format->cpp[i]
+			       * kfb->format_caps->tile_size)
+			    + fb->offsets[i];
+
+		if (obj->size < min_size) {
+			DRM_DEBUG_KMS("Fail to check none afbc fb size.\n");
+			drm_gem_object_put_unlocked(obj);
+			fb->obj[i] = NULL;
+
+			return -EINVAL;
+		}
+
+		fb->obj[i] = obj;
+	}
+
+	if (fb->format->num_planes == 3) {
+		if (fb->pitches[1] != fb->pitches[2]) {
+			DRM_DEBUG_KMS("The pitch[1] and [2] are not same\n");
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+struct drm_framebuffer *
+komeda_fb_create(struct drm_device *dev, struct drm_file *file,
+		 const struct drm_mode_fb_cmd2 *mode_cmd)
+{
+	struct komeda_dev *mdev = dev->dev_private;
+	struct komeda_fb *kfb;
+	int ret = 0, i;
+
+	kfb = kzalloc(sizeof(*kfb), GFP_KERNEL);
+	if (!kfb)
+		return ERR_PTR(-ENOMEM);
+
+	kfb->format_caps = komeda_get_format_caps(&mdev->fmt_tbl,
+						  mode_cmd->pixel_format,
+						  mode_cmd->modifier[0]);
+	if (!kfb->format_caps) {
+		DRM_DEBUG_KMS("FMT %x is not supported.\n",
+			      mode_cmd->pixel_format);
+		kfree(kfb);
+		return ERR_PTR(-EINVAL);
+	}
+
+	drm_helper_mode_fill_fb_struct(dev, &kfb->base, mode_cmd);
+
+	ret = komeda_fb_none_afbc_size_check(mdev, kfb, file, mode_cmd);
+	if (ret < 0)
+		goto err_cleanup;
+
+	ret = drm_framebuffer_init(dev, &kfb->base, &komeda_fb_funcs);
+	if (ret < 0) {
+		DRM_DEBUG_KMS("failed to initialize fb\n");
+
+		goto err_cleanup;
+	}
+
+	return &kfb->base;
+
+err_cleanup:
+	for (i = 0; i < kfb->base.format->num_planes; i++)
+		drm_gem_object_put_unlocked(kfb->base.obj[i]);
+
+	kfree(kfb);
+	return ERR_PTR(ret);
+}
+
+dma_addr_t
+komeda_fb_get_pixel_addr(struct komeda_fb *kfb, int x, int y, int plane)
+{
+	struct drm_framebuffer *fb = &kfb->base;
+	const struct drm_gem_cma_object *obj;
+	u32 plane_x, plane_y, cpp, pitch, offset;
+
+	if (plane >= fb->format->num_planes) {
+		DRM_DEBUG_KMS("Out of max plane num.\n");
+		return -EINVAL;
+	}
+
+	obj = drm_fb_cma_get_gem_obj(fb, plane);
+
+	offset = fb->offsets[plane];
+	if (!fb->modifier) {
+		plane_x = x / (plane ? fb->format->hsub : 1);
+		plane_y = y / (plane ? fb->format->vsub : 1);
+		cpp = fb->format->cpp[plane];
+		pitch = fb->pitches[plane];
+		offset += plane_x * cpp *  kfb->format_caps->tile_size +
+				(plane_y * pitch) / kfb->format_caps->tile_size;
+	}
+
+	return obj->paddr + offset;
+}
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_framebuffer.h b/drivers/gpu/drm/arm/display/komeda/komeda_framebuffer.h
new file mode 100644
index 000000000000..0de2e4a2afd2
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_framebuffer.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#ifndef _KOMEDA_FRAMEBUFFER_H_
+#define _KOMEDA_FRAMEBUFFER_H_
+
+#include <drm/drm_framebuffer.h>
+#include "komeda_format_caps.h"
+
+/** struct komeda_fb - entend drm_framebuffer with komeda attribute */
+struct komeda_fb {
+	/** @base: &drm_framebuffer */
+	struct drm_framebuffer base;
+	/* @format_caps: &komeda_format_caps */
+	const struct komeda_format_caps *format_caps;
+	/** @aligned_w: aligned frame buffer width */
+	u32 aligned_w;
+	/** @aligned_h: aligned frame buffer height */
+	u32 aligned_h;
+};
+
+#define to_kfb(dfb)	container_of(dfb, struct komeda_fb, base)
+
+struct drm_framebuffer *
+komeda_fb_create(struct drm_device *dev, struct drm_file *file,
+		 const struct drm_mode_fb_cmd2 *mode_cmd);
+dma_addr_t
+komeda_fb_get_pixel_addr(struct komeda_fb *kfb, int x, int y, int plane);
+bool komeda_fb_is_layer_supported(struct komeda_fb *kfb, u32 layer_type);
+
+#endif
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.c b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
new file mode 100644
index 000000000000..47a58ab20434
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#include <linux/component.h>
+#include <linux/interrupt.h>
+
+#include <drm/drm_atomic.h>
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_fb_helper.h>
+#include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_vblank.h>
+
+#include "komeda_dev.h"
+#include "komeda_framebuffer.h"
+#include "komeda_kms.h"
+
+DEFINE_DRM_GEM_CMA_FOPS(komeda_cma_fops);
+
+static int komeda_gem_cma_dumb_create(struct drm_file *file,
+				      struct drm_device *dev,
+				      struct drm_mode_create_dumb *args)
+{
+	u32 alignment = 16; /* TODO get alignment from dev */
+
+	args->pitch = ALIGN(DIV_ROUND_UP(args->width * args->bpp, 8),
+			    alignment);
+
+	return drm_gem_cma_dumb_create_internal(file, dev, args);
+}
+
+static struct drm_driver komeda_kms_driver = {
+	.driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC |
+			   DRIVER_PRIME,
+	.lastclose			= drm_fb_helper_lastclose,
+	.gem_free_object_unlocked	= drm_gem_cma_free_object,
+	.gem_vm_ops			= &drm_gem_cma_vm_ops,
+	.dumb_create			= komeda_gem_cma_dumb_create,
+	.prime_handle_to_fd		= drm_gem_prime_handle_to_fd,
+	.prime_fd_to_handle		= drm_gem_prime_fd_to_handle,
+	.gem_prime_export		= drm_gem_prime_export,
+	.gem_prime_import		= drm_gem_prime_import,
+	.gem_prime_get_sg_table		= drm_gem_cma_prime_get_sg_table,
+	.gem_prime_import_sg_table	= drm_gem_cma_prime_import_sg_table,
+	.gem_prime_vmap			= drm_gem_cma_prime_vmap,
+	.gem_prime_vunmap		= drm_gem_cma_prime_vunmap,
+	.gem_prime_mmap			= drm_gem_cma_prime_mmap,
+	.fops = &komeda_cma_fops,
+	.name = "komeda",
+	.desc = "Arm Komeda Display Processor driver",
+	.date = "20181101",
+	.major = 0,
+	.minor = 1,
+};
+
+static void komeda_kms_commit_tail(struct drm_atomic_state *old_state)
+{
+	struct drm_device *dev = old_state->dev;
+
+	drm_atomic_helper_commit_modeset_disables(dev, old_state);
+
+	drm_atomic_helper_commit_planes(dev, old_state, 0);
+
+	drm_atomic_helper_commit_modeset_enables(dev, old_state);
+
+	drm_atomic_helper_wait_for_flip_done(dev, old_state);
+
+	drm_atomic_helper_commit_hw_done(old_state);
+
+	drm_atomic_helper_cleanup_planes(dev, old_state);
+}
+
+static const struct drm_mode_config_helper_funcs komeda_mode_config_helpers = {
+	.atomic_commit_tail = komeda_kms_commit_tail,
+};
+
+static const struct drm_mode_config_funcs komeda_mode_config_funcs = {
+	.fb_create		= komeda_fb_create,
+	.atomic_check		= drm_atomic_helper_check,
+	.atomic_commit		= drm_atomic_helper_commit,
+};
+
+static void komeda_kms_mode_config_init(struct komeda_kms_dev *kms,
+					struct komeda_dev *mdev)
+{
+	struct drm_mode_config *config = &kms->base.mode_config;
+
+	drm_mode_config_init(&kms->base);
+
+	komeda_kms_setup_crtcs(kms, mdev);
+
+	/* Get value from dev */
+	config->min_width	= 0;
+	config->min_height	= 0;
+	config->max_width	= 4096;
+	config->max_height	= 4096;
+	config->allow_fb_modifiers = false;
+
+	config->funcs = &komeda_mode_config_funcs;
+	config->helper_private = &komeda_mode_config_helpers;
+}
+
+struct komeda_kms_dev *komeda_kms_attach(struct komeda_dev *mdev)
+{
+	struct komeda_kms_dev *kms = kzalloc(sizeof(*kms), GFP_KERNEL);
+	struct drm_device *drm;
+	int err;
+
+	if (!kms)
+		return ERR_PTR(-ENOMEM);
+
+	drm = &kms->base;
+	err = drm_dev_init(drm, &komeda_kms_driver, mdev->dev);
+	if (err)
+		goto free_kms;
+
+	drm->dev_private = mdev;
+
+	komeda_kms_mode_config_init(kms, mdev);
+
+	err = komeda_kms_add_private_objs(kms, mdev);
+	if (err)
+		goto cleanup_mode_config;
+
+	err = komeda_kms_add_planes(kms, mdev);
+	if (err)
+		goto cleanup_mode_config;
+
+	err = drm_vblank_init(drm, kms->n_crtcs);
+	if (err)
+		goto cleanup_mode_config;
+
+	err = komeda_kms_add_crtcs(kms, mdev);
+	if (err)
+		goto cleanup_mode_config;
+
+	err = component_bind_all(mdev->dev, kms);
+	if (err)
+		goto cleanup_mode_config;
+
+	drm_mode_config_reset(drm);
+
+	err = drm_dev_register(drm, 0);
+	if (err)
+		goto cleanup_mode_config;
+
+	return kms;
+
+cleanup_mode_config:
+	drm_mode_config_cleanup(drm);
+free_kms:
+	kfree(kms);
+	return ERR_PTR(err);
+}
+
+void komeda_kms_detach(struct komeda_kms_dev *kms)
+{
+	struct drm_device *drm = &kms->base;
+	struct komeda_dev *mdev = drm->dev_private;
+
+	drm_dev_unregister(drm);
+	component_unbind_all(mdev->dev, drm);
+	komeda_kms_cleanup_private_objs(mdev);
+	drm_mode_config_cleanup(drm);
+	drm->dev_private = NULL;
+	drm_dev_put(drm);
+}
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_kms.h b/drivers/gpu/drm/arm/display/komeda/komeda_kms.h
new file mode 100644
index 000000000000..874e9c9f0749
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_kms.h
@@ -0,0 +1,114 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#ifndef _KOMEDA_KMS_H_
+#define _KOMEDA_KMS_H_
+
+#include <drm/drm_atomic.h>
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_crtc_helper.h>
+#include <drm/drm_device.h>
+#include <drm/drm_writeback.h>
+
+/** struct komeda_plane - komeda instance of drm_plane */
+struct komeda_plane {
+	/** @base: &drm_plane */
+	struct drm_plane base;
+	/**
+	 * @layer:
+	 *
+	 * represents available layer input pipelines for this plane.
+	 *
+	 * NOTE:
+	 * the layer is not for a specific Layer, but indicate a group of
+	 * Layers with same capabilities.
+	 */
+	struct komeda_layer *layer;
+};
+
+/**
+ * struct komeda_plane_state
+ *
+ * The plane_state can be split into two data flow (left/right) and handled
+ * by two layers &komeda_plane.layer and &komeda_plane.layer.right
+ */
+struct komeda_plane_state {
+	/** @base: &drm_plane_state */
+	struct drm_plane_state base;
+
+	/* private properties */
+};
+
+/**
+ * struct komeda_wb_connector
+ */
+struct komeda_wb_connector {
+	/** @base: &drm_writeback_connector */
+	struct drm_writeback_connector base;
+
+	/** @wb_layer: represents associated writeback pipeline of komeda */
+	struct komeda_layer *wb_layer;
+};
+
+/**
+ * struct komeda_crtc
+ */
+struct komeda_crtc {
+	/** @base: &drm_crtc */
+	struct drm_crtc base;
+	/** @master: only master has display output */
+	struct komeda_pipeline *master;
+	/**
+	 * @slave: optional
+	 *
+	 * Doesn't have its own display output, the handled data flow will
+	 * merge into the master.
+	 */
+	struct komeda_pipeline *slave;
+};
+
+/** struct komeda_crtc_state */
+struct komeda_crtc_state {
+	/** @base: &drm_crtc_state */
+	struct drm_crtc_state base;
+
+	/* private properties */
+
+	/* computed state which are used by validate/check */
+	u32 affected_pipes;
+	u32 active_pipes;
+};
+
+/** struct komeda_kms_dev - for gather KMS related things */
+struct komeda_kms_dev {
+	/** @base: &drm_device */
+	struct drm_device base;
+
+	/** @n_crtcs: valid numbers of crtcs in &komeda_kms_dev.crtcs */
+	int n_crtcs;
+	/** @crtcs: crtcs list */
+	struct komeda_crtc crtcs[KOMEDA_MAX_PIPELINES];
+};
+
+#define to_kplane(p)	container_of(p, struct komeda_plane, base)
+#define to_kplane_st(p)	container_of(p, struct komeda_plane_state, base)
+#define to_kconn(p)	container_of(p, struct komeda_wb_connector, base)
+#define to_kcrtc(p)	container_of(p, struct komeda_crtc, base)
+#define to_kcrtc_st(p)	container_of(p, struct komeda_crtc_state, base)
+#define to_kdev(p)	container_of(p, struct komeda_kms_dev, base)
+
+int komeda_kms_setup_crtcs(struct komeda_kms_dev *kms, struct komeda_dev *mdev);
+
+int komeda_kms_add_crtcs(struct komeda_kms_dev *kms, struct komeda_dev *mdev);
+int komeda_kms_add_planes(struct komeda_kms_dev *kms, struct komeda_dev *mdev);
+int komeda_kms_add_private_objs(struct komeda_kms_dev *kms,
+				struct komeda_dev *mdev);
+void komeda_kms_cleanup_private_objs(struct komeda_dev *mdev);
+
+struct komeda_kms_dev *komeda_kms_attach(struct komeda_dev *mdev);
+void komeda_kms_detach(struct komeda_kms_dev *kms);
+
+#endif /*_KOMEDA_KMS_H_*/
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline.c b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline.c
new file mode 100644
index 000000000000..f1908e9ef128
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline.c
@@ -0,0 +1,202 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#include <drm/drm_print.h>
+
+#include "komeda_dev.h"
+#include "komeda_pipeline.h"
+
+/** komeda_pipeline_add - Add a pipeline to &komeda_dev */
+struct komeda_pipeline *
+komeda_pipeline_add(struct komeda_dev *mdev, size_t size,
+		    struct komeda_pipeline_funcs *funcs)
+{
+	struct komeda_pipeline *pipe;
+
+	if (mdev->n_pipelines + 1 > KOMEDA_MAX_PIPELINES) {
+		DRM_ERROR("Exceed max support %d pipelines.\n",
+			  KOMEDA_MAX_PIPELINES);
+		return NULL;
+	}
+
+	if (size < sizeof(*pipe)) {
+		DRM_ERROR("Request pipeline size too small.\n");
+		return NULL;
+	}
+
+	pipe = devm_kzalloc(mdev->dev, size, GFP_KERNEL);
+	if (!pipe)
+		return NULL;
+
+	pipe->mdev = mdev;
+	pipe->id   = mdev->n_pipelines;
+	pipe->funcs = funcs;
+
+	mdev->pipelines[mdev->n_pipelines] = pipe;
+	mdev->n_pipelines++;
+
+	return pipe;
+}
+
+void komeda_pipeline_destroy(struct komeda_dev *mdev,
+			     struct komeda_pipeline *pipe)
+{
+	struct komeda_component *c;
+	int i;
+
+	dp_for_each_set_bit(i, pipe->avail_comps) {
+		c = komeda_pipeline_get_component(pipe, i);
+		komeda_component_destroy(mdev, c);
+	}
+
+	clk_put(pipe->pxlclk);
+	clk_put(pipe->aclk);
+
+	of_node_put(pipe->of_output_dev);
+	of_node_put(pipe->of_output_port);
+	of_node_put(pipe->of_node);
+
+	devm_kfree(mdev->dev, pipe);
+}
+
+struct komeda_component **
+komeda_pipeline_get_component_pos(struct komeda_pipeline *pipe, int id)
+{
+	struct komeda_dev *mdev = pipe->mdev;
+	struct komeda_pipeline *temp = NULL;
+	struct komeda_component **pos = NULL;
+
+	switch (id) {
+	case KOMEDA_COMPONENT_LAYER0:
+	case KOMEDA_COMPONENT_LAYER1:
+	case KOMEDA_COMPONENT_LAYER2:
+	case KOMEDA_COMPONENT_LAYER3:
+		pos = to_cpos(pipe->layers[id - KOMEDA_COMPONENT_LAYER0]);
+		break;
+	case KOMEDA_COMPONENT_WB_LAYER:
+		pos = to_cpos(pipe->wb_layer);
+		break;
+	case KOMEDA_COMPONENT_COMPIZ0:
+	case KOMEDA_COMPONENT_COMPIZ1:
+		temp = mdev->pipelines[id - KOMEDA_COMPONENT_COMPIZ0];
+		if (!temp) {
+			DRM_ERROR("compiz-%d doesn't exist.\n", id);
+			return NULL;
+		}
+		pos = to_cpos(temp->compiz);
+		break;
+	case KOMEDA_COMPONENT_SCALER0:
+	case KOMEDA_COMPONENT_SCALER1:
+		pos = to_cpos(pipe->scalers[id - KOMEDA_COMPONENT_SCALER0]);
+		break;
+	case KOMEDA_COMPONENT_IPS0:
+	case KOMEDA_COMPONENT_IPS1:
+		temp = mdev->pipelines[id - KOMEDA_COMPONENT_IPS0];
+		if (!temp) {
+			DRM_ERROR("ips-%d doesn't exist.\n", id);
+			return NULL;
+		}
+		pos = to_cpos(temp->improc);
+		break;
+	case KOMEDA_COMPONENT_TIMING_CTRLR:
+		pos = to_cpos(pipe->ctrlr);
+		break;
+	default:
+		pos = NULL;
+		DRM_ERROR("Unknown pipeline resource ID: %d.\n", id);
+		break;
+	}
+
+	return pos;
+}
+
+struct komeda_component *
+komeda_pipeline_get_component(struct komeda_pipeline *pipe, int id)
+{
+	struct komeda_component **pos = NULL;
+	struct komeda_component *c = NULL;
+
+	pos = komeda_pipeline_get_component_pos(pipe, id);
+	if (pos)
+		c = *pos;
+
+	return c;
+}
+
+/** komeda_component_add - Add a component to &komeda_pipeline */
+struct komeda_component *
+komeda_component_add(struct komeda_pipeline *pipe,
+		     size_t comp_sz, u32 id, u32 hw_id,
+		     struct komeda_component_funcs *funcs,
+		     u8 max_active_inputs, u32 supported_inputs,
+		     u8 max_active_outputs, u32 __iomem *reg,
+		     const char *name_fmt, ...)
+{
+	struct komeda_component **pos;
+	struct komeda_component *c;
+	int idx, *num = NULL;
+
+	if (max_active_inputs > KOMEDA_COMPONENT_N_INPUTS) {
+		WARN(1, "please large KOMEDA_COMPONENT_N_INPUTS to %d.\n",
+		     max_active_inputs);
+		return NULL;
+	}
+
+	pos = komeda_pipeline_get_component_pos(pipe, id);
+	if (!pos || (*pos))
+		return NULL;
+
+	if (has_bit(id, KOMEDA_PIPELINE_LAYERS)) {
+		idx = id - KOMEDA_COMPONENT_LAYER0;
+		num = &pipe->n_layers;
+		if (idx != pipe->n_layers) {
+			DRM_ERROR("please add Layer by id sequence.\n");
+			return NULL;
+		}
+	} else if (has_bit(id,  KOMEDA_PIPELINE_SCALERS)) {
+		idx = id - KOMEDA_COMPONENT_SCALER0;
+		num = &pipe->n_scalers;
+		if (idx != pipe->n_scalers) {
+			DRM_ERROR("please add Scaler by id sequence.\n");
+			return NULL;
+		}
+	}
+
+	c = devm_kzalloc(pipe->mdev->dev, comp_sz, GFP_KERNEL);
+	if (!c)
+		return NULL;
+
+	c->id = id;
+	c->hw_id = hw_id;
+	c->reg = reg;
+	c->pipeline = pipe;
+	c->max_active_inputs = max_active_inputs;
+	c->max_active_outputs = max_active_outputs;
+	c->supported_inputs = supported_inputs;
+	c->funcs = funcs;
+
+	if (name_fmt) {
+		va_list args;
+
+		va_start(args, name_fmt);
+		vsnprintf(c->name, sizeof(c->name), name_fmt, args);
+		va_end(args);
+	}
+
+	if (num)
+		*num = *num + 1;
+
+	pipe->avail_comps |= BIT(c->id);
+	*pos = c;
+
+	return c;
+}
+
+void komeda_component_destroy(struct komeda_dev *mdev,
+			      struct komeda_component *c)
+{
+	devm_kfree(mdev->dev, c);
+}
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h
new file mode 100644
index 000000000000..8c950bc8ae96
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h
@@ -0,0 +1,359 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#ifndef _KOMEDA_PIPELINE_H_
+#define _KOMEDA_PIPELINE_H_
+
+#include <linux/types.h>
+#include <drm/drm_atomic.h>
+#include <drm/drm_atomic_helper.h>
+#include "malidp_utils.h"
+
+#define KOMEDA_MAX_PIPELINES		2
+#define KOMEDA_PIPELINE_MAX_LAYERS	4
+#define KOMEDA_PIPELINE_MAX_SCALERS	2
+#define KOMEDA_COMPONENT_N_INPUTS	5
+
+/* pipeline component IDs */
+enum {
+	KOMEDA_COMPONENT_LAYER0		= 0,
+	KOMEDA_COMPONENT_LAYER1		= 1,
+	KOMEDA_COMPONENT_LAYER2		= 2,
+	KOMEDA_COMPONENT_LAYER3		= 3,
+	KOMEDA_COMPONENT_WB_LAYER	= 7, /* write back layer */
+	KOMEDA_COMPONENT_SCALER0	= 8,
+	KOMEDA_COMPONENT_SCALER1	= 9,
+	KOMEDA_COMPONENT_SPLITTER	= 12,
+	KOMEDA_COMPONENT_MERGER		= 14,
+	KOMEDA_COMPONENT_COMPIZ0	= 16, /* compositor */
+	KOMEDA_COMPONENT_COMPIZ1	= 17,
+	KOMEDA_COMPONENT_IPS0		= 20, /* post image processor */
+	KOMEDA_COMPONENT_IPS1		= 21,
+	KOMEDA_COMPONENT_TIMING_CTRLR	= 22, /* timing controller */
+};
+
+#define KOMEDA_PIPELINE_LAYERS		(BIT(KOMEDA_COMPONENT_LAYER0) |\
+					 BIT(KOMEDA_COMPONENT_LAYER1) |\
+					 BIT(KOMEDA_COMPONENT_LAYER2) |\
+					 BIT(KOMEDA_COMPONENT_LAYER3))
+
+#define KOMEDA_PIPELINE_SCALERS		(BIT(KOMEDA_COMPONENT_SCALER0) |\
+					 BIT(KOMEDA_COMPONENT_SCALER1))
+
+#define KOMEDA_PIPELINE_COMPIZS		(BIT(KOMEDA_COMPONENT_COMPIZ0) |\
+					 BIT(KOMEDA_COMPONENT_COMPIZ1))
+
+#define KOMEDA_PIPELINE_IMPROCS		(BIT(KOMEDA_COMPONENT_IPS0) |\
+					 BIT(KOMEDA_COMPONENT_IPS1))
+struct komeda_component;
+struct komeda_component_state;
+
+/** komeda_component_funcs - component control functions */
+struct komeda_component_funcs {
+	/** @validate: optional,
+	 * component may has special requirements or limitations, this function
+	 * supply HW the ability to do the further HW specific check.
+	 */
+	int (*validate)(struct komeda_component *c,
+			struct komeda_component_state *state);
+	/** @update: update is a active update */
+	void (*update)(struct komeda_component *c,
+		       struct komeda_component_state *state);
+	/** @disable: disable component */
+	void (*disable)(struct komeda_component *c);
+	/** @dump_register: Optional, dump registers to seq_file */
+	void (*dump_register)(struct komeda_component *c, struct seq_file *seq);
+};
+
+/**
+ * struct komeda_component
+ *
+ * struct komeda_component describe the data flow capabilities for how to link a
+ * component into the display pipeline.
+ * all specified components are subclass of this structure.
+ */
+struct komeda_component {
+	/** @obj: treat component as private obj */
+	struct drm_private_obj obj;
+	/** @pipeline: the komeda pipeline this component belongs to */
+	struct komeda_pipeline *pipeline;
+	/** @name: component name */
+	char name[32];
+	/**
+	 * @reg:
+	 * component register base,
+	 * which is initialized by chip and used by chip only
+	 */
+	u32 __iomem *reg;
+	/** @id: component id */
+	u32 id;
+	/** @hw_ic: component hw id,
+	 *  which is initialized by chip and used by chip only
+	 */
+	u32 hw_id;
+
+	/**
+	 * @max_active_inputs:
+	 * @max_active_outpus:
+	 *
+	 * maximum number of inputs/outputs that can be active in the same time
+	 * Note:
+	 * the number isn't the bit number of @supported_inputs or
+	 * @supported_outputs, but may be less than it, since component may not
+	 * support enabling all @supported_inputs/outputs at the same time.
+	 */
+	u8 max_active_inputs;
+	u8 max_active_outputs;
+	/**
+	 * @supported_inputs:
+	 * @supported_outputs:
+	 *
+	 * bitmask of BIT(component->id) for the supported inputs/outputs
+	 * describes the possibilities of how a component is linked into a
+	 * pipeline.
+	 */
+	u32 supported_inputs;
+	u32 supported_outputs;
+
+	/**
+	 * @funcs: chip functions to access HW
+	 */
+	struct komeda_component_funcs *funcs;
+};
+
+/**
+ * struct komeda_component_output
+ *
+ * a component has multiple outputs, if want to know where the data
+ * comes from, only know the component is not enough, we still need to know
+ * its output port
+ */
+struct komeda_component_output {
+	/** @component: indicate which component the data comes from */
+	struct komeda_component *component;
+	/** @output_port:
+	 * the output port of the &komeda_component_output.component
+	 */
+	u8 output_port;
+};
+
+/**
+ * struct komeda_component_state
+ *
+ * component_state is the data flow configuration of the component, and it's
+ * the superclass of all specific component_state like @komeda_layer_state,
+ * @komeda_scaler_state
+ */
+struct komeda_component_state {
+	/** @obj: tracking component_state by drm_atomic_state */
+	struct drm_private_state obj;
+	struct komeda_component *component;
+	/**
+	 * @binding_user:
+	 * currently bound user, the user can be crtc/plane/wb_conn, which is
+	 * valid decided by @component and @inputs
+	 *
+	 * -  Layer: its user always is plane.
+	 * -  compiz/improc/timing_ctrlr: the user is crtc.
+	 * -  wb_layer: wb_conn;
+	 * -  scaler: plane when input is layer, wb_conn if input is compiz.
+	 */
+	union {
+		struct drm_crtc *crtc;
+		struct drm_plane *plane;
+		struct drm_connector *wb_conn;
+		void *binding_user;
+	};
+	/**
+	 * @active_inputs:
+	 *
+	 * active_inputs is bitmask of @inputs index
+	 *
+	 * -  active_inputs = changed_active_inputs + unchanged_active_inputs
+	 * -  affected_inputs = old->active_inputs + new->active_inputs;
+	 * -  disabling_inputs = affected_inputs ^ active_inputs;
+	 * -  changed_inputs = disabling_inputs + changed_active_inputs;
+	 *
+	 * NOTE:
+	 * changed_inputs doesn't include all active_input but only
+	 * @changed_active_inputs, and this bitmask can be used in chip
+	 * level for dirty update.
+	 */
+	u16 active_inputs;
+	u16 changed_active_inputs;
+	u16 affected_inputs;
+	/**
+	 * @inputs:
+	 *
+	 * the specific inputs[i] only valid on BIT(i) has been set in
+	 * @active_inputs, if not the inputs[i] is undefined.
+	 */
+	struct komeda_component_output inputs[KOMEDA_COMPONENT_N_INPUTS];
+};
+
+static inline u16 component_disabling_inputs(struct komeda_component_state *st)
+{
+	return st->affected_inputs ^ st->active_inputs;
+}
+
+static inline u16 component_changed_inputs(struct komeda_component_state *st)
+{
+	return component_disabling_inputs(st) | st->changed_active_inputs;
+}
+
+#define to_comp(__c)	(((__c) == NULL) ? NULL : &((__c)->base))
+#define to_cpos(__c)	((struct komeda_component **)&(__c))
+
+/* these structures are going to be filled in in uture patches */
+struct komeda_layer {
+	struct komeda_component base;
+	/* layer specific features and caps */
+	int layer_type; /* RICH, SIMPLE or WB */
+};
+
+struct komeda_layer_state {
+	struct komeda_component_state base;
+	/* layer specific configuration state */
+};
+
+struct komeda_compiz {
+	struct komeda_component base;
+	/* compiz specific features and caps */
+};
+
+struct komeda_compiz_state {
+	struct komeda_component_state base;
+	/* compiz specific configuration state */
+};
+
+struct komeda_scaler {
+	struct komeda_component base;
+	/* scaler features and caps */
+};
+
+struct komeda_scaler_state {
+	struct komeda_component_state base;
+};
+
+struct komeda_improc {
+	struct komeda_component base;
+};
+
+struct komeda_improc_state {
+	struct komeda_component_state base;
+};
+
+/* display timing controller */
+struct komeda_timing_ctrlr {
+	struct komeda_component base;
+};
+
+struct komeda_timing_ctrlr_state {
+	struct komeda_component_state base;
+};
+
+/** struct komeda_pipeline_funcs */
+struct komeda_pipeline_funcs {
+	/* dump_register: Optional, dump registers to seq_file */
+	void (*dump_register)(struct komeda_pipeline *pipe,
+			      struct seq_file *sf);
+};
+
+/**
+ * struct komeda_pipeline
+ *
+ * Represent a complete display pipeline and hold all functional components.
+ */
+struct komeda_pipeline {
+	/** @obj: link pipeline as private obj of drm_atomic_state */
+	struct drm_private_obj obj;
+	/** @mdev: the parent komeda_dev */
+	struct komeda_dev *mdev;
+	/** @pxlclk: pixel clock */
+	struct clk *pxlclk;
+	/** @aclk: AXI clock */
+	struct clk *aclk;
+	/** @id: pipeline id */
+	int id;
+	/** @avail_comps: available components mask of pipeline */
+	u32 avail_comps;
+	int n_layers;
+	struct komeda_layer *layers[KOMEDA_PIPELINE_MAX_LAYERS];
+	int n_scalers;
+	struct komeda_scaler *scalers[KOMEDA_PIPELINE_MAX_SCALERS];
+	struct komeda_compiz *compiz;
+	struct komeda_layer  *wb_layer;
+	struct komeda_improc *improc;
+	struct komeda_timing_ctrlr *ctrlr;
+	struct komeda_pipeline_funcs *funcs; /* private pipeline functions */
+
+	/** @of_node: pipeline dt node */
+	struct device_node *of_node;
+	/** @of_output_port: pipeline output port */
+	struct device_node *of_output_port;
+	/** @of_output_dev: output connector device node */
+	struct device_node *of_output_dev;
+};
+
+/**
+ * struct komeda_pipeline_state
+ *
+ * NOTE:
+ * Unlike the pipeline, pipeline_state doesn’t gather any component_state
+ * into it. It because all component will be managed by drm_atomic_state.
+ */
+struct komeda_pipeline_state {
+	/** @obj: tracking pipeline_state by drm_atomic_state */
+	struct drm_private_state obj;
+	struct komeda_pipeline *pipe;
+	/** @crtc: currently bound crtc */
+	struct drm_crtc *crtc;
+	/**
+	 * @active_comps:
+	 *
+	 * bitmask - BIT(component->id) of active components
+	 */
+	u32 active_comps;
+};
+
+#define to_layer(c)	container_of(c, struct komeda_layer, base)
+#define to_compiz(c)	container_of(c, struct komeda_compiz, base)
+#define to_scaler(c)	container_of(c, struct komeda_scaler, base)
+#define to_improc(c)	container_of(c, struct komeda_improc, base)
+#define to_ctrlr(c)	container_of(c, struct komeda_timing_ctrlr, base)
+
+#define to_layer_st(c)	container_of(c, struct komeda_layer_state, base)
+#define to_compiz_st(c)	container_of(c, struct komeda_compiz_state, base)
+#define to_scaler_st(c) container_of(c, struct komeda_scaler_state, base)
+#define to_improc_st(c)	container_of(c, struct komeda_improc_state, base)
+#define to_ctrlr_st(c)	container_of(c, struct komeda_timing_ctrlr_state, base)
+
+#define priv_to_comp_st(o) container_of(o, struct komeda_component_state, obj)
+#define priv_to_pipe_st(o)  container_of(o, struct komeda_pipeline_state, obj)
+
+/* pipeline APIs */
+struct komeda_pipeline *
+komeda_pipeline_add(struct komeda_dev *mdev, size_t size,
+		    struct komeda_pipeline_funcs *funcs);
+void komeda_pipeline_destroy(struct komeda_dev *mdev,
+			     struct komeda_pipeline *pipe);
+
+struct komeda_component *
+komeda_pipeline_get_component(struct komeda_pipeline *pipe, int id);
+
+/* component APIs */
+struct komeda_component *
+komeda_component_add(struct komeda_pipeline *pipe,
+		     size_t comp_sz, u32 id, u32 hw_id,
+		     struct komeda_component_funcs *funcs,
+		     u8 max_active_inputs, u32 supported_inputs,
+		     u8 max_active_outputs, u32 __iomem *reg,
+		     const char *name_fmt, ...);
+
+void komeda_component_destroy(struct komeda_dev *mdev,
+			      struct komeda_component *c);
+
+#endif /* _KOMEDA_PIPELINE_H_*/
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_plane.c b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
new file mode 100644
index 000000000000..0a4953a9a909
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_plane.c
@@ -0,0 +1,109 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#include <drm/drm_atomic.h>
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_plane_helper.h>
+#include "komeda_dev.h"
+#include "komeda_kms.h"
+
+static const struct drm_plane_helper_funcs komeda_plane_helper_funcs = {
+};
+
+static void komeda_plane_destroy(struct drm_plane *plane)
+{
+	drm_plane_cleanup(plane);
+
+	kfree(to_kplane(plane));
+}
+
+static const struct drm_plane_funcs komeda_plane_funcs = {
+};
+
+/* for komeda, which is pipeline can be share between crtcs */
+static u32 get_possible_crtcs(struct komeda_kms_dev *kms,
+			      struct komeda_pipeline *pipe)
+{
+	struct komeda_crtc *crtc;
+	u32 possible_crtcs = 0;
+	int i;
+
+	for (i = 0; i < kms->n_crtcs; i++) {
+		crtc = &kms->crtcs[i];
+
+		if ((pipe == crtc->master) || (pipe == crtc->slave))
+			possible_crtcs |= BIT(i);
+	}
+
+	return possible_crtcs;
+}
+
+/* use Layer0 as primary */
+static u32 get_plane_type(struct komeda_kms_dev *kms,
+			  struct komeda_component *c)
+{
+	bool is_primary = (c->id == KOMEDA_COMPONENT_LAYER0);
+
+	return is_primary ? DRM_PLANE_TYPE_PRIMARY : DRM_PLANE_TYPE_OVERLAY;
+}
+
+static int komeda_plane_add(struct komeda_kms_dev *kms,
+			    struct komeda_layer *layer)
+{
+	struct komeda_dev *mdev = kms->base.dev_private;
+	struct komeda_component *c = &layer->base;
+	struct komeda_plane *kplane;
+	struct drm_plane *plane;
+	u32 *formats, n_formats = 0;
+	int err;
+
+	kplane = kzalloc(sizeof(*kplane), GFP_KERNEL);
+	if (!kplane)
+		return -ENOMEM;
+
+	plane = &kplane->base;
+	kplane->layer = layer;
+
+	formats = komeda_get_layer_fourcc_list(&mdev->fmt_tbl,
+					       layer->layer_type, &n_formats);
+
+	err = drm_universal_plane_init(&kms->base, plane,
+			get_possible_crtcs(kms, c->pipeline),
+			&komeda_plane_funcs,
+			formats, n_formats, NULL,
+			get_plane_type(kms, c),
+			"%s", c->name);
+
+	komeda_put_fourcc_list(formats);
+
+	if (err)
+		goto cleanup;
+
+	drm_plane_helper_add(plane, &komeda_plane_helper_funcs);
+
+	return 0;
+cleanup:
+	komeda_plane_destroy(plane);
+	return err;
+}
+
+int komeda_kms_add_planes(struct komeda_kms_dev *kms, struct komeda_dev *mdev)
+{
+	struct komeda_pipeline *pipe;
+	int i, j, err;
+
+	for (i = 0; i < mdev->n_pipelines; i++) {
+		pipe = mdev->pipelines[i];
+
+		for (j = 0; j < pipe->n_layers; j++) {
+			err = komeda_plane_add(kms, pipe->layers[j]);
+			if (err)
+				return err;
+		}
+	}
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_private_obj.c b/drivers/gpu/drm/arm/display/komeda/komeda_private_obj.c
new file mode 100644
index 000000000000..f1c9e3fefa86
--- /dev/null
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_private_obj.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * (C) COPYRIGHT 2018 ARM Limited. All rights reserved.
+ * Author: James.Qian.Wang <james.qian.wang@arm.com>
+ *
+ */
+#include "komeda_dev.h"
+#include "komeda_kms.h"
+
+static struct drm_private_state *
+komeda_pipeline_atomic_duplicate_state(struct drm_private_obj *obj)
+{
+	struct komeda_pipeline_state *st;
+
+	st = kmemdup(obj->state, sizeof(*st), GFP_KERNEL);
+	if (!st)
+		return NULL;
+
+	st->active_comps = 0;
+
+	__drm_atomic_helper_private_obj_duplicate_state(obj, &st->obj);
+
+	return &st->obj;
+}
+
+static void
+komeda_pipeline_atomic_destroy_state(struct drm_private_obj *obj,
+				     struct drm_private_state *state)
+{
+	kfree(priv_to_pipe_st(state));
+}
+
+static const struct drm_private_state_funcs komeda_pipeline_obj_funcs = {
+	.atomic_duplicate_state	= komeda_pipeline_atomic_duplicate_state,
+	.atomic_destroy_state	= komeda_pipeline_atomic_destroy_state,
+};
+
+static int komeda_pipeline_obj_add(struct komeda_kms_dev *kms,
+				   struct komeda_pipeline *pipe)
+{
+	struct komeda_pipeline_state *st;
+
+	st = kzalloc(sizeof(*st), GFP_KERNEL);
+	if (!st)
+		return -ENOMEM;
+
+	st->pipe = pipe;
+	drm_atomic_private_obj_init(&kms->base, &pipe->obj, &st->obj,
+				    &komeda_pipeline_obj_funcs);
+
+	return 0;
+}
+
+int komeda_kms_add_private_objs(struct komeda_kms_dev *kms,
+				struct komeda_dev *mdev)
+{
+	struct komeda_pipeline *pipe;
+	int i, err;
+
+	for (i = 0; i < mdev->n_pipelines; i++) {
+		pipe = mdev->pipelines[i];
+
+		err = komeda_pipeline_obj_add(kms, pipe);
+		if (err)
+			return err;
+
+		/* Add component */
+	}
+
+	return 0;
+}
+
+void komeda_kms_cleanup_private_objs(struct komeda_dev *mdev)
+{
+	struct komeda_pipeline *pipe;
+	struct komeda_component *c;
+	int i, id;
+
+	for (i = 0; i < mdev->n_pipelines; i++) {
+		pipe = mdev->pipelines[i];
+		dp_for_each_set_bit(id, pipe->avail_comps) {
+			c = komeda_pipeline_get_component(pipe, id);
+
+			drm_atomic_private_obj_fini(&c->obj);
+		}
+		drm_atomic_private_obj_fini(&pipe->obj);
+	}
+}
diff --git a/drivers/gpu/drm/arm/hdlcd_crtc.c b/drivers/gpu/drm/arm/hdlcd_crtc.c
index e4d67b70244d..0b2b62f8fa3c 100644
--- a/drivers/gpu/drm/arm/hdlcd_crtc.c
+++ b/drivers/gpu/drm/arm/hdlcd_crtc.c
@@ -13,12 +13,12 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_fb_helper.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/of_graph.h>
 #include <linux/platform_data/simplefb.h>
diff --git a/drivers/gpu/drm/arm/hdlcd_drv.c b/drivers/gpu/drm/arm/hdlcd_drv.c
index dfad8d06d108..8fc0b884c428 100644
--- a/drivers/gpu/drm/arm/hdlcd_drv.c
+++ b/drivers/gpu/drm/arm/hdlcd_drv.c
@@ -22,13 +22,13 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_fb_helper.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_modeset_helper.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 
 #include "hdlcd_drv.h"
 #include "hdlcd_regs.h"
@@ -229,7 +229,7 @@ static int hdlcd_debugfs_init(struct drm_minor *minor)
 DEFINE_DRM_GEM_CMA_FOPS(fops);
 
 static struct drm_driver hdlcd_driver = {
-	.driver_features = DRIVER_HAVE_IRQ | DRIVER_GEM |
+	.driver_features = DRIVER_GEM |
 			   DRIVER_MODESET | DRIVER_PRIME |
 			   DRIVER_ATOMIC,
 	.irq_handler = hdlcd_irq,
diff --git a/drivers/gpu/drm/arm/malidp_crtc.c b/drivers/gpu/drm/arm/malidp_crtc.c
index e1b72782848c..56aad288666e 100644
--- a/drivers/gpu/drm/arm/malidp_crtc.c
+++ b/drivers/gpu/drm/arm/malidp_crtc.c
@@ -14,7 +14,7 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/pm_runtime.h>
 #include <video/videomode.h>
diff --git a/drivers/gpu/drm/arm/malidp_drv.c b/drivers/gpu/drm/arm/malidp_drv.c
index 505f316a192e..ab50ad06e271 100644
--- a/drivers/gpu/drm/arm/malidp_drv.c
+++ b/drivers/gpu/drm/arm/malidp_drv.c
@@ -23,7 +23,7 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
diff --git a/drivers/gpu/drm/arm/malidp_mw.c b/drivers/gpu/drm/arm/malidp_mw.c
index 91472e5e0c8b..041a64dc7167 100644
--- a/drivers/gpu/drm/arm/malidp_mw.c
+++ b/drivers/gpu/drm/arm/malidp_mw.c
@@ -8,7 +8,7 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drmP.h>
diff --git a/drivers/gpu/drm/armada/armada_510.c b/drivers/gpu/drm/armada/armada_510.c
index 2f7c048c5361..0e91d27921bd 100644
--- a/drivers/gpu/drm/armada/armada_510.c
+++ b/drivers/gpu/drm/armada/armada_510.c
@@ -9,7 +9,7 @@
  */
 #include <linux/clk.h>
 #include <linux/io.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "armada_crtc.h"
 #include "armada_drm.h"
 #include "armada_hw.h"
diff --git a/drivers/gpu/drm/armada/armada_crtc.c b/drivers/gpu/drm/armada/armada_crtc.c
index da9360688b55..ba4a3fab7745 100644
--- a/drivers/gpu/drm/armada/armada_crtc.c
+++ b/drivers/gpu/drm/armada/armada_crtc.c
@@ -12,7 +12,7 @@
 #include <linux/platform_device.h>
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_atomic_helper.h>
 #include "armada_crtc.h"
@@ -270,13 +270,7 @@ static void armada_drm_crtc_mode_set_nofb(struct drm_crtc *crtc)
 	tm = adj->crtc_vtotal - adj->crtc_vsync_end;
 
 	DRM_DEBUG_KMS("[CRTC:%d:%s] mode " DRM_MODE_FMT "\n",
-		      crtc->base.id, crtc->name,
-		      adj->base.id, adj->name, adj->vrefresh, adj->clock,
-		      adj->crtc_hdisplay, adj->crtc_hsync_start,
-		      adj->crtc_hsync_end, adj->crtc_htotal,
-		      adj->crtc_vdisplay, adj->crtc_vsync_start,
-		      adj->crtc_vsync_end, adj->crtc_vtotal,
-		      adj->type, adj->flags);
+		      crtc->base.id, crtc->name, DRM_MODE_ARG(adj));
 	DRM_DEBUG_KMS("lm %d rm %d tm %d bm %d\n", lm, rm, tm, bm);
 
 	/* Now compute the divider for real */
diff --git a/drivers/gpu/drm/armada/armada_crtc.h b/drivers/gpu/drm/armada/armada_crtc.h
index 7ebd337b60af..08761ff01739 100644
--- a/drivers/gpu/drm/armada/armada_crtc.h
+++ b/drivers/gpu/drm/armada/armada_crtc.h
@@ -8,6 +8,8 @@
 #ifndef ARMADA_CRTC_H
 #define ARMADA_CRTC_H
 
+#include <drm/drm_crtc.h>
+
 struct armada_gem_object;
 
 struct armada_regs {
diff --git a/drivers/gpu/drm/armada/armada_drv.c b/drivers/gpu/drm/armada/armada_drv.c
index fa31589b4fc0..e660c5ca52ae 100644
--- a/drivers/gpu/drm/armada/armada_drv.c
+++ b/drivers/gpu/drm/armada/armada_drv.c
@@ -10,7 +10,7 @@
 #include <linux/module.h>
 #include <linux/of_graph.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_of.h>
 #include "armada_crtc.h"
diff --git a/drivers/gpu/drm/armada/armada_fb.c b/drivers/gpu/drm/armada/armada_fb.c
index 6bd638a54579..058ac7d9920f 100644
--- a/drivers/gpu/drm/armada/armada_fb.c
+++ b/drivers/gpu/drm/armada/armada_fb.c
@@ -5,7 +5,7 @@
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
  */
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_modeset_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include "armada_drm.h"
diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c
index bf589c53b908..3871b39d4dea 100644
--- a/drivers/gpu/drm/ast/ast_drv.c
+++ b/drivers/gpu/drm/ast/ast_drv.c
@@ -30,6 +30,7 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "ast_drv.h"
 
diff --git a/drivers/gpu/drm/ast/ast_fb.c b/drivers/gpu/drm/ast/ast_fb.c
index de26df0c6044..2c9f8dd9733a 100644
--- a/drivers/gpu/drm/ast/ast_fb.c
+++ b/drivers/gpu/drm/ast/ast_fb.c
@@ -39,7 +39,9 @@
 #include <drm/drmP.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_util.h>
 #include <drm/drm_crtc_helper.h>
+
 #include "ast_drv.h"
 
 static void ast_dirty_update(struct ast_fbdev *afbdev,
@@ -191,7 +193,6 @@ static int astfb_create(struct drm_fb_helper *helper,
 	int size, ret;
 	void *sysram;
 	struct drm_gem_object *gobj = NULL;
-	struct ast_bo *bo = NULL;
 	mode_cmd.width = sizes->surface_width;
 	mode_cmd.height = sizes->surface_height;
 	mode_cmd.pitches[0] = mode_cmd.width * ((sizes->surface_bpp + 7)/8);
@@ -206,7 +207,6 @@ static int astfb_create(struct drm_fb_helper *helper,
 		DRM_ERROR("failed to create fbcon backing object %d\n", ret);
 		return ret;
 	}
-	bo = gem_to_ast_bo(gobj);
 
 	sysram = vmalloc(size);
 	if (!sysram)
@@ -263,7 +263,7 @@ static void ast_fbdev_destroy(struct drm_device *dev,
 {
 	struct ast_framebuffer *afb = &afbdev->afb;
 
-	drm_crtc_force_disable_all(dev);
+	drm_helper_force_disable_all(dev);
 	drm_fb_helper_unregister_fbi(&afbdev->helper);
 
 	if (afb->obj) {
diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c
index 373700c05a00..2854399856ba 100644
--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -639,13 +639,9 @@ int ast_dumb_create(struct drm_file *file,
 
 static void ast_bo_unref(struct ast_bo **bo)
 {
-	struct ttm_buffer_object *tbo;
-
 	if ((*bo) == NULL)
 		return;
-
-	tbo = &((*bo)->bo);
-	ttm_bo_unref(&tbo);
+	ttm_bo_put(&((*bo)->bo));
 	*bo = NULL;
 }
 
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 8bb355d5d43d..97fed0627d1c 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -32,6 +32,7 @@
 #include <drm/drm_crtc.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "ast_drv.h"
 
 #include "ast_tables.h"
diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c
index 6c4d4b6eba80..2362f07fe1fc 100644
--- a/drivers/gpu/drm/ati_pcigart.c
+++ b/drivers/gpu/drm/ati_pcigart.c
@@ -103,7 +103,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga
 	unsigned long pages;
 	u32 *pci_gart = NULL, page_base, gart_idx;
 	dma_addr_t bus_address = 0;
-	int i, j, ret = 0;
+	int i, j, ret = -ENOMEM;
 	int max_ati_pages, max_real_pages;
 
 	if (!entry) {
@@ -117,7 +117,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga
 		if (pci_set_dma_mask(dev->pdev, gart_info->table_mask)) {
 			DRM_ERROR("fail to set dma mask to 0x%Lx\n",
 				  (unsigned long long)gart_info->table_mask);
-			ret = 1;
+			ret = -EFAULT;
 			goto done;
 		}
 
@@ -160,6 +160,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga
 			drm_ati_pcigart_cleanup(dev, gart_info);
 			address = NULL;
 			bus_address = 0;
+			ret = -ENOMEM;
 			goto done;
 		}
 		page_base = (u32) entry->busaddr[i];
@@ -188,7 +189,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga
 			page_base += ATI_PCIGART_PAGE_SIZE;
 		}
 	}
-	ret = 1;
+	ret = 0;
 
 #if defined(__i386__) || defined(__x86_64__)
 	wbinvd();
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
index 96f4082671fe..8070a558d7b1 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
@@ -24,7 +24,7 @@
 #include <linux/pinctrl/consumer.h>
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #include <video/videomode.h>
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
index 034a91112098..0be13eceedba 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
@@ -720,7 +720,7 @@ static void atmel_hlcdc_dc_irq_uninstall(struct drm_device *dev)
 DEFINE_DRM_GEM_CMA_FOPS(fops);
 
 static struct drm_driver atmel_hlcdc_dc_driver = {
-	.driver_features = DRIVER_HAVE_IRQ | DRIVER_GEM |
+	.driver_features = DRIVER_GEM |
 			   DRIVER_MODESET | DRIVER_PRIME |
 			   DRIVER_ATOMIC,
 	.irq_handler = atmel_hlcdc_dc_irq_handler,
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h
index 4cc1e03f0aee..70bd540d644e 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h
@@ -31,7 +31,7 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
index 9330a076e15a..e836e2de35ce 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
@@ -549,7 +549,8 @@ atmel_hlcdc_plane_prepare_disc_area(struct drm_crtc_state *c_state)
 
 		ovl_state = drm_plane_state_to_atmel_hlcdc_plane_state(ovl_s);
 
-		if (!ovl_s->fb ||
+		if (!ovl_s->visible ||
+		    !ovl_s->fb ||
 		    ovl_s->fb->format->has_alpha ||
 		    ovl_s->alpha != DRM_BLEND_ALPHA_OPAQUE)
 			continue;
@@ -601,15 +602,10 @@ static int atmel_hlcdc_plane_atomic_check(struct drm_plane *p,
 	struct drm_framebuffer *fb = state->base.fb;
 	const struct drm_display_mode *mode;
 	struct drm_crtc_state *crtc_state;
-	unsigned int patched_crtc_w;
-	unsigned int patched_crtc_h;
-	unsigned int patched_src_w;
-	unsigned int patched_src_h;
 	unsigned int tmp;
-	int x_offset = 0;
-	int y_offset = 0;
 	int hsub = 1;
 	int vsub = 1;
+	int ret;
 	int i;
 
 	if (!state->base.crtc || !fb)
@@ -618,14 +614,21 @@ static int atmel_hlcdc_plane_atomic_check(struct drm_plane *p,
 	crtc_state = drm_atomic_get_existing_crtc_state(s->state, s->crtc);
 	mode = &crtc_state->adjusted_mode;
 
-	state->src_x = s->src_x;
-	state->src_y = s->src_y;
-	state->src_h = s->src_h;
-	state->src_w = s->src_w;
-	state->crtc_x = s->crtc_x;
-	state->crtc_y = s->crtc_y;
-	state->crtc_h = s->crtc_h;
-	state->crtc_w = s->crtc_w;
+	ret = drm_atomic_helper_check_plane_state(s, crtc_state,
+						  (1 << 16) / 2048,
+						  INT_MAX, true, true);
+	if (ret || !s->visible)
+		return ret;
+
+	state->src_x = s->src.x1;
+	state->src_y = s->src.y1;
+	state->src_w = drm_rect_width(&s->src);
+	state->src_h = drm_rect_height(&s->src);
+	state->crtc_x = s->dst.x1;
+	state->crtc_y = s->dst.y1;
+	state->crtc_w = drm_rect_width(&s->dst);
+	state->crtc_h = drm_rect_height(&s->dst);
+
 	if ((state->src_x | state->src_y | state->src_w | state->src_h) &
 	    SUBPIXEL_MASK)
 		return -EINVAL;
@@ -639,45 +642,6 @@ static int atmel_hlcdc_plane_atomic_check(struct drm_plane *p,
 	if (state->nplanes > ATMEL_HLCDC_LAYER_MAX_PLANES)
 		return -EINVAL;
 
-	/*
-	 * Swap width and size in case of 90 or 270 degrees rotation
-	 */
-	if (drm_rotation_90_or_270(state->base.rotation)) {
-		tmp = state->crtc_w;
-		state->crtc_w = state->crtc_h;
-		state->crtc_h = tmp;
-		tmp = state->src_w;
-		state->src_w = state->src_h;
-		state->src_h = tmp;
-	}
-
-	if (state->crtc_x + state->crtc_w > mode->hdisplay)
-		patched_crtc_w = mode->hdisplay - state->crtc_x;
-	else
-		patched_crtc_w = state->crtc_w;
-
-	if (state->crtc_x < 0) {
-		patched_crtc_w += state->crtc_x;
-		x_offset = -state->crtc_x;
-		state->crtc_x = 0;
-	}
-
-	if (state->crtc_y + state->crtc_h > mode->vdisplay)
-		patched_crtc_h = mode->vdisplay - state->crtc_y;
-	else
-		patched_crtc_h = state->crtc_h;
-
-	if (state->crtc_y < 0) {
-		patched_crtc_h += state->crtc_y;
-		y_offset = -state->crtc_y;
-		state->crtc_y = 0;
-	}
-
-	patched_src_w = DIV_ROUND_CLOSEST(patched_crtc_w * state->src_w,
-					  state->crtc_w);
-	patched_src_h = DIV_ROUND_CLOSEST(patched_crtc_h * state->src_h,
-					  state->crtc_h);
-
 	hsub = drm_format_horz_chroma_subsampling(fb->format->format);
 	vsub = drm_format_vert_chroma_subsampling(fb->format->format);
 
@@ -692,41 +656,38 @@ static int atmel_hlcdc_plane_atomic_check(struct drm_plane *p,
 
 		switch (state->base.rotation & DRM_MODE_ROTATE_MASK) {
 		case DRM_MODE_ROTATE_90:
-			offset = ((y_offset + state->src_y + patched_src_w - 1) /
-				  ydiv) * fb->pitches[i];
-			offset += ((x_offset + state->src_x) / xdiv) *
-				  state->bpp[i];
-			state->xstride[i] = ((patched_src_w - 1) / ydiv) *
-					  fb->pitches[i];
-			state->pstride[i] = -fb->pitches[i] - state->bpp[i];
+			offset = (state->src_y / ydiv) *
+				 fb->pitches[i];
+			offset += ((state->src_x + state->src_w - 1) /
+				   xdiv) * state->bpp[i];
+			state->xstride[i] = -(((state->src_h - 1) / ydiv) *
+					    fb->pitches[i]) -
+					  (2 * state->bpp[i]);
+			state->pstride[i] = fb->pitches[i] - state->bpp[i];
 			break;
 		case DRM_MODE_ROTATE_180:
-			offset = ((y_offset + state->src_y + patched_src_h - 1) /
+			offset = ((state->src_y + state->src_h - 1) /
 				  ydiv) * fb->pitches[i];
-			offset += ((x_offset + state->src_x + patched_src_w - 1) /
+			offset += ((state->src_x + state->src_w - 1) /
 				   xdiv) * state->bpp[i];
-			state->xstride[i] = ((((patched_src_w - 1) / xdiv) - 1) *
+			state->xstride[i] = ((((state->src_w - 1) / xdiv) - 1) *
 					   state->bpp[i]) - fb->pitches[i];
 			state->pstride[i] = -2 * state->bpp[i];
 			break;
 		case DRM_MODE_ROTATE_270:
-			offset = ((y_offset + state->src_y) / ydiv) *
-				 fb->pitches[i];
-			offset += ((x_offset + state->src_x + patched_src_h - 1) /
-				   xdiv) * state->bpp[i];
-			state->xstride[i] = -(((patched_src_w - 1) / ydiv) *
-					    fb->pitches[i]) -
-					  (2 * state->bpp[i]);
-			state->pstride[i] = fb->pitches[i] - state->bpp[i];
+			offset = ((state->src_y + state->src_h - 1) /
+				  ydiv) * fb->pitches[i];
+			offset += (state->src_x / xdiv) * state->bpp[i];
+			state->xstride[i] = ((state->src_h - 1) / ydiv) *
+					  fb->pitches[i];
+			state->pstride[i] = -fb->pitches[i] - state->bpp[i];
 			break;
 		case DRM_MODE_ROTATE_0:
 		default:
-			offset = ((y_offset + state->src_y) / ydiv) *
-				 fb->pitches[i];
-			offset += ((x_offset + state->src_x) / xdiv) *
-				  state->bpp[i];
+			offset = (state->src_y / ydiv) * fb->pitches[i];
+			offset += (state->src_x / xdiv) * state->bpp[i];
 			state->xstride[i] = fb->pitches[i] -
-					  ((patched_src_w / xdiv) *
+					  ((state->src_w / xdiv) *
 					   state->bpp[i]);
 			state->pstride[i] = 0;
 			break;
@@ -735,35 +696,45 @@ static int atmel_hlcdc_plane_atomic_check(struct drm_plane *p,
 		state->offsets[i] = offset + fb->offsets[i];
 	}
 
-	state->src_w = patched_src_w;
-	state->src_h = patched_src_h;
-	state->crtc_w = patched_crtc_w;
-	state->crtc_h = patched_crtc_h;
+	/*
+	 * Swap width and size in case of 90 or 270 degrees rotation
+	 */
+	if (drm_rotation_90_or_270(state->base.rotation)) {
+		tmp = state->src_w;
+		state->src_w = state->src_h;
+		state->src_h = tmp;
+	}
 
 	if (!desc->layout.size &&
 	    (mode->hdisplay != state->crtc_w ||
 	     mode->vdisplay != state->crtc_h))
 		return -EINVAL;
 
-	if (desc->max_height && state->crtc_h > desc->max_height)
-		return -EINVAL;
-
-	if (desc->max_width && state->crtc_w > desc->max_width)
-		return -EINVAL;
-
 	if ((state->crtc_h != state->src_h || state->crtc_w != state->src_w) &&
 	    (!desc->layout.memsize ||
 	     state->base.fb->format->has_alpha))
 		return -EINVAL;
 
-	if (state->crtc_x < 0 || state->crtc_y < 0)
-		return -EINVAL;
+	return 0;
+}
 
-	if (state->crtc_w + state->crtc_x > mode->hdisplay ||
-	    state->crtc_h + state->crtc_y > mode->vdisplay)
-		return -EINVAL;
+static void atmel_hlcdc_plane_atomic_disable(struct drm_plane *p,
+					     struct drm_plane_state *old_state)
+{
+	struct atmel_hlcdc_plane *plane = drm_plane_to_atmel_hlcdc_plane(p);
 
-	return 0;
+	/* Disable interrupts */
+	atmel_hlcdc_layer_write_reg(&plane->layer, ATMEL_HLCDC_LAYER_IDR,
+				    0xffffffff);
+
+	/* Disable the layer */
+	atmel_hlcdc_layer_write_reg(&plane->layer, ATMEL_HLCDC_LAYER_CHDR,
+				    ATMEL_HLCDC_LAYER_RST |
+				    ATMEL_HLCDC_LAYER_A2Q |
+				    ATMEL_HLCDC_LAYER_UPDATE);
+
+	/* Clear all pending interrupts */
+	atmel_hlcdc_layer_read_reg(&plane->layer, ATMEL_HLCDC_LAYER_ISR);
 }
 
 static void atmel_hlcdc_plane_atomic_update(struct drm_plane *p,
@@ -777,6 +748,11 @@ static void atmel_hlcdc_plane_atomic_update(struct drm_plane *p,
 	if (!p->state->crtc || !p->state->fb)
 		return;
 
+	if (!state->base.visible) {
+		atmel_hlcdc_plane_atomic_disable(p, old_s);
+		return;
+	}
+
 	atmel_hlcdc_plane_update_pos_and_size(plane, state);
 	atmel_hlcdc_plane_update_general_settings(plane, state);
 	atmel_hlcdc_plane_update_format(plane, state);
@@ -798,25 +774,6 @@ static void atmel_hlcdc_plane_atomic_update(struct drm_plane *p,
 			 ATMEL_HLCDC_LAYER_A2Q : ATMEL_HLCDC_LAYER_EN));
 }
 
-static void atmel_hlcdc_plane_atomic_disable(struct drm_plane *p,
-					     struct drm_plane_state *old_state)
-{
-	struct atmel_hlcdc_plane *plane = drm_plane_to_atmel_hlcdc_plane(p);
-
-	/* Disable interrupts */
-	atmel_hlcdc_layer_write_reg(&plane->layer, ATMEL_HLCDC_LAYER_IDR,
-				    0xffffffff);
-
-	/* Disable the layer */
-	atmel_hlcdc_layer_write_reg(&plane->layer, ATMEL_HLCDC_LAYER_CHDR,
-				    ATMEL_HLCDC_LAYER_RST |
-				    ATMEL_HLCDC_LAYER_A2Q |
-				    ATMEL_HLCDC_LAYER_UPDATE);
-
-	/* Clear all pending interrupts */
-	atmel_hlcdc_layer_read_reg(&plane->layer, ATMEL_HLCDC_LAYER_ISR);
-}
-
 static int atmel_hlcdc_plane_init_properties(struct atmel_hlcdc_plane *plane)
 {
 	const struct atmel_hlcdc_layer_desc *desc = plane->layer.desc;
diff --git a/drivers/gpu/drm/bochs/Makefile b/drivers/gpu/drm/bochs/Makefile
index 98ef60a19e8f..e9e0f8f5eb5b 100644
--- a/drivers/gpu/drm/bochs/Makefile
+++ b/drivers/gpu/drm/bochs/Makefile
@@ -1,3 +1,3 @@
-bochs-drm-y := bochs_drv.o bochs_mm.o bochs_kms.o bochs_fbdev.o bochs_hw.o
+bochs-drm-y := bochs_drv.o bochs_mm.o bochs_kms.o bochs_hw.o
 
 obj-$(CONFIG_DRM_BOCHS)	+= bochs-drm.o
diff --git a/drivers/gpu/drm/bochs/bochs.h b/drivers/gpu/drm/bochs/bochs.h
index fb38c8b857b5..03711394f1ed 100644
--- a/drivers/gpu/drm/bochs/bochs.h
+++ b/drivers/gpu/drm/bochs/bochs.h
@@ -80,12 +80,6 @@ struct bochs_device {
 		struct ttm_bo_device bdev;
 		bool initialized;
 	} ttm;
-
-	/* fbdev */
-	struct {
-		struct drm_framebuffer *fb;
-		struct drm_fb_helper helper;
-	} fb;
 };
 
 struct bochs_bo {
@@ -121,8 +115,9 @@ int bochs_hw_init(struct drm_device *dev);
 void bochs_hw_fini(struct drm_device *dev);
 
 void bochs_hw_setmode(struct bochs_device *bochs,
-		      struct drm_display_mode *mode,
-		      const struct drm_format_info *format);
+		      struct drm_display_mode *mode);
+void bochs_hw_setformat(struct bochs_device *bochs,
+			const struct drm_format_info *format);
 void bochs_hw_setbase(struct bochs_device *bochs,
 		      int x, int y, u64 addr);
 int bochs_hw_load_edid(struct bochs_device *bochs);
@@ -141,15 +136,19 @@ int bochs_dumb_create(struct drm_file *file, struct drm_device *dev,
 int bochs_dumb_mmap_offset(struct drm_file *file, struct drm_device *dev,
 			   uint32_t handle, uint64_t *offset);
 
-int bochs_bo_pin(struct bochs_bo *bo, u32 pl_flag, u64 *gpu_addr);
+int bochs_bo_pin(struct bochs_bo *bo, u32 pl_flag);
 int bochs_bo_unpin(struct bochs_bo *bo);
 
+int bochs_gem_prime_pin(struct drm_gem_object *obj);
+void bochs_gem_prime_unpin(struct drm_gem_object *obj);
+void *bochs_gem_prime_vmap(struct drm_gem_object *obj);
+void bochs_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
+int bochs_gem_prime_mmap(struct drm_gem_object *obj,
+			 struct vm_area_struct *vma);
+
 /* bochs_kms.c */
 int bochs_kms_init(struct bochs_device *bochs);
 void bochs_kms_fini(struct bochs_device *bochs);
 
 /* bochs_fbdev.c */
-int bochs_fbdev_init(struct bochs_device *bochs);
-void bochs_fbdev_fini(struct bochs_device *bochs);
-
 extern const struct drm_mode_config_funcs bochs_mode_funcs;
diff --git a/drivers/gpu/drm/bochs/bochs_drv.c b/drivers/gpu/drm/bochs/bochs_drv.c
index aa35007262cd..6b6e037258c3 100644
--- a/drivers/gpu/drm/bochs/bochs_drv.c
+++ b/drivers/gpu/drm/bochs/bochs_drv.c
@@ -9,6 +9,7 @@
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "bochs.h"
 
@@ -16,10 +17,6 @@ static int bochs_modeset = -1;
 module_param_named(modeset, bochs_modeset, int, 0444);
 MODULE_PARM_DESC(modeset, "enable/disable kernel modesetting");
 
-static bool enable_fbdev = true;
-module_param_named(fbdev, enable_fbdev, bool, 0444);
-MODULE_PARM_DESC(fbdev, "register fbdev device");
-
 /* ---------------------------------------------------------------------- */
 /* drm interface                                                          */
 
@@ -27,7 +24,6 @@ static void bochs_unload(struct drm_device *dev)
 {
 	struct bochs_device *bochs = dev->dev_private;
 
-	bochs_fbdev_fini(bochs);
 	bochs_kms_fini(bochs);
 	bochs_mm_fini(bochs);
 	bochs_hw_fini(dev);
@@ -58,9 +54,6 @@ static int bochs_load(struct drm_device *dev)
 	if (ret)
 		goto err;
 
-	if (enable_fbdev)
-		bochs_fbdev_init(bochs);
-
 	return 0;
 
 err:
@@ -81,7 +74,8 @@ static const struct file_operations bochs_fops = {
 };
 
 static struct drm_driver bochs_driver = {
-	.driver_features	= DRIVER_GEM | DRIVER_MODESET,
+	.driver_features	= DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC |
+				  DRIVER_PRIME,
 	.fops			= &bochs_fops,
 	.name			= "bochs-drm",
 	.desc			= "bochs dispi vga interface (qemu stdvga)",
@@ -91,6 +85,14 @@ static struct drm_driver bochs_driver = {
 	.gem_free_object_unlocked = bochs_gem_free_object,
 	.dumb_create            = bochs_dumb_create,
 	.dumb_map_offset        = bochs_dumb_mmap_offset,
+
+	.gem_prime_export = drm_gem_prime_export,
+	.gem_prime_import = drm_gem_prime_import,
+	.gem_prime_pin = bochs_gem_prime_pin,
+	.gem_prime_unpin = bochs_gem_prime_unpin,
+	.gem_prime_vmap = bochs_gem_prime_vmap,
+	.gem_prime_vunmap = bochs_gem_prime_vunmap,
+	.gem_prime_mmap = bochs_gem_prime_mmap,
 };
 
 /* ---------------------------------------------------------------------- */
@@ -101,27 +103,16 @@ static int bochs_pm_suspend(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
 	struct drm_device *drm_dev = pci_get_drvdata(pdev);
-	struct bochs_device *bochs = drm_dev->dev_private;
-
-	drm_kms_helper_poll_disable(drm_dev);
-
-	drm_fb_helper_set_suspend_unlocked(&bochs->fb.helper, 1);
 
-	return 0;
+	return drm_mode_config_helper_suspend(drm_dev);
 }
 
 static int bochs_pm_resume(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
 	struct drm_device *drm_dev = pci_get_drvdata(pdev);
-	struct bochs_device *bochs = drm_dev->dev_private;
-
-	drm_helper_resume_force_mode(drm_dev);
 
-	drm_fb_helper_set_suspend_unlocked(&bochs->fb.helper, 0);
-
-	drm_kms_helper_poll_enable(drm_dev);
-	return 0;
+	return drm_mode_config_helper_resume(drm_dev);
 }
 #endif
 
@@ -169,6 +160,7 @@ static int bochs_pci_probe(struct pci_dev *pdev,
 	if (ret)
 		goto err_unload;
 
+	drm_fbdev_generic_setup(dev, 32);
 	return ret;
 
 err_unload:
diff --git a/drivers/gpu/drm/bochs/bochs_fbdev.c b/drivers/gpu/drm/bochs/bochs_fbdev.c
deleted file mode 100644
index dd3c7df267da..000000000000
--- a/drivers/gpu/drm/bochs/bochs_fbdev.c
+++ /dev/null
@@ -1,163 +0,0 @@
-/*
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- */
-
-#include "bochs.h"
-#include <drm/drm_gem_framebuffer_helper.h>
-
-/* ---------------------------------------------------------------------- */
-
-static int bochsfb_mmap(struct fb_info *info,
-			struct vm_area_struct *vma)
-{
-	struct drm_fb_helper *fb_helper = info->par;
-	struct bochs_bo *bo = gem_to_bochs_bo(fb_helper->fb->obj[0]);
-
-	return ttm_fbdev_mmap(vma, &bo->bo);
-}
-
-static struct fb_ops bochsfb_ops = {
-	.owner = THIS_MODULE,
-	DRM_FB_HELPER_DEFAULT_OPS,
-	.fb_fillrect = drm_fb_helper_cfb_fillrect,
-	.fb_copyarea = drm_fb_helper_cfb_copyarea,
-	.fb_imageblit = drm_fb_helper_cfb_imageblit,
-	.fb_mmap = bochsfb_mmap,
-};
-
-static int bochsfb_create_object(struct bochs_device *bochs,
-				 const struct drm_mode_fb_cmd2 *mode_cmd,
-				 struct drm_gem_object **gobj_p)
-{
-	struct drm_device *dev = bochs->dev;
-	struct drm_gem_object *gobj;
-	u32 size;
-	int ret = 0;
-
-	size = mode_cmd->pitches[0] * mode_cmd->height;
-	ret = bochs_gem_create(dev, size, true, &gobj);
-	if (ret)
-		return ret;
-
-	*gobj_p = gobj;
-	return ret;
-}
-
-static int bochsfb_create(struct drm_fb_helper *helper,
-			  struct drm_fb_helper_surface_size *sizes)
-{
-	struct bochs_device *bochs =
-		container_of(helper, struct bochs_device, fb.helper);
-	struct fb_info *info;
-	struct drm_framebuffer *fb;
-	struct drm_mode_fb_cmd2 mode_cmd;
-	struct drm_gem_object *gobj = NULL;
-	struct bochs_bo *bo = NULL;
-	int size, ret;
-
-	if (sizes->surface_bpp != 32)
-		return -EINVAL;
-
-	mode_cmd.width = sizes->surface_width;
-	mode_cmd.height = sizes->surface_height;
-	mode_cmd.pitches[0] = sizes->surface_width * 4;
-	mode_cmd.pixel_format = DRM_FORMAT_HOST_XRGB8888;
-	size = mode_cmd.pitches[0] * mode_cmd.height;
-
-	/* alloc, pin & map bo */
-	ret = bochsfb_create_object(bochs, &mode_cmd, &gobj);
-	if (ret) {
-		DRM_ERROR("failed to create fbcon backing object %d\n", ret);
-		return ret;
-	}
-
-	bo = gem_to_bochs_bo(gobj);
-
-	ret = ttm_bo_reserve(&bo->bo, true, false, NULL);
-	if (ret)
-		return ret;
-
-	ret = bochs_bo_pin(bo, TTM_PL_FLAG_VRAM, NULL);
-	if (ret) {
-		DRM_ERROR("failed to pin fbcon\n");
-		ttm_bo_unreserve(&bo->bo);
-		return ret;
-	}
-
-	ret = ttm_bo_kmap(&bo->bo, 0, bo->bo.num_pages,
-			  &bo->kmap);
-	if (ret) {
-		DRM_ERROR("failed to kmap fbcon\n");
-		ttm_bo_unreserve(&bo->bo);
-		return ret;
-	}
-
-	ttm_bo_unreserve(&bo->bo);
-
-	/* init fb device */
-	info = drm_fb_helper_alloc_fbi(helper);
-	if (IS_ERR(info)) {
-		DRM_ERROR("Failed to allocate fbi: %ld\n", PTR_ERR(info));
-		return PTR_ERR(info);
-	}
-
-	info->par = &bochs->fb.helper;
-
-	fb = drm_gem_fbdev_fb_create(bochs->dev, sizes, 0, gobj, NULL);
-	if (IS_ERR(fb)) {
-		DRM_ERROR("Failed to create framebuffer: %ld\n", PTR_ERR(fb));
-		return PTR_ERR(fb);
-	}
-
-	/* setup helper */
-	bochs->fb.helper.fb = fb;
-
-	strcpy(info->fix.id, "bochsdrmfb");
-
-	info->fbops = &bochsfb_ops;
-
-	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
-	drm_fb_helper_fill_var(info, &bochs->fb.helper, sizes->fb_width,
-			       sizes->fb_height);
-
-	info->screen_base = bo->kmap.virtual;
-	info->screen_size = size;
-
-	drm_vma_offset_remove(&bo->bo.bdev->vma_manager, &bo->bo.vma_node);
-	info->fix.smem_start = 0;
-	info->fix.smem_len = size;
-	return 0;
-}
-
-static const struct drm_fb_helper_funcs bochs_fb_helper_funcs = {
-	.fb_probe = bochsfb_create,
-};
-
-static struct drm_framebuffer *
-bochs_gem_fb_create(struct drm_device *dev, struct drm_file *file,
-		    const struct drm_mode_fb_cmd2 *mode_cmd)
-{
-	if (mode_cmd->pixel_format != DRM_FORMAT_XRGB8888 &&
-	    mode_cmd->pixel_format != DRM_FORMAT_BGRX8888)
-		return ERR_PTR(-EINVAL);
-
-	return drm_gem_fb_create(dev, file, mode_cmd);
-}
-
-const struct drm_mode_config_funcs bochs_mode_funcs = {
-	.fb_create = bochs_gem_fb_create,
-};
-
-int bochs_fbdev_init(struct bochs_device *bochs)
-{
-	return drm_fb_helper_fbdev_setup(bochs->dev, &bochs->fb.helper,
-					 &bochs_fb_helper_funcs, 32, 1);
-}
-
-void bochs_fbdev_fini(struct bochs_device *bochs)
-{
-	drm_fb_helper_fbdev_teardown(bochs->dev);
-}
diff --git a/drivers/gpu/drm/bochs/bochs_hw.c b/drivers/gpu/drm/bochs/bochs_hw.c
index c90a0d492fd5..3e04b2f0ec08 100644
--- a/drivers/gpu/drm/bochs/bochs_hw.c
+++ b/drivers/gpu/drm/bochs/bochs_hw.c
@@ -86,9 +86,16 @@ static int bochs_get_edid_block(void *data, u8 *buf,
 
 int bochs_hw_load_edid(struct bochs_device *bochs)
 {
+	u8 header[8];
+
 	if (!bochs->mmio)
 		return -1;
 
+	/* check header to detect whenever edid support is enabled in qemu */
+	bochs_get_edid_block(bochs, header, 0, ARRAY_SIZE(header));
+	if (drm_edid_header_is_valid(header) != 8)
+		return -1;
+
 	kfree(bochs->edid);
 	bochs->edid = drm_do_get_edid(&bochs->connector,
 				      bochs_get_edid_block, bochs);
@@ -197,8 +204,7 @@ void bochs_hw_fini(struct drm_device *dev)
 }
 
 void bochs_hw_setmode(struct bochs_device *bochs,
-		      struct drm_display_mode *mode,
-		      const struct drm_format_info *format)
+		      struct drm_display_mode *mode)
 {
 	bochs->xres = mode->hdisplay;
 	bochs->yres = mode->vdisplay;
@@ -206,12 +212,8 @@ void bochs_hw_setmode(struct bochs_device *bochs,
 	bochs->stride = mode->hdisplay * (bochs->bpp / 8);
 	bochs->yres_virtual = bochs->fb_size / bochs->stride;
 
-	DRM_DEBUG_DRIVER("%dx%d @ %d bpp, format %c%c%c%c, vy %d\n",
+	DRM_DEBUG_DRIVER("%dx%d @ %d bpp, vy %d\n",
 			 bochs->xres, bochs->yres, bochs->bpp,
-			 (format->format >>  0) & 0xff,
-			 (format->format >>  8) & 0xff,
-			 (format->format >> 16) & 0xff,
-			 (format->format >> 24) & 0xff,
 			 bochs->yres_virtual);
 
 	bochs_vga_writeb(bochs, 0x3c0, 0x20); /* unblank */
@@ -229,6 +231,16 @@ void bochs_hw_setmode(struct bochs_device *bochs,
 
 	bochs_dispi_write(bochs, VBE_DISPI_INDEX_ENABLE,
 			  VBE_DISPI_ENABLED | VBE_DISPI_LFB_ENABLED);
+}
+
+void bochs_hw_setformat(struct bochs_device *bochs,
+			const struct drm_format_info *format)
+{
+	DRM_DEBUG_DRIVER("format %c%c%c%c\n",
+			 (format->format >>  0) & 0xff,
+			 (format->format >>  8) & 0xff,
+			 (format->format >> 16) & 0xff,
+			 (format->format >> 24) & 0xff);
 
 	switch (format->format) {
 	case DRM_FORMAT_XRGB8888:
diff --git a/drivers/gpu/drm/bochs/bochs_kms.c b/drivers/gpu/drm/bochs/bochs_kms.c
index f87c284dd93d..9cd82e3631fb 100644
--- a/drivers/gpu/drm/bochs/bochs_kms.c
+++ b/drivers/gpu/drm/bochs/bochs_kms.c
@@ -6,7 +6,11 @@
  */
 
 #include "bochs.h"
+#include <drm/drm_atomic_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_atomic_uapi.h>
+#include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
 
 static int defx = 1024;
 static int defy = 768;
@@ -18,115 +22,51 @@ MODULE_PARM_DESC(defy, "default y resolution");
 
 /* ---------------------------------------------------------------------- */
 
-static void bochs_crtc_dpms(struct drm_crtc *crtc, int mode)
-{
-	switch (mode) {
-	case DRM_MODE_DPMS_ON:
-	case DRM_MODE_DPMS_STANDBY:
-	case DRM_MODE_DPMS_SUSPEND:
-	case DRM_MODE_DPMS_OFF:
-	default:
-		return;
-	}
-}
-
-static int bochs_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y,
-				    struct drm_framebuffer *old_fb)
-{
-	struct bochs_device *bochs =
-		container_of(crtc, struct bochs_device, crtc);
-	struct bochs_bo *bo;
-	u64 gpu_addr = 0;
-	int ret;
-
-	if (old_fb) {
-		bo = gem_to_bochs_bo(old_fb->obj[0]);
-		ret = ttm_bo_reserve(&bo->bo, true, false, NULL);
-		if (ret) {
-			DRM_ERROR("failed to reserve old_fb bo\n");
-		} else {
-			bochs_bo_unpin(bo);
-			ttm_bo_unreserve(&bo->bo);
-		}
-	}
-
-	if (WARN_ON(crtc->primary->fb == NULL))
-		return -EINVAL;
-
-	bo = gem_to_bochs_bo(crtc->primary->fb->obj[0]);
-	ret = ttm_bo_reserve(&bo->bo, true, false, NULL);
-	if (ret)
-		return ret;
-
-	ret = bochs_bo_pin(bo, TTM_PL_FLAG_VRAM, &gpu_addr);
-	if (ret) {
-		ttm_bo_unreserve(&bo->bo);
-		return ret;
-	}
-
-	ttm_bo_unreserve(&bo->bo);
-	bochs_hw_setbase(bochs, x, y, gpu_addr);
-	return 0;
-}
-
-static int bochs_crtc_mode_set(struct drm_crtc *crtc,
-			       struct drm_display_mode *mode,
-			       struct drm_display_mode *adjusted_mode,
-			       int x, int y, struct drm_framebuffer *old_fb)
+static void bochs_crtc_mode_set_nofb(struct drm_crtc *crtc)
 {
 	struct bochs_device *bochs =
 		container_of(crtc, struct bochs_device, crtc);
 
-	if (WARN_ON(crtc->primary->fb == NULL))
-		return -EINVAL;
-
-	bochs_hw_setmode(bochs, mode, crtc->primary->fb->format);
-	bochs_crtc_mode_set_base(crtc, x, y, old_fb);
-	return 0;
+	bochs_hw_setmode(bochs, &crtc->mode);
 }
 
-static void bochs_crtc_prepare(struct drm_crtc *crtc)
+static void bochs_crtc_atomic_enable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_crtc_state)
 {
 }
 
-static void bochs_crtc_commit(struct drm_crtc *crtc)
+static void bochs_crtc_atomic_flush(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_crtc_state)
 {
-}
+	struct drm_device *dev = crtc->dev;
+	struct drm_pending_vblank_event *event;
 
-static int bochs_crtc_page_flip(struct drm_crtc *crtc,
-				struct drm_framebuffer *fb,
-				struct drm_pending_vblank_event *event,
-				uint32_t page_flip_flags,
-				struct drm_modeset_acquire_ctx *ctx)
-{
-	struct bochs_device *bochs =
-		container_of(crtc, struct bochs_device, crtc);
-	struct drm_framebuffer *old_fb = crtc->primary->fb;
-	unsigned long irqflags;
+	if (crtc->state && crtc->state->event) {
+		unsigned long irqflags;
 
-	crtc->primary->fb = fb;
-	bochs_crtc_mode_set_base(crtc, 0, 0, old_fb);
-	if (event) {
-		spin_lock_irqsave(&bochs->dev->event_lock, irqflags);
+		spin_lock_irqsave(&dev->event_lock, irqflags);
+		event = crtc->state->event;
+		crtc->state->event = NULL;
 		drm_crtc_send_vblank_event(crtc, event);
-		spin_unlock_irqrestore(&bochs->dev->event_lock, irqflags);
+		spin_unlock_irqrestore(&dev->event_lock, irqflags);
 	}
-	return 0;
 }
 
+
 /* These provide the minimum set of functions required to handle a CRTC */
 static const struct drm_crtc_funcs bochs_crtc_funcs = {
-	.set_config = drm_crtc_helper_set_config,
+	.set_config = drm_atomic_helper_set_config,
 	.destroy = drm_crtc_cleanup,
-	.page_flip = bochs_crtc_page_flip,
+	.page_flip = drm_atomic_helper_page_flip,
+	.reset = drm_atomic_helper_crtc_reset,
+	.atomic_duplicate_state = drm_atomic_helper_crtc_duplicate_state,
+	.atomic_destroy_state = drm_atomic_helper_crtc_destroy_state,
 };
 
 static const struct drm_crtc_helper_funcs bochs_helper_funcs = {
-	.dpms = bochs_crtc_dpms,
-	.mode_set = bochs_crtc_mode_set,
-	.mode_set_base = bochs_crtc_mode_set_base,
-	.prepare = bochs_crtc_prepare,
-	.commit = bochs_crtc_commit,
+	.mode_set_nofb = bochs_crtc_mode_set_nofb,
+	.atomic_enable = bochs_crtc_atomic_enable,
+	.atomic_flush = bochs_crtc_atomic_flush,
 };
 
 static const uint32_t bochs_formats[] = {
@@ -134,6 +74,59 @@ static const uint32_t bochs_formats[] = {
 	DRM_FORMAT_BGRX8888,
 };
 
+static void bochs_plane_atomic_update(struct drm_plane *plane,
+				      struct drm_plane_state *old_state)
+{
+	struct bochs_device *bochs = plane->dev->dev_private;
+	struct bochs_bo *bo;
+
+	if (!plane->state->fb)
+		return;
+	bo = gem_to_bochs_bo(plane->state->fb->obj[0]);
+	bochs_hw_setbase(bochs,
+			 plane->state->crtc_x,
+			 plane->state->crtc_y,
+			 bo->bo.offset);
+	bochs_hw_setformat(bochs, plane->state->fb->format);
+}
+
+static int bochs_plane_prepare_fb(struct drm_plane *plane,
+				struct drm_plane_state *new_state)
+{
+	struct bochs_bo *bo;
+
+	if (!new_state->fb)
+		return 0;
+	bo = gem_to_bochs_bo(new_state->fb->obj[0]);
+	return bochs_bo_pin(bo, TTM_PL_FLAG_VRAM);
+}
+
+static void bochs_plane_cleanup_fb(struct drm_plane *plane,
+				   struct drm_plane_state *old_state)
+{
+	struct bochs_bo *bo;
+
+	if (!old_state->fb)
+		return;
+	bo = gem_to_bochs_bo(old_state->fb->obj[0]);
+	bochs_bo_unpin(bo);
+}
+
+static const struct drm_plane_helper_funcs bochs_plane_helper_funcs = {
+	.atomic_update = bochs_plane_atomic_update,
+	.prepare_fb = bochs_plane_prepare_fb,
+	.cleanup_fb = bochs_plane_cleanup_fb,
+};
+
+static const struct drm_plane_funcs bochs_plane_funcs = {
+       .update_plane   = drm_atomic_helper_update_plane,
+       .disable_plane  = drm_atomic_helper_disable_plane,
+       .destroy        = drm_primary_helper_destroy,
+       .reset          = drm_atomic_helper_plane_reset,
+       .atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
+       .atomic_destroy_state = drm_atomic_helper_plane_destroy_state,
+};
+
 static struct drm_plane *bochs_primary_plane(struct drm_device *dev)
 {
 	struct drm_plane *primary;
@@ -146,16 +139,17 @@ static struct drm_plane *bochs_primary_plane(struct drm_device *dev)
 	}
 
 	ret = drm_universal_plane_init(dev, primary, 0,
-				       &drm_primary_helper_funcs,
+				       &bochs_plane_funcs,
 				       bochs_formats,
 				       ARRAY_SIZE(bochs_formats),
 				       NULL,
 				       DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		kfree(primary);
-		primary = NULL;
+		return NULL;
 	}
 
+	drm_plane_helper_add(primary, &bochs_plane_helper_funcs);
 	return primary;
 }
 
@@ -170,31 +164,6 @@ static void bochs_crtc_init(struct drm_device *dev)
 	drm_crtc_helper_add(crtc, &bochs_helper_funcs);
 }
 
-static void bochs_encoder_mode_set(struct drm_encoder *encoder,
-				   struct drm_display_mode *mode,
-				   struct drm_display_mode *adjusted_mode)
-{
-}
-
-static void bochs_encoder_dpms(struct drm_encoder *encoder, int state)
-{
-}
-
-static void bochs_encoder_prepare(struct drm_encoder *encoder)
-{
-}
-
-static void bochs_encoder_commit(struct drm_encoder *encoder)
-{
-}
-
-static const struct drm_encoder_helper_funcs bochs_encoder_helper_funcs = {
-	.dpms = bochs_encoder_dpms,
-	.mode_set = bochs_encoder_mode_set,
-	.prepare = bochs_encoder_prepare,
-	.commit = bochs_encoder_commit,
-};
-
 static const struct drm_encoder_funcs bochs_encoder_encoder_funcs = {
 	.destroy = drm_encoder_cleanup,
 };
@@ -207,7 +176,6 @@ static void bochs_encoder_init(struct drm_device *dev)
 	encoder->possible_crtcs = 0x1;
 	drm_encoder_init(dev, encoder, &bochs_encoder_encoder_funcs,
 			 DRM_MODE_ENCODER_DAC, NULL);
-	drm_encoder_helper_add(encoder, &bochs_encoder_helper_funcs);
 }
 
 
@@ -266,6 +234,9 @@ static const struct drm_connector_funcs bochs_connector_connector_funcs = {
 	.dpms = drm_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
+	.reset = drm_atomic_helper_connector_reset,
+	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
+	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 };
 
 static void bochs_connector_init(struct drm_device *dev)
@@ -287,6 +258,22 @@ static void bochs_connector_init(struct drm_device *dev)
 	}
 }
 
+static struct drm_framebuffer *
+bochs_gem_fb_create(struct drm_device *dev, struct drm_file *file,
+		    const struct drm_mode_fb_cmd2 *mode_cmd)
+{
+	if (mode_cmd->pixel_format != DRM_FORMAT_XRGB8888 &&
+	    mode_cmd->pixel_format != DRM_FORMAT_BGRX8888)
+		return ERR_PTR(-EINVAL);
+
+	return drm_gem_fb_create(dev, file, mode_cmd);
+}
+
+const struct drm_mode_config_funcs bochs_mode_funcs = {
+	.fb_create = bochs_gem_fb_create,
+	.atomic_check = drm_atomic_helper_check,
+	.atomic_commit = drm_atomic_helper_commit,
+};
 
 int bochs_kms_init(struct bochs_device *bochs)
 {
@@ -309,6 +296,8 @@ int bochs_kms_init(struct bochs_device *bochs)
 	drm_connector_attach_encoder(&bochs->connector,
 					  &bochs->encoder);
 
+	drm_mode_config_reset(bochs->dev);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/bochs/bochs_mm.c b/drivers/gpu/drm/bochs/bochs_mm.c
index 0980411e41bf..49463348a07a 100644
--- a/drivers/gpu/drm/bochs/bochs_mm.c
+++ b/drivers/gpu/drm/bochs/bochs_mm.c
@@ -210,33 +210,28 @@ static void bochs_ttm_placement(struct bochs_bo *bo, int domain)
 	bo->placement.num_busy_placement = c;
 }
 
-static inline u64 bochs_bo_gpu_offset(struct bochs_bo *bo)
-{
-	return bo->bo.offset;
-}
-
-int bochs_bo_pin(struct bochs_bo *bo, u32 pl_flag, u64 *gpu_addr)
+int bochs_bo_pin(struct bochs_bo *bo, u32 pl_flag)
 {
 	struct ttm_operation_ctx ctx = { false, false };
 	int i, ret;
 
 	if (bo->pin_count) {
 		bo->pin_count++;
-		if (gpu_addr)
-			*gpu_addr = bochs_bo_gpu_offset(bo);
 		return 0;
 	}
 
 	bochs_ttm_placement(bo, pl_flag);
 	for (i = 0; i < bo->placement.num_placement; i++)
 		bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
+	ret = ttm_bo_reserve(&bo->bo, true, false, NULL);
+	if (ret)
+		return ret;
 	ret = ttm_bo_validate(&bo->bo, &bo->placement, &ctx);
+	ttm_bo_unreserve(&bo->bo);
 	if (ret)
 		return ret;
 
 	bo->pin_count = 1;
-	if (gpu_addr)
-		*gpu_addr = bochs_bo_gpu_offset(bo);
 	return 0;
 }
 
@@ -256,7 +251,11 @@ int bochs_bo_unpin(struct bochs_bo *bo)
 
 	for (i = 0; i < bo->placement.num_placement; i++)
 		bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
+	ret = ttm_bo_reserve(&bo->bo, true, false, NULL);
+	if (ret)
+		return ret;
 	ret = ttm_bo_validate(&bo->bo, &bo->placement, &ctx);
+	ttm_bo_unreserve(&bo->bo);
 	if (ret)
 		return ret;
 
@@ -396,3 +395,53 @@ int bochs_dumb_mmap_offset(struct drm_file *file, struct drm_device *dev,
 	drm_gem_object_put_unlocked(obj);
 	return 0;
 }
+
+/* ---------------------------------------------------------------------- */
+
+int bochs_gem_prime_pin(struct drm_gem_object *obj)
+{
+	struct bochs_bo *bo = gem_to_bochs_bo(obj);
+
+	return bochs_bo_pin(bo, TTM_PL_FLAG_VRAM);
+}
+
+void bochs_gem_prime_unpin(struct drm_gem_object *obj)
+{
+	struct bochs_bo *bo = gem_to_bochs_bo(obj);
+
+	bochs_bo_unpin(bo);
+}
+
+void *bochs_gem_prime_vmap(struct drm_gem_object *obj)
+{
+	struct bochs_bo *bo = gem_to_bochs_bo(obj);
+	bool is_iomem;
+	int ret;
+
+	ret = bochs_bo_pin(bo, TTM_PL_FLAG_VRAM);
+	if (ret)
+		return NULL;
+	ret = ttm_bo_kmap(&bo->bo, 0, bo->bo.num_pages, &bo->kmap);
+	if (ret) {
+		bochs_bo_unpin(bo);
+		return NULL;
+	}
+	return ttm_kmap_obj_virtual(&bo->kmap, &is_iomem);
+}
+
+void bochs_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr)
+{
+	struct bochs_bo *bo = gem_to_bochs_bo(obj);
+
+	ttm_bo_kunmap(&bo->kmap);
+	bochs_bo_unpin(bo);
+}
+
+int bochs_gem_prime_mmap(struct drm_gem_object *obj,
+			 struct vm_area_struct *vma)
+{
+	struct bochs_bo *bo = gem_to_bochs_bo(obj);
+
+	bo->gem.vma_node.vm_node.start = bo->bo.vma_node.vm_node.start;
+	return drm_gem_prime_mmap(obj, vma);
+}
diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
index 2fee47b0d50b..8840f396a7b6 100644
--- a/drivers/gpu/drm/bridge/Kconfig
+++ b/drivers/gpu/drm/bridge/Kconfig
@@ -30,6 +30,7 @@ config DRM_CDNS_DSI
 	select DRM_KMS_HELPER
 	select DRM_MIPI_DSI
 	select DRM_PANEL_BRIDGE
+	select GENERIC_PHY_MIPI_DPHY
 	depends on OF
 	help
 	  Support Cadence DPI to DSI bridge. This is an internal
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511.h b/drivers/gpu/drm/bridge/adv7511/adv7511.h
index 73d8ccb97742..996a7e7dbfd6 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7511.h
+++ b/drivers/gpu/drm/bridge/adv7511/adv7511.h
@@ -14,8 +14,10 @@
 #include <linux/regmap.h>
 #include <linux/regulator/consumer.h>
 
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_bridge.h>
+#include <drm/drm_connector.h>
 #include <drm/drm_mipi_dsi.h>
+#include <drm/drm_modes.h>
 
 #define ADV7511_REG_CHIP_REVISION		0x00
 #define ADV7511_REG_N0				0x01
@@ -395,7 +397,7 @@ static inline int adv7511_cec_init(struct device *dev, struct adv7511 *adv7511)
 #ifdef CONFIG_DRM_I2C_ADV7533
 void adv7533_dsi_power_on(struct adv7511 *adv);
 void adv7533_dsi_power_off(struct adv7511 *adv);
-void adv7533_mode_set(struct adv7511 *adv, struct drm_display_mode *mode);
+void adv7533_mode_set(struct adv7511 *adv, const struct drm_display_mode *mode);
 int adv7533_patch_registers(struct adv7511 *adv);
 int adv7533_patch_cec_registers(struct adv7511 *adv);
 int adv7533_attach_dsi(struct adv7511 *adv);
@@ -411,7 +413,7 @@ static inline void adv7533_dsi_power_off(struct adv7511 *adv)
 }
 
 static inline void adv7533_mode_set(struct adv7511 *adv,
-				    struct drm_display_mode *mode)
+				    const struct drm_display_mode *mode)
 {
 }
 
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
index 85c2d407a52e..ec2ca71e1323 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
@@ -17,6 +17,7 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 
 #include <media/cec.h>
 
@@ -676,8 +677,8 @@ static enum drm_mode_status adv7511_mode_valid(struct adv7511 *adv7511,
 }
 
 static void adv7511_mode_set(struct adv7511 *adv7511,
-			     struct drm_display_mode *mode,
-			     struct drm_display_mode *adj_mode)
+			     const struct drm_display_mode *mode,
+			     const struct drm_display_mode *adj_mode)
 {
 	unsigned int low_refresh_rate;
 	unsigned int hsync_polarity = 0;
@@ -839,8 +840,8 @@ static void adv7511_bridge_disable(struct drm_bridge *bridge)
 }
 
 static void adv7511_bridge_mode_set(struct drm_bridge *bridge,
-				    struct drm_display_mode *mode,
-				    struct drm_display_mode *adj_mode)
+				    const struct drm_display_mode *mode,
+				    const struct drm_display_mode *adj_mode)
 {
 	struct adv7511 *adv = bridge_to_adv7511(bridge);
 
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7533.c b/drivers/gpu/drm/bridge/adv7511/adv7533.c
index 185b6d842166..5d5e7d9eded2 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7533.c
+++ b/drivers/gpu/drm/bridge/adv7511/adv7533.c
@@ -108,7 +108,7 @@ void adv7533_dsi_power_off(struct adv7511 *adv)
 	regmap_write(adv->regmap_cec, 0x27, 0x0b);
 }
 
-void adv7533_mode_set(struct adv7511 *adv, struct drm_display_mode *mode)
+void adv7533_mode_set(struct adv7511 *adv, const struct drm_display_mode *mode)
 {
 	struct mipi_dsi_device *dsi = adv->dsi;
 	int lanes, ret;
diff --git a/drivers/gpu/drm/bridge/analogix-anx78xx.c b/drivers/gpu/drm/bridge/analogix-anx78xx.c
index f8433c93f463..c09aaf93ae1b 100644
--- a/drivers/gpu/drm/bridge/analogix-anx78xx.c
+++ b/drivers/gpu/drm/bridge/analogix-anx78xx.c
@@ -31,9 +31,9 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 
 #include "analogix-anx78xx.h"
 
@@ -1082,8 +1082,8 @@ static void anx78xx_bridge_disable(struct drm_bridge *bridge)
 }
 
 static void anx78xx_bridge_mode_set(struct drm_bridge *bridge,
-				    struct drm_display_mode *mode,
-				    struct drm_display_mode *adjusted_mode)
+				const struct drm_display_mode *mode,
+				const struct drm_display_mode *adjusted_mode)
 {
 	struct anx78xx *anx78xx = bridge_to_anx78xx(bridge);
 	struct hdmi_avi_infoframe frame;
@@ -1094,8 +1094,9 @@ static void anx78xx_bridge_mode_set(struct drm_bridge *bridge,
 
 	mutex_lock(&anx78xx->lock);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, adjusted_mode,
-						       false);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame,
+						       &anx78xx->connector,
+						       adjusted_mode);
 	if (err) {
 		DRM_ERROR("Failed to setup AVI infoframe: %d\n", err);
 		goto unlock;
diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index 753e96129ab7..225f5e5dd69b 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -26,8 +26,8 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include <drm/bridge/analogix_dp.h>
 
@@ -1361,8 +1361,8 @@ static void analogix_dp_bridge_disable(struct drm_bridge *bridge)
 }
 
 static void analogix_dp_bridge_mode_set(struct drm_bridge *bridge,
-					struct drm_display_mode *orig_mode,
-					struct drm_display_mode *mode)
+				const struct drm_display_mode *orig_mode,
+				const struct drm_display_mode *mode)
 {
 	struct analogix_dp_device *dp = bridge->driver_private;
 	struct drm_display_info *display_info = &dp->connector.display_info;
diff --git a/drivers/gpu/drm/bridge/cdns-dsi.c b/drivers/gpu/drm/bridge/cdns-dsi.c
index ce9496d13986..6166dca6be81 100644
--- a/drivers/gpu/drm/bridge/cdns-dsi.c
+++ b/drivers/gpu/drm/bridge/cdns-dsi.c
@@ -7,12 +7,14 @@
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_bridge.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_mipi_dsi.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <video/mipi_display.h>
 
 #include <linux/clk.h>
+#include <linux/interrupt.h>
 #include <linux/iopoll.h>
 #include <linux/module.h>
 #include <linux/of_address.h>
@@ -21,6 +23,9 @@
 #include <linux/pm_runtime.h>
 #include <linux/reset.h>
 
+#include <linux/phy/phy.h>
+#include <linux/phy/phy-mipi-dphy.h>
+
 #define IP_CONF				0x0
 #define SP_HS_FIFO_DEPTH(x)		(((x) & GENMASK(30, 26)) >> 26)
 #define SP_LP_FIFO_DEPTH(x)		(((x) & GENMASK(25, 21)) >> 21)
@@ -419,44 +424,11 @@
 #define DSI_NULL_FRAME_OVERHEAD		6
 #define DSI_EOT_PKT_SIZE		4
 
-#define REG_WAKEUP_TIME_NS		800
-#define DPHY_PLL_RATE_HZ		108000000
-
-/* DPHY registers */
-#define DPHY_PMA_CMN(reg)		(reg)
-#define DPHY_PMA_LCLK(reg)		(0x100 + (reg))
-#define DPHY_PMA_LDATA(lane, reg)	(0x200 + ((lane) * 0x100) + (reg))
-#define DPHY_PMA_RCLK(reg)		(0x600 + (reg))
-#define DPHY_PMA_RDATA(lane, reg)	(0x700 + ((lane) * 0x100) + (reg))
-#define DPHY_PCS(reg)			(0xb00 + (reg))
-
-#define DPHY_CMN_SSM			DPHY_PMA_CMN(0x20)
-#define DPHY_CMN_SSM_EN			BIT(0)
-#define DPHY_CMN_TX_MODE_EN		BIT(9)
-
-#define DPHY_CMN_PWM			DPHY_PMA_CMN(0x40)
-#define DPHY_CMN_PWM_DIV(x)		((x) << 20)
-#define DPHY_CMN_PWM_LOW(x)		((x) << 10)
-#define DPHY_CMN_PWM_HIGH(x)		(x)
-
-#define DPHY_CMN_FBDIV			DPHY_PMA_CMN(0x4c)
-#define DPHY_CMN_FBDIV_VAL(low, high)	(((high) << 11) | ((low) << 22))
-#define DPHY_CMN_FBDIV_FROM_REG		(BIT(10) | BIT(21))
-
-#define DPHY_CMN_OPIPDIV		DPHY_PMA_CMN(0x50)
-#define DPHY_CMN_IPDIV_FROM_REG		BIT(0)
-#define DPHY_CMN_IPDIV(x)		((x) << 1)
-#define DPHY_CMN_OPDIV_FROM_REG		BIT(6)
-#define DPHY_CMN_OPDIV(x)		((x) << 7)
-
-#define DPHY_PSM_CFG			DPHY_PCS(0x4)
-#define DPHY_PSM_CFG_FROM_REG		BIT(0)
-#define DPHY_PSM_CLK_DIV(x)		((x) << 1)
-
 struct cdns_dsi_output {
 	struct mipi_dsi_device *dev;
 	struct drm_panel *panel;
 	struct drm_bridge *bridge;
+	union phy_configure_opts phy_opts;
 };
 
 enum cdns_dsi_input_id {
@@ -465,14 +437,6 @@ enum cdns_dsi_input_id {
 	CDNS_DSC_INPUT,
 };
 
-struct cdns_dphy_cfg {
-	u8 pll_ipdiv;
-	u8 pll_opdiv;
-	u16 pll_fbdiv;
-	unsigned long lane_bps;
-	unsigned int nlanes;
-};
-
 struct cdns_dsi_cfg {
 	unsigned int hfp;
 	unsigned int hsa;
@@ -481,34 +445,6 @@ struct cdns_dsi_cfg {
 	unsigned int htotal;
 };
 
-struct cdns_dphy;
-
-enum cdns_dphy_clk_lane_cfg {
-	DPHY_CLK_CFG_LEFT_DRIVES_ALL = 0,
-	DPHY_CLK_CFG_LEFT_DRIVES_RIGHT = 1,
-	DPHY_CLK_CFG_LEFT_DRIVES_LEFT = 2,
-	DPHY_CLK_CFG_RIGHT_DRIVES_ALL = 3,
-};
-
-struct cdns_dphy_ops {
-	int (*probe)(struct cdns_dphy *dphy);
-	void (*remove)(struct cdns_dphy *dphy);
-	void (*set_psm_div)(struct cdns_dphy *dphy, u8 div);
-	void (*set_clk_lane_cfg)(struct cdns_dphy *dphy,
-				 enum cdns_dphy_clk_lane_cfg cfg);
-	void (*set_pll_cfg)(struct cdns_dphy *dphy,
-			    const struct cdns_dphy_cfg *cfg);
-	unsigned long (*get_wakeup_time_ns)(struct cdns_dphy *dphy);
-};
-
-struct cdns_dphy {
-	struct cdns_dphy_cfg cfg;
-	void __iomem *regs;
-	struct clk *psm_clk;
-	struct clk *pll_ref_clk;
-	const struct cdns_dphy_ops *ops;
-};
-
 struct cdns_dsi_input {
 	enum cdns_dsi_input_id id;
 	struct drm_bridge bridge;
@@ -526,7 +462,7 @@ struct cdns_dsi {
 	struct reset_control *dsi_p_rst;
 	struct clk *dsi_sys_clk;
 	bool link_initialized;
-	struct cdns_dphy *dphy;
+	struct phy *dphy;
 };
 
 static inline struct cdns_dsi *input_to_dsi(struct cdns_dsi_input *input)
@@ -545,173 +481,13 @@ bridge_to_cdns_dsi_input(struct drm_bridge *bridge)
 	return container_of(bridge, struct cdns_dsi_input, bridge);
 }
 
-static int cdns_dsi_get_dphy_pll_cfg(struct cdns_dphy *dphy,
-				     struct cdns_dphy_cfg *cfg,
-				     unsigned int dpi_htotal,
-				     unsigned int dpi_bpp,
-				     unsigned int dpi_hz,
-				     unsigned int dsi_htotal,
-				     unsigned int dsi_nlanes,
-				     unsigned int *dsi_hfp_ext)
-{
-	u64 dlane_bps, dlane_bps_max, fbdiv, fbdiv_max, adj_dsi_htotal;
-	unsigned long pll_ref_hz = clk_get_rate(dphy->pll_ref_clk);
-
-	memset(cfg, 0, sizeof(*cfg));
-
-	cfg->nlanes = dsi_nlanes;
-
-	if (pll_ref_hz < 9600000 || pll_ref_hz >= 150000000)
-		return -EINVAL;
-	else if (pll_ref_hz < 19200000)
-		cfg->pll_ipdiv = 1;
-	else if (pll_ref_hz < 38400000)
-		cfg->pll_ipdiv = 2;
-	else if (pll_ref_hz < 76800000)
-		cfg->pll_ipdiv = 4;
-	else
-		cfg->pll_ipdiv = 8;
-
-	/*
-	 * Make sure DSI htotal is aligned on a lane boundary when calculating
-	 * the expected data rate. This is done by extending HFP in case of
-	 * misalignment.
-	 */
-	adj_dsi_htotal = dsi_htotal;
-	if (dsi_htotal % dsi_nlanes)
-		adj_dsi_htotal += dsi_nlanes - (dsi_htotal % dsi_nlanes);
-
-	dlane_bps = (u64)dpi_hz * adj_dsi_htotal;
-
-	/* data rate in bytes/sec is not an integer, refuse the mode. */
-	if (do_div(dlane_bps, dsi_nlanes * dpi_htotal))
-		return -EINVAL;
-
-	/* data rate was in bytes/sec, convert to bits/sec. */
-	dlane_bps *= 8;
-
-	if (dlane_bps > 2500000000UL || dlane_bps < 160000000UL)
-		return -EINVAL;
-	else if (dlane_bps >= 1250000000)
-		cfg->pll_opdiv = 1;
-	else if (dlane_bps >= 630000000)
-		cfg->pll_opdiv = 2;
-	else if (dlane_bps >= 320000000)
-		cfg->pll_opdiv = 4;
-	else if (dlane_bps >= 160000000)
-		cfg->pll_opdiv = 8;
-
-	/*
-	 * Allow a deviation of 0.2% on the per-lane data rate to try to
-	 * recover a potential mismatch between DPI and PPI clks.
-	 */
-	dlane_bps_max = dlane_bps + DIV_ROUND_DOWN_ULL(dlane_bps, 500);
-	fbdiv_max = DIV_ROUND_DOWN_ULL(dlane_bps_max * 2 *
-				       cfg->pll_opdiv * cfg->pll_ipdiv,
-				       pll_ref_hz);
-	fbdiv = DIV_ROUND_UP_ULL(dlane_bps * 2 * cfg->pll_opdiv *
-				 cfg->pll_ipdiv,
-				 pll_ref_hz);
-
-	/*
-	 * Iterate over all acceptable fbdiv and try to find an adjusted DSI
-	 * htotal length providing an exact match.
-	 *
-	 * Note that we could do something even trickier by relying on the fact
-	 * that a new line is not necessarily aligned on a lane boundary, so,
-	 * by making adj_dsi_htotal non aligned on a dsi_lanes we can improve a
-	 * bit the precision. With this, the step would be
-	 *
-	 *	pll_ref_hz / (2 * opdiv * ipdiv * nlanes)
-	 *
-	 * instead of
-	 *
-	 *	pll_ref_hz / (2 * opdiv * ipdiv)
-	 *
-	 * The drawback of this approach is that we would need to make sure the
-	 * number or lines is a multiple of the realignment periodicity which is
-	 * a function of the number of lanes and the original misalignment. For
-	 * example, for NLANES = 4 and HTOTAL % NLANES = 3, it takes 4 lines
-	 * to realign on a lane:
-	 * LINE 0: expected number of bytes, starts emitting first byte of
-	 *	   LINE 1 on LANE 3
-	 * LINE 1: expected number of bytes, starts emitting first 2 bytes of
-	 *	   LINE 2 on LANES 2 and 3
-	 * LINE 2: expected number of bytes, starts emitting first 3 bytes of
-	 *	   of LINE 3 on LANES 1, 2 and 3
-	 * LINE 3: one byte less, now things are realigned on LANE 0 for LINE 4
-	 *
-	 * I figured this extra complexity was not worth the benefit, but if
-	 * someone really has unfixable mismatch, that would be something to
-	 * investigate.
-	 */
-	for (; fbdiv <= fbdiv_max; fbdiv++) {
-		u32 rem;
-
-		adj_dsi_htotal = (u64)fbdiv * pll_ref_hz * dsi_nlanes *
-				 dpi_htotal;
-
-		/*
-		 * Do the division in 2 steps to avoid an overflow on the
-		 * divider.
-		 */
-		rem = do_div(adj_dsi_htotal, dpi_hz);
-		if (rem)
-			continue;
-
-		rem = do_div(adj_dsi_htotal,
-			     cfg->pll_opdiv * cfg->pll_ipdiv * 2 * 8);
-		if (rem)
-			continue;
-
-		cfg->pll_fbdiv = fbdiv;
-		*dsi_hfp_ext = adj_dsi_htotal - dsi_htotal;
-		break;
-	}
-
-	/* No match, let's just reject the display mode. */
-	if (!cfg->pll_fbdiv)
-		return -EINVAL;
-
-	dlane_bps = DIV_ROUND_DOWN_ULL((u64)dpi_hz * adj_dsi_htotal * 8,
-				       dsi_nlanes * dpi_htotal);
-	cfg->lane_bps = dlane_bps;
-
-	return 0;
-}
-
-static int cdns_dphy_setup_psm(struct cdns_dphy *dphy)
-{
-	unsigned long psm_clk_hz = clk_get_rate(dphy->psm_clk);
-	unsigned long psm_div;
-
-	if (!psm_clk_hz || psm_clk_hz > 100000000)
-		return -EINVAL;
-
-	psm_div = DIV_ROUND_CLOSEST(psm_clk_hz, 1000000);
-	if (dphy->ops->set_psm_div)
-		dphy->ops->set_psm_div(dphy, psm_div);
-
-	return 0;
-}
-
-static void cdns_dphy_set_clk_lane_cfg(struct cdns_dphy *dphy,
-				       enum cdns_dphy_clk_lane_cfg cfg)
-{
-	if (dphy->ops->set_clk_lane_cfg)
-		dphy->ops->set_clk_lane_cfg(dphy, cfg);
-}
-
-static void cdns_dphy_set_pll_cfg(struct cdns_dphy *dphy,
-				  const struct cdns_dphy_cfg *cfg)
+static unsigned int mode_to_dpi_hfp(const struct drm_display_mode *mode,
+				    bool mode_valid_check)
 {
-	if (dphy->ops->set_pll_cfg)
-		dphy->ops->set_pll_cfg(dphy, cfg);
-}
+	if (mode_valid_check)
+		return mode->hsync_start - mode->hdisplay;
 
-static unsigned long cdns_dphy_get_wakeup_time_ns(struct cdns_dphy *dphy)
-{
-	return dphy->ops->get_wakeup_time_ns(dphy);
+	return mode->crtc_hsync_start - mode->crtc_hdisplay;
 }
 
 static unsigned int dpi_to_dsi_timing(unsigned int dpi_timing,
@@ -731,14 +507,12 @@ static unsigned int dpi_to_dsi_timing(unsigned int dpi_timing,
 static int cdns_dsi_mode2cfg(struct cdns_dsi *dsi,
 			     const struct drm_display_mode *mode,
 			     struct cdns_dsi_cfg *dsi_cfg,
-			     struct cdns_dphy_cfg *dphy_cfg,
 			     bool mode_valid_check)
 {
-	unsigned long dsi_htotal = 0, dsi_hss_hsa_hse_hbp = 0;
 	struct cdns_dsi_output *output = &dsi->output;
-	unsigned int dsi_hfp_ext = 0, dpi_hfp, tmp;
+	unsigned int tmp;
 	bool sync_pulse = false;
-	int bpp, nlanes, ret;
+	int bpp, nlanes;
 
 	memset(dsi_cfg, 0, sizeof(*dsi_cfg));
 
@@ -757,8 +531,6 @@ static int cdns_dsi_mode2cfg(struct cdns_dsi *dsi,
 		       mode->crtc_hsync_end : mode->crtc_hsync_start);
 
 	dsi_cfg->hbp = dpi_to_dsi_timing(tmp, bpp, DSI_HBP_FRAME_OVERHEAD);
-	dsi_htotal += dsi_cfg->hbp + DSI_HBP_FRAME_OVERHEAD;
-	dsi_hss_hsa_hse_hbp += dsi_cfg->hbp + DSI_HBP_FRAME_OVERHEAD;
 
 	if (sync_pulse) {
 		if (mode_valid_check)
@@ -768,49 +540,104 @@ static int cdns_dsi_mode2cfg(struct cdns_dsi *dsi,
 
 		dsi_cfg->hsa = dpi_to_dsi_timing(tmp, bpp,
 						 DSI_HSA_FRAME_OVERHEAD);
-		dsi_htotal += dsi_cfg->hsa + DSI_HSA_FRAME_OVERHEAD;
-		dsi_hss_hsa_hse_hbp += dsi_cfg->hsa + DSI_HSA_FRAME_OVERHEAD;
 	}
 
 	dsi_cfg->hact = dpi_to_dsi_timing(mode_valid_check ?
 					  mode->hdisplay : mode->crtc_hdisplay,
 					  bpp, 0);
-	dsi_htotal += dsi_cfg->hact;
+	dsi_cfg->hfp = dpi_to_dsi_timing(mode_to_dpi_hfp(mode, mode_valid_check),
+					 bpp, DSI_HFP_FRAME_OVERHEAD);
 
-	if (mode_valid_check)
-		dpi_hfp = mode->hsync_start - mode->hdisplay;
-	else
-		dpi_hfp = mode->crtc_hsync_start - mode->crtc_hdisplay;
+	return 0;
+}
+
+static int cdns_dsi_adjust_phy_config(struct cdns_dsi *dsi,
+			      struct cdns_dsi_cfg *dsi_cfg,
+			      struct phy_configure_opts_mipi_dphy *phy_cfg,
+			      const struct drm_display_mode *mode,
+			      bool mode_valid_check)
+{
+	struct cdns_dsi_output *output = &dsi->output;
+	unsigned long long dlane_bps;
+	unsigned long adj_dsi_htotal;
+	unsigned long dsi_htotal;
+	unsigned long dpi_htotal;
+	unsigned long dpi_hz;
+	unsigned int dsi_hfp_ext;
+	unsigned int lanes = output->dev->lanes;
+
+	dsi_htotal = dsi_cfg->hbp + DSI_HBP_FRAME_OVERHEAD;
+	if (output->dev->mode_flags & MIPI_DSI_MODE_VIDEO_SYNC_PULSE)
+		dsi_htotal += dsi_cfg->hsa + DSI_HSA_FRAME_OVERHEAD;
 
-	dsi_cfg->hfp = dpi_to_dsi_timing(dpi_hfp, bpp, DSI_HFP_FRAME_OVERHEAD);
+	dsi_htotal += dsi_cfg->hact;
 	dsi_htotal += dsi_cfg->hfp + DSI_HFP_FRAME_OVERHEAD;
 
-	if (mode_valid_check)
-		ret = cdns_dsi_get_dphy_pll_cfg(dsi->dphy, dphy_cfg,
-						mode->htotal, bpp,
-						mode->clock * 1000,
-						dsi_htotal, nlanes,
-						&dsi_hfp_ext);
-	else
-		ret = cdns_dsi_get_dphy_pll_cfg(dsi->dphy, dphy_cfg,
-						mode->crtc_htotal, bpp,
-						mode->crtc_clock * 1000,
-						dsi_htotal, nlanes,
-						&dsi_hfp_ext);
+	/*
+	 * Make sure DSI htotal is aligned on a lane boundary when calculating
+	 * the expected data rate. This is done by extending HFP in case of
+	 * misalignment.
+	 */
+	adj_dsi_htotal = dsi_htotal;
+	if (dsi_htotal % lanes)
+		adj_dsi_htotal += lanes - (dsi_htotal % lanes);
+
+	dpi_hz = (mode_valid_check ? mode->clock : mode->crtc_clock) * 1000;
+	dlane_bps = (unsigned long long)dpi_hz * adj_dsi_htotal;
+
+	/* data rate in bytes/sec is not an integer, refuse the mode. */
+	dpi_htotal = mode_valid_check ? mode->htotal : mode->crtc_htotal;
+	if (do_div(dlane_bps, lanes * dpi_htotal))
+		return -EINVAL;
+
+	/* data rate was in bytes/sec, convert to bits/sec. */
+	phy_cfg->hs_clk_rate = dlane_bps * 8;
 
+	dsi_hfp_ext = adj_dsi_htotal - dsi_htotal;
+	dsi_cfg->hfp += dsi_hfp_ext;
+	dsi_cfg->htotal = dsi_htotal + dsi_hfp_ext;
+
+	return 0;
+}
+
+static int cdns_dsi_check_conf(struct cdns_dsi *dsi,
+			       const struct drm_display_mode *mode,
+			       struct cdns_dsi_cfg *dsi_cfg,
+			       bool mode_valid_check)
+{
+	struct cdns_dsi_output *output = &dsi->output;
+	struct phy_configure_opts_mipi_dphy *phy_cfg = &output->phy_opts.mipi_dphy;
+	unsigned long dsi_hss_hsa_hse_hbp;
+	unsigned int nlanes = output->dev->lanes;
+	int ret;
+
+	ret = cdns_dsi_mode2cfg(dsi, mode, dsi_cfg, mode_valid_check);
 	if (ret)
 		return ret;
 
-	dsi_cfg->hfp += dsi_hfp_ext;
-	dsi_htotal += dsi_hfp_ext;
-	dsi_cfg->htotal = dsi_htotal;
+	phy_mipi_dphy_get_default_config(mode->crtc_clock * 1000,
+					 mipi_dsi_pixel_format_to_bpp(output->dev->format),
+					 nlanes, phy_cfg);
+
+	ret = cdns_dsi_adjust_phy_config(dsi, dsi_cfg, phy_cfg, mode, mode_valid_check);
+	if (ret)
+		return ret;
+
+	ret = phy_validate(dsi->dphy, PHY_MODE_MIPI_DPHY, 0, &output->phy_opts);
+	if (ret)
+		return ret;
+
+	dsi_hss_hsa_hse_hbp = dsi_cfg->hbp + DSI_HBP_FRAME_OVERHEAD;
+	if (output->dev->mode_flags & MIPI_DSI_MODE_VIDEO_SYNC_PULSE)
+		dsi_hss_hsa_hse_hbp += dsi_cfg->hsa + DSI_HSA_FRAME_OVERHEAD;
 
 	/*
 	 * Make sure DPI(HFP) > DSI(HSS+HSA+HSE+HBP) to guarantee that the FIFO
 	 * is empty before we start a receiving a new line on the DPI
 	 * interface.
 	 */
-	if ((u64)dphy_cfg->lane_bps * dpi_hfp * nlanes <
+	if ((u64)phy_cfg->hs_clk_rate *
+	    mode_to_dpi_hfp(mode, mode_valid_check) * nlanes <
 	    (u64)dsi_hss_hsa_hse_hbp *
 	    (mode_valid_check ? mode->clock : mode->crtc_clock) * 1000)
 		return -EINVAL;
@@ -840,9 +667,8 @@ cdns_dsi_bridge_mode_valid(struct drm_bridge *bridge,
 	struct cdns_dsi_input *input = bridge_to_cdns_dsi_input(bridge);
 	struct cdns_dsi *dsi = input_to_dsi(input);
 	struct cdns_dsi_output *output = &dsi->output;
-	struct cdns_dphy_cfg dphy_cfg;
 	struct cdns_dsi_cfg dsi_cfg;
-	int bpp, nlanes, ret;
+	int bpp, ret;
 
 	/*
 	 * VFP_DSI should be less than VFP_DPI and VFP_DSI should be at
@@ -860,11 +686,9 @@ cdns_dsi_bridge_mode_valid(struct drm_bridge *bridge,
 	if ((mode->hdisplay * bpp) % 32)
 		return MODE_H_ILLEGAL;
 
-	nlanes = output->dev->lanes;
-
-	ret = cdns_dsi_mode2cfg(dsi, mode, &dsi_cfg, &dphy_cfg, true);
+	ret = cdns_dsi_check_conf(dsi, mode, &dsi_cfg, true);
 	if (ret)
-		return MODE_CLOCK_RANGE;
+		return MODE_BAD;
 
 	return MODE_OK;
 }
@@ -885,9 +709,9 @@ static void cdns_dsi_bridge_disable(struct drm_bridge *bridge)
 	pm_runtime_put(dsi->base.dev);
 }
 
-static void cdns_dsi_hs_init(struct cdns_dsi *dsi,
-			     const struct cdns_dphy_cfg *dphy_cfg)
+static void cdns_dsi_hs_init(struct cdns_dsi *dsi)
 {
+	struct cdns_dsi_output *output = &dsi->output;
 	u32 status;
 
 	/*
@@ -898,30 +722,10 @@ static void cdns_dsi_hs_init(struct cdns_dsi *dsi,
 	       DPHY_CMN_PDN | DPHY_PLL_PDN,
 	       dsi->regs + MCTL_DPHY_CFG0);
 
-	/*
-	 * Configure the internal PSM clk divider so that the DPHY has a
-	 * 1MHz clk (or something close).
-	 */
-	WARN_ON_ONCE(cdns_dphy_setup_psm(dsi->dphy));
-
-	/*
-	 * Configure attach clk lanes to data lanes: the DPHY has 2 clk lanes
-	 * and 8 data lanes, each clk lane can be attache different set of
-	 * data lanes. The 2 groups are named 'left' and 'right', so here we
-	 * just say that we want the 'left' clk lane to drive the 'left' data
-	 * lanes.
-	 */
-	cdns_dphy_set_clk_lane_cfg(dsi->dphy, DPHY_CLK_CFG_LEFT_DRIVES_LEFT);
-
-	/*
-	 * Configure the DPHY PLL that will be used to generate the TX byte
-	 * clk.
-	 */
-	cdns_dphy_set_pll_cfg(dsi->dphy, dphy_cfg);
-
-	/* Start TX state machine. */
-	writel(DPHY_CMN_SSM_EN | DPHY_CMN_TX_MODE_EN,
-	       dsi->dphy->regs + DPHY_CMN_SSM);
+	phy_init(dsi->dphy);
+	phy_set_mode(dsi->dphy, PHY_MODE_MIPI_DPHY);
+	phy_configure(dsi->dphy, &output->phy_opts);
+	phy_power_on(dsi->dphy);
 
 	/* Activate the PLL and wait until it's locked. */
 	writel(PLL_LOCKED, dsi->regs + MCTL_MAIN_STS_CLR);
@@ -931,7 +735,7 @@ static void cdns_dsi_hs_init(struct cdns_dsi *dsi,
 					status & PLL_LOCKED, 100, 100));
 	/* De-assert data and clock reset lines. */
 	writel(DPHY_CMN_PSO | DPHY_ALL_D_PDN | DPHY_C_PDN | DPHY_CMN_PDN |
-	       DPHY_D_RSTB(dphy_cfg->nlanes) | DPHY_C_RSTB,
+	       DPHY_D_RSTB(output->dev->lanes) | DPHY_C_RSTB,
 	       dsi->regs + MCTL_DPHY_CFG0);
 }
 
@@ -977,7 +781,7 @@ static void cdns_dsi_bridge_enable(struct drm_bridge *bridge)
 	struct cdns_dsi *dsi = input_to_dsi(input);
 	struct cdns_dsi_output *output = &dsi->output;
 	struct drm_display_mode *mode;
-	struct cdns_dphy_cfg dphy_cfg;
+	struct phy_configure_opts_mipi_dphy *phy_cfg = &output->phy_opts.mipi_dphy;
 	unsigned long tx_byte_period;
 	struct cdns_dsi_cfg dsi_cfg;
 	u32 tmp, reg_wakeup, div;
@@ -990,9 +794,9 @@ static void cdns_dsi_bridge_enable(struct drm_bridge *bridge)
 	bpp = mipi_dsi_pixel_format_to_bpp(output->dev->format);
 	nlanes = output->dev->lanes;
 
-	WARN_ON_ONCE(cdns_dsi_mode2cfg(dsi, mode, &dsi_cfg, &dphy_cfg, false));
+	WARN_ON_ONCE(cdns_dsi_check_conf(dsi, mode, &dsi_cfg, false));
 
-	cdns_dsi_hs_init(dsi, &dphy_cfg);
+	cdns_dsi_hs_init(dsi);
 	cdns_dsi_init_link(dsi);
 
 	writel(HBP_LEN(dsi_cfg.hbp) | HSA_LEN(dsi_cfg.hsa),
@@ -1028,9 +832,8 @@ static void cdns_dsi_bridge_enable(struct drm_bridge *bridge)
 		tmp -= DIV_ROUND_UP(DSI_EOT_PKT_SIZE, nlanes);
 
 	tx_byte_period = DIV_ROUND_DOWN_ULL((u64)NSEC_PER_SEC * 8,
-					    dphy_cfg.lane_bps);
-	reg_wakeup = cdns_dphy_get_wakeup_time_ns(dsi->dphy) /
-		     tx_byte_period;
+					    phy_cfg->hs_clk_rate);
+	reg_wakeup = (phy_cfg->hs_prepare + phy_cfg->hs_zero) / tx_byte_period;
 	writel(REG_WAKEUP_TIME(reg_wakeup) | REG_LINE_DURATION(tmp),
 	       dsi->regs + VID_DPHY_TIME);
 
@@ -1344,8 +1147,6 @@ static int __maybe_unused cdns_dsi_resume(struct device *dev)
 	reset_control_deassert(dsi->dsi_p_rst);
 	clk_prepare_enable(dsi->dsi_p_clk);
 	clk_prepare_enable(dsi->dsi_sys_clk);
-	clk_prepare_enable(dsi->dphy->psm_clk);
-	clk_prepare_enable(dsi->dphy->pll_ref_clk);
 
 	return 0;
 }
@@ -1354,8 +1155,6 @@ static int __maybe_unused cdns_dsi_suspend(struct device *dev)
 {
 	struct cdns_dsi *dsi = dev_get_drvdata(dev);
 
-	clk_disable_unprepare(dsi->dphy->pll_ref_clk);
-	clk_disable_unprepare(dsi->dphy->psm_clk);
 	clk_disable_unprepare(dsi->dsi_sys_clk);
 	clk_disable_unprepare(dsi->dsi_p_clk);
 	reset_control_assert(dsi->dsi_p_rst);
@@ -1366,121 +1165,6 @@ static int __maybe_unused cdns_dsi_suspend(struct device *dev)
 static UNIVERSAL_DEV_PM_OPS(cdns_dsi_pm_ops, cdns_dsi_suspend, cdns_dsi_resume,
 			    NULL);
 
-static unsigned long cdns_dphy_ref_get_wakeup_time_ns(struct cdns_dphy *dphy)
-{
-	/* Default wakeup time is 800 ns (in a simulated environment). */
-	return 800;
-}
-
-static void cdns_dphy_ref_set_pll_cfg(struct cdns_dphy *dphy,
-				      const struct cdns_dphy_cfg *cfg)
-{
-	u32 fbdiv_low, fbdiv_high;
-
-	fbdiv_low = (cfg->pll_fbdiv / 4) - 2;
-	fbdiv_high = cfg->pll_fbdiv - fbdiv_low - 2;
-
-	writel(DPHY_CMN_IPDIV_FROM_REG | DPHY_CMN_OPDIV_FROM_REG |
-	       DPHY_CMN_IPDIV(cfg->pll_ipdiv) |
-	       DPHY_CMN_OPDIV(cfg->pll_opdiv),
-	       dphy->regs + DPHY_CMN_OPIPDIV);
-	writel(DPHY_CMN_FBDIV_FROM_REG |
-	       DPHY_CMN_FBDIV_VAL(fbdiv_low, fbdiv_high),
-	       dphy->regs + DPHY_CMN_FBDIV);
-	writel(DPHY_CMN_PWM_HIGH(6) | DPHY_CMN_PWM_LOW(0x101) |
-	       DPHY_CMN_PWM_DIV(0x8),
-	       dphy->regs + DPHY_CMN_PWM);
-}
-
-static void cdns_dphy_ref_set_psm_div(struct cdns_dphy *dphy, u8 div)
-{
-	writel(DPHY_PSM_CFG_FROM_REG | DPHY_PSM_CLK_DIV(div),
-	       dphy->regs + DPHY_PSM_CFG);
-}
-
-/*
- * This is the reference implementation of DPHY hooks. Specific integration of
- * this IP may have to re-implement some of them depending on how they decided
- * to wire things in the SoC.
- */
-static const struct cdns_dphy_ops ref_dphy_ops = {
-	.get_wakeup_time_ns = cdns_dphy_ref_get_wakeup_time_ns,
-	.set_pll_cfg = cdns_dphy_ref_set_pll_cfg,
-	.set_psm_div = cdns_dphy_ref_set_psm_div,
-};
-
-static const struct of_device_id cdns_dphy_of_match[] = {
-	{ .compatible = "cdns,dphy", .data = &ref_dphy_ops },
-	{ /* sentinel */ },
-};
-
-static struct cdns_dphy *cdns_dphy_probe(struct platform_device *pdev)
-{
-	const struct of_device_id *match;
-	struct cdns_dphy *dphy;
-	struct of_phandle_args args;
-	struct resource res;
-	int ret;
-
-	ret = of_parse_phandle_with_args(pdev->dev.of_node, "phys",
-					 "#phy-cells", 0, &args);
-	if (ret)
-		return ERR_PTR(-ENOENT);
-
-	match = of_match_node(cdns_dphy_of_match, args.np);
-	if (!match || !match->data)
-		return ERR_PTR(-EINVAL);
-
-	dphy = devm_kzalloc(&pdev->dev, sizeof(*dphy), GFP_KERNEL);
-	if (!dphy)
-		return ERR_PTR(-ENOMEM);
-
-	dphy->ops = match->data;
-
-	ret = of_address_to_resource(args.np, 0, &res);
-	if (ret)
-		return ERR_PTR(ret);
-
-	dphy->regs = devm_ioremap_resource(&pdev->dev, &res);
-	if (IS_ERR(dphy->regs))
-		return ERR_CAST(dphy->regs);
-
-	dphy->psm_clk = of_clk_get_by_name(args.np, "psm");
-	if (IS_ERR(dphy->psm_clk))
-		return ERR_CAST(dphy->psm_clk);
-
-	dphy->pll_ref_clk = of_clk_get_by_name(args.np, "pll_ref");
-	if (IS_ERR(dphy->pll_ref_clk)) {
-		ret = PTR_ERR(dphy->pll_ref_clk);
-		goto err_put_psm_clk;
-	}
-
-	if (dphy->ops->probe) {
-		ret = dphy->ops->probe(dphy);
-		if (ret)
-			goto err_put_pll_ref_clk;
-	}
-
-	return dphy;
-
-err_put_pll_ref_clk:
-	clk_put(dphy->pll_ref_clk);
-
-err_put_psm_clk:
-	clk_put(dphy->psm_clk);
-
-	return ERR_PTR(ret);
-}
-
-static void cdns_dphy_remove(struct cdns_dphy *dphy)
-{
-	if (dphy->ops->remove)
-		dphy->ops->remove(dphy);
-
-	clk_put(dphy->pll_ref_clk);
-	clk_put(dphy->psm_clk);
-}
-
 static int cdns_dsi_drm_probe(struct platform_device *pdev)
 {
 	struct cdns_dsi *dsi;
@@ -1519,13 +1203,13 @@ static int cdns_dsi_drm_probe(struct platform_device *pdev)
 	if (irq < 0)
 		return irq;
 
-	dsi->dphy = cdns_dphy_probe(pdev);
+	dsi->dphy = devm_phy_get(&pdev->dev, "dphy");
 	if (IS_ERR(dsi->dphy))
 		return PTR_ERR(dsi->dphy);
 
 	ret = clk_prepare_enable(dsi->dsi_p_clk);
 	if (ret)
-		goto err_remove_dphy;
+		return ret;
 
 	val = readl(dsi->regs + ID_REG);
 	if (REV_VENDOR_ID(val) != 0xcad) {
@@ -1583,9 +1267,6 @@ err_disable_runtime_pm:
 err_disable_pclk:
 	clk_disable_unprepare(dsi->dsi_p_clk);
 
-err_remove_dphy:
-	cdns_dphy_remove(dsi->dphy);
-
 	return ret;
 }
 
@@ -1595,7 +1276,6 @@ static int cdns_dsi_drm_remove(struct platform_device *pdev)
 
 	mipi_dsi_host_unregister(&dsi->base);
 	pm_runtime_disable(&pdev->dev);
-	cdns_dphy_remove(dsi->dphy);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/bridge/dumb-vga-dac.c b/drivers/gpu/drm/bridge/dumb-vga-dac.c
index 9b706789a341..0805801f4e94 100644
--- a/drivers/gpu/drm/bridge/dumb-vga-dac.c
+++ b/drivers/gpu/drm/bridge/dumb-vga-dac.c
@@ -18,7 +18,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 struct dumb_vga {
 	struct drm_bridge	bridge;
diff --git a/drivers/gpu/drm/bridge/lvds-encoder.c b/drivers/gpu/drm/bridge/lvds-encoder.c
index f56c92f7af7c..ae8fc597eb38 100644
--- a/drivers/gpu/drm/bridge/lvds-encoder.c
+++ b/drivers/gpu/drm/bridge/lvds-encoder.c
@@ -11,11 +11,13 @@
 #include <drm/drm_bridge.h>
 #include <drm/drm_panel.h>
 
+#include <linux/gpio/consumer.h>
 #include <linux/of_graph.h>
 
 struct lvds_encoder {
 	struct drm_bridge bridge;
 	struct drm_bridge *panel_bridge;
+	struct gpio_desc *powerdown_gpio;
 };
 
 static int lvds_encoder_attach(struct drm_bridge *bridge)
@@ -28,54 +30,85 @@ static int lvds_encoder_attach(struct drm_bridge *bridge)
 				 bridge);
 }
 
+static void lvds_encoder_enable(struct drm_bridge *bridge)
+{
+	struct lvds_encoder *lvds_encoder = container_of(bridge,
+							 struct lvds_encoder,
+							 bridge);
+
+	if (lvds_encoder->powerdown_gpio)
+		gpiod_set_value_cansleep(lvds_encoder->powerdown_gpio, 0);
+}
+
+static void lvds_encoder_disable(struct drm_bridge *bridge)
+{
+	struct lvds_encoder *lvds_encoder = container_of(bridge,
+							 struct lvds_encoder,
+							 bridge);
+
+	if (lvds_encoder->powerdown_gpio)
+		gpiod_set_value_cansleep(lvds_encoder->powerdown_gpio, 1);
+}
+
 static struct drm_bridge_funcs funcs = {
 	.attach = lvds_encoder_attach,
+	.enable = lvds_encoder_enable,
+	.disable = lvds_encoder_disable,
 };
 
 static int lvds_encoder_probe(struct platform_device *pdev)
 {
+	struct device *dev = &pdev->dev;
 	struct device_node *port;
 	struct device_node *endpoint;
 	struct device_node *panel_node;
 	struct drm_panel *panel;
 	struct lvds_encoder *lvds_encoder;
 
-	lvds_encoder = devm_kzalloc(&pdev->dev, sizeof(*lvds_encoder),
-				    GFP_KERNEL);
+	lvds_encoder = devm_kzalloc(dev, sizeof(*lvds_encoder), GFP_KERNEL);
 	if (!lvds_encoder)
 		return -ENOMEM;
 
+	lvds_encoder->powerdown_gpio = devm_gpiod_get_optional(dev, "powerdown",
+							       GPIOD_OUT_HIGH);
+	if (IS_ERR(lvds_encoder->powerdown_gpio)) {
+		int err = PTR_ERR(lvds_encoder->powerdown_gpio);
+
+		if (err != -EPROBE_DEFER)
+			dev_err(dev, "powerdown GPIO failure: %d\n", err);
+		return err;
+	}
+
 	/* Locate the panel DT node. */
-	port = of_graph_get_port_by_id(pdev->dev.of_node, 1);
+	port = of_graph_get_port_by_id(dev->of_node, 1);
 	if (!port) {
-		dev_dbg(&pdev->dev, "port 1 not found\n");
+		dev_dbg(dev, "port 1 not found\n");
 		return -ENXIO;
 	}
 
 	endpoint = of_get_child_by_name(port, "endpoint");
 	of_node_put(port);
 	if (!endpoint) {
-		dev_dbg(&pdev->dev, "no endpoint for port 1\n");
+		dev_dbg(dev, "no endpoint for port 1\n");
 		return -ENXIO;
 	}
 
 	panel_node = of_graph_get_remote_port_parent(endpoint);
 	of_node_put(endpoint);
 	if (!panel_node) {
-		dev_dbg(&pdev->dev, "no remote endpoint for port 1\n");
+		dev_dbg(dev, "no remote endpoint for port 1\n");
 		return -ENXIO;
 	}
 
 	panel = of_drm_find_panel(panel_node);
 	of_node_put(panel_node);
 	if (IS_ERR(panel)) {
-		dev_dbg(&pdev->dev, "panel not found, deferring probe\n");
+		dev_dbg(dev, "panel not found, deferring probe\n");
 		return PTR_ERR(panel);
 	}
 
 	lvds_encoder->panel_bridge =
-		devm_drm_panel_bridge_add(&pdev->dev,
-					  panel, DRM_MODE_CONNECTOR_LVDS);
+		devm_drm_panel_bridge_add(dev, panel, DRM_MODE_CONNECTOR_LVDS);
 	if (IS_ERR(lvds_encoder->panel_bridge))
 		return PTR_ERR(lvds_encoder->panel_bridge);
 
@@ -83,7 +116,7 @@ static int lvds_encoder_probe(struct platform_device *pdev)
 	 * but we need a bridge attached to our of_node for our user
 	 * to look up.
 	 */
-	lvds_encoder->bridge.of_node = pdev->dev.of_node;
+	lvds_encoder->bridge.of_node = dev->of_node;
 	lvds_encoder->bridge.funcs = &funcs;
 	drm_bridge_add(&lvds_encoder->bridge);
 
diff --git a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c
index 2136c97aeb8e..a01028ec4de6 100644
--- a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c
+++ b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c
@@ -36,8 +36,8 @@
 #include <linux/of.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #define EDID_EXT_BLOCK_CNT 0x7E
diff --git a/drivers/gpu/drm/bridge/nxp-ptn3460.c b/drivers/gpu/drm/bridge/nxp-ptn3460.c
index a3e817abace1..fb335afea4cf 100644
--- a/drivers/gpu/drm/bridge/nxp-ptn3460.c
+++ b/drivers/gpu/drm/bridge/nxp-ptn3460.c
@@ -22,10 +22,10 @@
 #include <linux/of_gpio.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #define PTN3460_EDID_ADDR			0x0
diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
index 7cbaba213ef6..38eeaf8ba959 100644
--- a/drivers/gpu/drm/bridge/panel.c
+++ b/drivers/gpu/drm/bridge/panel.c
@@ -12,9 +12,9 @@
 #include <drm/drm_panel.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_connector.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_encoder.h>
 #include <drm/drm_modeset_helper_vtables.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_panel.h>
 
 struct panel_bridge {
@@ -134,8 +134,8 @@ static const struct drm_bridge_funcs panel_bridge_bridge_funcs = {
 };
 
 /**
- * drm_panel_bridge_add - Creates a drm_bridge and drm_connector that
- * just calls the appropriate functions from drm_panel.
+ * drm_panel_bridge_add - Creates a &drm_bridge and &drm_connector that
+ * just calls the appropriate functions from &drm_panel.
  *
  * @panel: The drm_panel being wrapped.  Must be non-NULL.
  * @connector_type: The DRM_MODE_CONNECTOR_* for the connector to be
@@ -149,9 +149,12 @@ static const struct drm_bridge_funcs panel_bridge_bridge_funcs = {
  * passed to drm_bridge_attach().  The drm_panel_prepare() and related
  * functions can be dropped from the encoder driver (they're now
  * called by the KMS helpers before calling into the encoder), along
- * with connector creation.  When done with the bridge,
- * drm_bridge_detach() should be called as normal, then
+ * with connector creation.  When done with the bridge (after
+ * drm_mode_config_cleanup() if the bridge has already been attached), then
  * drm_panel_bridge_remove() to free it.
+ *
+ * See devm_drm_panel_bridge_add() for an automatically manged version of this
+ * function.
  */
 struct drm_bridge *drm_panel_bridge_add(struct drm_panel *panel,
 					u32 connector_type)
@@ -210,6 +213,17 @@ static void devm_drm_panel_bridge_release(struct device *dev, void *res)
 	drm_panel_bridge_remove(*bridge);
 }
 
+/**
+ * devm_drm_panel_bridge_add - Creates a managed &drm_bridge and &drm_connector
+ * that just calls the appropriate functions from &drm_panel.
+ * @dev: device to tie the bridge lifetime to
+ * @panel: The drm_panel being wrapped.  Must be non-NULL.
+ * @connector_type: The DRM_MODE_CONNECTOR_* for the connector to be
+ * created.
+ *
+ * This is the managed version of drm_panel_bridge_add() which automatically
+ * calls drm_panel_bridge_remove() when @dev is unbound.
+ */
 struct drm_bridge *devm_drm_panel_bridge_add(struct device *dev,
 					     struct drm_panel *panel,
 					     u32 connector_type)
diff --git a/drivers/gpu/drm/bridge/parade-ps8622.c b/drivers/gpu/drm/bridge/parade-ps8622.c
index 7334d1b62b71..fda1395b7481 100644
--- a/drivers/gpu/drm/bridge/parade-ps8622.c
+++ b/drivers/gpu/drm/bridge/parade-ps8622.c
@@ -26,9 +26,9 @@
 #include <linux/regulator/consumer.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 /* Brightness scale on the Parade chip */
diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c
index bfa902013aa4..08e12fef1349 100644
--- a/drivers/gpu/drm/bridge/sii902x.c
+++ b/drivers/gpu/drm/bridge/sii902x.c
@@ -30,8 +30,8 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 
 #define SII902X_TPI_VIDEO_DATA			0x0
 
@@ -232,8 +232,8 @@ static void sii902x_bridge_enable(struct drm_bridge *bridge)
 }
 
 static void sii902x_bridge_mode_set(struct drm_bridge *bridge,
-				    struct drm_display_mode *mode,
-				    struct drm_display_mode *adj)
+				    const struct drm_display_mode *mode,
+				    const struct drm_display_mode *adj)
 {
 	struct sii902x *sii902x = bridge_to_sii902x(bridge);
 	struct regmap *regmap = sii902x->regmap;
@@ -258,7 +258,8 @@ static void sii902x_bridge_mode_set(struct drm_bridge *bridge,
 	if (ret)
 		return;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame, adj, false);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame,
+						       &sii902x->connector, adj);
 	if (ret < 0) {
 		DRM_ERROR("couldn't fill AVI infoframe\n");
 		return;
diff --git a/drivers/gpu/drm/bridge/sil-sii8620.c b/drivers/gpu/drm/bridge/sil-sii8620.c
index a6e8f4591e63..0cc293a6ac24 100644
--- a/drivers/gpu/drm/bridge/sil-sii8620.c
+++ b/drivers/gpu/drm/bridge/sil-sii8620.c
@@ -1104,8 +1104,7 @@ static void sii8620_set_infoframes(struct sii8620 *ctx,
 	int ret;
 
 	ret = drm_hdmi_avi_infoframe_from_display_mode(&frm.avi,
-						       mode,
-						       true);
+						       NULL, mode);
 	if (ctx->use_packed_pixel)
 		frm.avi.colorspace = HDMI_COLORSPACE_YUV422;
 
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
index 8f9c8a6b46de..5cbb71a866d5 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
@@ -1,13 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * dw-hdmi-i2s-audio.c
  *
  * Copyright (c) 2017 Renesas Solutions Corp.
  * Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
  */
+
+#include <linux/dma-mapping.h>
+#include <linux/module.h>
+
 #include <drm/bridge/dw_hdmi.h>
 
 #include <sound/hdmi-codec.h>
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 64c3cf027518..a63e5f0dae56 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -25,9 +25,10 @@
 #include <drm/drm_of.h>
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_encoder_slave.h>
+#include <drm/drm_scdc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/bridge/dw_hdmi.h>
 
 #include <uapi/linux/media-bus-format.h>
@@ -43,6 +44,11 @@
 
 #define HDMI_EDID_LEN		512
 
+/* DW-HDMI Controller >= 0x200a are at least compliant with SCDC version 1 */
+#define SCDC_MIN_SOURCE_VERSION	0x1
+
+#define HDMI14_MAX_TMDSCLK	340000000
+
 enum hdmi_datamap {
 	RGB444_8B = 0x01,
 	RGB444_10B = 0x03,
@@ -93,6 +99,7 @@ struct hdmi_vmode {
 	unsigned int mpixelclock;
 	unsigned int mpixelrepetitioninput;
 	unsigned int mpixelrepetitionoutput;
+	unsigned int mtmdsclock;
 };
 
 struct hdmi_data_info {
@@ -537,7 +544,7 @@ static void hdmi_init_clk_regenerator(struct dw_hdmi *hdmi)
 static void hdmi_clk_regenerator_update_pixel_clock(struct dw_hdmi *hdmi)
 {
 	mutex_lock(&hdmi->audio_mutex);
-	hdmi_set_clk_regenerator(hdmi, hdmi->hdmi_data.video_mode.mpixelclock,
+	hdmi_set_clk_regenerator(hdmi, hdmi->hdmi_data.video_mode.mtmdsclock,
 				 hdmi->sample_rate);
 	mutex_unlock(&hdmi->audio_mutex);
 }
@@ -546,7 +553,7 @@ void dw_hdmi_set_sample_rate(struct dw_hdmi *hdmi, unsigned int rate)
 {
 	mutex_lock(&hdmi->audio_mutex);
 	hdmi->sample_rate = rate;
-	hdmi_set_clk_regenerator(hdmi, hdmi->hdmi_data.video_mode.mpixelclock,
+	hdmi_set_clk_regenerator(hdmi, hdmi->hdmi_data.video_mode.mtmdsclock,
 				 hdmi->sample_rate);
 	mutex_unlock(&hdmi->audio_mutex);
 }
@@ -647,6 +654,20 @@ static bool hdmi_bus_fmt_is_yuv422(unsigned int bus_format)
 	}
 }
 
+static bool hdmi_bus_fmt_is_yuv420(unsigned int bus_format)
+{
+	switch (bus_format) {
+	case MEDIA_BUS_FMT_UYYVYY8_0_5X24:
+	case MEDIA_BUS_FMT_UYYVYY10_0_5X30:
+	case MEDIA_BUS_FMT_UYYVYY12_0_5X36:
+	case MEDIA_BUS_FMT_UYYVYY16_0_5X48:
+		return true;
+
+	default:
+		return false;
+	}
+}
+
 static int hdmi_bus_fmt_color_depth(unsigned int bus_format)
 {
 	switch (bus_format) {
@@ -876,7 +897,8 @@ static void hdmi_video_packetize(struct dw_hdmi *hdmi)
 	u8 val, vp_conf;
 
 	if (hdmi_bus_fmt_is_rgb(hdmi->hdmi_data.enc_out_bus_format) ||
-	    hdmi_bus_fmt_is_yuv444(hdmi->hdmi_data.enc_out_bus_format)) {
+	    hdmi_bus_fmt_is_yuv444(hdmi->hdmi_data.enc_out_bus_format) ||
+	    hdmi_bus_fmt_is_yuv420(hdmi->hdmi_data.enc_out_bus_format)) {
 		switch (hdmi_bus_fmt_color_depth(
 					hdmi->hdmi_data.enc_out_bus_format)) {
 		case 8:
@@ -1015,6 +1037,33 @@ void dw_hdmi_phy_i2c_write(struct dw_hdmi *hdmi, unsigned short data,
 }
 EXPORT_SYMBOL_GPL(dw_hdmi_phy_i2c_write);
 
+/*
+ * HDMI2.0 Specifies the following procedure for High TMDS Bit Rates:
+ * - The Source shall suspend transmission of the TMDS clock and data
+ * - The Source shall write to the TMDS_Bit_Clock_Ratio bit to change it
+ * from a 0 to a 1 or from a 1 to a 0
+ * - The Source shall allow a minimum of 1 ms and a maximum of 100 ms from
+ * the time the TMDS_Bit_Clock_Ratio bit is written until resuming
+ * transmission of TMDS clock and data
+ *
+ * To respect the 100ms maximum delay, the dw_hdmi_set_high_tmds_clock_ratio()
+ * helper should called right before enabling the TMDS Clock and Data in
+ * the PHY configuration callback.
+ */
+void dw_hdmi_set_high_tmds_clock_ratio(struct dw_hdmi *hdmi)
+{
+	unsigned long mtmdsclock = hdmi->hdmi_data.video_mode.mtmdsclock;
+
+	/* Control for TMDS Bit Period/TMDS Clock-Period Ratio */
+	if (hdmi->connector.display_info.hdmi.scdc.supported) {
+		if (mtmdsclock > HDMI14_MAX_TMDSCLK)
+			drm_scdc_set_high_tmds_clock_ratio(hdmi->ddc, 1);
+		else
+			drm_scdc_set_high_tmds_clock_ratio(hdmi->ddc, 0);
+	}
+}
+EXPORT_SYMBOL_GPL(dw_hdmi_set_high_tmds_clock_ratio);
+
 static void dw_hdmi_phy_enable_powerdown(struct dw_hdmi *hdmi, bool enable)
 {
 	hdmi_mask_writeb(hdmi, !enable, HDMI_PHY_CONF0,
@@ -1165,6 +1214,8 @@ static int hdmi_phy_configure_dwc_hdmi_3d_tx(struct dw_hdmi *hdmi,
 	const struct dw_hdmi_curr_ctrl *curr_ctrl = pdata->cur_ctr;
 	const struct dw_hdmi_phy_config *phy_config = pdata->phy_config;
 
+	/* TOFIX Will need 420 specific PHY configuration tables */
+
 	/* PLL/MPLL Cfg - always match on final entry */
 	for (; mpll_config->mpixelclock != ~0UL; mpll_config++)
 		if (mpixelclock <= mpll_config->mpixelclock)
@@ -1212,10 +1263,13 @@ static int hdmi_phy_configure(struct dw_hdmi *hdmi)
 	const struct dw_hdmi_phy_data *phy = hdmi->phy.data;
 	const struct dw_hdmi_plat_data *pdata = hdmi->plat_data;
 	unsigned long mpixelclock = hdmi->hdmi_data.video_mode.mpixelclock;
+	unsigned long mtmdsclock = hdmi->hdmi_data.video_mode.mtmdsclock;
 	int ret;
 
 	dw_hdmi_phy_power_off(hdmi);
 
+	dw_hdmi_set_high_tmds_clock_ratio(hdmi);
+
 	/* Leave low power consumption mode by asserting SVSRET. */
 	if (phy->has_svsret)
 		dw_hdmi_phy_enable_svsret(hdmi, 1);
@@ -1237,6 +1291,10 @@ static int hdmi_phy_configure(struct dw_hdmi *hdmi)
 		return ret;
 	}
 
+	/* Wait for resuming transmission of TMDS clock and data */
+	if (mtmdsclock > HDMI14_MAX_TMDSCLK)
+		msleep(100);
+
 	return dw_hdmi_phy_power_on(hdmi);
 }
 
@@ -1344,12 +1402,15 @@ static void hdmi_config_AVI(struct dw_hdmi *hdmi, struct drm_display_mode *mode)
 	u8 val;
 
 	/* Initialise info frame from DRM mode */
-	drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	drm_hdmi_avi_infoframe_from_display_mode(&frame,
+						 &hdmi->connector, mode);
 
 	if (hdmi_bus_fmt_is_yuv444(hdmi->hdmi_data.enc_out_bus_format))
 		frame.colorspace = HDMI_COLORSPACE_YUV444;
 	else if (hdmi_bus_fmt_is_yuv422(hdmi->hdmi_data.enc_out_bus_format))
 		frame.colorspace = HDMI_COLORSPACE_YUV422;
+	else if (hdmi_bus_fmt_is_yuv420(hdmi->hdmi_data.enc_out_bus_format))
+		frame.colorspace = HDMI_COLORSPACE_YUV420;
 	else
 		frame.colorspace = HDMI_COLORSPACE_RGB;
 
@@ -1503,17 +1564,23 @@ static void hdmi_config_vendor_specific_infoframe(struct dw_hdmi *hdmi,
 static void hdmi_av_composer(struct dw_hdmi *hdmi,
 			     const struct drm_display_mode *mode)
 {
-	u8 inv_val;
+	u8 inv_val, bytes;
+	struct drm_hdmi_info *hdmi_info = &hdmi->connector.display_info.hdmi;
 	struct hdmi_vmode *vmode = &hdmi->hdmi_data.video_mode;
 	int hblank, vblank, h_de_hs, v_de_vs, hsync_len, vsync_len;
-	unsigned int vdisplay;
+	unsigned int vdisplay, hdisplay;
 
-	vmode->mpixelclock = mode->clock * 1000;
+	vmode->mtmdsclock = vmode->mpixelclock = mode->clock * 1000;
 
 	dev_dbg(hdmi->dev, "final pixclk = %d\n", vmode->mpixelclock);
 
+	if (hdmi_bus_fmt_is_yuv420(hdmi->hdmi_data.enc_out_bus_format))
+		vmode->mtmdsclock /= 2;
+
 	/* Set up HDMI_FC_INVIDCONF */
-	inv_val = (hdmi->hdmi_data.hdcp_enable ?
+	inv_val = (hdmi->hdmi_data.hdcp_enable ||
+		   vmode->mtmdsclock > HDMI14_MAX_TMDSCLK ||
+		   hdmi_info->scdc.scrambling.low_rates ?
 		HDMI_FC_INVIDCONF_HDCP_KEEPOUT_ACTIVE :
 		HDMI_FC_INVIDCONF_HDCP_KEEPOUT_INACTIVE);
 
@@ -1546,6 +1613,22 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi,
 
 	hdmi_writeb(hdmi, inv_val, HDMI_FC_INVIDCONF);
 
+	hdisplay = mode->hdisplay;
+	hblank = mode->htotal - mode->hdisplay;
+	h_de_hs = mode->hsync_start - mode->hdisplay;
+	hsync_len = mode->hsync_end - mode->hsync_start;
+
+	/*
+	 * When we're setting a YCbCr420 mode, we need
+	 * to adjust the horizontal timing to suit.
+	 */
+	if (hdmi_bus_fmt_is_yuv420(hdmi->hdmi_data.enc_out_bus_format)) {
+		hdisplay /= 2;
+		hblank /= 2;
+		h_de_hs /= 2;
+		hsync_len /= 2;
+	}
+
 	vdisplay = mode->vdisplay;
 	vblank = mode->vtotal - mode->vdisplay;
 	v_de_vs = mode->vsync_start - mode->vdisplay;
@@ -1562,16 +1645,54 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi,
 		vsync_len /= 2;
 	}
 
+	/* Scrambling Control */
+	if (hdmi_info->scdc.supported) {
+		if (vmode->mtmdsclock > HDMI14_MAX_TMDSCLK ||
+		    hdmi_info->scdc.scrambling.low_rates) {
+			/*
+			 * HDMI2.0 Specifies the following procedure:
+			 * After the Source Device has determined that
+			 * SCDC_Present is set (=1), the Source Device should
+			 * write the accurate Version of the Source Device
+			 * to the Source Version field in the SCDCS.
+			 * Source Devices compliant shall set the
+			 * Source Version = 1.
+			 */
+			drm_scdc_readb(&hdmi->i2c->adap, SCDC_SINK_VERSION,
+				       &bytes);
+			drm_scdc_writeb(&hdmi->i2c->adap, SCDC_SOURCE_VERSION,
+				min_t(u8, bytes, SCDC_MIN_SOURCE_VERSION));
+
+			/* Enabled Scrambling in the Sink */
+			drm_scdc_set_scrambling(&hdmi->i2c->adap, 1);
+
+			/*
+			 * To activate the scrambler feature, you must ensure
+			 * that the quasi-static configuration bit
+			 * fc_invidconf.HDCP_keepout is set at configuration
+			 * time, before the required mc_swrstzreq.tmdsswrst_req
+			 * reset request is issued.
+			 */
+			hdmi_writeb(hdmi, (u8)~HDMI_MC_SWRSTZ_TMDSSWRST_REQ,
+				    HDMI_MC_SWRSTZ);
+			hdmi_writeb(hdmi, 1, HDMI_FC_SCRAMBLER_CTRL);
+		} else {
+			hdmi_writeb(hdmi, 0, HDMI_FC_SCRAMBLER_CTRL);
+			hdmi_writeb(hdmi, (u8)~HDMI_MC_SWRSTZ_TMDSSWRST_REQ,
+				    HDMI_MC_SWRSTZ);
+			drm_scdc_set_scrambling(&hdmi->i2c->adap, 0);
+		}
+	}
+
 	/* Set up horizontal active pixel width */
-	hdmi_writeb(hdmi, mode->hdisplay >> 8, HDMI_FC_INHACTV1);
-	hdmi_writeb(hdmi, mode->hdisplay, HDMI_FC_INHACTV0);
+	hdmi_writeb(hdmi, hdisplay >> 8, HDMI_FC_INHACTV1);
+	hdmi_writeb(hdmi, hdisplay, HDMI_FC_INHACTV0);
 
 	/* Set up vertical active lines */
 	hdmi_writeb(hdmi, vdisplay >> 8, HDMI_FC_INVACTV1);
 	hdmi_writeb(hdmi, vdisplay, HDMI_FC_INVACTV0);
 
 	/* Set up horizontal blanking pixel region width */
-	hblank = mode->htotal - mode->hdisplay;
 	hdmi_writeb(hdmi, hblank >> 8, HDMI_FC_INHBLANK1);
 	hdmi_writeb(hdmi, hblank, HDMI_FC_INHBLANK0);
 
@@ -1579,7 +1700,6 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi,
 	hdmi_writeb(hdmi, vblank, HDMI_FC_INVBLANK);
 
 	/* Set up HSYNC active edge delay width (in pixel clks) */
-	h_de_hs = mode->hsync_start - mode->hdisplay;
 	hdmi_writeb(hdmi, h_de_hs >> 8, HDMI_FC_HSYNCINDELAY1);
 	hdmi_writeb(hdmi, h_de_hs, HDMI_FC_HSYNCINDELAY0);
 
@@ -1587,7 +1707,6 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi,
 	hdmi_writeb(hdmi, v_de_vs, HDMI_FC_VSYNCINDELAY);
 
 	/* Set up HSYNC active pulse width (in pixel clks) */
-	hsync_len = mode->hsync_end - mode->hsync_start;
 	hdmi_writeb(hdmi, hsync_len >> 8, HDMI_FC_HSYNCINWIDTH1);
 	hdmi_writeb(hdmi, hsync_len, HDMI_FC_HSYNCINWIDTH0);
 
@@ -1998,8 +2117,8 @@ dw_hdmi_bridge_mode_valid(struct drm_bridge *bridge,
 }
 
 static void dw_hdmi_bridge_mode_set(struct drm_bridge *bridge,
-				    struct drm_display_mode *orig_mode,
-				    struct drm_display_mode *mode)
+				    const struct drm_display_mode *orig_mode,
+				    const struct drm_display_mode *mode)
 {
 	struct dw_hdmi *hdmi = bridge->driver_private;
 
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.h b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.h
index 9d90eb9c46e5..3f3c616eba97 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.h
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.h
@@ -255,6 +255,7 @@
 #define HDMI_FC_MASK2                           0x10DA
 #define HDMI_FC_POL2                            0x10DB
 #define HDMI_FC_PRCONF                          0x10E0
+#define HDMI_FC_SCRAMBLER_CTRL                  0x10E1
 
 #define HDMI_FC_GMD_STAT                        0x1100
 #define HDMI_FC_GMD_EN                          0x1101
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
index 2f4b145b73af..e915ae8c9a92 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
@@ -19,9 +19,9 @@
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_bridge.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_mipi_dsi.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/bridge/dw_mipi_dsi.h>
 #include <video/mipi_display.h>
 
@@ -248,7 +248,7 @@ static inline bool dw_mipi_is_dual_mode(struct dw_mipi_dsi *dsi)
  * The controller should generate 2 frames before
  * preparing the peripheral.
  */
-static void dw_mipi_dsi_wait_for_two_frames(struct drm_display_mode *mode)
+static void dw_mipi_dsi_wait_for_two_frames(const struct drm_display_mode *mode)
 {
 	int refresh, two_frames;
 
@@ -564,7 +564,7 @@ static void dw_mipi_dsi_init(struct dw_mipi_dsi *dsi)
 }
 
 static void dw_mipi_dsi_dpi_config(struct dw_mipi_dsi *dsi,
-				   struct drm_display_mode *mode)
+				   const struct drm_display_mode *mode)
 {
 	u32 val = 0, color = 0;
 
@@ -607,7 +607,7 @@ static void dw_mipi_dsi_packet_handler_config(struct dw_mipi_dsi *dsi)
 }
 
 static void dw_mipi_dsi_video_packet_config(struct dw_mipi_dsi *dsi,
-					    struct drm_display_mode *mode)
+					    const struct drm_display_mode *mode)
 {
 	/*
 	 * TODO dw drv improvements
@@ -642,7 +642,7 @@ static void dw_mipi_dsi_command_mode_config(struct dw_mipi_dsi *dsi)
 
 /* Get lane byte clock cycles. */
 static u32 dw_mipi_dsi_get_hcomponent_lbcc(struct dw_mipi_dsi *dsi,
-					   struct drm_display_mode *mode,
+					   const struct drm_display_mode *mode,
 					   u32 hcomponent)
 {
 	u32 frac, lbcc;
@@ -658,7 +658,7 @@ static u32 dw_mipi_dsi_get_hcomponent_lbcc(struct dw_mipi_dsi *dsi,
 }
 
 static void dw_mipi_dsi_line_timer_config(struct dw_mipi_dsi *dsi,
-					  struct drm_display_mode *mode)
+					  const struct drm_display_mode *mode)
 {
 	u32 htotal, hsa, hbp, lbcc;
 
@@ -681,7 +681,7 @@ static void dw_mipi_dsi_line_timer_config(struct dw_mipi_dsi *dsi,
 }
 
 static void dw_mipi_dsi_vertical_timing_config(struct dw_mipi_dsi *dsi,
-					       struct drm_display_mode *mode)
+					const struct drm_display_mode *mode)
 {
 	u32 vactive, vsa, vfp, vbp;
 
@@ -818,7 +818,7 @@ static unsigned int dw_mipi_dsi_get_lanes(struct dw_mipi_dsi *dsi)
 }
 
 static void dw_mipi_dsi_mode_set(struct dw_mipi_dsi *dsi,
-				struct drm_display_mode *adjusted_mode)
+				 const struct drm_display_mode *adjusted_mode)
 {
 	const struct dw_mipi_dsi_phy_ops *phy_ops = dsi->plat_data->phy_ops;
 	void *priv_data = dsi->plat_data->priv_data;
@@ -861,8 +861,8 @@ static void dw_mipi_dsi_mode_set(struct dw_mipi_dsi *dsi,
 }
 
 static void dw_mipi_dsi_bridge_mode_set(struct drm_bridge *bridge,
-					struct drm_display_mode *mode,
-					struct drm_display_mode *adjusted_mode)
+					const struct drm_display_mode *mode,
+					const struct drm_display_mode *adjusted_mode)
 {
 	struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
 
diff --git a/drivers/gpu/drm/bridge/tc358764.c b/drivers/gpu/drm/bridge/tc358764.c
index afd491018bfc..a20e454ddd64 100644
--- a/drivers/gpu/drm/bridge/tc358764.c
+++ b/drivers/gpu/drm/bridge/tc358764.c
@@ -9,11 +9,11 @@
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_mipi_dsi.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 #include <linux/gpio/consumer.h>
 #include <linux/of_graph.h>
diff --git a/drivers/gpu/drm/bridge/tc358767.c b/drivers/gpu/drm/bridge/tc358767.c
index e6403b9549f1..888980d4bc74 100644
--- a/drivers/gpu/drm/bridge/tc358767.c
+++ b/drivers/gpu/drm/bridge/tc358767.c
@@ -34,11 +34,11 @@
 #include <linux/slab.h>
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 /* Registers */
 
@@ -208,7 +208,7 @@ struct tc_data {
 	/* display edid */
 	struct edid		*edid;
 	/* current mode */
-	struct drm_display_mode	*mode;
+	const struct drm_display_mode	*mode;
 
 	u32			rev;
 	u8			assr;
@@ -657,7 +657,8 @@ err_dpcd_read:
 	return ret;
 }
 
-static int tc_set_video_mode(struct tc_data *tc, struct drm_display_mode *mode)
+static int tc_set_video_mode(struct tc_data *tc,
+			     const struct drm_display_mode *mode)
 {
 	int ret;
 	int vid_sync_dly;
@@ -1136,8 +1137,8 @@ static enum drm_mode_status tc_connector_mode_valid(struct drm_connector *connec
 }
 
 static void tc_bridge_mode_set(struct drm_bridge *bridge,
-			       struct drm_display_mode *mode,
-			       struct drm_display_mode *adj)
+			       const struct drm_display_mode *mode,
+			       const struct drm_display_mode *adj)
 {
 	struct tc_data *tc = bridge_to_tc(bridge);
 
diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 10243965ee7c..f72ee137e5f1 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -6,11 +6,11 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_mipi_dsi.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/gpio/consumer.h>
 #include <linux/i2c.h>
diff --git a/drivers/gpu/drm/bridge/ti-tfp410.c b/drivers/gpu/drm/bridge/ti-tfp410.c
index c3e32138c6bb..7bfb4f338813 100644
--- a/drivers/gpu/drm/bridge/ti-tfp410.c
+++ b/drivers/gpu/drm/bridge/ti-tfp410.c
@@ -20,7 +20,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #define HOTPLUG_DEBOUNCE_MS		1100
 
diff --git a/drivers/gpu/drm/cirrus/cirrus_drv.c b/drivers/gpu/drm/cirrus/cirrus_drv.c
index db40b77c7f7c..8ec880f3a322 100644
--- a/drivers/gpu/drm/cirrus/cirrus_drv.c
+++ b/drivers/gpu/drm/cirrus/cirrus_drv.c
@@ -12,6 +12,7 @@
 #include <linux/console.h>
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "cirrus_drv.h"
 
diff --git a/drivers/gpu/drm/cirrus/cirrus_fbdev.c b/drivers/gpu/drm/cirrus/cirrus_fbdev.c
index 4dd499c7d1ba..39df62acac69 100644
--- a/drivers/gpu/drm/cirrus/cirrus_fbdev.c
+++ b/drivers/gpu/drm/cirrus/cirrus_fbdev.c
@@ -10,6 +10,7 @@
  */
 #include <linux/module.h>
 #include <drm/drmP.h>
+#include <drm/drm_util.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_crtc_helper.h>
 
@@ -256,6 +257,8 @@ static int cirrus_fbdev_destroy(struct drm_device *dev,
 {
 	struct drm_framebuffer *gfb = gfbdev->gfb;
 
+	drm_helper_force_disable_all(dev);
+
 	drm_fb_helper_unregister_fbi(&gfbdev->helper);
 
 	vfree(gfbdev->sysram);
diff --git a/drivers/gpu/drm/cirrus/cirrus_mode.c b/drivers/gpu/drm/cirrus/cirrus_mode.c
index ed7dcf212a34..7f9bc32af685 100644
--- a/drivers/gpu/drm/cirrus/cirrus_mode.c
+++ b/drivers/gpu/drm/cirrus/cirrus_mode.c
@@ -17,6 +17,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include <video/cirrus.h>
 
@@ -359,10 +360,70 @@ static const struct drm_crtc_helper_funcs cirrus_helper_funcs = {
 };
 
 /* CRTC setup */
+static const uint32_t cirrus_formats_16[] = {
+	DRM_FORMAT_RGB565,
+};
+
+static const uint32_t cirrus_formats_24[] = {
+	DRM_FORMAT_RGB888,
+	DRM_FORMAT_RGB565,
+};
+
+static const uint32_t cirrus_formats_32[] = {
+	DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_ARGB8888,
+	DRM_FORMAT_RGB888,
+	DRM_FORMAT_RGB565,
+};
+
+static struct drm_plane *cirrus_primary_plane(struct drm_device *dev)
+{
+	const uint32_t *formats;
+	uint32_t nformats;
+	struct drm_plane *primary;
+	int ret;
+
+	switch (cirrus_bpp) {
+	case 16:
+		formats = cirrus_formats_16;
+		nformats = ARRAY_SIZE(cirrus_formats_16);
+		break;
+	case 24:
+		formats = cirrus_formats_24;
+		nformats = ARRAY_SIZE(cirrus_formats_24);
+		break;
+	case 32:
+		formats = cirrus_formats_32;
+		nformats = ARRAY_SIZE(cirrus_formats_32);
+		break;
+	default:
+		return NULL;
+	}
+
+	primary = kzalloc(sizeof(*primary), GFP_KERNEL);
+	if (primary == NULL) {
+		DRM_DEBUG_KMS("Failed to allocate primary plane\n");
+		return NULL;
+	}
+
+	ret = drm_universal_plane_init(dev, primary, 0,
+				       &drm_primary_helper_funcs,
+				       formats, nformats,
+				       NULL,
+				       DRM_PLANE_TYPE_PRIMARY, NULL);
+	if (ret) {
+		kfree(primary);
+		primary = NULL;
+	}
+
+	return primary;
+}
+
 static void cirrus_crtc_init(struct drm_device *dev)
 {
 	struct cirrus_device *cdev = dev->dev_private;
 	struct cirrus_crtc *cirrus_crtc;
+	struct drm_plane *primary;
 
 	cirrus_crtc = kzalloc(sizeof(struct cirrus_crtc) +
 			      (CIRRUSFB_CONN_LIMIT * sizeof(struct drm_connector *)),
@@ -371,7 +432,15 @@ static void cirrus_crtc_init(struct drm_device *dev)
 	if (cirrus_crtc == NULL)
 		return;
 
-	drm_crtc_init(dev, &cirrus_crtc->base, &cirrus_crtc_funcs);
+	primary = cirrus_primary_plane(dev);
+	if (primary == NULL) {
+		kfree(cirrus_crtc);
+		return;
+	}
+
+	drm_crtc_init_with_planes(dev, &cirrus_crtc->base,
+				  primary, NULL,
+				  &cirrus_crtc_funcs, NULL);
 
 	drm_mode_crtc_set_gamma_size(&cirrus_crtc->base, CIRRUS_LUT_SIZE);
 	cdev->mode_info.crtc = cirrus_crtc;
diff --git a/drivers/gpu/drm/drm_agpsupport.c b/drivers/gpu/drm/drm_agpsupport.c
index 737f02885c28..40fba1c04dfc 100644
--- a/drivers/gpu/drm/drm_agpsupport.c
+++ b/drivers/gpu/drm/drm_agpsupport.c
@@ -348,7 +348,7 @@ int drm_agp_bind_ioctl(struct drm_device *dev, void *data,
  * \return zero on success or a negative number on failure.
  *
  * Verifies the AGP device is present and has been acquired and looks up the
- * AGP memory entry. If the memory it's currently bound, unbind it via
+ * AGP memory entry. If the memory is currently bound, unbind it via
  * unbind_agp(). Frees it via free_agp() as well as the entry itself
  * and unlinks from the doubly linked list it's inserted in.
  */
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 48ec378fb27e..5eb40130fafb 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -698,6 +698,7 @@ static void drm_atomic_plane_print_state(struct drm_printer *p,
 
 /**
  * drm_atomic_private_obj_init - initialize private object
+ * @dev: DRM device this object will be attached to
  * @obj: private object
  * @state: initial private object state
  * @funcs: pointer to the struct of function pointers that identify the object
@@ -707,14 +708,18 @@ static void drm_atomic_plane_print_state(struct drm_printer *p,
  * driver private object that needs its own atomic state.
  */
 void
-drm_atomic_private_obj_init(struct drm_private_obj *obj,
+drm_atomic_private_obj_init(struct drm_device *dev,
+			    struct drm_private_obj *obj,
 			    struct drm_private_state *state,
 			    const struct drm_private_state_funcs *funcs)
 {
 	memset(obj, 0, sizeof(*obj));
 
+	drm_modeset_lock_init(&obj->lock);
+
 	obj->state = state;
 	obj->funcs = funcs;
+	list_add_tail(&obj->head, &dev->mode_config.privobj_list);
 }
 EXPORT_SYMBOL(drm_atomic_private_obj_init);
 
@@ -727,7 +732,9 @@ EXPORT_SYMBOL(drm_atomic_private_obj_init);
 void
 drm_atomic_private_obj_fini(struct drm_private_obj *obj)
 {
+	list_del(&obj->head);
 	obj->funcs->atomic_destroy_state(obj, obj->state);
+	drm_modeset_lock_fini(&obj->lock);
 }
 EXPORT_SYMBOL(drm_atomic_private_obj_fini);
 
@@ -737,8 +744,8 @@ EXPORT_SYMBOL(drm_atomic_private_obj_fini);
  * @obj: private object to get the state for
  *
  * This function returns the private object state for the given private object,
- * allocating the state if needed. It does not grab any locks as the caller is
- * expected to care of any required locking.
+ * allocating the state if needed. It will also grab the relevant private
+ * object lock to make sure that the state is consistent.
  *
  * RETURNS:
  *
@@ -748,7 +755,7 @@ struct drm_private_state *
 drm_atomic_get_private_obj_state(struct drm_atomic_state *state,
 				 struct drm_private_obj *obj)
 {
-	int index, num_objs, i;
+	int index, num_objs, i, ret;
 	size_t size;
 	struct __drm_private_objs_state *arr;
 	struct drm_private_state *obj_state;
@@ -757,6 +764,10 @@ drm_atomic_get_private_obj_state(struct drm_atomic_state *state,
 		if (obj == state->private_objs[i].ptr)
 			return state->private_objs[i].state;
 
+	ret = drm_modeset_lock(&obj->lock, state->acquire_ctx);
+	if (ret)
+		return ERR_PTR(ret);
+
 	num_objs = state->num_private_objs + 1;
 	size = sizeof(*state->private_objs) * num_objs;
 	arr = krealloc(state->private_objs, size, GFP_KERNEL);
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index f4290f6b0c38..540a77a2ade9 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -29,7 +29,6 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_uapi.h>
 #include <drm/drm_plane_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_writeback.h>
 #include <drm/drm_damage_helper.h>
@@ -331,10 +330,17 @@ update_connector_routing(struct drm_atomic_state *state,
 	 * Since the connector can be unregistered at any point during an
 	 * atomic check or commit, this is racy. But that's OK: all we care
 	 * about is ensuring that userspace can't do anything but shut off the
-	 * display on a connector that was destroyed after its been notified,
+	 * display on a connector that was destroyed after it's been notified,
 	 * not before.
+	 *
+	 * Additionally, we also want to ignore connector registration when
+	 * we're trying to restore an atomic state during system resume since
+	 * there's a chance the connector may have been destroyed during the
+	 * process, but it's better to ignore that then cause
+	 * drm_atomic_helper_resume() to fail.
 	 */
-	if (drm_connector_is_unregistered(connector) && crtc_state->active) {
+	if (!state->duplicated && drm_connector_is_unregistered(connector) &&
+	    crtc_state->active) {
 		DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] is not registered\n",
 				 connector->base.id, connector->name);
 		return -EINVAL;
@@ -686,7 +692,7 @@ drm_atomic_helper_check_modeset(struct drm_device *dev,
 
 	/*
 	 * After all the routing has been prepared we need to add in any
-	 * connector which is itself unchanged, but who's crtc changes it's
+	 * connector which is itself unchanged, but whose crtc changes its
 	 * configuration. This must be done before calling mode_fixup in case a
 	 * crtc only changed its mode but has the same set of connectors.
 	 */
@@ -1680,7 +1686,7 @@ EXPORT_SYMBOL(drm_atomic_helper_async_commit);
  * drm_atomic_helper_setup_commit() and related functions.
  *
  * Committing the actual hardware state is done through the
- * &drm_mode_config_helper_funcs.atomic_commit_tail callback, or it's default
+ * &drm_mode_config_helper_funcs.atomic_commit_tail callback, or its default
  * implementation drm_atomic_helper_commit_tail().
  *
  * RETURNS:
@@ -1903,7 +1909,7 @@ crtc_or_fake_commit(struct drm_atomic_state *state, struct drm_crtc *crtc)
  * functions. drm_atomic_helper_wait_for_dependencies() must be called before
  * actually committing the hardware state, and for nonblocking commits this call
  * must be placed in the async worker. See also drm_atomic_helper_swap_state()
- * and it's stall parameter, for when a driver's commit hooks look at the
+ * and its stall parameter, for when a driver's commit hooks look at the
  * &drm_crtc.state, &drm_plane.state or &drm_connector.state pointer directly.
  *
  * Completion of the hardware commit step must be signalled using
@@ -3190,6 +3196,7 @@ drm_atomic_helper_duplicate_state(struct drm_device *dev,
 		return ERR_PTR(-ENOMEM);
 
 	state->acquire_ctx = ctx;
+	state->duplicated = true;
 
 	drm_for_each_crtc(crtc, dev) {
 		struct drm_crtc_state *crtc_state;
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c
index 9a1f41adfc67..0aabd401d3ca 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -44,8 +44,8 @@
  * DOC: overview
  *
  * This file contains the marshalling and demarshalling glue for the atomic UAPI
- * in all it's form: The monster ATOMIC IOCTL itself, code for GET_PROPERTY and
- * SET_PROPERTY IOCTls. Plus interface functions for compatibility helpers and
+ * in all its forms: The monster ATOMIC IOCTL itself, code for GET_PROPERTY and
+ * SET_PROPERTY IOCTLs. Plus interface functions for compatibility helpers and
  * drivers which have special needs to construct their own atomic updates, e.g.
  * for load detect or similiar.
  */
diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index ba7025041e46..138b2711d389 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -294,8 +294,8 @@ EXPORT_SYMBOL(drm_bridge_post_disable);
  * Note: the bridge passed should be the one closest to the encoder
  */
 void drm_bridge_mode_set(struct drm_bridge *bridge,
-			struct drm_display_mode *mode,
-			struct drm_display_mode *adjusted_mode)
+			 const struct drm_display_mode *mode,
+			 const struct drm_display_mode *adjusted_mode)
 {
 	if (!bridge)
 		return;
diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
index d7d10cabb9bb..e407adb033e7 100644
--- a/drivers/gpu/drm/drm_bufs.c
+++ b/drivers/gpu/drm/drm_bufs.c
@@ -377,6 +377,17 @@ int drm_legacy_addmap(struct drm_device *dev, resource_size_t offset,
 }
 EXPORT_SYMBOL(drm_legacy_addmap);
 
+struct drm_local_map *drm_legacy_findmap(struct drm_device *dev,
+					 unsigned int token)
+{
+	struct drm_map_list *_entry;
+	list_for_each_entry(_entry, &dev->maplist, head)
+		if (_entry->user_token == token)
+			return _entry->map;
+	return NULL;
+}
+EXPORT_SYMBOL(drm_legacy_findmap);
+
 /**
  * Ioctl to specify a range of memory that is available for mapping by a
  * non-root process.
@@ -483,7 +494,7 @@ int drm_legacy_getmap_ioctl(struct drm_device *dev, void *data,
  * isn't in use.
  *
  * Searches the map on drm_device::maplist, removes it from the list, see if
- * its being used, and free any associate resource (such as MTRR's) if it's not
+ * it's being used, and free any associated resource (such as MTRR's) if it's not
  * being on use.
  *
  * \sa drm_legacy_addmap
@@ -610,7 +621,7 @@ int drm_legacy_rmmap_ioctl(struct drm_device *dev, void *data,
 		}
 	}
 
-	/* List has wrapped around to the head pointer, or its empty we didn't
+	/* List has wrapped around to the head pointer, or it's empty we didn't
 	 * find anything.
 	 */
 	if (list_empty(&dev->maplist) || !map) {
diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c
index 07dcf47daafe..d5d34d0c79c7 100644
--- a/drivers/gpu/drm/drm_color_mgmt.c
+++ b/drivers/gpu/drm/drm_color_mgmt.c
@@ -462,3 +462,46 @@ int drm_plane_create_color_properties(struct drm_plane *plane,
 	return 0;
 }
 EXPORT_SYMBOL(drm_plane_create_color_properties);
+
+/**
+ * drm_color_lut_check - check validity of lookup table
+ * @lut: property blob containing LUT to check
+ * @tests: bitmask of tests to run
+ *
+ * Helper to check whether a userspace-provided lookup table is valid and
+ * satisfies hardware requirements.  Drivers pass a bitmask indicating which of
+ * the tests in &drm_color_lut_tests should be performed.
+ *
+ * Returns 0 on success, -EINVAL on failure.
+ */
+int drm_color_lut_check(const struct drm_property_blob *lut, u32 tests)
+{
+	const struct drm_color_lut *entry;
+	int i;
+
+	if (!lut || !tests)
+		return 0;
+
+	entry = lut->data;
+	for (i = 0; i < drm_color_lut_size(lut); i++) {
+		if (tests & DRM_COLOR_LUT_EQUAL_CHANNELS) {
+			if (entry[i].red != entry[i].blue ||
+			    entry[i].red != entry[i].green) {
+				DRM_DEBUG_KMS("All LUT entries must have equal r/g/b\n");
+				return -EINVAL;
+			}
+		}
+
+		if (i > 0 && tests & DRM_COLOR_LUT_NON_DECREASING) {
+			if (entry[i].red < entry[i - 1].red ||
+			    entry[i].green < entry[i - 1].green ||
+			    entry[i].blue < entry[i - 1].blue) {
+				DRM_DEBUG_KMS("LUT entries must never decrease.\n");
+				return -EINVAL;
+			}
+		}
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(drm_color_lut_check);
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index da8ae80c2750..dd40eff0911c 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -1066,7 +1066,7 @@ EXPORT_SYMBOL(drm_mode_create_dvi_i_properties);
  *
  * content type (HDMI specific):
  *	Indicates content type setting to be used in HDMI infoframes to indicate
- *	content type for the external device, so that it adjusts it's display
+ *	content type for the external device, so that it adjusts its display
  *	settings accordingly.
  *
  *	The value of this property can be one of the following:
@@ -1138,7 +1138,71 @@ void drm_hdmi_avi_infoframe_content_type(struct hdmi_avi_infoframe *frame,
 EXPORT_SYMBOL(drm_hdmi_avi_infoframe_content_type);
 
 /**
- * drm_create_tv_properties - create TV specific connector properties
+ * drm_mode_attach_tv_margin_properties - attach TV connector margin properties
+ * @connector: DRM connector
+ *
+ * Called by a driver when it needs to attach TV margin props to a connector.
+ * Typically used on SDTV and HDMI connectors.
+ */
+void drm_connector_attach_tv_margin_properties(struct drm_connector *connector)
+{
+	struct drm_device *dev = connector->dev;
+
+	drm_object_attach_property(&connector->base,
+				   dev->mode_config.tv_left_margin_property,
+				   0);
+	drm_object_attach_property(&connector->base,
+				   dev->mode_config.tv_right_margin_property,
+				   0);
+	drm_object_attach_property(&connector->base,
+				   dev->mode_config.tv_top_margin_property,
+				   0);
+	drm_object_attach_property(&connector->base,
+				   dev->mode_config.tv_bottom_margin_property,
+				   0);
+}
+EXPORT_SYMBOL(drm_connector_attach_tv_margin_properties);
+
+/**
+ * drm_mode_create_tv_margin_properties - create TV connector margin properties
+ * @dev: DRM device
+ *
+ * Called by a driver's HDMI connector initialization routine, this function
+ * creates the TV margin properties for a given device. No need to call this
+ * function for an SDTV connector, it's already called from
+ * drm_mode_create_tv_properties().
+ */
+int drm_mode_create_tv_margin_properties(struct drm_device *dev)
+{
+	if (dev->mode_config.tv_left_margin_property)
+		return 0;
+
+	dev->mode_config.tv_left_margin_property =
+		drm_property_create_range(dev, 0, "left margin", 0, 100);
+	if (!dev->mode_config.tv_left_margin_property)
+		return -ENOMEM;
+
+	dev->mode_config.tv_right_margin_property =
+		drm_property_create_range(dev, 0, "right margin", 0, 100);
+	if (!dev->mode_config.tv_right_margin_property)
+		return -ENOMEM;
+
+	dev->mode_config.tv_top_margin_property =
+		drm_property_create_range(dev, 0, "top margin", 0, 100);
+	if (!dev->mode_config.tv_top_margin_property)
+		return -ENOMEM;
+
+	dev->mode_config.tv_bottom_margin_property =
+		drm_property_create_range(dev, 0, "bottom margin", 0, 100);
+	if (!dev->mode_config.tv_bottom_margin_property)
+		return -ENOMEM;
+
+	return 0;
+}
+EXPORT_SYMBOL(drm_mode_create_tv_margin_properties);
+
+/**
+ * drm_mode_create_tv_properties - create TV specific connector properties
  * @dev: DRM device
  * @num_modes: number of different TV formats (modes) supported
  * @modes: array of pointers to strings containing name of each format
@@ -1183,24 +1247,7 @@ int drm_mode_create_tv_properties(struct drm_device *dev,
 	/*
 	 * Other, TV specific properties: margins & TV modes.
 	 */
-	dev->mode_config.tv_left_margin_property =
-		drm_property_create_range(dev, 0, "left margin", 0, 100);
-	if (!dev->mode_config.tv_left_margin_property)
-		goto nomem;
-
-	dev->mode_config.tv_right_margin_property =
-		drm_property_create_range(dev, 0, "right margin", 0, 100);
-	if (!dev->mode_config.tv_right_margin_property)
-		goto nomem;
-
-	dev->mode_config.tv_top_margin_property =
-		drm_property_create_range(dev, 0, "top margin", 0, 100);
-	if (!dev->mode_config.tv_top_margin_property)
-		goto nomem;
-
-	dev->mode_config.tv_bottom_margin_property =
-		drm_property_create_range(dev, 0, "bottom margin", 0, 100);
-	if (!dev->mode_config.tv_bottom_margin_property)
+	if (drm_mode_create_tv_margin_properties(dev))
 		goto nomem;
 
 	dev->mode_config.tv_mode_property =
@@ -1320,7 +1367,7 @@ EXPORT_SYMBOL(drm_mode_create_scaling_mode_property);
  *
  *	Absence of the property should indicate absence of support.
  *
- * "vrr_enabled":
+ * "VRR_ENABLED":
  *	Default &drm_crtc boolean property that notifies the driver that the
  *	content on the CRTC is suitable for variable refresh rate presentation.
  *	The driver will take this property as a hint to enable variable
@@ -2077,7 +2124,7 @@ EXPORT_SYMBOL(drm_mode_get_tile_group);
  * identifier for the tile group.
  *
  * RETURNS:
- * new tile group or error.
+ * new tile group or NULL.
  */
 struct drm_tile_group *drm_mode_create_tile_group(struct drm_device *dev,
 						  char topology[8])
@@ -2087,7 +2134,7 @@ struct drm_tile_group *drm_mode_create_tile_group(struct drm_device *dev,
 
 	tg = kzalloc(sizeof(*tg), GFP_KERNEL);
 	if (!tg)
-		return ERR_PTR(-ENOMEM);
+		return NULL;
 
 	kref_init(&tg->refcount);
 	memcpy(tg->group_data, topology, 8);
@@ -2099,7 +2146,7 @@ struct drm_tile_group *drm_mode_create_tile_group(struct drm_device *dev,
 		tg->id = ret;
 	} else {
 		kfree(tg);
-		tg = ERR_PTR(ret);
+		tg = NULL;
 	}
 
 	mutex_unlock(&dev->mode_config.idr_mutex);
diff --git a/drivers/gpu/drm/drm_context.c b/drivers/gpu/drm/drm_context.c
index 506663c69b0a..6e8e1a9fcae3 100644
--- a/drivers/gpu/drm/drm_context.c
+++ b/drivers/gpu/drm/drm_context.c
@@ -361,23 +361,26 @@ int drm_legacy_addctx(struct drm_device *dev, void *data,
 {
 	struct drm_ctx_list *ctx_entry;
 	struct drm_ctx *ctx = data;
+	int tmp_handle;
 
 	if (!drm_core_check_feature(dev, DRIVER_KMS_LEGACY_CONTEXT) &&
 	    !drm_core_check_feature(dev, DRIVER_LEGACY))
 		return -EOPNOTSUPP;
 
-	ctx->handle = drm_legacy_ctxbitmap_next(dev);
-	if (ctx->handle == DRM_KERNEL_CONTEXT) {
+	tmp_handle = drm_legacy_ctxbitmap_next(dev);
+	if (tmp_handle == DRM_KERNEL_CONTEXT) {
 		/* Skip kernel's context and get a new one. */
-		ctx->handle = drm_legacy_ctxbitmap_next(dev);
+		tmp_handle = drm_legacy_ctxbitmap_next(dev);
 	}
-	DRM_DEBUG("%d\n", ctx->handle);
-	if (ctx->handle < 0) {
+	DRM_DEBUG("%d\n", tmp_handle);
+	if (tmp_handle < 0) {
 		DRM_DEBUG("Not enough free contexts.\n");
 		/* Should this return -EBUSY instead? */
-		return -ENOMEM;
+		return tmp_handle;
 	}
 
+	ctx->handle = tmp_handle;
+
 	ctx_entry = kmalloc(sizeof(*ctx_entry), GFP_KERNEL);
 	if (!ctx_entry) {
 		DRM_DEBUG("out of memory\n");
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index 1593dd6cdfb7..7dabbaf033a1 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -93,15 +93,6 @@ struct drm_crtc *drm_crtc_from_index(struct drm_device *dev, int idx)
 }
 EXPORT_SYMBOL(drm_crtc_from_index);
 
-/**
- * drm_crtc_force_disable - Forcibly turn off a CRTC
- * @crtc: CRTC to turn off
- *
- * Note: This should only be used by non-atomic legacy drivers.
- *
- * Returns:
- * Zero on success, error code on failure.
- */
 int drm_crtc_force_disable(struct drm_crtc *crtc)
 {
 	struct drm_mode_set set = {
@@ -112,38 +103,6 @@ int drm_crtc_force_disable(struct drm_crtc *crtc)
 
 	return drm_mode_set_config_internal(&set);
 }
-EXPORT_SYMBOL(drm_crtc_force_disable);
-
-/**
- * drm_crtc_force_disable_all - Forcibly turn off all enabled CRTCs
- * @dev: DRM device whose CRTCs to turn off
- *
- * Drivers may want to call this on unload to ensure that all displays are
- * unlit and the GPU is in a consistent, low power state. Takes modeset locks.
- *
- * Note: This should only be used by non-atomic legacy drivers. For an atomic
- * version look at drm_atomic_helper_shutdown().
- *
- * Returns:
- * Zero on success, error code on failure.
- */
-int drm_crtc_force_disable_all(struct drm_device *dev)
-{
-	struct drm_crtc *crtc;
-	int ret = 0;
-
-	drm_modeset_lock_all(dev);
-	drm_for_each_crtc(crtc, dev)
-		if (crtc->enabled) {
-			ret = drm_crtc_force_disable(crtc);
-			if (ret)
-				goto out;
-		}
-out:
-	drm_modeset_unlock_all(dev);
-	return ret;
-}
-EXPORT_SYMBOL(drm_crtc_force_disable_all);
 
 static unsigned int drm_num_crtcs(struct drm_device *dev)
 {
diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index a3c81850e755..747661f63fbb 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -93,6 +93,8 @@ bool drm_helper_encoder_in_use(struct drm_encoder *encoder)
 	struct drm_connector_list_iter conn_iter;
 	struct drm_device *dev = encoder->dev;
 
+	WARN_ON(drm_drv_uses_atomic_modeset(dev));
+
 	/*
 	 * We can expect this mutex to be locked if we are not panicking.
 	 * Locking is currently fubar in the panic handler.
@@ -131,6 +133,8 @@ bool drm_helper_crtc_in_use(struct drm_crtc *crtc)
 	struct drm_encoder *encoder;
 	struct drm_device *dev = crtc->dev;
 
+	WARN_ON(drm_drv_uses_atomic_modeset(dev));
+
 	/*
 	 * We can expect this mutex to be locked if we are not panicking.
 	 * Locking is currently fubar in the panic handler.
@@ -212,8 +216,7 @@ static void __drm_helper_disable_unused_functions(struct drm_device *dev)
  */
 void drm_helper_disable_unused_functions(struct drm_device *dev)
 {
-	if (drm_core_check_feature(dev, DRIVER_ATOMIC))
-		DRM_ERROR("Called for atomic driver, this is not what you want.\n");
+	WARN_ON(drm_drv_uses_atomic_modeset(dev));
 
 	drm_modeset_lock_all(dev);
 	__drm_helper_disable_unused_functions(dev);
@@ -281,6 +284,8 @@ bool drm_crtc_helper_set_mode(struct drm_crtc *crtc,
 	struct drm_encoder *encoder;
 	bool ret = true;
 
+	WARN_ON(drm_drv_uses_atomic_modeset(dev));
+
 	drm_warn_on_modeset_not_all_locked(dev);
 
 	saved_enabled = crtc->enabled;
@@ -386,9 +391,8 @@ bool drm_crtc_helper_set_mode(struct drm_crtc *crtc,
 		if (!encoder_funcs)
 			continue;
 
-		DRM_DEBUG_KMS("[ENCODER:%d:%s] set [MODE:%d:%s]\n",
-			encoder->base.id, encoder->name,
-			mode->base.id, mode->name);
+		DRM_DEBUG_KMS("[ENCODER:%d:%s] set [MODE:%s]\n",
+			encoder->base.id, encoder->name, mode->name);
 		if (encoder_funcs->mode_set)
 			encoder_funcs->mode_set(encoder, mode, adjusted_mode);
 
@@ -540,6 +544,9 @@ int drm_crtc_helper_set_config(struct drm_mode_set *set,
 
 	crtc_funcs = set->crtc->helper_private;
 
+	dev = set->crtc->dev;
+	WARN_ON(drm_drv_uses_atomic_modeset(dev));
+
 	if (!set->mode)
 		set->fb = NULL;
 
@@ -555,8 +562,6 @@ int drm_crtc_helper_set_config(struct drm_mode_set *set,
 		return 0;
 	}
 
-	dev = set->crtc->dev;
-
 	drm_warn_on_modeset_not_all_locked(dev);
 
 	/*
@@ -875,6 +880,8 @@ int drm_helper_connector_dpms(struct drm_connector *connector, int mode)
 	struct drm_crtc *crtc = encoder ? encoder->crtc : NULL;
 	int old_dpms, encoder_dpms = DRM_MODE_DPMS_OFF;
 
+	WARN_ON(drm_drv_uses_atomic_modeset(connector->dev));
+
 	if (mode == connector->dpms)
 		return 0;
 
@@ -946,6 +953,8 @@ void drm_helper_resume_force_mode(struct drm_device *dev)
 	int encoder_dpms;
 	bool ret;
 
+	WARN_ON(drm_drv_uses_atomic_modeset(dev));
+
 	drm_modeset_lock_all(dev);
 	drm_for_each_crtc(crtc, dev) {
 
@@ -984,3 +993,38 @@ void drm_helper_resume_force_mode(struct drm_device *dev)
 	drm_modeset_unlock_all(dev);
 }
 EXPORT_SYMBOL(drm_helper_resume_force_mode);
+
+/**
+ * drm_helper_force_disable_all - Forcibly turn off all enabled CRTCs
+ * @dev: DRM device whose CRTCs to turn off
+ *
+ * Drivers may want to call this on unload to ensure that all displays are
+ * unlit and the GPU is in a consistent, low power state. Takes modeset locks.
+ *
+ * Note: This should only be used by non-atomic legacy drivers. For an atomic
+ * version look at drm_atomic_helper_shutdown().
+ *
+ * Returns:
+ * Zero on success, error code on failure.
+ */
+int drm_helper_force_disable_all(struct drm_device *dev)
+{
+	struct drm_crtc *crtc;
+	int ret = 0;
+
+	drm_modeset_lock_all(dev);
+	drm_for_each_crtc(crtc, dev)
+		if (crtc->enabled) {
+			struct drm_mode_set set = {
+				.crtc = crtc,
+			};
+
+			ret = drm_mode_set_config_internal(&set);
+			if (ret)
+				goto out;
+		}
+out:
+	drm_modeset_unlock_all(dev);
+	return ret;
+}
+EXPORT_SYMBOL(drm_helper_force_disable_all);
diff --git a/drivers/gpu/drm/drm_crtc_internal.h b/drivers/gpu/drm/drm_crtc_internal.h
index 86893448f486..216f2a9ee3d4 100644
--- a/drivers/gpu/drm/drm_crtc_internal.h
+++ b/drivers/gpu/drm/drm_crtc_internal.h
@@ -50,6 +50,7 @@ int drm_crtc_check_viewport(const struct drm_crtc *crtc,
 			    const struct drm_framebuffer *fb);
 int drm_crtc_register_all(struct drm_device *dev);
 void drm_crtc_unregister_all(struct drm_device *dev);
+int drm_crtc_force_disable(struct drm_crtc *crtc);
 
 struct dma_fence *drm_crtc_create_fence(struct drm_crtc *crtc);
 
diff --git a/drivers/gpu/drm/drm_damage_helper.c b/drivers/gpu/drm/drm_damage_helper.c
index 31032407254d..ee67c96841fa 100644
--- a/drivers/gpu/drm/drm_damage_helper.c
+++ b/drivers/gpu/drm/drm_damage_helper.c
@@ -32,6 +32,7 @@
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_damage_helper.h>
+#include <drm/drm_device.h>
 
 /**
  * DOC: overview
@@ -333,3 +334,44 @@ drm_atomic_helper_damage_iter_next(struct drm_atomic_helper_damage_iter *iter,
 	return ret;
 }
 EXPORT_SYMBOL(drm_atomic_helper_damage_iter_next);
+
+/**
+ * drm_atomic_helper_damage_merged - Merged plane damage
+ * @old_state: Old plane state for validation.
+ * @state: Plane state from which to iterate the damage clips.
+ * @rect: Returns the merged damage rectangle
+ *
+ * This function merges any valid plane damage clips into one rectangle and
+ * returns it in @rect.
+ *
+ * For details see: drm_atomic_helper_damage_iter_init() and
+ * drm_atomic_helper_damage_iter_next().
+ *
+ * Returns:
+ * True if there is valid plane damage otherwise false.
+ */
+bool drm_atomic_helper_damage_merged(const struct drm_plane_state *old_state,
+				     struct drm_plane_state *state,
+				     struct drm_rect *rect)
+{
+	struct drm_atomic_helper_damage_iter iter;
+	struct drm_rect clip;
+	bool valid = false;
+
+	rect->x1 = INT_MAX;
+	rect->y1 = INT_MAX;
+	rect->x2 = 0;
+	rect->y2 = 0;
+
+	drm_atomic_helper_damage_iter_init(&iter, old_state, state);
+	drm_atomic_for_each_plane_damage(&iter, &clip) {
+		rect->x1 = min(rect->x1, clip.x1);
+		rect->y1 = min(rect->y1, clip.y1);
+		rect->x2 = max(rect->x2, clip.x2);
+		rect->y2 = max(rect->y2, clip.y2);
+		valid = true;
+	}
+
+	return valid;
+}
+EXPORT_SYMBOL(drm_atomic_helper_damage_merged);
diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index 516e82d0ed50..54a6414c5d96 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -154,6 +154,7 @@ u8 drm_dp_link_rate_to_bw_code(int link_rate)
 	default:
 		WARN(1, "unknown DP link rate %d, using %x\n", link_rate,
 		     DP_LINK_BW_1_62);
+		/* fall through */
 	case 162000:
 		return DP_LINK_BW_1_62;
 	case 270000:
@@ -171,6 +172,7 @@ int drm_dp_bw_code_to_link_rate(u8 link_bw)
 	switch (link_bw) {
 	default:
 		WARN(1, "unknown DP link BW code %x, using 162000\n", link_bw);
+		/* fall through */
 	case DP_LINK_BW_1_62:
 		return 162000;
 	case DP_LINK_BW_2_7:
@@ -192,11 +194,11 @@ drm_dp_dump_access(const struct drm_dp_aux *aux,
 	const char *arrow = request == DP_AUX_NATIVE_READ ? "->" : "<-";
 
 	if (ret > 0)
-		drm_dbg(DRM_UT_DP, "%s: 0x%05x AUX %s (ret=%3d) %*ph\n",
-			aux->name, offset, arrow, ret, min(ret, 20), buffer);
+		DRM_DEBUG_DP("%s: 0x%05x AUX %s (ret=%3d) %*ph\n",
+			     aux->name, offset, arrow, ret, min(ret, 20), buffer);
 	else
-		drm_dbg(DRM_UT_DP, "%s: 0x%05x AUX %s (ret=%3d)\n",
-			aux->name, offset, arrow, ret);
+		DRM_DEBUG_DP("%s: 0x%05x AUX %s (ret=%3d)\n",
+			     aux->name, offset, arrow, ret);
 }
 
 /**
@@ -552,6 +554,7 @@ int drm_dp_downstream_max_bpc(const u8 dpcd[DP_RECEIVER_CAP_SIZE],
 		case DP_DS_16BPC:
 			return 16;
 		}
+		/* fall through */
 	default:
 		return 0;
 	}
@@ -884,7 +887,8 @@ static void drm_dp_i2c_msg_set_request(struct drm_dp_aux_msg *msg,
 {
 	msg->request = (i2c_msg->flags & I2C_M_RD) ?
 		DP_AUX_I2C_READ : DP_AUX_I2C_WRITE;
-	msg->request |= DP_AUX_I2C_MOT;
+	if (!(i2c_msg->flags & I2C_M_STOP))
+		msg->request |= DP_AUX_I2C_MOT;
 }
 
 /*
@@ -1356,7 +1360,20 @@ int drm_dp_read_desc(struct drm_dp_aux *aux, struct drm_dp_desc *desc,
 EXPORT_SYMBOL(drm_dp_read_desc);
 
 /**
- * DRM DP Helpers for DSC
+ * drm_dp_dsc_sink_max_slice_count() - Get the max slice count
+ * supported by the DSC sink.
+ * @dsc_dpcd: DSC capabilities from DPCD
+ * @is_edp: true if its eDP, false for DP
+ *
+ * Read the slice capabilities DPCD register from DSC sink to get
+ * the maximum slice count supported. This is used to populate
+ * the DSC parameters in the &struct drm_dsc_config by the driver.
+ * Driver creates an infoframe using these parameters to populate
+ * &struct drm_dsc_pps_infoframe. These are sent to the sink using DSC
+ * infoframe using the helper function drm_dsc_pps_infoframe_pack()
+ *
+ * Returns:
+ * Maximum slice count supported by DSC sink or 0 its invalid
  */
 u8 drm_dp_dsc_sink_max_slice_count(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE],
 				   bool is_edp)
@@ -1401,6 +1418,21 @@ u8 drm_dp_dsc_sink_max_slice_count(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE],
 }
 EXPORT_SYMBOL(drm_dp_dsc_sink_max_slice_count);
 
+/**
+ * drm_dp_dsc_sink_line_buf_depth() - Get the line buffer depth in bits
+ * @dsc_dpcd: DSC capabilities from DPCD
+ *
+ * Read the DSC DPCD register to parse the line buffer depth in bits which is
+ * number of bits of precision within the decoder line buffer supported by
+ * the DSC sink. This is used to populate the DSC parameters in the
+ * &struct drm_dsc_config by the driver.
+ * Driver creates an infoframe using these parameters to populate
+ * &struct drm_dsc_pps_infoframe. These are sent to the sink using DSC
+ * infoframe using the helper function drm_dsc_pps_infoframe_pack()
+ *
+ * Returns:
+ * Line buffer depth supported by DSC panel or 0 its invalid
+ */
 u8 drm_dp_dsc_sink_line_buf_depth(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE])
 {
 	u8 line_buf_depth = dsc_dpcd[DP_DSC_LINE_BUF_BIT_DEPTH - DP_DSC_SUPPORT];
@@ -1430,6 +1462,23 @@ u8 drm_dp_dsc_sink_line_buf_depth(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE])
 }
 EXPORT_SYMBOL(drm_dp_dsc_sink_line_buf_depth);
 
+/**
+ * drm_dp_dsc_sink_supported_input_bpcs() - Get all the input bits per component
+ * values supported by the DSC sink.
+ * @dsc_dpcd: DSC capabilities from DPCD
+ * @dsc_bpc: An array to be filled by this helper with supported
+ *           input bpcs.
+ *
+ * Read the DSC DPCD from the sink device to parse the supported bits per
+ * component values. This is used to populate the DSC parameters
+ * in the &struct drm_dsc_config by the driver.
+ * Driver creates an infoframe using these parameters to populate
+ * &struct drm_dsc_pps_infoframe. These are sent to the sink using DSC
+ * infoframe using the helper function drm_dsc_pps_infoframe_pack()
+ *
+ * Returns:
+ * Number of input BPC values parsed from the DPCD
+ */
 int drm_dp_dsc_sink_supported_input_bpcs(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE],
 					 u8 dsc_bpc[3])
 {
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index 529414556962..dc7ac0c60547 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -33,6 +33,7 @@
 #include <drm/drm_fixed.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_probe_helper.h>
 
 /**
  * DOC: dp mst helper
@@ -45,7 +46,7 @@ static bool dump_dp_payload_table(struct drm_dp_mst_topology_mgr *mgr,
 				  char *buf);
 static int test_calc_pbn_mode(void);
 
-static void drm_dp_put_port(struct drm_dp_mst_port *port);
+static void drm_dp_mst_topology_put_port(struct drm_dp_mst_port *port);
 
 static int drm_dp_dpcd_write_payload(struct drm_dp_mst_topology_mgr *mgr,
 				     int id,
@@ -66,6 +67,64 @@ static bool drm_dp_validate_guid(struct drm_dp_mst_topology_mgr *mgr,
 static int drm_dp_mst_register_i2c_bus(struct drm_dp_aux *aux);
 static void drm_dp_mst_unregister_i2c_bus(struct drm_dp_aux *aux);
 static void drm_dp_mst_kick_tx(struct drm_dp_mst_topology_mgr *mgr);
+
+#define DP_STR(x) [DP_ ## x] = #x
+
+static const char *drm_dp_mst_req_type_str(u8 req_type)
+{
+	static const char * const req_type_str[] = {
+		DP_STR(GET_MSG_TRANSACTION_VERSION),
+		DP_STR(LINK_ADDRESS),
+		DP_STR(CONNECTION_STATUS_NOTIFY),
+		DP_STR(ENUM_PATH_RESOURCES),
+		DP_STR(ALLOCATE_PAYLOAD),
+		DP_STR(QUERY_PAYLOAD),
+		DP_STR(RESOURCE_STATUS_NOTIFY),
+		DP_STR(CLEAR_PAYLOAD_ID_TABLE),
+		DP_STR(REMOTE_DPCD_READ),
+		DP_STR(REMOTE_DPCD_WRITE),
+		DP_STR(REMOTE_I2C_READ),
+		DP_STR(REMOTE_I2C_WRITE),
+		DP_STR(POWER_UP_PHY),
+		DP_STR(POWER_DOWN_PHY),
+		DP_STR(SINK_EVENT_NOTIFY),
+		DP_STR(QUERY_STREAM_ENC_STATUS),
+	};
+
+	if (req_type >= ARRAY_SIZE(req_type_str) ||
+	    !req_type_str[req_type])
+		return "unknown";
+
+	return req_type_str[req_type];
+}
+
+#undef DP_STR
+#define DP_STR(x) [DP_NAK_ ## x] = #x
+
+static const char *drm_dp_mst_nak_reason_str(u8 nak_reason)
+{
+	static const char * const nak_reason_str[] = {
+		DP_STR(WRITE_FAILURE),
+		DP_STR(INVALID_READ),
+		DP_STR(CRC_FAILURE),
+		DP_STR(BAD_PARAM),
+		DP_STR(DEFER),
+		DP_STR(LINK_FAILURE),
+		DP_STR(NO_RESOURCES),
+		DP_STR(DPCD_FAIL),
+		DP_STR(I2C_NAK),
+		DP_STR(ALLOCATE_FAIL),
+	};
+
+	if (nak_reason >= ARRAY_SIZE(nak_reason_str) ||
+	    !nak_reason_str[nak_reason])
+		return "unknown";
+
+	return nak_reason_str[nak_reason];
+}
+
+#undef DP_STR
+
 /* sideband msg handling */
 static u8 drm_dp_msg_header_crc4(const uint8_t *data, size_t num_nibbles)
 {
@@ -567,7 +626,7 @@ static bool drm_dp_sideband_parse_reply(struct drm_dp_sideband_msg_rx *raw,
 	msg->reply_type = (raw->msg[0] & 0x80) >> 7;
 	msg->req_type = (raw->msg[0] & 0x7f);
 
-	if (msg->reply_type) {
+	if (msg->reply_type == DP_SIDEBAND_REPLY_NAK) {
 		memcpy(msg->u.nak.guid, &raw->msg[1], 16);
 		msg->u.nak.reason = raw->msg[17];
 		msg->u.nak.nak_data = raw->msg[18];
@@ -593,7 +652,8 @@ static bool drm_dp_sideband_parse_reply(struct drm_dp_sideband_msg_rx *raw,
 	case DP_POWER_UP_PHY:
 		return drm_dp_sideband_parse_power_updown_phy_ack(raw, msg);
 	default:
-		DRM_ERROR("Got unknown reply 0x%02x\n", msg->req_type);
+		DRM_ERROR("Got unknown reply 0x%02x (%s)\n", msg->req_type,
+			  drm_dp_mst_req_type_str(msg->req_type));
 		return false;
 	}
 }
@@ -660,7 +720,8 @@ static bool drm_dp_sideband_parse_req(struct drm_dp_sideband_msg_rx *raw,
 	case DP_RESOURCE_STATUS_NOTIFY:
 		return drm_dp_sideband_parse_resource_status_notify(raw, msg);
 	default:
-		DRM_ERROR("Got unknown request 0x%02x\n", msg->req_type);
+		DRM_ERROR("Got unknown request 0x%02x (%s)\n", msg->req_type,
+			  drm_dp_mst_req_type_str(msg->req_type));
 		return false;
 	}
 }
@@ -849,46 +910,212 @@ static struct drm_dp_mst_branch *drm_dp_add_mst_branch_device(u8 lct, u8 *rad)
 	if (lct > 1)
 		memcpy(mstb->rad, rad, lct / 2);
 	INIT_LIST_HEAD(&mstb->ports);
-	kref_init(&mstb->kref);
+	kref_init(&mstb->topology_kref);
+	kref_init(&mstb->malloc_kref);
 	return mstb;
 }
 
-static void drm_dp_free_mst_port(struct kref *kref);
-
 static void drm_dp_free_mst_branch_device(struct kref *kref)
 {
-	struct drm_dp_mst_branch *mstb = container_of(kref, struct drm_dp_mst_branch, kref);
-	if (mstb->port_parent) {
-		if (list_empty(&mstb->port_parent->next))
-			kref_put(&mstb->port_parent->kref, drm_dp_free_mst_port);
-	}
+	struct drm_dp_mst_branch *mstb =
+		container_of(kref, struct drm_dp_mst_branch, malloc_kref);
+
+	if (mstb->port_parent)
+		drm_dp_mst_put_port_malloc(mstb->port_parent);
+
 	kfree(mstb);
 }
 
+/**
+ * DOC: Branch device and port refcounting
+ *
+ * Topology refcount overview
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * The refcounting schemes for &struct drm_dp_mst_branch and &struct
+ * drm_dp_mst_port are somewhat unusual. Both ports and branch devices have
+ * two different kinds of refcounts: topology refcounts, and malloc refcounts.
+ *
+ * Topology refcounts are not exposed to drivers, and are handled internally
+ * by the DP MST helpers. The helpers use them in order to prevent the
+ * in-memory topology state from being changed in the middle of critical
+ * operations like changing the internal state of payload allocations. This
+ * means each branch and port will be considered to be connected to the rest
+ * of the topology until its topology refcount reaches zero. Additionally,
+ * for ports this means that their associated &struct drm_connector will stay
+ * registered with userspace until the port's refcount reaches 0.
+ *
+ * Malloc refcount overview
+ * ~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * Malloc references are used to keep a &struct drm_dp_mst_port or &struct
+ * drm_dp_mst_branch allocated even after all of its topology references have
+ * been dropped, so that the driver or MST helpers can safely access each
+ * branch's last known state before it was disconnected from the topology.
+ * When the malloc refcount of a port or branch reaches 0, the memory
+ * allocation containing the &struct drm_dp_mst_branch or &struct
+ * drm_dp_mst_port respectively will be freed.
+ *
+ * For &struct drm_dp_mst_branch, malloc refcounts are not currently exposed
+ * to drivers. As of writing this documentation, there are no drivers that
+ * have a usecase for accessing &struct drm_dp_mst_branch outside of the MST
+ * helpers. Exposing this API to drivers in a race-free manner would take more
+ * tweaking of the refcounting scheme, however patches are welcome provided
+ * there is a legitimate driver usecase for this.
+ *
+ * Refcount relationships in a topology
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * Let's take a look at why the relationship between topology and malloc
+ * refcounts is designed the way it is.
+ *
+ * .. kernel-figure:: dp-mst/topology-figure-1.dot
+ *
+ *    An example of topology and malloc refs in a DP MST topology with two
+ *    active payloads. Topology refcount increments are indicated by solid
+ *    lines, and malloc refcount increments are indicated by dashed lines.
+ *    Each starts from the branch which incremented the refcount, and ends at
+ *    the branch to which the refcount belongs to, i.e. the arrow points the
+ *    same way as the C pointers used to reference a structure.
+ *
+ * As you can see in the above figure, every branch increments the topology
+ * refcount of its children, and increments the malloc refcount of its
+ * parent. Additionally, every payload increments the malloc refcount of its
+ * assigned port by 1.
+ *
+ * So, what would happen if MSTB #3 from the above figure was unplugged from
+ * the system, but the driver hadn't yet removed payload #2 from port #3? The
+ * topology would start to look like the figure below.
+ *
+ * .. kernel-figure:: dp-mst/topology-figure-2.dot
+ *
+ *    Ports and branch devices which have been released from memory are
+ *    colored grey, and references which have been removed are colored red.
+ *
+ * Whenever a port or branch device's topology refcount reaches zero, it will
+ * decrement the topology refcounts of all its children, the malloc refcount
+ * of its parent, and finally its own malloc refcount. For MSTB #4 and port
+ * #4, this means they both have been disconnected from the topology and freed
+ * from memory. But, because payload #2 is still holding a reference to port
+ * #3, port #3 is removed from the topology but its &struct drm_dp_mst_port
+ * is still accessible from memory. This also means port #3 has not yet
+ * decremented the malloc refcount of MSTB #3, so its &struct
+ * drm_dp_mst_branch will also stay allocated in memory until port #3's
+ * malloc refcount reaches 0.
+ *
+ * This relationship is necessary because in order to release payload #2, we
+ * need to be able to figure out the last relative of port #3 that's still
+ * connected to the topology. In this case, we would travel up the topology as
+ * shown below.
+ *
+ * .. kernel-figure:: dp-mst/topology-figure-3.dot
+ *
+ * And finally, remove payload #2 by communicating with port #2 through
+ * sideband transactions.
+ */
+
+/**
+ * drm_dp_mst_get_mstb_malloc() - Increment the malloc refcount of a branch
+ * device
+ * @mstb: The &struct drm_dp_mst_branch to increment the malloc refcount of
+ *
+ * Increments &drm_dp_mst_branch.malloc_kref. When
+ * &drm_dp_mst_branch.malloc_kref reaches 0, the memory allocation for @mstb
+ * will be released and @mstb may no longer be used.
+ *
+ * See also: drm_dp_mst_put_mstb_malloc()
+ */
+static void
+drm_dp_mst_get_mstb_malloc(struct drm_dp_mst_branch *mstb)
+{
+	kref_get(&mstb->malloc_kref);
+	DRM_DEBUG("mstb %p (%d)\n", mstb, kref_read(&mstb->malloc_kref));
+}
+
+/**
+ * drm_dp_mst_put_mstb_malloc() - Decrement the malloc refcount of a branch
+ * device
+ * @mstb: The &struct drm_dp_mst_branch to decrement the malloc refcount of
+ *
+ * Decrements &drm_dp_mst_branch.malloc_kref. When
+ * &drm_dp_mst_branch.malloc_kref reaches 0, the memory allocation for @mstb
+ * will be released and @mstb may no longer be used.
+ *
+ * See also: drm_dp_mst_get_mstb_malloc()
+ */
+static void
+drm_dp_mst_put_mstb_malloc(struct drm_dp_mst_branch *mstb)
+{
+	DRM_DEBUG("mstb %p (%d)\n", mstb, kref_read(&mstb->malloc_kref) - 1);
+	kref_put(&mstb->malloc_kref, drm_dp_free_mst_branch_device);
+}
+
+static void drm_dp_free_mst_port(struct kref *kref)
+{
+	struct drm_dp_mst_port *port =
+		container_of(kref, struct drm_dp_mst_port, malloc_kref);
+
+	drm_dp_mst_put_mstb_malloc(port->parent);
+	kfree(port);
+}
+
+/**
+ * drm_dp_mst_get_port_malloc() - Increment the malloc refcount of an MST port
+ * @port: The &struct drm_dp_mst_port to increment the malloc refcount of
+ *
+ * Increments &drm_dp_mst_port.malloc_kref. When &drm_dp_mst_port.malloc_kref
+ * reaches 0, the memory allocation for @port will be released and @port may
+ * no longer be used.
+ *
+ * Because @port could potentially be freed at any time by the DP MST helpers
+ * if &drm_dp_mst_port.malloc_kref reaches 0, including during a call to this
+ * function, drivers that which to make use of &struct drm_dp_mst_port should
+ * ensure that they grab at least one main malloc reference to their MST ports
+ * in &drm_dp_mst_topology_cbs.add_connector. This callback is called before
+ * there is any chance for &drm_dp_mst_port.malloc_kref to reach 0.
+ *
+ * See also: drm_dp_mst_put_port_malloc()
+ */
+void
+drm_dp_mst_get_port_malloc(struct drm_dp_mst_port *port)
+{
+	kref_get(&port->malloc_kref);
+	DRM_DEBUG("port %p (%d)\n", port, kref_read(&port->malloc_kref));
+}
+EXPORT_SYMBOL(drm_dp_mst_get_port_malloc);
+
+/**
+ * drm_dp_mst_put_port_malloc() - Decrement the malloc refcount of an MST port
+ * @port: The &struct drm_dp_mst_port to decrement the malloc refcount of
+ *
+ * Decrements &drm_dp_mst_port.malloc_kref. When &drm_dp_mst_port.malloc_kref
+ * reaches 0, the memory allocation for @port will be released and @port may
+ * no longer be used.
+ *
+ * See also: drm_dp_mst_get_port_malloc()
+ */
+void
+drm_dp_mst_put_port_malloc(struct drm_dp_mst_port *port)
+{
+	DRM_DEBUG("port %p (%d)\n", port, kref_read(&port->malloc_kref) - 1);
+	kref_put(&port->malloc_kref, drm_dp_free_mst_port);
+}
+EXPORT_SYMBOL(drm_dp_mst_put_port_malloc);
+
 static void drm_dp_destroy_mst_branch_device(struct kref *kref)
 {
-	struct drm_dp_mst_branch *mstb = container_of(kref, struct drm_dp_mst_branch, kref);
+	struct drm_dp_mst_branch *mstb =
+		container_of(kref, struct drm_dp_mst_branch, topology_kref);
+	struct drm_dp_mst_topology_mgr *mgr = mstb->mgr;
 	struct drm_dp_mst_port *port, *tmp;
 	bool wake_tx = false;
 
-	/*
-	 * init kref again to be used by ports to remove mst branch when it is
-	 * not needed anymore
-	 */
-	kref_init(kref);
-
-	if (mstb->port_parent && list_empty(&mstb->port_parent->next))
-		kref_get(&mstb->port_parent->kref);
-
-	/*
-	 * destroy all ports - don't need lock
-	 * as there are no more references to the mst branch
-	 * device at this point.
-	 */
+	mutex_lock(&mgr->lock);
 	list_for_each_entry_safe(port, tmp, &mstb->ports, next) {
 		list_del(&port->next);
-		drm_dp_put_port(port);
+		drm_dp_mst_topology_put_port(port);
 	}
+	mutex_unlock(&mgr->lock);
 
 	/* drop any tx slots msg */
 	mutex_lock(&mstb->mgr->qlock);
@@ -907,14 +1134,83 @@ static void drm_dp_destroy_mst_branch_device(struct kref *kref)
 	if (wake_tx)
 		wake_up_all(&mstb->mgr->tx_waitq);
 
-	kref_put(kref, drm_dp_free_mst_branch_device);
+	drm_dp_mst_put_mstb_malloc(mstb);
 }
 
-static void drm_dp_put_mst_branch_device(struct drm_dp_mst_branch *mstb)
+/**
+ * drm_dp_mst_topology_try_get_mstb() - Increment the topology refcount of a
+ * branch device unless it's zero
+ * @mstb: &struct drm_dp_mst_branch to increment the topology refcount of
+ *
+ * Attempts to grab a topology reference to @mstb, if it hasn't yet been
+ * removed from the topology (e.g. &drm_dp_mst_branch.topology_kref has
+ * reached 0). Holding a topology reference implies that a malloc reference
+ * will be held to @mstb as long as the user holds the topology reference.
+ *
+ * Care should be taken to ensure that the user has at least one malloc
+ * reference to @mstb. If you already have a topology reference to @mstb, you
+ * should use drm_dp_mst_topology_get_mstb() instead.
+ *
+ * See also:
+ * drm_dp_mst_topology_get_mstb()
+ * drm_dp_mst_topology_put_mstb()
+ *
+ * Returns:
+ * * 1: A topology reference was grabbed successfully
+ * * 0: @port is no longer in the topology, no reference was grabbed
+ */
+static int __must_check
+drm_dp_mst_topology_try_get_mstb(struct drm_dp_mst_branch *mstb)
 {
-	kref_put(&mstb->kref, drm_dp_destroy_mst_branch_device);
+	int ret = kref_get_unless_zero(&mstb->topology_kref);
+
+	if (ret)
+		DRM_DEBUG("mstb %p (%d)\n", mstb,
+			  kref_read(&mstb->topology_kref));
+
+	return ret;
 }
 
+/**
+ * drm_dp_mst_topology_get_mstb() - Increment the topology refcount of a
+ * branch device
+ * @mstb: The &struct drm_dp_mst_branch to increment the topology refcount of
+ *
+ * Increments &drm_dp_mst_branch.topology_refcount without checking whether or
+ * not it's already reached 0. This is only valid to use in scenarios where
+ * you are already guaranteed to have at least one active topology reference
+ * to @mstb. Otherwise, drm_dp_mst_topology_try_get_mstb() must be used.
+ *
+ * See also:
+ * drm_dp_mst_topology_try_get_mstb()
+ * drm_dp_mst_topology_put_mstb()
+ */
+static void drm_dp_mst_topology_get_mstb(struct drm_dp_mst_branch *mstb)
+{
+	WARN_ON(kref_read(&mstb->topology_kref) == 0);
+	kref_get(&mstb->topology_kref);
+	DRM_DEBUG("mstb %p (%d)\n", mstb, kref_read(&mstb->topology_kref));
+}
+
+/**
+ * drm_dp_mst_topology_put_mstb() - release a topology reference to a branch
+ * device
+ * @mstb: The &struct drm_dp_mst_branch to release the topology reference from
+ *
+ * Releases a topology reference from @mstb by decrementing
+ * &drm_dp_mst_branch.topology_kref.
+ *
+ * See also:
+ * drm_dp_mst_topology_try_get_mstb()
+ * drm_dp_mst_topology_get_mstb()
+ */
+static void
+drm_dp_mst_topology_put_mstb(struct drm_dp_mst_branch *mstb)
+{
+	DRM_DEBUG("mstb %p (%d)\n",
+		  mstb, kref_read(&mstb->topology_kref) - 1);
+	kref_put(&mstb->topology_kref, drm_dp_destroy_mst_branch_device);
+}
 
 static void drm_dp_port_teardown_pdt(struct drm_dp_mst_port *port, int old_pdt)
 {
@@ -929,19 +1225,18 @@ static void drm_dp_port_teardown_pdt(struct drm_dp_mst_port *port, int old_pdt)
 	case DP_PEER_DEVICE_MST_BRANCHING:
 		mstb = port->mstb;
 		port->mstb = NULL;
-		drm_dp_put_mst_branch_device(mstb);
+		drm_dp_mst_topology_put_mstb(mstb);
 		break;
 	}
 }
 
 static void drm_dp_destroy_port(struct kref *kref)
 {
-	struct drm_dp_mst_port *port = container_of(kref, struct drm_dp_mst_port, kref);
+	struct drm_dp_mst_port *port =
+		container_of(kref, struct drm_dp_mst_port, topology_kref);
 	struct drm_dp_mst_topology_mgr *mgr = port->mgr;
 
 	if (!port->input) {
-		port->vcpi.num_slots = 0;
-
 		kfree(port->cached_edid);
 
 		/*
@@ -955,7 +1250,6 @@ static void drm_dp_destroy_port(struct kref *kref)
 			 * from an EDID retrieval */
 
 			mutex_lock(&mgr->destroy_connector_lock);
-			kref_get(&port->parent->kref);
 			list_add(&port->next, &mgr->destroy_connector_list);
 			mutex_unlock(&mgr->destroy_connector_lock);
 			schedule_work(&mgr->destroy_connector_work);
@@ -966,25 +1260,95 @@ static void drm_dp_destroy_port(struct kref *kref)
 		drm_dp_port_teardown_pdt(port, port->pdt);
 		port->pdt = DP_PEER_DEVICE_NONE;
 	}
-	kfree(port);
+	drm_dp_mst_put_port_malloc(port);
 }
 
-static void drm_dp_put_port(struct drm_dp_mst_port *port)
+/**
+ * drm_dp_mst_topology_try_get_port() - Increment the topology refcount of a
+ * port unless it's zero
+ * @port: &struct drm_dp_mst_port to increment the topology refcount of
+ *
+ * Attempts to grab a topology reference to @port, if it hasn't yet been
+ * removed from the topology (e.g. &drm_dp_mst_port.topology_kref has reached
+ * 0). Holding a topology reference implies that a malloc reference will be
+ * held to @port as long as the user holds the topology reference.
+ *
+ * Care should be taken to ensure that the user has at least one malloc
+ * reference to @port. If you already have a topology reference to @port, you
+ * should use drm_dp_mst_topology_get_port() instead.
+ *
+ * See also:
+ * drm_dp_mst_topology_get_port()
+ * drm_dp_mst_topology_put_port()
+ *
+ * Returns:
+ * * 1: A topology reference was grabbed successfully
+ * * 0: @port is no longer in the topology, no reference was grabbed
+ */
+static int __must_check
+drm_dp_mst_topology_try_get_port(struct drm_dp_mst_port *port)
 {
-	kref_put(&port->kref, drm_dp_destroy_port);
+	int ret = kref_get_unless_zero(&port->topology_kref);
+
+	if (ret)
+		DRM_DEBUG("port %p (%d)\n", port,
+			  kref_read(&port->topology_kref));
+
+	return ret;
 }
 
-static struct drm_dp_mst_branch *drm_dp_mst_get_validated_mstb_ref_locked(struct drm_dp_mst_branch *mstb, struct drm_dp_mst_branch *to_find)
+/**
+ * drm_dp_mst_topology_get_port() - Increment the topology refcount of a port
+ * @port: The &struct drm_dp_mst_port to increment the topology refcount of
+ *
+ * Increments &drm_dp_mst_port.topology_refcount without checking whether or
+ * not it's already reached 0. This is only valid to use in scenarios where
+ * you are already guaranteed to have at least one active topology reference
+ * to @port. Otherwise, drm_dp_mst_topology_try_get_port() must be used.
+ *
+ * See also:
+ * drm_dp_mst_topology_try_get_port()
+ * drm_dp_mst_topology_put_port()
+ */
+static void drm_dp_mst_topology_get_port(struct drm_dp_mst_port *port)
+{
+	WARN_ON(kref_read(&port->topology_kref) == 0);
+	kref_get(&port->topology_kref);
+	DRM_DEBUG("port %p (%d)\n", port, kref_read(&port->topology_kref));
+}
+
+/**
+ * drm_dp_mst_topology_put_port() - release a topology reference to a port
+ * @port: The &struct drm_dp_mst_port to release the topology reference from
+ *
+ * Releases a topology reference from @port by decrementing
+ * &drm_dp_mst_port.topology_kref.
+ *
+ * See also:
+ * drm_dp_mst_topology_try_get_port()
+ * drm_dp_mst_topology_get_port()
+ */
+static void drm_dp_mst_topology_put_port(struct drm_dp_mst_port *port)
+{
+	DRM_DEBUG("port %p (%d)\n",
+		  port, kref_read(&port->topology_kref) - 1);
+	kref_put(&port->topology_kref, drm_dp_destroy_port);
+}
+
+static struct drm_dp_mst_branch *
+drm_dp_mst_topology_get_mstb_validated_locked(struct drm_dp_mst_branch *mstb,
+					      struct drm_dp_mst_branch *to_find)
 {
 	struct drm_dp_mst_port *port;
 	struct drm_dp_mst_branch *rmstb;
-	if (to_find == mstb) {
-		kref_get(&mstb->kref);
+
+	if (to_find == mstb)
 		return mstb;
-	}
+
 	list_for_each_entry(port, &mstb->ports, next) {
 		if (port->mstb) {
-			rmstb = drm_dp_mst_get_validated_mstb_ref_locked(port->mstb, to_find);
+			rmstb = drm_dp_mst_topology_get_mstb_validated_locked(
+			    port->mstb, to_find);
 			if (rmstb)
 				return rmstb;
 		}
@@ -992,27 +1356,37 @@ static struct drm_dp_mst_branch *drm_dp_mst_get_validated_mstb_ref_locked(struct
 	return NULL;
 }
 
-static struct drm_dp_mst_branch *drm_dp_get_validated_mstb_ref(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_branch *mstb)
+static struct drm_dp_mst_branch *
+drm_dp_mst_topology_get_mstb_validated(struct drm_dp_mst_topology_mgr *mgr,
+				       struct drm_dp_mst_branch *mstb)
 {
 	struct drm_dp_mst_branch *rmstb = NULL;
+
 	mutex_lock(&mgr->lock);
-	if (mgr->mst_primary)
-		rmstb = drm_dp_mst_get_validated_mstb_ref_locked(mgr->mst_primary, mstb);
+	if (mgr->mst_primary) {
+		rmstb = drm_dp_mst_topology_get_mstb_validated_locked(
+		    mgr->mst_primary, mstb);
+
+		if (rmstb && !drm_dp_mst_topology_try_get_mstb(rmstb))
+			rmstb = NULL;
+	}
 	mutex_unlock(&mgr->lock);
 	return rmstb;
 }
 
-static struct drm_dp_mst_port *drm_dp_mst_get_port_ref_locked(struct drm_dp_mst_branch *mstb, struct drm_dp_mst_port *to_find)
+static struct drm_dp_mst_port *
+drm_dp_mst_topology_get_port_validated_locked(struct drm_dp_mst_branch *mstb,
+					      struct drm_dp_mst_port *to_find)
 {
 	struct drm_dp_mst_port *port, *mport;
 
 	list_for_each_entry(port, &mstb->ports, next) {
-		if (port == to_find) {
-			kref_get(&port->kref);
+		if (port == to_find)
 			return port;
-		}
+
 		if (port->mstb) {
-			mport = drm_dp_mst_get_port_ref_locked(port->mstb, to_find);
+			mport = drm_dp_mst_topology_get_port_validated_locked(
+			    port->mstb, to_find);
 			if (mport)
 				return mport;
 		}
@@ -1020,12 +1394,20 @@ static struct drm_dp_mst_port *drm_dp_mst_get_port_ref_locked(struct drm_dp_mst_
 	return NULL;
 }
 
-static struct drm_dp_mst_port *drm_dp_get_validated_port_ref(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port)
+static struct drm_dp_mst_port *
+drm_dp_mst_topology_get_port_validated(struct drm_dp_mst_topology_mgr *mgr,
+				       struct drm_dp_mst_port *port)
 {
 	struct drm_dp_mst_port *rport = NULL;
+
 	mutex_lock(&mgr->lock);
-	if (mgr->mst_primary)
-		rport = drm_dp_mst_get_port_ref_locked(mgr->mst_primary, port);
+	if (mgr->mst_primary) {
+		rport = drm_dp_mst_topology_get_port_validated_locked(
+		    mgr->mst_primary, port);
+
+		if (rport && !drm_dp_mst_topology_try_get_port(rport))
+			rport = NULL;
+	}
 	mutex_unlock(&mgr->lock);
 	return rport;
 }
@@ -1033,11 +1415,12 @@ static struct drm_dp_mst_port *drm_dp_get_validated_port_ref(struct drm_dp_mst_t
 static struct drm_dp_mst_port *drm_dp_get_port(struct drm_dp_mst_branch *mstb, u8 port_num)
 {
 	struct drm_dp_mst_port *port;
+	int ret;
 
 	list_for_each_entry(port, &mstb->ports, next) {
 		if (port->port_num == port_num) {
-			kref_get(&port->kref);
-			return port;
+			ret = drm_dp_mst_topology_try_get_port(port);
+			return ret ? port : NULL;
 		}
 	}
 
@@ -1086,6 +1469,11 @@ static bool drm_dp_port_setup_pdt(struct drm_dp_mst_port *port)
 		if (port->mstb) {
 			port->mstb->mgr = port->mgr;
 			port->mstb->port_parent = port;
+			/*
+			 * Make sure this port's memory allocation stays
+			 * around until its child MSTB releases it
+			 */
+			drm_dp_mst_get_port_malloc(port);
 
 			send_link = true;
 		}
@@ -1146,17 +1534,26 @@ static void drm_dp_add_port(struct drm_dp_mst_branch *mstb,
 	bool created = false;
 	int old_pdt = 0;
 	int old_ddps = 0;
+
 	port = drm_dp_get_port(mstb, port_msg->port_number);
 	if (!port) {
 		port = kzalloc(sizeof(*port), GFP_KERNEL);
 		if (!port)
 			return;
-		kref_init(&port->kref);
+		kref_init(&port->topology_kref);
+		kref_init(&port->malloc_kref);
 		port->parent = mstb;
 		port->port_num = port_msg->port_number;
 		port->mgr = mstb->mgr;
 		port->aux.name = "DPMST";
 		port->aux.dev = dev->dev;
+
+		/*
+		 * Make sure the memory allocation for our parent branch stays
+		 * around until our own memory allocation is released
+		 */
+		drm_dp_mst_get_mstb_malloc(mstb);
+
 		created = true;
 	} else {
 		old_pdt = port->pdt;
@@ -1176,18 +1573,20 @@ static void drm_dp_add_port(struct drm_dp_mst_branch *mstb,
 	   for this list */
 	if (created) {
 		mutex_lock(&mstb->mgr->lock);
-		kref_get(&port->kref);
+		drm_dp_mst_topology_get_port(port);
 		list_add(&port->next, &mstb->ports);
 		mutex_unlock(&mstb->mgr->lock);
 	}
 
 	if (old_ddps != port->ddps) {
 		if (port->ddps) {
-			if (!port->input)
-				drm_dp_send_enum_path_resources(mstb->mgr, mstb, port);
+			if (!port->input) {
+				drm_dp_send_enum_path_resources(mstb->mgr,
+								mstb, port);
+			}
 		} else {
 			port->available_pbn = 0;
-			}
+		}
 	}
 
 	if (old_pdt != port->pdt && !port->input) {
@@ -1201,21 +1600,25 @@ static void drm_dp_add_port(struct drm_dp_mst_branch *mstb,
 	if (created && !port->input) {
 		char proppath[255];
 
-		build_mst_prop_path(mstb, port->port_num, proppath, sizeof(proppath));
-		port->connector = (*mstb->mgr->cbs->add_connector)(mstb->mgr, port, proppath);
+		build_mst_prop_path(mstb, port->port_num, proppath,
+				    sizeof(proppath));
+		port->connector = (*mstb->mgr->cbs->add_connector)(mstb->mgr,
+								   port,
+								   proppath);
 		if (!port->connector) {
 			/* remove it from the port list */
 			mutex_lock(&mstb->mgr->lock);
 			list_del(&port->next);
 			mutex_unlock(&mstb->mgr->lock);
 			/* drop port list reference */
-			drm_dp_put_port(port);
+			drm_dp_mst_topology_put_port(port);
 			goto out;
 		}
 		if ((port->pdt == DP_PEER_DEVICE_DP_LEGACY_CONV ||
 		     port->pdt == DP_PEER_DEVICE_SST_SINK) &&
 		    port->port_num >= DP_MST_LOGICAL_PORT_0) {
-			port->cached_edid = drm_get_edid(port->connector, &port->aux.ddc);
+			port->cached_edid = drm_get_edid(port->connector,
+							 &port->aux.ddc);
 			drm_connector_set_tile_property(port->connector);
 		}
 		(*mstb->mgr->cbs->register_connector)(port->connector);
@@ -1223,7 +1626,7 @@ static void drm_dp_add_port(struct drm_dp_mst_branch *mstb,
 
 out:
 	/* put reference to this port */
-	drm_dp_put_port(port);
+	drm_dp_mst_topology_put_port(port);
 }
 
 static void drm_dp_update_port(struct drm_dp_mst_branch *mstb,
@@ -1258,7 +1661,7 @@ static void drm_dp_update_port(struct drm_dp_mst_branch *mstb,
 			dowork = true;
 	}
 
-	drm_dp_put_port(port);
+	drm_dp_mst_topology_put_port(port);
 	if (dowork)
 		queue_work(system_long_wq, &mstb->mgr->work);
 
@@ -1269,7 +1672,7 @@ static struct drm_dp_mst_branch *drm_dp_get_mst_branch_device(struct drm_dp_mst_
 {
 	struct drm_dp_mst_branch *mstb;
 	struct drm_dp_mst_port *port;
-	int i;
+	int i, ret;
 	/* find the port by iterating down */
 
 	mutex_lock(&mgr->lock);
@@ -1294,7 +1697,9 @@ static struct drm_dp_mst_branch *drm_dp_get_mst_branch_device(struct drm_dp_mst_
 			}
 		}
 	}
-	kref_get(&mstb->kref);
+	ret = drm_dp_mst_topology_try_get_mstb(mstb);
+	if (!ret)
+		mstb = NULL;
 out:
 	mutex_unlock(&mgr->lock);
 	return mstb;
@@ -1324,19 +1729,22 @@ static struct drm_dp_mst_branch *get_mst_branch_device_by_guid_helper(
 	return NULL;
 }
 
-static struct drm_dp_mst_branch *drm_dp_get_mst_branch_device_by_guid(
-	struct drm_dp_mst_topology_mgr *mgr,
-	uint8_t *guid)
+static struct drm_dp_mst_branch *
+drm_dp_get_mst_branch_device_by_guid(struct drm_dp_mst_topology_mgr *mgr,
+				     uint8_t *guid)
 {
 	struct drm_dp_mst_branch *mstb;
+	int ret;
 
 	/* find the port by iterating down */
 	mutex_lock(&mgr->lock);
 
 	mstb = get_mst_branch_device_by_guid_helper(mgr->mst_primary, guid);
-
-	if (mstb)
-		kref_get(&mstb->kref);
+	if (mstb) {
+		ret = drm_dp_mst_topology_try_get_mstb(mstb);
+		if (!ret)
+			mstb = NULL;
+	}
 
 	mutex_unlock(&mgr->lock);
 	return mstb;
@@ -1361,10 +1769,11 @@ static void drm_dp_check_and_send_link_address(struct drm_dp_mst_topology_mgr *m
 			drm_dp_send_enum_path_resources(mgr, mstb, port);
 
 		if (port->mstb) {
-			mstb_child = drm_dp_get_validated_mstb_ref(mgr, port->mstb);
+			mstb_child = drm_dp_mst_topology_get_mstb_validated(
+			    mgr, port->mstb);
 			if (mstb_child) {
 				drm_dp_check_and_send_link_address(mgr, mstb_child);
-				drm_dp_put_mst_branch_device(mstb_child);
+				drm_dp_mst_topology_put_mstb(mstb_child);
 			}
 		}
 	}
@@ -1374,16 +1783,19 @@ static void drm_dp_mst_link_probe_work(struct work_struct *work)
 {
 	struct drm_dp_mst_topology_mgr *mgr = container_of(work, struct drm_dp_mst_topology_mgr, work);
 	struct drm_dp_mst_branch *mstb;
+	int ret;
 
 	mutex_lock(&mgr->lock);
 	mstb = mgr->mst_primary;
 	if (mstb) {
-		kref_get(&mstb->kref);
+		ret = drm_dp_mst_topology_try_get_mstb(mstb);
+		if (!ret)
+			mstb = NULL;
 	}
 	mutex_unlock(&mgr->lock);
 	if (mstb) {
 		drm_dp_check_and_send_link_address(mgr, mstb);
-		drm_dp_put_mst_branch_device(mstb);
+		drm_dp_mst_topology_put_mstb(mstb);
 	}
 }
 
@@ -1617,9 +2029,9 @@ static void drm_dp_send_link_address(struct drm_dp_mst_topology_mgr *mgr,
 	if (ret > 0) {
 		int i;
 
-		if (txmsg->reply.reply_type == 1)
+		if (txmsg->reply.reply_type == DP_SIDEBAND_REPLY_NAK) {
 			DRM_DEBUG_KMS("link address nak received\n");
-		else {
+		} else {
 			DRM_DEBUG_KMS("link address reply: %d\n", txmsg->reply.u.link_addr.nports);
 			for (i = 0; i < txmsg->reply.u.link_addr.nports; i++) {
 				DRM_DEBUG_KMS("port %d: input %d, pdt: %d, pn: %d, dpcd_rev: %02x, mcs: %d, ddps: %d, ldps %d, sdp %d/%d\n", i,
@@ -1639,7 +2051,7 @@ static void drm_dp_send_link_address(struct drm_dp_mst_topology_mgr *mgr,
 			for (i = 0; i < txmsg->reply.u.link_addr.nports; i++) {
 				drm_dp_add_port(mstb, mgr->dev, &txmsg->reply.u.link_addr.ports[i]);
 			}
-			(*mgr->cbs->hotplug)(mgr);
+			drm_kms_helper_hotplug_event(mgr->dev);
 		}
 	} else {
 		mstb->link_address_sent = false;
@@ -1668,9 +2080,9 @@ static int drm_dp_send_enum_path_resources(struct drm_dp_mst_topology_mgr *mgr,
 
 	ret = drm_dp_mst_wait_tx_reply(mstb, txmsg);
 	if (ret > 0) {
-		if (txmsg->reply.reply_type == 1)
+		if (txmsg->reply.reply_type == DP_SIDEBAND_REPLY_NAK) {
 			DRM_DEBUG_KMS("enum path resources nak received\n");
-		else {
+		} else {
 			if (port->port_num != txmsg->reply.u.path_resources.port_number)
 				DRM_ERROR("got incorrect port in response\n");
 			DRM_DEBUG_KMS("enum path resources %d: %d %d\n", txmsg->reply.u.path_resources.port_number, txmsg->reply.u.path_resources.full_payload_bw_number,
@@ -1694,22 +2106,40 @@ static struct drm_dp_mst_port *drm_dp_get_last_connected_port_to_mstb(struct drm
 	return drm_dp_get_last_connected_port_to_mstb(mstb->port_parent->parent);
 }
 
-static struct drm_dp_mst_branch *drm_dp_get_last_connected_port_and_mstb(struct drm_dp_mst_topology_mgr *mgr,
-									 struct drm_dp_mst_branch *mstb,
-									 int *port_num)
+/*
+ * Searches upwards in the topology starting from mstb to try to find the
+ * closest available parent of mstb that's still connected to the rest of the
+ * topology. This can be used in order to perform operations like releasing
+ * payloads, where the branch device which owned the payload may no longer be
+ * around and thus would require that the payload on the last living relative
+ * be freed instead.
+ */
+static struct drm_dp_mst_branch *
+drm_dp_get_last_connected_port_and_mstb(struct drm_dp_mst_topology_mgr *mgr,
+					struct drm_dp_mst_branch *mstb,
+					int *port_num)
 {
 	struct drm_dp_mst_branch *rmstb = NULL;
 	struct drm_dp_mst_port *found_port;
+
 	mutex_lock(&mgr->lock);
-	if (mgr->mst_primary) {
+	if (!mgr->mst_primary)
+		goto out;
+
+	do {
 		found_port = drm_dp_get_last_connected_port_to_mstb(mstb);
+		if (!found_port)
+			break;
 
-		if (found_port) {
+		if (drm_dp_mst_topology_try_get_mstb(found_port->parent)) {
 			rmstb = found_port->parent;
-			kref_get(&rmstb->kref);
 			*port_num = found_port->port_num;
+		} else {
+			/* Search again, starting from this parent */
+			mstb = found_port->parent;
 		}
-	}
+	} while (!rmstb);
+out:
 	mutex_unlock(&mgr->lock);
 	return rmstb;
 }
@@ -1725,19 +2155,15 @@ static int drm_dp_payload_send_msg(struct drm_dp_mst_topology_mgr *mgr,
 	u8 sinks[DRM_DP_MAX_SDP_STREAMS];
 	int i;
 
-	port = drm_dp_get_validated_port_ref(mgr, port);
-	if (!port)
-		return -EINVAL;
-
 	port_num = port->port_num;
-	mstb = drm_dp_get_validated_mstb_ref(mgr, port->parent);
+	mstb = drm_dp_mst_topology_get_mstb_validated(mgr, port->parent);
 	if (!mstb) {
-		mstb = drm_dp_get_last_connected_port_and_mstb(mgr, port->parent, &port_num);
+		mstb = drm_dp_get_last_connected_port_and_mstb(mgr,
+							       port->parent,
+							       &port_num);
 
-		if (!mstb) {
-			drm_dp_put_port(port);
+		if (!mstb)
 			return -EINVAL;
-		}
 	}
 
 	txmsg = kzalloc(sizeof(*txmsg), GFP_KERNEL);
@@ -1756,17 +2182,24 @@ static int drm_dp_payload_send_msg(struct drm_dp_mst_topology_mgr *mgr,
 
 	drm_dp_queue_down_tx(mgr, txmsg);
 
+	/*
+	 * FIXME: there is a small chance that between getting the last
+	 * connected mstb and sending the payload message, the last connected
+	 * mstb could also be removed from the topology. In the future, this
+	 * needs to be fixed by restarting the
+	 * drm_dp_get_last_connected_port_and_mstb() search in the event of a
+	 * timeout if the topology is still connected to the system.
+	 */
 	ret = drm_dp_mst_wait_tx_reply(mstb, txmsg);
 	if (ret > 0) {
-		if (txmsg->reply.reply_type == 1) {
+		if (txmsg->reply.reply_type == DP_SIDEBAND_REPLY_NAK)
 			ret = -EINVAL;
-		} else
+		else
 			ret = 0;
 	}
 	kfree(txmsg);
 fail_put:
-	drm_dp_put_mst_branch_device(mstb);
-	drm_dp_put_port(port);
+	drm_dp_mst_topology_put_mstb(mstb);
 	return ret;
 }
 
@@ -1776,13 +2209,13 @@ int drm_dp_send_power_updown_phy(struct drm_dp_mst_topology_mgr *mgr,
 	struct drm_dp_sideband_msg_tx *txmsg;
 	int len, ret;
 
-	port = drm_dp_get_validated_port_ref(mgr, port);
+	port = drm_dp_mst_topology_get_port_validated(mgr, port);
 	if (!port)
 		return -EINVAL;
 
 	txmsg = kzalloc(sizeof(*txmsg), GFP_KERNEL);
 	if (!txmsg) {
-		drm_dp_put_port(port);
+		drm_dp_mst_topology_put_port(port);
 		return -ENOMEM;
 	}
 
@@ -1792,13 +2225,13 @@ int drm_dp_send_power_updown_phy(struct drm_dp_mst_topology_mgr *mgr,
 
 	ret = drm_dp_mst_wait_tx_reply(port->parent, txmsg);
 	if (ret > 0) {
-		if (txmsg->reply.reply_type == 1)
+		if (txmsg->reply.reply_type == DP_SIDEBAND_REPLY_NAK)
 			ret = -EINVAL;
 		else
 			ret = 0;
 	}
 	kfree(txmsg);
-	drm_dp_put_port(port);
+	drm_dp_mst_topology_put_port(port);
 
 	return ret;
 }
@@ -1838,7 +2271,7 @@ static int drm_dp_destroy_payload_step1(struct drm_dp_mst_topology_mgr *mgr,
 					struct drm_dp_payload *payload)
 {
 	DRM_DEBUG_KMS("\n");
-	/* its okay for these to fail */
+	/* it's okay for these to fail */
 	if (port) {
 		drm_dp_payload_send_msg(mgr, port, id, 0);
 	}
@@ -1871,72 +2304,93 @@ static int drm_dp_destroy_payload_step2(struct drm_dp_mst_topology_mgr *mgr,
  */
 int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr)
 {
-	int i, j;
-	int cur_slots = 1;
 	struct drm_dp_payload req_payload;
 	struct drm_dp_mst_port *port;
+	int i, j;
+	int cur_slots = 1;
 
 	mutex_lock(&mgr->payload_lock);
 	for (i = 0; i < mgr->max_payloads; i++) {
+		struct drm_dp_vcpi *vcpi = mgr->proposed_vcpis[i];
+		struct drm_dp_payload *payload = &mgr->payloads[i];
+		bool put_port = false;
+
 		/* solve the current payloads - compare to the hw ones
 		   - update the hw view */
 		req_payload.start_slot = cur_slots;
-		if (mgr->proposed_vcpis[i]) {
-			port = container_of(mgr->proposed_vcpis[i], struct drm_dp_mst_port, vcpi);
-			port = drm_dp_get_validated_port_ref(mgr, port);
-			if (!port) {
-				mutex_unlock(&mgr->payload_lock);
-				return -EINVAL;
+		if (vcpi) {
+			port = container_of(vcpi, struct drm_dp_mst_port,
+					    vcpi);
+
+			/* Validated ports don't matter if we're releasing
+			 * VCPI
+			 */
+			if (vcpi->num_slots) {
+				port = drm_dp_mst_topology_get_port_validated(
+				    mgr, port);
+				if (!port) {
+					mutex_unlock(&mgr->payload_lock);
+					return -EINVAL;
+				}
+				put_port = true;
 			}
-			req_payload.num_slots = mgr->proposed_vcpis[i]->num_slots;
-			req_payload.vcpi = mgr->proposed_vcpis[i]->vcpi;
+
+			req_payload.num_slots = vcpi->num_slots;
+			req_payload.vcpi = vcpi->vcpi;
 		} else {
 			port = NULL;
 			req_payload.num_slots = 0;
 		}
 
-		if (mgr->payloads[i].start_slot != req_payload.start_slot) {
-			mgr->payloads[i].start_slot = req_payload.start_slot;
-		}
+		payload->start_slot = req_payload.start_slot;
 		/* work out what is required to happen with this payload */
-		if (mgr->payloads[i].num_slots != req_payload.num_slots) {
+		if (payload->num_slots != req_payload.num_slots) {
 
 			/* need to push an update for this payload */
 			if (req_payload.num_slots) {
-				drm_dp_create_payload_step1(mgr, mgr->proposed_vcpis[i]->vcpi, &req_payload);
-				mgr->payloads[i].num_slots = req_payload.num_slots;
-				mgr->payloads[i].vcpi = req_payload.vcpi;
-			} else if (mgr->payloads[i].num_slots) {
-				mgr->payloads[i].num_slots = 0;
-				drm_dp_destroy_payload_step1(mgr, port, mgr->payloads[i].vcpi, &mgr->payloads[i]);
-				req_payload.payload_state = mgr->payloads[i].payload_state;
-				mgr->payloads[i].start_slot = 0;
+				drm_dp_create_payload_step1(mgr, vcpi->vcpi,
+							    &req_payload);
+				payload->num_slots = req_payload.num_slots;
+				payload->vcpi = req_payload.vcpi;
+
+			} else if (payload->num_slots) {
+				payload->num_slots = 0;
+				drm_dp_destroy_payload_step1(mgr, port,
+							     payload->vcpi,
+							     payload);
+				req_payload.payload_state =
+					payload->payload_state;
+				payload->start_slot = 0;
 			}
-			mgr->payloads[i].payload_state = req_payload.payload_state;
+			payload->payload_state = req_payload.payload_state;
 		}
 		cur_slots += req_payload.num_slots;
 
-		if (port)
-			drm_dp_put_port(port);
+		if (put_port)
+			drm_dp_mst_topology_put_port(port);
 	}
 
 	for (i = 0; i < mgr->max_payloads; i++) {
-		if (mgr->payloads[i].payload_state == DP_PAYLOAD_DELETE_LOCAL) {
-			DRM_DEBUG_KMS("removing payload %d\n", i);
-			for (j = i; j < mgr->max_payloads - 1; j++) {
-				memcpy(&mgr->payloads[j], &mgr->payloads[j + 1], sizeof(struct drm_dp_payload));
-				mgr->proposed_vcpis[j] = mgr->proposed_vcpis[j + 1];
-				if (mgr->proposed_vcpis[j] && mgr->proposed_vcpis[j]->num_slots) {
-					set_bit(j + 1, &mgr->payload_mask);
-				} else {
-					clear_bit(j + 1, &mgr->payload_mask);
-				}
-			}
-			memset(&mgr->payloads[mgr->max_payloads - 1], 0, sizeof(struct drm_dp_payload));
-			mgr->proposed_vcpis[mgr->max_payloads - 1] = NULL;
-			clear_bit(mgr->max_payloads, &mgr->payload_mask);
+		if (mgr->payloads[i].payload_state != DP_PAYLOAD_DELETE_LOCAL)
+			continue;
 
+		DRM_DEBUG_KMS("removing payload %d\n", i);
+		for (j = i; j < mgr->max_payloads - 1; j++) {
+			mgr->payloads[j] = mgr->payloads[j + 1];
+			mgr->proposed_vcpis[j] = mgr->proposed_vcpis[j + 1];
+
+			if (mgr->proposed_vcpis[j] &&
+			    mgr->proposed_vcpis[j]->num_slots) {
+				set_bit(j + 1, &mgr->payload_mask);
+			} else {
+				clear_bit(j + 1, &mgr->payload_mask);
+			}
 		}
+
+		memset(&mgr->payloads[mgr->max_payloads - 1], 0,
+		       sizeof(struct drm_dp_payload));
+		mgr->proposed_vcpis[mgr->max_payloads - 1] = NULL;
+		clear_bit(mgr->max_payloads, &mgr->payload_mask);
 	}
 	mutex_unlock(&mgr->payload_lock);
 
@@ -2012,7 +2466,7 @@ static int drm_dp_send_dpcd_write(struct drm_dp_mst_topology_mgr *mgr,
 	struct drm_dp_sideband_msg_tx *txmsg;
 	struct drm_dp_mst_branch *mstb;
 
-	mstb = drm_dp_get_validated_mstb_ref(mgr, port->parent);
+	mstb = drm_dp_mst_topology_get_mstb_validated(mgr, port->parent);
 	if (!mstb)
 		return -EINVAL;
 
@@ -2029,14 +2483,14 @@ static int drm_dp_send_dpcd_write(struct drm_dp_mst_topology_mgr *mgr,
 
 	ret = drm_dp_mst_wait_tx_reply(mstb, txmsg);
 	if (ret > 0) {
-		if (txmsg->reply.reply_type == 1) {
+		if (txmsg->reply.reply_type == DP_SIDEBAND_REPLY_NAK)
 			ret = -EINVAL;
-		} else
+		else
 			ret = 0;
 	}
 	kfree(txmsg);
 fail_put:
-	drm_dp_put_mst_branch_device(mstb);
+	drm_dp_mst_topology_put_mstb(mstb);
 	return ret;
 }
 
@@ -2044,7 +2498,7 @@ static int drm_dp_encode_up_ack_reply(struct drm_dp_sideband_msg_tx *msg, u8 req
 {
 	struct drm_dp_sideband_msg_reply_body reply;
 
-	reply.reply_type = 0;
+	reply.reply_type = DP_SIDEBAND_REPLY_ACK;
 	reply.req_type = req_type;
 	drm_dp_encode_sideband_reply(&reply, msg);
 	return 0;
@@ -2146,7 +2600,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
 
 		/* give this the main reference */
 		mgr->mst_primary = mstb;
-		kref_get(&mgr->mst_primary->kref);
+		drm_dp_mst_topology_get_mstb(mgr->mst_primary);
 
 		ret = drm_dp_dpcd_writeb(mgr->aux, DP_MSTM_CTRL,
 							 DP_MST_EN | DP_UP_REQ_EN | DP_UPSTREAM_IS_SRC);
@@ -2180,7 +2634,7 @@ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool ms
 out_unlock:
 	mutex_unlock(&mgr->lock);
 	if (mstb)
-		drm_dp_put_mst_branch_device(mstb);
+		drm_dp_mst_topology_put_mstb(mstb);
 	return ret;
 
 }
@@ -2345,18 +2799,23 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr)
 			       mgr->down_rep_recv.initial_hdr.lct,
 				      mgr->down_rep_recv.initial_hdr.rad[0],
 				      mgr->down_rep_recv.msg[0]);
-			drm_dp_put_mst_branch_device(mstb);
+			drm_dp_mst_topology_put_mstb(mstb);
 			memset(&mgr->down_rep_recv, 0, sizeof(struct drm_dp_sideband_msg_rx));
 			return 0;
 		}
 
 		drm_dp_sideband_parse_reply(&mgr->down_rep_recv, &txmsg->reply);
-		if (txmsg->reply.reply_type == 1) {
-			DRM_DEBUG_KMS("Got NAK reply: req 0x%02x, reason 0x%02x, nak data 0x%02x\n", txmsg->reply.req_type, txmsg->reply.u.nak.reason, txmsg->reply.u.nak.nak_data);
-		}
+
+		if (txmsg->reply.reply_type == DP_SIDEBAND_REPLY_NAK)
+			DRM_DEBUG_KMS("Got NAK reply: req 0x%02x (%s), reason 0x%02x (%s), nak data 0x%02x\n",
+				      txmsg->reply.req_type,
+				      drm_dp_mst_req_type_str(txmsg->reply.req_type),
+				      txmsg->reply.u.nak.reason,
+				      drm_dp_mst_nak_reason_str(txmsg->reply.u.nak.reason),
+				      txmsg->reply.u.nak.nak_data);
 
 		memset(&mgr->down_rep_recv, 0, sizeof(struct drm_dp_sideband_msg_rx));
-		drm_dp_put_mst_branch_device(mstb);
+		drm_dp_mst_topology_put_mstb(mstb);
 
 		mutex_lock(&mgr->qlock);
 		txmsg->state = DRM_DP_SIDEBAND_TX_RX;
@@ -2412,7 +2871,7 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr)
 			drm_dp_update_port(mstb, &msg.u.conn_stat);
 
 			DRM_DEBUG_KMS("Got CSN: pn: %d ldps:%d ddps: %d mcs: %d ip: %d pdt: %d\n", msg.u.conn_stat.port_number, msg.u.conn_stat.legacy_device_plug_status, msg.u.conn_stat.displayport_device_plug_status, msg.u.conn_stat.message_capability_status, msg.u.conn_stat.input_port, msg.u.conn_stat.peer_device_type);
-			(*mgr->cbs->hotplug)(mgr);
+			drm_kms_helper_hotplug_event(mgr->dev);
 
 		} else if (msg.req_type == DP_RESOURCE_STATUS_NOTIFY) {
 			drm_dp_send_up_ack_reply(mgr, mgr->mst_primary, msg.req_type, seqno, false);
@@ -2429,7 +2888,7 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr)
 		}
 
 		if (mstb)
-			drm_dp_put_mst_branch_device(mstb);
+			drm_dp_mst_topology_put_mstb(mstb);
 
 		memset(&mgr->up_req_recv, 0, sizeof(struct drm_dp_sideband_msg_rx));
 	}
@@ -2488,8 +2947,8 @@ enum drm_connector_status drm_dp_mst_detect_port(struct drm_connector *connector
 {
 	enum drm_connector_status status = connector_status_disconnected;
 
-	/* we need to search for the port in the mgr in case its gone */
-	port = drm_dp_get_validated_port_ref(mgr, port);
+	/* we need to search for the port in the mgr in case it's gone */
+	port = drm_dp_mst_topology_get_port_validated(mgr, port);
 	if (!port)
 		return connector_status_disconnected;
 
@@ -2514,7 +2973,7 @@ enum drm_connector_status drm_dp_mst_detect_port(struct drm_connector *connector
 		break;
 	}
 out:
-	drm_dp_put_port(port);
+	drm_dp_mst_topology_put_port(port);
 	return status;
 }
 EXPORT_SYMBOL(drm_dp_mst_detect_port);
@@ -2531,11 +2990,11 @@ bool drm_dp_mst_port_has_audio(struct drm_dp_mst_topology_mgr *mgr,
 {
 	bool ret = false;
 
-	port = drm_dp_get_validated_port_ref(mgr, port);
+	port = drm_dp_mst_topology_get_port_validated(mgr, port);
 	if (!port)
 		return ret;
 	ret = port->has_audio;
-	drm_dp_put_port(port);
+	drm_dp_mst_topology_put_port(port);
 	return ret;
 }
 EXPORT_SYMBOL(drm_dp_mst_port_has_audio);
@@ -2554,8 +3013,8 @@ struct edid *drm_dp_mst_get_edid(struct drm_connector *connector, struct drm_dp_
 {
 	struct edid *edid = NULL;
 
-	/* we need to search for the port in the mgr in case its gone */
-	port = drm_dp_get_validated_port_ref(mgr, port);
+	/* we need to search for the port in the mgr in case it's gone */
+	port = drm_dp_mst_topology_get_port_validated(mgr, port);
 	if (!port)
 		return NULL;
 
@@ -2566,7 +3025,7 @@ struct edid *drm_dp_mst_get_edid(struct drm_connector *connector, struct drm_dp_
 		drm_connector_set_tile_property(connector);
 	}
 	port->has_audio = drm_detect_monitor_audio(edid);
-	drm_dp_put_port(port);
+	drm_dp_mst_topology_put_port(port);
 	return edid;
 }
 EXPORT_SYMBOL(drm_dp_mst_get_edid);
@@ -2617,43 +3076,90 @@ static int drm_dp_init_vcpi(struct drm_dp_mst_topology_mgr *mgr,
 }
 
 /**
- * drm_dp_atomic_find_vcpi_slots() - Find and add vcpi slots to the state
+ * drm_dp_atomic_find_vcpi_slots() - Find and add VCPI slots to the state
  * @state: global atomic state
  * @mgr: MST topology manager for the port
  * @port: port to find vcpi slots for
  * @pbn: bandwidth required for the mode in PBN
  *
- * RETURNS:
- * Total slots in the atomic state assigned for this port or error
+ * Allocates VCPI slots to @port, replacing any previous VCPI allocations it
+ * may have had. Any atomic drivers which support MST must call this function
+ * in their &drm_encoder_helper_funcs.atomic_check() callback to change the
+ * current VCPI allocation for the new state, but only when
+ * &drm_crtc_state.mode_changed or &drm_crtc_state.connectors_changed is set
+ * to ensure compatibility with userspace applications that still use the
+ * legacy modesetting UAPI.
+ *
+ * Allocations set by this function are not checked against the bandwidth
+ * restraints of @mgr until the driver calls drm_dp_mst_atomic_check().
+ *
+ * Additionally, it is OK to call this function multiple times on the same
+ * @port as needed. It is not OK however, to call this function and
+ * drm_dp_atomic_release_vcpi_slots() in the same atomic check phase.
+ *
+ * See also:
+ * drm_dp_atomic_release_vcpi_slots()
+ * drm_dp_mst_atomic_check()
+ *
+ * Returns:
+ * Total slots in the atomic state assigned for this port, or a negative error
+ * code if the port no longer exists
  */
 int drm_dp_atomic_find_vcpi_slots(struct drm_atomic_state *state,
 				  struct drm_dp_mst_topology_mgr *mgr,
 				  struct drm_dp_mst_port *port, int pbn)
 {
 	struct drm_dp_mst_topology_state *topology_state;
-	int req_slots;
+	struct drm_dp_vcpi_allocation *pos, *vcpi = NULL;
+	int prev_slots, req_slots, ret;
 
 	topology_state = drm_atomic_get_mst_topology_state(state, mgr);
 	if (IS_ERR(topology_state))
 		return PTR_ERR(topology_state);
 
-	port = drm_dp_get_validated_port_ref(mgr, port);
-	if (port == NULL)
-		return -EINVAL;
-	req_slots = DIV_ROUND_UP(pbn, mgr->pbn_div);
-	DRM_DEBUG_KMS("vcpi slots req=%d, avail=%d\n",
-			req_slots, topology_state->avail_slots);
+	/* Find the current allocation for this port, if any */
+	list_for_each_entry(pos, &topology_state->vcpis, next) {
+		if (pos->port == port) {
+			vcpi = pos;
+			prev_slots = vcpi->vcpi;
+
+			/*
+			 * This should never happen, unless the driver tries
+			 * releasing and allocating the same VCPI allocation,
+			 * which is an error
+			 */
+			if (WARN_ON(!prev_slots)) {
+				DRM_ERROR("cannot allocate and release VCPI on [MST PORT:%p] in the same state\n",
+					  port);
+				return -EINVAL;
+			}
 
-	if (req_slots > topology_state->avail_slots) {
-		drm_dp_put_port(port);
-		return -ENOSPC;
+			break;
+		}
 	}
+	if (!vcpi)
+		prev_slots = 0;
 
-	topology_state->avail_slots -= req_slots;
-	DRM_DEBUG_KMS("vcpi slots avail=%d", topology_state->avail_slots);
+	req_slots = DIV_ROUND_UP(pbn, mgr->pbn_div);
+
+	DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] [MST PORT:%p] VCPI %d -> %d\n",
+			 port->connector->base.id, port->connector->name,
+			 port, prev_slots, req_slots);
+
+	/* Add the new allocation to the state */
+	if (!vcpi) {
+		vcpi = kzalloc(sizeof(*vcpi), GFP_KERNEL);
+		if (!vcpi)
+			return -ENOMEM;
+
+		drm_dp_mst_get_port_malloc(port);
+		vcpi->port = port;
+		list_add(&vcpi->next, &topology_state->vcpis);
+	}
+	vcpi->vcpi = req_slots;
 
-	drm_dp_put_port(port);
-	return req_slots;
+	ret = req_slots;
+	return ret;
 }
 EXPORT_SYMBOL(drm_dp_atomic_find_vcpi_slots);
 
@@ -2661,31 +3167,57 @@ EXPORT_SYMBOL(drm_dp_atomic_find_vcpi_slots);
  * drm_dp_atomic_release_vcpi_slots() - Release allocated vcpi slots
  * @state: global atomic state
  * @mgr: MST topology manager for the port
- * @slots: number of vcpi slots to release
+ * @port: The port to release the VCPI slots from
  *
- * RETURNS:
- * 0 if @slots were added back to &drm_dp_mst_topology_state->avail_slots or
- * negative error code
+ * Releases any VCPI slots that have been allocated to a port in the atomic
+ * state. Any atomic drivers which support MST must call this function in
+ * their &drm_connector_helper_funcs.atomic_check() callback when the
+ * connector will no longer have VCPI allocated (e.g. because its CRTC was
+ * removed) when it had VCPI allocated in the previous atomic state.
+ *
+ * It is OK to call this even if @port has been removed from the system.
+ * Additionally, it is OK to call this function multiple times on the same
+ * @port as needed. It is not OK however, to call this function and
+ * drm_dp_atomic_find_vcpi_slots() on the same @port in a single atomic check
+ * phase.
+ *
+ * See also:
+ * drm_dp_atomic_find_vcpi_slots()
+ * drm_dp_mst_atomic_check()
+ *
+ * Returns:
+ * 0 if all slots for this port were added back to
+ * &drm_dp_mst_topology_state.avail_slots or negative error code
  */
 int drm_dp_atomic_release_vcpi_slots(struct drm_atomic_state *state,
 				     struct drm_dp_mst_topology_mgr *mgr,
-				     int slots)
+				     struct drm_dp_mst_port *port)
 {
 	struct drm_dp_mst_topology_state *topology_state;
+	struct drm_dp_vcpi_allocation *pos;
+	bool found = false;
 
 	topology_state = drm_atomic_get_mst_topology_state(state, mgr);
 	if (IS_ERR(topology_state))
 		return PTR_ERR(topology_state);
 
-	/* We cannot rely on port->vcpi.num_slots to update
-	 * topology_state->avail_slots as the port may not exist if the parent
-	 * branch device was unplugged. This should be fixed by tracking
-	 * per-port slot allocation in drm_dp_mst_topology_state instead of
-	 * depending on the caller to tell us how many slots to release.
-	 */
-	topology_state->avail_slots += slots;
-	DRM_DEBUG_KMS("vcpi slots released=%d, avail=%d\n",
-			slots, topology_state->avail_slots);
+	list_for_each_entry(pos, &topology_state->vcpis, next) {
+		if (pos->port == port) {
+			found = true;
+			break;
+		}
+	}
+	if (WARN_ON(!found)) {
+		DRM_ERROR("no VCPI for [MST PORT:%p] found in mst state %p\n",
+			  port, &topology_state->base);
+		return -EINVAL;
+	}
+
+	DRM_DEBUG_ATOMIC("[MST PORT:%p] VCPI %d -> 0\n", port, pos->vcpi);
+	if (pos->vcpi) {
+		drm_dp_mst_put_port_malloc(port);
+		pos->vcpi = 0;
+	}
 
 	return 0;
 }
@@ -2703,7 +3235,7 @@ bool drm_dp_mst_allocate_vcpi(struct drm_dp_mst_topology_mgr *mgr,
 {
 	int ret;
 
-	port = drm_dp_get_validated_port_ref(mgr, port);
+	port = drm_dp_mst_topology_get_port_validated(mgr, port);
 	if (!port)
 		return false;
 
@@ -2711,9 +3243,10 @@ bool drm_dp_mst_allocate_vcpi(struct drm_dp_mst_topology_mgr *mgr,
 		return false;
 
 	if (port->vcpi.vcpi > 0) {
-		DRM_DEBUG_KMS("payload: vcpi %d already allocated for pbn %d - requested pbn %d\n", port->vcpi.vcpi, port->vcpi.pbn, pbn);
+		DRM_DEBUG_KMS("payload: vcpi %d already allocated for pbn %d - requested pbn %d\n",
+			      port->vcpi.vcpi, port->vcpi.pbn, pbn);
 		if (pbn == port->vcpi.pbn) {
-			drm_dp_put_port(port);
+			drm_dp_mst_topology_put_port(port);
 			return true;
 		}
 	}
@@ -2721,13 +3254,15 @@ bool drm_dp_mst_allocate_vcpi(struct drm_dp_mst_topology_mgr *mgr,
 	ret = drm_dp_init_vcpi(mgr, &port->vcpi, pbn, slots);
 	if (ret) {
 		DRM_DEBUG_KMS("failed to init vcpi slots=%d max=63 ret=%d\n",
-				DIV_ROUND_UP(pbn, mgr->pbn_div), ret);
+			      DIV_ROUND_UP(pbn, mgr->pbn_div), ret);
 		goto out;
 	}
 	DRM_DEBUG_KMS("initing vcpi for pbn=%d slots=%d\n",
-			pbn, port->vcpi.num_slots);
+		      pbn, port->vcpi.num_slots);
 
-	drm_dp_put_port(port);
+	/* Keep port allocated until its payload has been removed */
+	drm_dp_mst_get_port_malloc(port);
+	drm_dp_mst_topology_put_port(port);
 	return true;
 out:
 	return false;
@@ -2737,12 +3272,12 @@ EXPORT_SYMBOL(drm_dp_mst_allocate_vcpi);
 int drm_dp_mst_get_vcpi_slots(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port)
 {
 	int slots = 0;
-	port = drm_dp_get_validated_port_ref(mgr, port);
+	port = drm_dp_mst_topology_get_port_validated(mgr, port);
 	if (!port)
 		return slots;
 
 	slots = port->vcpi.num_slots;
-	drm_dp_put_port(port);
+	drm_dp_mst_topology_put_port(port);
 	return slots;
 }
 EXPORT_SYMBOL(drm_dp_mst_get_vcpi_slots);
@@ -2756,23 +3291,27 @@ EXPORT_SYMBOL(drm_dp_mst_get_vcpi_slots);
  */
 void drm_dp_mst_reset_vcpi_slots(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port)
 {
-	port = drm_dp_get_validated_port_ref(mgr, port);
-	if (!port)
-		return;
+	/*
+	 * A port with VCPI will remain allocated until its VCPI is
+	 * released, no verified ref needed
+	 */
+
 	port->vcpi.num_slots = 0;
-	drm_dp_put_port(port);
 }
 EXPORT_SYMBOL(drm_dp_mst_reset_vcpi_slots);
 
 /**
  * drm_dp_mst_deallocate_vcpi() - deallocate a VCPI
  * @mgr: manager for this port
- * @port: unverified port to deallocate vcpi for
+ * @port: port to deallocate vcpi for
+ *
+ * This can be called unconditionally, regardless of whether
+ * drm_dp_mst_allocate_vcpi() succeeded or not.
  */
-void drm_dp_mst_deallocate_vcpi(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port)
+void drm_dp_mst_deallocate_vcpi(struct drm_dp_mst_topology_mgr *mgr,
+				struct drm_dp_mst_port *port)
 {
-	port = drm_dp_get_validated_port_ref(mgr, port);
-	if (!port)
+	if (!port->vcpi.vcpi)
 		return;
 
 	drm_dp_mst_put_payload_id(mgr, port->vcpi.vcpi);
@@ -2780,7 +3319,7 @@ void drm_dp_mst_deallocate_vcpi(struct drm_dp_mst_topology_mgr *mgr, struct drm_
 	port->vcpi.pbn = 0;
 	port->vcpi.aligned_pbn = 0;
 	port->vcpi.vcpi = 0;
-	drm_dp_put_port(port);
+	drm_dp_mst_put_port_malloc(port);
 }
 EXPORT_SYMBOL(drm_dp_mst_deallocate_vcpi);
 
@@ -3064,13 +3603,6 @@ static void drm_dp_tx_work(struct work_struct *work)
 	mutex_unlock(&mgr->qlock);
 }
 
-static void drm_dp_free_mst_port(struct kref *kref)
-{
-	struct drm_dp_mst_port *port = container_of(kref, struct drm_dp_mst_port, kref);
-	kref_put(&port->parent->kref, drm_dp_free_mst_branch_device);
-	kfree(port);
-}
-
 static void drm_dp_destroy_connector_work(struct work_struct *work)
 {
 	struct drm_dp_mst_topology_mgr *mgr = container_of(work, struct drm_dp_mst_topology_mgr, destroy_connector_work);
@@ -3091,7 +3623,6 @@ static void drm_dp_destroy_connector_work(struct work_struct *work)
 		list_del(&port->next);
 		mutex_unlock(&mgr->destroy_connector_lock);
 
-		kref_init(&port->kref);
 		INIT_LIST_HEAD(&port->next);
 
 		mgr->cbs->destroy_connector(mgr, port->connector);
@@ -3099,31 +3630,51 @@ static void drm_dp_destroy_connector_work(struct work_struct *work)
 		drm_dp_port_teardown_pdt(port, port->pdt);
 		port->pdt = DP_PEER_DEVICE_NONE;
 
-		if (!port->input && port->vcpi.vcpi > 0) {
-			drm_dp_mst_reset_vcpi_slots(mgr, port);
-			drm_dp_update_payload_part1(mgr);
-			drm_dp_mst_put_payload_id(mgr, port->vcpi.vcpi);
-		}
-
-		kref_put(&port->kref, drm_dp_free_mst_port);
+		drm_dp_mst_put_port_malloc(port);
 		send_hotplug = true;
 	}
 	if (send_hotplug)
-		(*mgr->cbs->hotplug)(mgr);
+		drm_kms_helper_hotplug_event(mgr->dev);
 }
 
 static struct drm_private_state *
 drm_dp_mst_duplicate_state(struct drm_private_obj *obj)
 {
-	struct drm_dp_mst_topology_state *state;
+	struct drm_dp_mst_topology_state *state, *old_state =
+		to_dp_mst_topology_state(obj->state);
+	struct drm_dp_vcpi_allocation *pos, *vcpi;
 
-	state = kmemdup(obj->state, sizeof(*state), GFP_KERNEL);
+	state = kmemdup(old_state, sizeof(*state), GFP_KERNEL);
 	if (!state)
 		return NULL;
 
 	__drm_atomic_helper_private_obj_duplicate_state(obj, &state->base);
 
+	INIT_LIST_HEAD(&state->vcpis);
+
+	list_for_each_entry(pos, &old_state->vcpis, next) {
+		/* Prune leftover freed VCPI allocations */
+		if (!pos->vcpi)
+			continue;
+
+		vcpi = kmemdup(pos, sizeof(*vcpi), GFP_KERNEL);
+		if (!vcpi)
+			goto fail;
+
+		drm_dp_mst_get_port_malloc(vcpi->port);
+		list_add(&vcpi->next, &state->vcpis);
+	}
+
 	return &state->base;
+
+fail:
+	list_for_each_entry_safe(pos, vcpi, &state->vcpis, next) {
+		drm_dp_mst_put_port_malloc(pos->port);
+		kfree(pos);
+	}
+	kfree(state);
+
+	return NULL;
 }
 
 static void drm_dp_mst_destroy_state(struct drm_private_obj *obj,
@@ -3131,14 +3682,99 @@ static void drm_dp_mst_destroy_state(struct drm_private_obj *obj,
 {
 	struct drm_dp_mst_topology_state *mst_state =
 		to_dp_mst_topology_state(state);
+	struct drm_dp_vcpi_allocation *pos, *tmp;
+
+	list_for_each_entry_safe(pos, tmp, &mst_state->vcpis, next) {
+		/* We only keep references to ports with non-zero VCPIs */
+		if (pos->vcpi)
+			drm_dp_mst_put_port_malloc(pos->port);
+		kfree(pos);
+	}
 
 	kfree(mst_state);
 }
 
-static const struct drm_private_state_funcs mst_state_funcs = {
+static inline int
+drm_dp_mst_atomic_check_topology_state(struct drm_dp_mst_topology_mgr *mgr,
+				       struct drm_dp_mst_topology_state *mst_state)
+{
+	struct drm_dp_vcpi_allocation *vcpi;
+	int avail_slots = 63, payload_count = 0;
+
+	list_for_each_entry(vcpi, &mst_state->vcpis, next) {
+		/* Releasing VCPI is always OK-even if the port is gone */
+		if (!vcpi->vcpi) {
+			DRM_DEBUG_ATOMIC("[MST PORT:%p] releases all VCPI slots\n",
+					 vcpi->port);
+			continue;
+		}
+
+		DRM_DEBUG_ATOMIC("[MST PORT:%p] requires %d vcpi slots\n",
+				 vcpi->port, vcpi->vcpi);
+
+		avail_slots -= vcpi->vcpi;
+		if (avail_slots < 0) {
+			DRM_DEBUG_ATOMIC("[MST PORT:%p] not enough VCPI slots in mst state %p (avail=%d)\n",
+					 vcpi->port, mst_state,
+					 avail_slots + vcpi->vcpi);
+			return -ENOSPC;
+		}
+
+		if (++payload_count > mgr->max_payloads) {
+			DRM_DEBUG_ATOMIC("[MST MGR:%p] state %p has too many payloads (max=%d)\n",
+					 mgr, mst_state, mgr->max_payloads);
+			return -EINVAL;
+		}
+	}
+	DRM_DEBUG_ATOMIC("[MST MGR:%p] mst state %p VCPI avail=%d used=%d\n",
+			 mgr, mst_state, avail_slots,
+			 63 - avail_slots);
+
+	return 0;
+}
+
+/**
+ * drm_dp_mst_atomic_check - Check that the new state of an MST topology in an
+ * atomic update is valid
+ * @state: Pointer to the new &struct drm_dp_mst_topology_state
+ *
+ * Checks the given topology state for an atomic update to ensure that it's
+ * valid. This includes checking whether there's enough bandwidth to support
+ * the new VCPI allocations in the atomic update.
+ *
+ * Any atomic drivers supporting DP MST must make sure to call this after
+ * checking the rest of their state in their
+ * &drm_mode_config_funcs.atomic_check() callback.
+ *
+ * See also:
+ * drm_dp_atomic_find_vcpi_slots()
+ * drm_dp_atomic_release_vcpi_slots()
+ *
+ * Returns:
+ *
+ * 0 if the new state is valid, negative error code otherwise.
+ */
+int drm_dp_mst_atomic_check(struct drm_atomic_state *state)
+{
+	struct drm_dp_mst_topology_mgr *mgr;
+	struct drm_dp_mst_topology_state *mst_state;
+	int i, ret = 0;
+
+	for_each_new_mst_mgr_in_state(state, mgr, mst_state, i) {
+		ret = drm_dp_mst_atomic_check_topology_state(mgr, mst_state);
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL(drm_dp_mst_atomic_check);
+
+const struct drm_private_state_funcs drm_dp_mst_topology_state_funcs = {
 	.atomic_duplicate_state = drm_dp_mst_duplicate_state,
 	.atomic_destroy_state = drm_dp_mst_destroy_state,
 };
+EXPORT_SYMBOL(drm_dp_mst_topology_state_funcs);
 
 /**
  * drm_atomic_get_mst_topology_state: get MST topology state
@@ -3216,13 +3852,11 @@ int drm_dp_mst_topology_mgr_init(struct drm_dp_mst_topology_mgr *mgr,
 		return -ENOMEM;
 
 	mst_state->mgr = mgr;
+	INIT_LIST_HEAD(&mst_state->vcpis);
 
-	/* max. time slots - one slot for MTP header */
-	mst_state->avail_slots = 63;
-
-	drm_atomic_private_obj_init(&mgr->base,
+	drm_atomic_private_obj_init(dev, &mgr->base,
 				    &mst_state->base,
-				    &mst_state_funcs);
+				    &drm_dp_mst_topology_state_funcs);
 
 	return 0;
 }
@@ -3234,6 +3868,7 @@ EXPORT_SYMBOL(drm_dp_mst_topology_mgr_init);
  */
 void drm_dp_mst_topology_mgr_destroy(struct drm_dp_mst_topology_mgr *mgr)
 {
+	drm_dp_mst_topology_mgr_set_mst(mgr, false);
 	flush_work(&mgr->work);
 	flush_work(&mgr->destroy_connector_work);
 	mutex_lock(&mgr->payload_lock);
@@ -3249,6 +3884,23 @@ void drm_dp_mst_topology_mgr_destroy(struct drm_dp_mst_topology_mgr *mgr)
 }
 EXPORT_SYMBOL(drm_dp_mst_topology_mgr_destroy);
 
+static bool remote_i2c_read_ok(const struct i2c_msg msgs[], int num)
+{
+	int i;
+
+	if (num - 1 > DP_REMOTE_I2C_READ_MAX_TRANSACTIONS)
+		return false;
+
+	for (i = 0; i < num - 1; i++) {
+		if (msgs[i].flags & I2C_M_RD ||
+		    msgs[i].len > 0xff)
+			return false;
+	}
+
+	return msgs[num - 1].flags & I2C_M_RD &&
+		msgs[num - 1].len <= 0xff;
+}
+
 /* I2C device */
 static int drm_dp_mst_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
 			       int num)
@@ -3258,21 +3910,15 @@ static int drm_dp_mst_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs
 	struct drm_dp_mst_branch *mstb;
 	struct drm_dp_mst_topology_mgr *mgr = port->mgr;
 	unsigned int i;
-	bool reading = false;
 	struct drm_dp_sideband_msg_req_body msg;
 	struct drm_dp_sideband_msg_tx *txmsg = NULL;
 	int ret;
 
-	mstb = drm_dp_get_validated_mstb_ref(mgr, port->parent);
+	mstb = drm_dp_mst_topology_get_mstb_validated(mgr, port->parent);
 	if (!mstb)
 		return -EREMOTEIO;
 
-	/* construct i2c msg */
-	/* see if last msg is a read */
-	if (msgs[num - 1].flags & I2C_M_RD)
-		reading = true;
-
-	if (!reading || (num - 1 > DP_REMOTE_I2C_READ_MAX_TRANSACTIONS)) {
+	if (!remote_i2c_read_ok(msgs, num)) {
 		DRM_DEBUG_KMS("Unsupported I2C transaction for MST device\n");
 		ret = -EIO;
 		goto out;
@@ -3286,6 +3932,7 @@ static int drm_dp_mst_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs
 		msg.u.i2c_read.transactions[i].i2c_dev_id = msgs[i].addr;
 		msg.u.i2c_read.transactions[i].num_bytes = msgs[i].len;
 		msg.u.i2c_read.transactions[i].bytes = msgs[i].buf;
+		msg.u.i2c_read.transactions[i].no_stop_bit = !(msgs[i].flags & I2C_M_STOP);
 	}
 	msg.u.i2c_read.read_i2c_device_id = msgs[num - 1].addr;
 	msg.u.i2c_read.num_bytes_read = msgs[num - 1].len;
@@ -3304,7 +3951,7 @@ static int drm_dp_mst_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs
 	ret = drm_dp_mst_wait_tx_reply(mstb, txmsg);
 	if (ret > 0) {
 
-		if (txmsg->reply.reply_type == 1) { /* got a NAK back */
+		if (txmsg->reply.reply_type == DP_SIDEBAND_REPLY_NAK) {
 			ret = -EREMOTEIO;
 			goto out;
 		}
@@ -3317,7 +3964,7 @@ static int drm_dp_mst_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs
 	}
 out:
 	kfree(txmsg);
-	drm_dp_put_mst_branch_device(mstb);
+	drm_dp_mst_topology_put_mstb(mstb);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 12e5e2be7890..381581b01d48 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -41,7 +41,6 @@
 #include "drm_crtc_internal.h"
 #include "drm_legacy.h"
 #include "drm_internal.h"
-#include "drm_crtc_internal.h"
 
 /*
  * drm_debug: Enable debug output.
@@ -265,14 +264,13 @@ void drm_minor_release(struct drm_minor *minor)
  * DOC: driver instance overview
  *
  * A device instance for a drm driver is represented by &struct drm_device. This
- * is allocated with drm_dev_alloc(), usually from bus-specific ->probe()
+ * is initialized with drm_dev_init(), usually from bus-specific ->probe()
  * callbacks implemented by the driver. The driver then needs to initialize all
  * the various subsystems for the drm device like memory management, vblank
  * handling, modesetting support and intial output configuration plus obviously
- * initialize all the corresponding hardware bits. An important part of this is
- * also calling drm_dev_set_unique() to set the userspace-visible unique name of
- * this device instance. Finally when everything is up and running and ready for
- * userspace the device instance can be published using drm_dev_register().
+ * initialize all the corresponding hardware bits. Finally when everything is up
+ * and running and ready for userspace the device instance can be published
+ * using drm_dev_register().
  *
  * There is also deprecated support for initalizing device instances using
  * bus-specific helpers and the &drm_driver.load callback. But due to
@@ -288,9 +286,6 @@ void drm_minor_release(struct drm_minor *minor)
  * Note that the lifetime rules for &drm_device instance has still a lot of
  * historical baggage. Hence use the reference counting provided by
  * drm_dev_get() and drm_dev_put() only carefully.
- *
- * It is recommended that drivers embed &struct drm_device into their own device
- * structure, which is supported through drm_dev_init().
  */
 
 /**
@@ -476,6 +471,9 @@ static void drm_fs_inode_free(struct inode *inode)
  * The initial ref-count of the object is 1. Use drm_dev_get() and
  * drm_dev_put() to take and drop further ref-counts.
  *
+ * It is recommended that drivers embed &struct drm_device into their own device
+ * structure.
+ *
  * Drivers that do not want to allocate their own device struct
  * embedding &struct drm_device can call drm_dev_alloc() instead. For drivers
  * that do embed &struct drm_device it must be placed first in the overall
@@ -766,7 +764,7 @@ static void remove_compat_control_link(struct drm_device *dev)
  * @flags: Flags passed to the driver's .load() function
  *
  * Register the DRM device @dev with the system, advertise device to user-space
- * and start normal device operation. @dev must be allocated via drm_dev_alloc()
+ * and start normal device operation. @dev must be initialized via drm_dev_init()
  * previously.
  *
  * Never call this twice on any device!
@@ -878,9 +876,9 @@ EXPORT_SYMBOL(drm_dev_unregister);
  * @dev: device of which to set the unique name
  * @name: unique name
  *
- * Sets the unique name of a DRM device using the specified string. Drivers
- * can use this at driver probe time if the unique name of the devices they
- * drive is static.
+ * Sets the unique name of a DRM device using the specified string. This is
+ * already done by drm_dev_init(), drivers should only override the default
+ * unique name for backwards compatibility reasons.
  *
  * Return: 0 on success or a negative error code on failure.
  */
diff --git a/drivers/gpu/drm/drm_dsc.c b/drivers/gpu/drm/drm_dsc.c
index bc2b23adb072..bce99f95c1a3 100644
--- a/drivers/gpu/drm/drm_dsc.c
+++ b/drivers/gpu/drm/drm_dsc.c
@@ -17,6 +17,12 @@
 /**
  * DOC: dsc helpers
  *
+ * VESA specification for DP 1.4 adds a new feature called Display Stream
+ * Compression (DSC) used to compress the pixel bits before sending it on
+ * DP/eDP/MIPI DSI interface. DSC is required to be enabled so that the existing
+ * display interfaces can support high resolutions at higher frames rates uisng
+ * the maximum available link capacity of these interfaces.
+ *
  * These functions contain some common logic and helpers to deal with VESA
  * Display Stream Compression standard required for DSC on Display Port/eDP or
  * MIPI display interfaces.
@@ -26,6 +32,13 @@
  * drm_dsc_dp_pps_header_init() - Initializes the PPS Header
  * for DisplayPort as per the DP 1.4 spec.
  * @pps_sdp: Secondary data packet for DSC Picture Parameter Set
+ *           as defined in &struct drm_dsc_pps_infoframe
+ *
+ * DP 1.4 spec defines the secondary data packet for sending the
+ * picture parameter infoframes from the source to the sink.
+ * This function populates the pps header defined in
+ * &struct drm_dsc_pps_infoframe as per the header bytes defined
+ * in &struct dp_sdp_header.
  */
 void drm_dsc_dp_pps_header_init(struct drm_dsc_pps_infoframe *pps_sdp)
 {
@@ -38,15 +51,20 @@ EXPORT_SYMBOL(drm_dsc_dp_pps_header_init);
 
 /**
  * drm_dsc_pps_infoframe_pack() - Populates the DSC PPS infoframe
- * using the DSC configuration parameters in the order expected
- * by the DSC Display Sink device. For the DSC, the sink device
- * expects the PPS payload in the big endian format for the fields
- * that span more than 1 byte.
  *
  * @pps_sdp:
- * Secondary data packet for DSC Picture Parameter Set
+ * Secondary data packet for DSC Picture Parameter Set. This is defined
+ * by &struct drm_dsc_pps_infoframe
  * @dsc_cfg:
- * DSC Configuration data filled by driver
+ * DSC Configuration data filled by driver as defined by
+ * &struct drm_dsc_config
+ *
+ * DSC source device sends a secondary data packet filled with all the
+ * picture parameter set (PPS) information required by the sink to decode
+ * the compressed frame. Driver populates the dsC PPS infoframe using the DSC
+ * configuration parameters in the order expected by the DSC Display Sink
+ * device. For the DSC, the sink device expects the PPS payload in the big
+ * endian format for the fields that span more than 1 byte.
  */
 void drm_dsc_pps_infoframe_pack(struct drm_dsc_pps_infoframe *pps_sdp,
 				const struct drm_dsc_config *dsc_cfg)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index b506e3622b08..990b1909f9d7 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3641,6 +3641,20 @@ static bool cea_db_is_hdmi_forum_vsdb(const u8 *db)
 	return oui == HDMI_FORUM_IEEE_OUI;
 }
 
+static bool cea_db_is_vcdb(const u8 *db)
+{
+	if (cea_db_tag(db) != USE_EXTENDED_TAG)
+		return false;
+
+	if (cea_db_payload_len(db) != 2)
+		return false;
+
+	if (cea_db_extended_tag(db) != EXT_VIDEO_CAPABILITY_BLOCK)
+		return false;
+
+	return true;
+}
+
 static bool cea_db_is_y420cmdb(const u8 *db)
 {
 	if (cea_db_tag(db) != USE_EXTENDED_TAG)
@@ -4223,41 +4237,6 @@ end:
 }
 EXPORT_SYMBOL(drm_detect_monitor_audio);
 
-/**
- * drm_rgb_quant_range_selectable - is RGB quantization range selectable?
- * @edid: EDID block to scan
- *
- * Check whether the monitor reports the RGB quantization range selection
- * as supported. The AVI infoframe can then be used to inform the monitor
- * which quantization range (full or limited) is used.
- *
- * Return: True if the RGB quantization range is selectable, false otherwise.
- */
-bool drm_rgb_quant_range_selectable(struct edid *edid)
-{
-	u8 *edid_ext;
-	int i, start, end;
-
-	edid_ext = drm_find_cea_extension(edid);
-	if (!edid_ext)
-		return false;
-
-	if (cea_db_offsets(edid_ext, &start, &end))
-		return false;
-
-	for_each_cea_db(edid_ext, i, start, end) {
-		if (cea_db_tag(&edid_ext[i]) == USE_EXTENDED_TAG &&
-		    cea_db_payload_len(&edid_ext[i]) == 2 &&
-		    cea_db_extended_tag(&edid_ext[i]) ==
-			EXT_VIDEO_CAPABILITY_BLOCK) {
-			DRM_DEBUG_KMS("CEA VCDB 0x%02x\n", edid_ext[i + 2]);
-			return edid_ext[i + 2] & EDID_CEA_VCDB_QS;
-		}
-	}
-
-	return false;
-}
-EXPORT_SYMBOL(drm_rgb_quant_range_selectable);
 
 /**
  * drm_default_rgb_quant_range - default RGB quantization range
@@ -4278,6 +4257,16 @@ drm_default_rgb_quant_range(const struct drm_display_mode *mode)
 }
 EXPORT_SYMBOL(drm_default_rgb_quant_range);
 
+static void drm_parse_vcdb(struct drm_connector *connector, const u8 *db)
+{
+	struct drm_display_info *info = &connector->display_info;
+
+	DRM_DEBUG_KMS("CEA VCDB 0x%02x\n", db[2]);
+
+	if (db[2] & EDID_CEA_VCDB_QS)
+		info->rgb_quant_range_selectable = true;
+}
+
 static void drm_parse_ycbcr420_deep_color_info(struct drm_connector *connector,
 					       const u8 *db)
 {
@@ -4452,6 +4441,8 @@ static void drm_parse_cea_ext(struct drm_connector *connector,
 			drm_parse_hdmi_forum_vsdb(connector, db);
 		if (cea_db_is_y420cmdb(db))
 			drm_parse_y420cmdb_bitmap(connector, db);
+		if (cea_db_is_vcdb(db))
+			drm_parse_vcdb(connector, db);
 	}
 }
 
@@ -4472,6 +4463,7 @@ drm_reset_display_info(struct drm_connector *connector)
 	info->max_tmds_clock = 0;
 	info->dvi_dual = false;
 	info->has_hdmi_infoframe = false;
+	info->rgb_quant_range_selectable = false;
 	memset(&info->hdmi, 0, sizeof(info->hdmi));
 
 	info->non_desktop = 0;
@@ -4830,19 +4822,32 @@ void drm_set_preferred_mode(struct drm_connector *connector,
 }
 EXPORT_SYMBOL(drm_set_preferred_mode);
 
+static bool is_hdmi2_sink(struct drm_connector *connector)
+{
+	/*
+	 * FIXME: sil-sii8620 doesn't have a connector around when
+	 * we need one, so we have to be prepared for a NULL connector.
+	 */
+	if (!connector)
+		return true;
+
+	return connector->display_info.hdmi.scdc.supported ||
+		connector->display_info.color_formats & DRM_COLOR_FORMAT_YCRCB420;
+}
+
 /**
  * drm_hdmi_avi_infoframe_from_display_mode() - fill an HDMI AVI infoframe with
  *                                              data from a DRM display mode
  * @frame: HDMI AVI infoframe
+ * @connector: the connector
  * @mode: DRM display mode
- * @is_hdmi2_sink: Sink is HDMI 2.0 compliant
  *
  * Return: 0 on success or a negative error code on failure.
  */
 int
 drm_hdmi_avi_infoframe_from_display_mode(struct hdmi_avi_infoframe *frame,
-					 const struct drm_display_mode *mode,
-					 bool is_hdmi2_sink)
+					 struct drm_connector *connector,
+					 const struct drm_display_mode *mode)
 {
 	enum hdmi_picture_aspect picture_aspect;
 	int err;
@@ -4864,7 +4869,7 @@ drm_hdmi_avi_infoframe_from_display_mode(struct hdmi_avi_infoframe *frame,
 	 * HDMI 2.0 VIC range: 1 <= VIC <= 107 (CEA-861-F). So we
 	 * have to make sure we dont break HDMI 1.4 sinks.
 	 */
-	if (!is_hdmi2_sink && frame->video_code > 64)
+	if (!is_hdmi2_sink(connector) && frame->video_code > 64)
 		frame->video_code = 0;
 
 	/*
@@ -4923,22 +4928,18 @@ EXPORT_SYMBOL(drm_hdmi_avi_infoframe_from_display_mode);
  * drm_hdmi_avi_infoframe_quant_range() - fill the HDMI AVI infoframe
  *                                        quantization range information
  * @frame: HDMI AVI infoframe
+ * @connector: the connector
  * @mode: DRM display mode
  * @rgb_quant_range: RGB quantization range (Q)
- * @rgb_quant_range_selectable: Sink support selectable RGB quantization range (QS)
- * @is_hdmi2_sink: HDMI 2.0 sink, which has different default recommendations
- *
- * Note that @is_hdmi2_sink can be derived by looking at the
- * &drm_scdc.supported flag stored in &drm_hdmi_info.scdc,
- * &drm_display_info.hdmi, which can be found in &drm_connector.display_info.
  */
 void
 drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame,
+				   struct drm_connector *connector,
 				   const struct drm_display_mode *mode,
-				   enum hdmi_quantization_range rgb_quant_range,
-				   bool rgb_quant_range_selectable,
-				   bool is_hdmi2_sink)
+				   enum hdmi_quantization_range rgb_quant_range)
 {
+	const struct drm_display_info *info = &connector->display_info;
+
 	/*
 	 * CEA-861:
 	 * "A Source shall not send a non-zero Q value that does not correspond
@@ -4949,7 +4950,7 @@ drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame,
 	 * HDMI 2.0 recommends sending non-zero Q when it does match the
 	 * default RGB quantization range for the mode, even when QS=0.
 	 */
-	if (rgb_quant_range_selectable ||
+	if (info->rgb_quant_range_selectable ||
 	    rgb_quant_range == drm_default_rgb_quant_range(mode))
 		frame->quantization_range = rgb_quant_range;
 	else
@@ -4968,7 +4969,7 @@ drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame,
 	 * we limit non-zero YQ to HDMI 2.0 sinks only as HDMI 2.0 is based
 	 * on on CEA-861-F.
 	 */
-	if (!is_hdmi2_sink ||
+	if (!is_hdmi2_sink(connector) ||
 	    rgb_quant_range == HDMI_QUANTIZATION_RANGE_LIMITED)
 		frame->ycc_quantization_range =
 			HDMI_YCC_QUANTIZATION_RANGE_LIMITED;
diff --git a/drivers/gpu/drm/drm_fb_cma_helper.c b/drivers/gpu/drm/drm_fb_cma_helper.c
index 5b516615881a..5f8074ffe7d9 100644
--- a/drivers/gpu/drm/drm_fb_cma_helper.c
+++ b/drivers/gpu/drm/drm_fb_cma_helper.c
@@ -17,20 +17,13 @@
  * GNU General Public License for more details.
  */
 
-#include <drm/drmP.h>
-#include <drm/drm_client.h>
-#include <drm/drm_fb_helper.h>
+#include <drm/drm_fourcc.h>
 #include <drm/drm_framebuffer.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
-#include <drm/drm_fb_cma_helper.h>
-#include <drm/drm_print.h>
+#include <drm/drm_plane.h>
 #include <linux/module.h>
 
-struct drm_fbdev_cma {
-	struct drm_fb_helper	fb_helper;
-};
-
 /**
  * DOC: framebuffer cma helper functions
  *
@@ -39,16 +32,8 @@ struct drm_fbdev_cma {
  *
  * drm_gem_fb_create() is used in the &drm_mode_config_funcs.fb_create
  * callback function to create a cma backed framebuffer.
- *
- * An fbdev framebuffer backed by cma is also available by calling
- * drm_fb_cma_fbdev_init(). drm_fb_cma_fbdev_fini() tears it down.
  */
 
-static inline struct drm_fbdev_cma *to_fbdev_cma(struct drm_fb_helper *helper)
-{
-	return container_of(helper, struct drm_fbdev_cma, fb_helper);
-}
-
 /**
  * drm_fb_cma_get_gem_obj() - Get CMA GEM object for framebuffer
  * @fb: The framebuffer
@@ -119,121 +104,3 @@ dma_addr_t drm_fb_cma_get_gem_addr(struct drm_framebuffer *fb,
 	return paddr;
 }
 EXPORT_SYMBOL_GPL(drm_fb_cma_get_gem_addr);
-
-/**
- * drm_fb_cma_fbdev_init() - Allocate and initialize fbdev emulation
- * @dev: DRM device
- * @preferred_bpp: Preferred bits per pixel for the device.
- *                 @dev->mode_config.preferred_depth is used if this is zero.
- * @max_conn_count: Maximum number of connectors.
- *                  @dev->mode_config.num_connector is used if this is zero.
- *
- * Returns:
- * Zero on success or negative error code on failure.
- */
-int drm_fb_cma_fbdev_init(struct drm_device *dev, unsigned int preferred_bpp,
-			  unsigned int max_conn_count)
-{
-	struct drm_fbdev_cma *fbdev_cma;
-
-	/* dev->fb_helper will indirectly point to fbdev_cma after this call */
-	fbdev_cma = drm_fbdev_cma_init(dev, preferred_bpp, max_conn_count);
-	return PTR_ERR_OR_ZERO(fbdev_cma);
-}
-EXPORT_SYMBOL_GPL(drm_fb_cma_fbdev_init);
-
-/**
- * drm_fb_cma_fbdev_fini() - Teardown fbdev emulation
- * @dev: DRM device
- */
-void drm_fb_cma_fbdev_fini(struct drm_device *dev)
-{
-	if (dev->fb_helper)
-		drm_fbdev_cma_fini(to_fbdev_cma(dev->fb_helper));
-}
-EXPORT_SYMBOL_GPL(drm_fb_cma_fbdev_fini);
-
-static const struct drm_fb_helper_funcs drm_fb_cma_helper_funcs = {
-	.fb_probe = drm_fb_helper_generic_probe,
-};
-
-/**
- * drm_fbdev_cma_init() - Allocate and initializes a drm_fbdev_cma struct
- * @dev: DRM device
- * @preferred_bpp: Preferred bits per pixel for the device
- * @max_conn_count: Maximum number of connectors
- *
- * Returns a newly allocated drm_fbdev_cma struct or a ERR_PTR.
- */
-struct drm_fbdev_cma *drm_fbdev_cma_init(struct drm_device *dev,
-	unsigned int preferred_bpp, unsigned int max_conn_count)
-{
-	struct drm_fbdev_cma *fbdev_cma;
-	struct drm_fb_helper *fb_helper;
-	int ret;
-
-	fbdev_cma = kzalloc(sizeof(*fbdev_cma), GFP_KERNEL);
-	if (!fbdev_cma)
-		return ERR_PTR(-ENOMEM);
-
-	fb_helper = &fbdev_cma->fb_helper;
-
-	ret = drm_client_init(dev, &fb_helper->client, "fbdev", NULL);
-	if (ret)
-		goto err_free;
-
-	ret = drm_fb_helper_fbdev_setup(dev, fb_helper, &drm_fb_cma_helper_funcs,
-					preferred_bpp, max_conn_count);
-	if (ret)
-		goto err_client_put;
-
-	drm_client_add(&fb_helper->client);
-
-	return fbdev_cma;
-
-err_client_put:
-	drm_client_release(&fb_helper->client);
-err_free:
-	kfree(fbdev_cma);
-
-	return ERR_PTR(ret);
-}
-EXPORT_SYMBOL_GPL(drm_fbdev_cma_init);
-
-/**
- * drm_fbdev_cma_fini() - Free drm_fbdev_cma struct
- * @fbdev_cma: The drm_fbdev_cma struct
- */
-void drm_fbdev_cma_fini(struct drm_fbdev_cma *fbdev_cma)
-{
-	drm_fb_helper_unregister_fbi(&fbdev_cma->fb_helper);
-	/* All resources have now been freed by drm_fbdev_fb_destroy() */
-}
-EXPORT_SYMBOL_GPL(drm_fbdev_cma_fini);
-
-/**
- * drm_fbdev_cma_restore_mode() - Restores initial framebuffer mode
- * @fbdev_cma: The drm_fbdev_cma struct, may be NULL
- *
- * This function is usually called from the &drm_driver.lastclose callback.
- */
-void drm_fbdev_cma_restore_mode(struct drm_fbdev_cma *fbdev_cma)
-{
-	if (fbdev_cma)
-		drm_fb_helper_restore_fbdev_mode_unlocked(&fbdev_cma->fb_helper);
-}
-EXPORT_SYMBOL_GPL(drm_fbdev_cma_restore_mode);
-
-/**
- * drm_fbdev_cma_hotplug_event() - Poll for hotpulug events
- * @fbdev_cma: The drm_fbdev_cma struct, may be NULL
- *
- * This function is usually called from the &drm_mode_config.output_poll_changed
- * callback.
- */
-void drm_fbdev_cma_hotplug_event(struct drm_fbdev_cma *fbdev_cma)
-{
-	if (fbdev_cma)
-		drm_fb_helper_hotplug_event(&fbdev_cma->fb_helper);
-}
-EXPORT_SYMBOL_GPL(drm_fbdev_cma_hotplug_event);
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index d73703a695e8..0e9349ff2d16 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -1874,6 +1874,7 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 	int i;
 	struct drm_fb_helper_surface_size sizes;
 	int gamma_size = 0;
+	int best_depth = 0;
 
 	memset(&sizes, 0, sizeof(struct drm_fb_helper_surface_size));
 	sizes.surface_depth = 24;
@@ -1881,7 +1882,10 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 	sizes.fb_width = (u32)-1;
 	sizes.fb_height = (u32)-1;
 
-	/* if driver picks 8 or 16 by default use that for both depth/bpp */
+	/*
+	 * If driver picks 8 or 16 by default use that for both depth/bpp
+	 * to begin with
+	 */
 	if (preferred_bpp != sizes.surface_bpp)
 		sizes.surface_depth = sizes.surface_bpp = preferred_bpp;
 
@@ -1916,6 +1920,55 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 		}
 	}
 
+	/*
+	 * If we run into a situation where, for example, the primary plane
+	 * supports RGBA5551 (16 bpp, depth 15) but not RGB565 (16 bpp, depth
+	 * 16) we need to scale down the depth of the sizes we request.
+	 */
+	for (i = 0; i < fb_helper->crtc_count; i++) {
+		struct drm_mode_set *mode_set = &fb_helper->crtc_info[i].mode_set;
+		struct drm_crtc *crtc = mode_set->crtc;
+		struct drm_plane *plane = crtc->primary;
+		int j;
+
+		DRM_DEBUG("test CRTC %d primary plane\n", i);
+
+		for (j = 0; j < plane->format_count; j++) {
+			const struct drm_format_info *fmt;
+
+			fmt = drm_format_info(plane->format_types[j]);
+
+			/*
+			 * Do not consider YUV or other complicated formats
+			 * for framebuffers. This means only legacy formats
+			 * are supported (fmt->depth is a legacy field) but
+			 * the framebuffer emulation can only deal with such
+			 * formats, specifically RGB/BGA formats.
+			 */
+			if (fmt->depth == 0)
+				continue;
+
+			/* We found a perfect fit, great */
+			if (fmt->depth == sizes.surface_depth) {
+				best_depth = fmt->depth;
+				break;
+			}
+
+			/* Skip depths above what we're looking for */
+			if (fmt->depth > sizes.surface_depth)
+				continue;
+
+			/* Best depth found so far */
+			if (fmt->depth > best_depth)
+				best_depth = fmt->depth;
+		}
+	}
+	if (sizes.surface_depth != best_depth) {
+		DRM_INFO("requested bpp %d, scaled depth down to %d",
+			 sizes.surface_bpp, best_depth);
+		sizes.surface_depth = best_depth;
+	}
+
 	crtc_count = 0;
 	for (i = 0; i < fb_helper->crtc_count; i++) {
 		struct drm_display_mode *desired_mode;
@@ -2455,7 +2508,7 @@ static int drm_pick_crtcs(struct drm_fb_helper *fb_helper,
 /*
  * This function checks if rotation is necessary because of panel orientation
  * and if it is, if it is supported.
- * If rotation is necessary and supported, its gets set in fb_crtc.rotation.
+ * If rotation is necessary and supported, it gets set in fb_crtc.rotation.
  * If rotation is necessary but not supported, a DRM_MODE_ROTATE_* flag gets
  * or-ed into fb_helper->sw_rotations. In drm_setup_crtcs_fb() we check if only
  * one bit is set and then we set fb_info.fbcon_rotate_hint to make fbcon do
@@ -2891,7 +2944,7 @@ int drm_fb_helper_fbdev_setup(struct drm_device *dev,
 	return 0;
 
 err_drm_fb_helper_fini:
-	drm_fb_helper_fini(fb_helper);
+	drm_fb_helper_fbdev_teardown(dev);
 
 	return ret;
 }
@@ -2986,18 +3039,16 @@ static int drm_fbdev_fb_release(struct fb_info *info, int user)
 	return 0;
 }
 
-/*
- * fb_ops.fb_destroy is called by the last put_fb_info() call at the end of
- * unregister_framebuffer() or fb_release().
- */
-static void drm_fbdev_fb_destroy(struct fb_info *info)
+static void drm_fbdev_cleanup(struct drm_fb_helper *fb_helper)
 {
-	struct drm_fb_helper *fb_helper = info->par;
 	struct fb_info *fbi = fb_helper->fbdev;
 	struct fb_ops *fbops = NULL;
 	void *shadow = NULL;
 
-	if (fbi->fbdefio) {
+	if (!fb_helper->dev)
+		return;
+
+	if (fbi && fbi->fbdefio) {
 		fb_deferred_io_cleanup(fbi);
 		shadow = fbi->screen_buffer;
 		fbops = fbi->fbops;
@@ -3011,15 +3062,22 @@ static void drm_fbdev_fb_destroy(struct fb_info *info)
 	}
 
 	drm_client_framebuffer_delete(fb_helper->buffer);
-	/*
-	 * FIXME:
-	 * Remove conditional when all CMA drivers have been moved over to using
-	 * drm_fbdev_generic_setup().
-	 */
-	if (fb_helper->client.funcs) {
-		drm_client_release(&fb_helper->client);
-		kfree(fb_helper);
-	}
+}
+
+static void drm_fbdev_release(struct drm_fb_helper *fb_helper)
+{
+	drm_fbdev_cleanup(fb_helper);
+	drm_client_release(&fb_helper->client);
+	kfree(fb_helper);
+}
+
+/*
+ * fb_ops.fb_destroy is called by the last put_fb_info() call at the end of
+ * unregister_framebuffer() or fb_release().
+ */
+static void drm_fbdev_fb_destroy(struct fb_info *info)
+{
+	drm_fbdev_release(info->par);
 }
 
 static int drm_fbdev_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
@@ -3072,7 +3130,6 @@ int drm_fb_helper_generic_probe(struct drm_fb_helper *fb_helper,
 	struct drm_framebuffer *fb;
 	struct fb_info *fbi;
 	u32 format;
-	int ret;
 
 	DRM_DEBUG_KMS("surface width(%d), height(%d) and bpp(%d)\n",
 		      sizes->surface_width, sizes->surface_height,
@@ -3089,10 +3146,8 @@ int drm_fb_helper_generic_probe(struct drm_fb_helper *fb_helper,
 	fb = buffer->fb;
 
 	fbi = drm_fb_helper_alloc_fbi(fb_helper);
-	if (IS_ERR(fbi)) {
-		ret = PTR_ERR(fbi);
-		goto err_free_buffer;
-	}
+	if (IS_ERR(fbi))
+		return PTR_ERR(fbi);
 
 	fbi->par = fb_helper;
 	fbi->fbops = &drm_fbdev_fb_ops;
@@ -3123,8 +3178,7 @@ int drm_fb_helper_generic_probe(struct drm_fb_helper *fb_helper,
 		if (!fbops || !shadow) {
 			kfree(fbops);
 			vfree(shadow);
-			ret = -ENOMEM;
-			goto err_fb_info_destroy;
+			return -ENOMEM;
 		}
 
 		*fbops = *fbi->fbops;
@@ -3136,13 +3190,6 @@ int drm_fb_helper_generic_probe(struct drm_fb_helper *fb_helper,
 	}
 
 	return 0;
-
-err_fb_info_destroy:
-	drm_fb_helper_fini(fb_helper);
-err_free_buffer:
-	drm_client_framebuffer_delete(buffer);
-
-	return ret;
 }
 EXPORT_SYMBOL(drm_fb_helper_generic_probe);
 
@@ -3154,25 +3201,16 @@ static void drm_fbdev_client_unregister(struct drm_client_dev *client)
 {
 	struct drm_fb_helper *fb_helper = drm_fb_helper_from_client(client);
 
-	if (fb_helper->fbdev) {
-		drm_fb_helper_unregister_fbi(fb_helper);
+	if (fb_helper->fbdev)
 		/* drm_fbdev_fb_destroy() takes care of cleanup */
-		return;
-	}
-
-	/* Did drm_fb_helper_fbdev_setup() run? */
-	if (fb_helper->dev)
-		drm_fb_helper_fini(fb_helper);
-
-	drm_client_release(client);
-	kfree(fb_helper);
+		drm_fb_helper_unregister_fbi(fb_helper);
+	else
+		drm_fbdev_release(fb_helper);
 }
 
 static int drm_fbdev_client_restore(struct drm_client_dev *client)
 {
-	struct drm_fb_helper *fb_helper = drm_fb_helper_from_client(client);
-
-	drm_fb_helper_restore_fbdev_mode_unlocked(fb_helper);
+	drm_fb_helper_lastclose(client->dev);
 
 	return 0;
 }
@@ -3183,7 +3221,7 @@ static int drm_fbdev_client_hotplug(struct drm_client_dev *client)
 	struct drm_device *dev = client->dev;
 	int ret;
 
-	/* If drm_fb_helper_fbdev_setup() failed, we only try once */
+	/* Setup is not retried if it has failed */
 	if (!fb_helper->dev && fb_helper->funcs)
 		return 0;
 
@@ -3195,15 +3233,34 @@ static int drm_fbdev_client_hotplug(struct drm_client_dev *client)
 		return 0;
 	}
 
-	ret = drm_fb_helper_fbdev_setup(dev, fb_helper, &drm_fb_helper_generic_funcs,
-					fb_helper->preferred_bpp, 0);
-	if (ret) {
-		fb_helper->dev = NULL;
-		fb_helper->fbdev = NULL;
-		return ret;
-	}
+	drm_fb_helper_prepare(dev, fb_helper, &drm_fb_helper_generic_funcs);
+
+	ret = drm_fb_helper_init(dev, fb_helper, dev->mode_config.num_connector);
+	if (ret)
+		goto err;
+
+	ret = drm_fb_helper_single_add_all_connectors(fb_helper);
+	if (ret)
+		goto err_cleanup;
+
+	if (!drm_drv_uses_atomic_modeset(dev))
+		drm_helper_disable_unused_functions(dev);
+
+	ret = drm_fb_helper_initial_config(fb_helper, fb_helper->preferred_bpp);
+	if (ret)
+		goto err_cleanup;
 
 	return 0;
+
+err_cleanup:
+	drm_fbdev_cleanup(fb_helper);
+err:
+	fb_helper->dev = NULL;
+	fb_helper->fbdev = NULL;
+
+	DRM_DEV_ERROR(dev->dev, "fbdev: Failed to setup generic emulation (ret=%d)\n", ret);
+
+	return ret;
 }
 
 static const struct drm_client_funcs drm_fbdev_client_funcs = {
@@ -3262,6 +3319,10 @@ int drm_fbdev_generic_setup(struct drm_device *dev, unsigned int preferred_bpp)
 
 	drm_client_add(&fb_helper->client);
 
+	if (!preferred_bpp)
+		preferred_bpp = dev->mode_config.preferred_depth;
+	if (!preferred_bpp)
+		preferred_bpp = 32;
 	fb_helper->preferred_bpp = preferred_bpp;
 
 	ret = drm_fbdev_client_hotplug(&fb_helper->client);
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index 46f48f245eb5..83a5bbca6e7e 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -262,6 +262,18 @@ void drm_file_free(struct drm_file *file)
 	kfree(file);
 }
 
+static void drm_close_helper(struct file *filp)
+{
+	struct drm_file *file_priv = filp->private_data;
+	struct drm_device *dev = file_priv->minor->dev;
+
+	mutex_lock(&dev->filelist_mutex);
+	list_del(&file_priv->lhead);
+	mutex_unlock(&dev->filelist_mutex);
+
+	drm_file_free(file_priv);
+}
+
 static int drm_setup(struct drm_device * dev)
 {
 	int ret;
@@ -318,8 +330,10 @@ int drm_open(struct inode *inode, struct file *filp)
 		goto err_undo;
 	if (need_setup) {
 		retcode = drm_setup(dev);
-		if (retcode)
+		if (retcode) {
+			drm_close_helper(filp);
 			goto err_undo;
+		}
 	}
 	return 0;
 
@@ -473,11 +487,7 @@ int drm_release(struct inode *inode, struct file *filp)
 
 	DRM_DEBUG("open_count = %d\n", dev->open_count);
 
-	mutex_lock(&dev->filelist_mutex);
-	list_del(&file_priv->lhead);
-	mutex_unlock(&dev->filelist_mutex);
-
-	drm_file_free(file_priv);
+	drm_close_helper(filp);
 
 	if (!--dev->open_count) {
 		drm_lastclose(dev);
@@ -701,7 +711,7 @@ int drm_event_reserve_init(struct drm_device *dev,
 EXPORT_SYMBOL(drm_event_reserve_init);
 
 /**
- * drm_event_cancel_free - free a DRM event and release it's space
+ * drm_event_cancel_free - free a DRM event and release its space
  * @dev: DRM device
  * @p: tracking structure for the pending event
  *
diff --git a/drivers/gpu/drm/drm_flip_work.c b/drivers/gpu/drm/drm_flip_work.c
index 12dea16f22a8..3da3bf5af405 100644
--- a/drivers/gpu/drm/drm_flip_work.c
+++ b/drivers/gpu/drm/drm_flip_work.c
@@ -22,6 +22,7 @@
  */
 
 #include <drm/drmP.h>
+#include <drm/drm_util.h>
 #include <drm/drm_flip_work.h>
 
 /**
diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
index d90ee03a84c6..ba7e19d4336c 100644
--- a/drivers/gpu/drm/drm_fourcc.c
+++ b/drivers/gpu/drm/drm_fourcc.c
@@ -238,6 +238,15 @@ const struct drm_format_info *__drm_format_info(u32 format)
 		{ .format = DRM_FORMAT_X0L2,		.depth = 0,  .num_planes = 1,
 		  .char_per_block = { 8, 0, 0 }, .block_w = { 2, 0, 0 }, .block_h = { 2, 0, 0 },
 		  .hsub = 2, .vsub = 2, .is_yuv = true },
+		{ .format = DRM_FORMAT_P010,            .depth = 0,  .num_planes = 2,
+		  .char_per_block = { 2, 4, 0 }, .block_w = { 1, 0, 0 }, .block_h = { 1, 0, 0 },
+		  .hsub = 2, .vsub = 2, .is_yuv = true},
+		{ .format = DRM_FORMAT_P012,		.depth = 0,  .num_planes = 2,
+		  .char_per_block = { 2, 4, 0 }, .block_w = { 1, 0, 0 }, .block_h = { 1, 0, 0 },
+		   .hsub = 2, .vsub = 2, .is_yuv = true},
+		{ .format = DRM_FORMAT_P016,		.depth = 0,  .num_planes = 2,
+		  .char_per_block = { 2, 4, 0 }, .block_w = { 1, 0, 0 }, .block_h = { 1, 0, 0 },
+		  .hsub = 2, .vsub = 2, .is_yuv = true},
 	};
 
 	unsigned int i;
diff --git a/drivers/gpu/drm/drm_framebuffer.c b/drivers/gpu/drm/drm_framebuffer.c
index fcaea8f50513..d8d75e25f6fb 100644
--- a/drivers/gpu/drm/drm_framebuffer.c
+++ b/drivers/gpu/drm/drm_framebuffer.c
@@ -27,6 +27,7 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_uapi.h>
 #include <drm/drm_print.h>
+#include <drm/drm_util.h>
 
 #include "drm_internal.h"
 #include "drm_crtc_internal.h"
@@ -772,7 +773,7 @@ EXPORT_SYMBOL(drm_framebuffer_lookup);
  * @fb: fb to unregister
  *
  * Drivers need to call this when cleaning up driver-private framebuffers, e.g.
- * those used for fbdev. Note that the caller must hold a reference of it's own,
+ * those used for fbdev. Note that the caller must hold a reference of its own,
  * i.e. the object may not be destroyed through this call (since it'll lead to a
  * locking inversion).
  *
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 8b55ece97967..d0b9f6a9953f 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -37,6 +37,7 @@
 #include <linux/shmem_fs.h>
 #include <linux/dma-buf.h>
 #include <linux/mem_encrypt.h>
+#include <linux/pagevec.h>
 #include <drm/drmP.h>
 #include <drm/drm_vma_manager.h>
 #include <drm/drm_gem.h>
@@ -526,6 +527,17 @@ int drm_gem_create_mmap_offset(struct drm_gem_object *obj)
 }
 EXPORT_SYMBOL(drm_gem_create_mmap_offset);
 
+/*
+ * Move pages to appropriate lru and release the pagevec, decrementing the
+ * ref count of those pages.
+ */
+static void drm_gem_check_release_pagevec(struct pagevec *pvec)
+{
+	check_move_unevictable_pages(pvec);
+	__pagevec_release(pvec);
+	cond_resched();
+}
+
 /**
  * drm_gem_get_pages - helper to allocate backing pages for a GEM object
  * from shmem
@@ -551,6 +563,7 @@ struct page **drm_gem_get_pages(struct drm_gem_object *obj)
 {
 	struct address_space *mapping;
 	struct page *p, **pages;
+	struct pagevec pvec;
 	int i, npages;
 
 	/* This is the shared memory object that backs the GEM resource */
@@ -568,6 +581,8 @@ struct page **drm_gem_get_pages(struct drm_gem_object *obj)
 	if (pages == NULL)
 		return ERR_PTR(-ENOMEM);
 
+	mapping_set_unevictable(mapping);
+
 	for (i = 0; i < npages; i++) {
 		p = shmem_read_mapping_page(mapping, i);
 		if (IS_ERR(p))
@@ -586,8 +601,14 @@ struct page **drm_gem_get_pages(struct drm_gem_object *obj)
 	return pages;
 
 fail:
-	while (i--)
-		put_page(pages[i]);
+	mapping_clear_unevictable(mapping);
+	pagevec_init(&pvec);
+	while (i--) {
+		if (!pagevec_add(&pvec, pages[i]))
+			drm_gem_check_release_pagevec(&pvec);
+	}
+	if (pagevec_count(&pvec))
+		drm_gem_check_release_pagevec(&pvec);
 
 	kvfree(pages);
 	return ERR_CAST(p);
@@ -605,6 +626,11 @@ void drm_gem_put_pages(struct drm_gem_object *obj, struct page **pages,
 		bool dirty, bool accessed)
 {
 	int i, npages;
+	struct address_space *mapping;
+	struct pagevec pvec;
+
+	mapping = file_inode(obj->filp)->i_mapping;
+	mapping_clear_unevictable(mapping);
 
 	/* We already BUG_ON() for non-page-aligned sizes in
 	 * drm_gem_object_init(), so we should never hit this unless
@@ -614,6 +640,7 @@ void drm_gem_put_pages(struct drm_gem_object *obj, struct page **pages,
 
 	npages = obj->size >> PAGE_SHIFT;
 
+	pagevec_init(&pvec);
 	for (i = 0; i < npages; i++) {
 		if (dirty)
 			set_page_dirty(pages[i]);
@@ -622,15 +649,18 @@ void drm_gem_put_pages(struct drm_gem_object *obj, struct page **pages,
 			mark_page_accessed(pages[i]);
 
 		/* Undo the reference we took when populating the table */
-		put_page(pages[i]);
+		if (!pagevec_add(&pvec, pages[i]))
+			drm_gem_check_release_pagevec(&pvec);
 	}
+	if (pagevec_count(&pvec))
+		drm_gem_check_release_pagevec(&pvec);
 
 	kvfree(pages);
 }
 EXPORT_SYMBOL(drm_gem_put_pages);
 
 /**
- * drm_gem_object_lookup - look up a GEM object from it's handle
+ * drm_gem_object_lookup - look up a GEM object from its handle
  * @filp: DRM file private date
  * @handle: userspace handle
  *
diff --git a/drivers/gpu/drm/drm_gem_framebuffer_helper.c b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
index acb466d25afc..65edb1ccb185 100644
--- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c
+++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
@@ -17,6 +17,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_uapi.h>
+#include <drm/drm_damage_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fourcc.h>
 #include <drm/drm_framebuffer.h>
@@ -136,10 +137,9 @@ EXPORT_SYMBOL(drm_gem_fb_create_handle);
  * @mode_cmd: Metadata from the userspace framebuffer creation request
  * @funcs: vtable to be used for the new framebuffer object
  *
- * This can be used to set &drm_framebuffer_funcs for drivers that need the
- * &drm_framebuffer_funcs.dirty callback. Use drm_gem_fb_create() if you don't
- * need to change &drm_framebuffer_funcs.
- * The function does buffer size validation.
+ * This function can be used to set &drm_framebuffer_funcs for drivers that need
+ * custom framebuffer callbacks. Use drm_gem_fb_create() if you don't need to
+ * change &drm_framebuffer_funcs. The function does buffer size validation.
  *
  * Returns:
  * Pointer to a &drm_framebuffer on success or an error pointer on failure.
@@ -215,8 +215,8 @@ static const struct drm_framebuffer_funcs drm_gem_fb_funcs = {
  *
  * If your hardware has special alignment or pitch requirements these should be
  * checked before calling this function. The function does buffer size
- * validation. Use drm_gem_fb_create_with_funcs() if you need to set
- * &drm_framebuffer_funcs.dirty.
+ * validation. Use drm_gem_fb_create_with_dirty() if you need framebuffer
+ * flushing.
  *
  * Drivers can use this as their &drm_mode_config_funcs.fb_create callback.
  * The ADDFB2 IOCTL calls into this callback.
@@ -233,6 +233,44 @@ drm_gem_fb_create(struct drm_device *dev, struct drm_file *file,
 }
 EXPORT_SYMBOL_GPL(drm_gem_fb_create);
 
+static const struct drm_framebuffer_funcs drm_gem_fb_funcs_dirtyfb = {
+	.destroy	= drm_gem_fb_destroy,
+	.create_handle	= drm_gem_fb_create_handle,
+	.dirty		= drm_atomic_helper_dirtyfb,
+};
+
+/**
+ * drm_gem_fb_create_with_dirty() - Helper function for the
+ *                       &drm_mode_config_funcs.fb_create callback
+ * @dev: DRM device
+ * @file: DRM file that holds the GEM handle(s) backing the framebuffer
+ * @mode_cmd: Metadata from the userspace framebuffer creation request
+ *
+ * This function creates a new framebuffer object described by
+ * &drm_mode_fb_cmd2. This description includes handles for the buffer(s)
+ * backing the framebuffer. drm_atomic_helper_dirtyfb() is used for the dirty
+ * callback giving framebuffer flushing through the atomic machinery. Use
+ * drm_gem_fb_create() if you don't need the dirty callback.
+ * The function does buffer size validation.
+ *
+ * Drivers should also call drm_plane_enable_fb_damage_clips() on all planes
+ * to enable userspace to use damage clips also with the ATOMIC IOCTL.
+ *
+ * Drivers can use this as their &drm_mode_config_funcs.fb_create callback.
+ * The ADDFB2 IOCTL calls into this callback.
+ *
+ * Returns:
+ * Pointer to a &drm_framebuffer on success or an error pointer on failure.
+ */
+struct drm_framebuffer *
+drm_gem_fb_create_with_dirty(struct drm_device *dev, struct drm_file *file,
+			     const struct drm_mode_fb_cmd2 *mode_cmd)
+{
+	return drm_gem_fb_create_with_funcs(dev, file, mode_cmd,
+					    &drm_gem_fb_funcs_dirtyfb);
+}
+EXPORT_SYMBOL_GPL(drm_gem_fb_create_with_dirty);
+
 /**
  * drm_gem_fb_prepare_fb() - Prepare a GEM backed framebuffer
  * @plane: Plane
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index d9caf205e0b3..251d67e04c2d 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -26,6 +26,8 @@
 #define DRM_IF_MAJOR 1
 #define DRM_IF_MINOR 4
 
+#define DRM_IF_VERSION(maj, min) (maj << 16 | min)
+
 struct drm_prime_file_private;
 struct dma_buf;
 
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 7e6746b2d704..687943df58e1 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -508,6 +508,13 @@ int drm_version(struct drm_device *dev, void *data,
 	return err;
 }
 
+static inline bool
+drm_render_driver_and_ioctl(const struct drm_device *dev, u32 flags)
+{
+	return drm_core_check_feature(dev, DRIVER_RENDER) &&
+		(flags & DRM_RENDER_ALLOW);
+}
+
 /**
  * drm_ioctl_permit - Check ioctl permissions against caller
  *
@@ -522,14 +529,19 @@ int drm_version(struct drm_device *dev, void *data,
  */
 int drm_ioctl_permit(u32 flags, struct drm_file *file_priv)
 {
+	const struct drm_device *dev = file_priv->minor->dev;
+
 	/* ROOT_ONLY is only for CAP_SYS_ADMIN */
 	if (unlikely((flags & DRM_ROOT_ONLY) && !capable(CAP_SYS_ADMIN)))
 		return -EACCES;
 
-	/* AUTH is only for authenticated or render client */
-	if (unlikely((flags & DRM_AUTH) && !drm_is_render_client(file_priv) &&
-		     !file_priv->authenticated))
-		return -EACCES;
+	/* AUTH is only for master ... */
+	if (unlikely((flags & DRM_AUTH) && drm_is_primary_client(file_priv))) {
+		/* authenticated ones, or render capable on DRM_RENDER_ALLOW. */
+		if (!file_priv->authenticated &&
+		    !drm_render_driver_and_ioctl(dev, flags))
+			return -EACCES;
+	}
 
 	/* MASTER is only for master or control clients */
 	if (unlikely((flags & DRM_MASTER) &&
@@ -570,7 +582,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
 	DRM_IOCTL_DEF(DRM_IOCTL_SET_UNIQUE, drm_invalid_op, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF(DRM_IOCTL_BLOCK, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF(DRM_IOCTL_UNBLOCK, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
-	DRM_IOCTL_DEF(DRM_IOCTL_AUTH_MAGIC, drm_authmagic, DRM_AUTH|DRM_UNLOCKED|DRM_MASTER),
+	DRM_IOCTL_DEF(DRM_IOCTL_AUTH_MAGIC, drm_authmagic, DRM_UNLOCKED|DRM_MASTER),
 
 	DRM_IOCTL_DEF(DRM_IOCTL_ADD_MAP, drm_legacy_addmap_ioctl, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF(DRM_IOCTL_RM_MAP, drm_legacy_rmmap_ioctl, DRM_AUTH),
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 45a07652fa00..9bd8908d5fd8 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -103,9 +103,6 @@ int drm_irq_install(struct drm_device *dev, int irq)
 	int ret;
 	unsigned long sh_flags = 0;
 
-	if (!drm_core_check_feature(dev, DRIVER_HAVE_IRQ))
-		return -EOPNOTSUPP;
-
 	if (irq == 0)
 		return -EINVAL;
 
@@ -123,8 +120,8 @@ int drm_irq_install(struct drm_device *dev, int irq)
 	if (dev->driver->irq_preinstall)
 		dev->driver->irq_preinstall(dev);
 
-	/* Install handler */
-	if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
+	/* PCI devices require shared interrupts. */
+	if (dev->pdev)
 		sh_flags = IRQF_SHARED;
 
 	ret = request_irq(irq, dev->driver->irq_handler,
@@ -174,9 +171,6 @@ int drm_irq_uninstall(struct drm_device *dev)
 	bool irq_enabled;
 	int i;
 
-	if (!drm_core_check_feature(dev, DRIVER_HAVE_IRQ))
-		return -EOPNOTSUPP;
-
 	irq_enabled = dev->irq_enabled;
 	dev->irq_enabled = false;
 
diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c
index 5df1256618cc..603b0bd9c5ce 100644
--- a/drivers/gpu/drm/drm_lease.c
+++ b/drivers/gpu/drm/drm_lease.c
@@ -218,7 +218,7 @@ static struct drm_master *drm_lease_create(struct drm_master *lessor, struct idr
 
 	idr_for_each_entry(leases, entry, object) {
 		error = 0;
-		if (!idr_find(&dev->mode_config.crtc_idr, object))
+		if (!idr_find(&dev->mode_config.object_idr, object))
 			error = -ENOENT;
 		else if (!_drm_lease_held_master(lessor, object))
 			error = -EACCES;
@@ -439,7 +439,7 @@ static int fill_object_idr(struct drm_device *dev,
 		/*
 		 * We're using an IDR to hold the set of leased
 		 * objects, but we don't need to point at the object's
-		 * data structure from the lease as the main crtc_idr
+		 * data structure from the lease as the main object_idr
 		 * will be used to actually find that. Instead, all we
 		 * really want is a 'leased/not-leased' result, for
 		 * which any non-NULL pointer will work fine.
@@ -688,7 +688,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev,
 
 	if (lessee->lessor == NULL)
 		/* owner can use all objects */
-		object_idr = &lessee->dev->mode_config.crtc_idr;
+		object_idr = &lessee->dev->mode_config.object_idr;
 	else
 		/* lessee can only use allowed object */
 		object_idr = &lessee->leases;
diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 3cc5fbd78ee2..2b4f373736c7 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -816,7 +816,7 @@ EXPORT_SYMBOL(drm_mm_scan_add_block);
  * When the scan list is empty, the selected memory nodes can be freed. An
  * immediately following drm_mm_insert_node_in_range_generic() or one of the
  * simpler versions of that function with !DRM_MM_SEARCH_BEST will then return
- * the just freed block (because its at the top of the free_stack list).
+ * the just freed block (because it's at the top of the free_stack list).
  *
  * Returns:
  * True if this block should be evicted, false otherwise. Will always
diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
index 703bfce975bb..4a1c2023ccf0 100644
--- a/drivers/gpu/drm/drm_mode_config.c
+++ b/drivers/gpu/drm/drm_mode_config.c
@@ -393,7 +393,8 @@ void drm_mode_config_init(struct drm_device *dev)
 	INIT_LIST_HEAD(&dev->mode_config.property_list);
 	INIT_LIST_HEAD(&dev->mode_config.property_blob_list);
 	INIT_LIST_HEAD(&dev->mode_config.plane_list);
-	idr_init(&dev->mode_config.crtc_idr);
+	INIT_LIST_HEAD(&dev->mode_config.privobj_list);
+	idr_init(&dev->mode_config.object_idr);
 	idr_init(&dev->mode_config.tile_idr);
 	ida_init(&dev->mode_config.connector_ida);
 	spin_lock_init(&dev->mode_config.connector_list_lock);
@@ -496,7 +497,7 @@ void drm_mode_config_cleanup(struct drm_device *dev)
 
 	ida_destroy(&dev->mode_config.connector_ida);
 	idr_destroy(&dev->mode_config.tile_idr);
-	idr_destroy(&dev->mode_config.crtc_idr);
+	idr_destroy(&dev->mode_config.object_idr);
 	drm_modeset_lock_fini(&dev->mode_config.connection_mutex);
 }
 EXPORT_SYMBOL(drm_mode_config_cleanup);
diff --git a/drivers/gpu/drm/drm_mode_object.c b/drivers/gpu/drm/drm_mode_object.c
index 004191d01772..a9005c1c2384 100644
--- a/drivers/gpu/drm/drm_mode_object.c
+++ b/drivers/gpu/drm/drm_mode_object.c
@@ -38,7 +38,7 @@ int __drm_mode_object_add(struct drm_device *dev, struct drm_mode_object *obj,
 	int ret;
 
 	mutex_lock(&dev->mode_config.idr_mutex);
-	ret = idr_alloc(&dev->mode_config.crtc_idr, register_obj ? obj : NULL,
+	ret = idr_alloc(&dev->mode_config.object_idr, register_obj ? obj : NULL,
 			1, 0, GFP_KERNEL);
 	if (ret >= 0) {
 		/*
@@ -79,7 +79,7 @@ void drm_mode_object_register(struct drm_device *dev,
 			      struct drm_mode_object *obj)
 {
 	mutex_lock(&dev->mode_config.idr_mutex);
-	idr_replace(&dev->mode_config.crtc_idr, obj, obj->id);
+	idr_replace(&dev->mode_config.object_idr, obj, obj->id);
 	mutex_unlock(&dev->mode_config.idr_mutex);
 }
 
@@ -99,7 +99,7 @@ void drm_mode_object_unregister(struct drm_device *dev,
 {
 	mutex_lock(&dev->mode_config.idr_mutex);
 	if (object->id) {
-		idr_remove(&dev->mode_config.crtc_idr, object->id);
+		idr_remove(&dev->mode_config.object_idr, object->id);
 		object->id = 0;
 	}
 	mutex_unlock(&dev->mode_config.idr_mutex);
@@ -131,7 +131,7 @@ struct drm_mode_object *__drm_mode_object_find(struct drm_device *dev,
 	struct drm_mode_object *obj = NULL;
 
 	mutex_lock(&dev->mode_config.idr_mutex);
-	obj = idr_find(&dev->mode_config.crtc_idr, id);
+	obj = idr_find(&dev->mode_config.object_idr, id);
 	if (obj && type != DRM_MODE_OBJECT_ANY && obj->type != type)
 		obj = NULL;
 	if (obj && obj->id != id)
@@ -465,6 +465,7 @@ static int set_property_atomic(struct drm_mode_object *obj,
 
 	drm_modeset_acquire_init(&ctx, 0);
 	state->acquire_ctx = &ctx;
+
 retry:
 	if (prop == state->dev->mode_config.dpms_property) {
 		if (obj->type != DRM_MODE_OBJECT_CONNECTOR) {
diff --git a/drivers/gpu/drm/drm_modes.c b/drivers/gpu/drm/drm_modes.c
index f91e02c87fd8..869ac6f4671e 100644
--- a/drivers/gpu/drm/drm_modes.c
+++ b/drivers/gpu/drm/drm_modes.c
@@ -71,11 +71,6 @@ struct drm_display_mode *drm_mode_create(struct drm_device *dev)
 	if (!nmode)
 		return NULL;
 
-	if (drm_mode_object_add(dev, &nmode->base, DRM_MODE_OBJECT_MODE)) {
-		kfree(nmode);
-		return NULL;
-	}
-
 	return nmode;
 }
 EXPORT_SYMBOL(drm_mode_create);
@@ -92,8 +87,6 @@ void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode)
 	if (!mode)
 		return;
 
-	drm_mode_object_unregister(dev, &mode->base);
-
 	kfree(mode);
 }
 EXPORT_SYMBOL(drm_mode_destroy);
@@ -911,11 +904,9 @@ EXPORT_SYMBOL(drm_mode_set_crtcinfo);
  */
 void drm_mode_copy(struct drm_display_mode *dst, const struct drm_display_mode *src)
 {
-	int id = dst->base.id;
 	struct list_head head = dst->head;
 
 	*dst = *src;
-	dst->base.id = id;
 	dst->head = head;
 }
 EXPORT_SYMBOL(drm_mode_copy);
@@ -1281,7 +1272,7 @@ const char *drm_get_mode_status_name(enum drm_mode_status status)
  * @verbose: be verbose about it
  *
  * This helper function can be used to prune a display mode list after
- * validation has been completed. All modes who's status is not MODE_OK will be
+ * validation has been completed. All modes whose status is not MODE_OK will be
  * removed from the list, and if @verbose the status code and mode name is also
  * printed to dmesg.
  */
diff --git a/drivers/gpu/drm/drm_modeset_helper.c b/drivers/gpu/drm/drm_modeset_helper.c
index 9150fa385bba..da483125e063 100644
--- a/drivers/gpu/drm/drm_modeset_helper.c
+++ b/drivers/gpu/drm/drm_modeset_helper.c
@@ -21,10 +21,12 @@
  */
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_fourcc.h>
 #include <drm/drm_modeset_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_print.h>
+#include <drm/drm_probe_helper.h>
 
 /**
  * DOC: aux kms helpers
diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c
index 51f534db9107..81dd11901ffd 100644
--- a/drivers/gpu/drm/drm_modeset_lock.c
+++ b/drivers/gpu/drm/drm_modeset_lock.c
@@ -22,6 +22,7 @@
  */
 
 #include <drm/drmP.h>
+#include <drm/drm_atomic.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_modeset_lock.h>
 
@@ -394,6 +395,7 @@ EXPORT_SYMBOL(drm_modeset_unlock);
 int drm_modeset_lock_all_ctx(struct drm_device *dev,
 			     struct drm_modeset_acquire_ctx *ctx)
 {
+	struct drm_private_obj *privobj;
 	struct drm_crtc *crtc;
 	struct drm_plane *plane;
 	int ret;
@@ -414,6 +416,12 @@ int drm_modeset_lock_all_ctx(struct drm_device *dev,
 			return ret;
 	}
 
+	drm_for_each_privobj(privobj, dev) {
+		ret = drm_modeset_lock(&privobj->lock, ctx);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL(drm_modeset_lock_all_ctx);
diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c
index 2763a5ec845b..f2f71d71494a 100644
--- a/drivers/gpu/drm/drm_of.c
+++ b/drivers/gpu/drm/drm_of.c
@@ -217,9 +217,11 @@ int drm_of_encoder_active_endpoint(struct device_node *node,
 }
 EXPORT_SYMBOL_GPL(drm_of_encoder_active_endpoint);
 
-/*
+/**
  * drm_of_find_panel_or_bridge - return connected panel or bridge device
  * @np: device tree node containing encoder output ports
+ * @port: port in the device tree node
+ * @endpoint: endpoint in the device tree node
  * @panel: pointer to hold returned drm_panel
  * @bridge: pointer to hold returned drm_bridge
  *
diff --git a/drivers/gpu/drm/drm_panel.c b/drivers/gpu/drm/drm_panel.c
index c33f95e08e1b..dbd5b873e8f2 100644
--- a/drivers/gpu/drm/drm_panel.c
+++ b/drivers/gpu/drm/drm_panel.c
@@ -36,6 +36,9 @@ static LIST_HEAD(panel_list);
  * The DRM panel helpers allow drivers to register panel objects with a
  * central registry and provide functions to retrieve those panels in display
  * drivers.
+ *
+ * For easy integration into drivers using the &drm_bridge infrastructure please
+ * take look at drm_panel_bridge_add() and devm_drm_panel_bridge_add().
  */
 
 /**
diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index 5f650d8fc66b..4cfb56893b7f 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -220,6 +220,9 @@ int drm_universal_plane_init(struct drm_device *dev, struct drm_plane *plane,
 			format_modifier_count++;
 	}
 
+	if (format_modifier_count)
+		config->allow_fb_modifiers = true;
+
 	plane->modifier_count = format_modifier_count;
 	plane->modifiers = kmalloc_array(format_modifier_count,
 					 sizeof(format_modifiers[0]),
diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
index a1bb157bfdfa..6fd08e04b323 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -36,10 +36,10 @@
 #include <drm/drm_client.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_fourcc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_modeset_helper_vtables.h>
+#include <drm/drm_probe_helper.h>
 
 #include "drm_crtc_helper_internal.h"
 
diff --git a/drivers/gpu/drm/drm_property.c b/drivers/gpu/drm/drm_property.c
index 79c77c3cad86..f8ec8f9c3e7a 100644
--- a/drivers/gpu/drm/drm_property.c
+++ b/drivers/gpu/drm/drm_property.c
@@ -866,7 +866,7 @@ err:
  * value doesn't become invalid part way through the property update due to
  * race).  The value returned by reference via 'obj' should be passed back
  * to drm_property_change_valid_put() after the property is set (and the
- * object to which the property is attached has a chance to take it's own
+ * object to which the property is attached has a chance to take its own
  * reference).
  */
 bool drm_property_change_valid_get(struct drm_property *property,
diff --git a/drivers/gpu/drm/drm_rect.c b/drivers/gpu/drm/drm_rect.c
index 8c057829b804..66c41b12719c 100644
--- a/drivers/gpu/drm/drm_rect.c
+++ b/drivers/gpu/drm/drm_rect.c
@@ -208,114 +208,6 @@ int drm_rect_calc_vscale(const struct drm_rect *src,
 EXPORT_SYMBOL(drm_rect_calc_vscale);
 
 /**
- * drm_calc_hscale_relaxed - calculate the horizontal scaling factor
- * @src: source window rectangle
- * @dst: destination window rectangle
- * @min_hscale: minimum allowed horizontal scaling factor
- * @max_hscale: maximum allowed horizontal scaling factor
- *
- * Calculate the horizontal scaling factor as
- * (@src width) / (@dst width).
- *
- * If the calculated scaling factor is below @min_vscale,
- * decrease the height of rectangle @dst to compensate.
- *
- * If the calculated scaling factor is above @max_vscale,
- * decrease the height of rectangle @src to compensate.
- *
- * If the scale is below 1 << 16, round down. If the scale is above
- * 1 << 16, round up. This will calculate the scale with the most
- * pessimistic limit calculation.
- *
- * RETURNS:
- * The horizontal scaling factor.
- */
-int drm_rect_calc_hscale_relaxed(struct drm_rect *src,
-				 struct drm_rect *dst,
-				 int min_hscale, int max_hscale)
-{
-	int src_w = drm_rect_width(src);
-	int dst_w = drm_rect_width(dst);
-	int hscale = drm_calc_scale(src_w, dst_w);
-
-	if (hscale < 0 || dst_w == 0)
-		return hscale;
-
-	if (hscale < min_hscale) {
-		int max_dst_w = src_w / min_hscale;
-
-		drm_rect_adjust_size(dst, max_dst_w - dst_w, 0);
-
-		return min_hscale;
-	}
-
-	if (hscale > max_hscale) {
-		int max_src_w = dst_w * max_hscale;
-
-		drm_rect_adjust_size(src, max_src_w - src_w, 0);
-
-		return max_hscale;
-	}
-
-	return hscale;
-}
-EXPORT_SYMBOL(drm_rect_calc_hscale_relaxed);
-
-/**
- * drm_rect_calc_vscale_relaxed - calculate the vertical scaling factor
- * @src: source window rectangle
- * @dst: destination window rectangle
- * @min_vscale: minimum allowed vertical scaling factor
- * @max_vscale: maximum allowed vertical scaling factor
- *
- * Calculate the vertical scaling factor as
- * (@src height) / (@dst height).
- *
- * If the calculated scaling factor is below @min_vscale,
- * decrease the height of rectangle @dst to compensate.
- *
- * If the calculated scaling factor is above @max_vscale,
- * decrease the height of rectangle @src to compensate.
- *
- * If the scale is below 1 << 16, round down. If the scale is above
- * 1 << 16, round up. This will calculate the scale with the most
- * pessimistic limit calculation.
- *
- * RETURNS:
- * The vertical scaling factor.
- */
-int drm_rect_calc_vscale_relaxed(struct drm_rect *src,
-				 struct drm_rect *dst,
-				 int min_vscale, int max_vscale)
-{
-	int src_h = drm_rect_height(src);
-	int dst_h = drm_rect_height(dst);
-	int vscale = drm_calc_scale(src_h, dst_h);
-
-	if (vscale < 0 || dst_h == 0)
-		return vscale;
-
-	if (vscale < min_vscale) {
-		int max_dst_h = src_h / min_vscale;
-
-		drm_rect_adjust_size(dst, 0, max_dst_h - dst_h);
-
-		return min_vscale;
-	}
-
-	if (vscale > max_vscale) {
-		int max_src_h = dst_h * max_vscale;
-
-		drm_rect_adjust_size(src, 0, max_src_h - src_h);
-
-		return max_vscale;
-	}
-
-	return vscale;
-}
-EXPORT_SYMBOL(drm_rect_calc_vscale_relaxed);
-
-/**
  * drm_rect_debug_print - print the rectangle information
  * @prefix: prefix string
  * @r: rectangle to print
diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c b/drivers/gpu/drm/drm_simple_kms_helper.c
index 917812448d1b..a32f14cd7398 100644
--- a/drivers/gpu/drm/drm_simple_kms_helper.c
+++ b/drivers/gpu/drm/drm_simple_kms_helper.c
@@ -10,8 +10,8 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_simple_kms_helper.h>
 #include <linux/slab.h>
 
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index db30a0e89db8..e19525af0cce 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,16 @@
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
 
+struct syncobj_wait_entry {
+	struct list_head node;
+	struct task_struct *task;
+	struct dma_fence *fence;
+	struct dma_fence_cb fence_cb;
+};
+
+static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
+				      struct syncobj_wait_entry *wait);
+
 /**
  * drm_syncobj_find - lookup and reference a sync object.
  * @file_private: drm file private pointer
@@ -82,58 +92,33 @@ struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 }
 EXPORT_SYMBOL(drm_syncobj_find);
 
-static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
-					    struct drm_syncobj_cb *cb,
-					    drm_syncobj_func_t func)
+static void drm_syncobj_fence_add_wait(struct drm_syncobj *syncobj,
+				       struct syncobj_wait_entry *wait)
 {
-	cb->func = func;
-	list_add_tail(&cb->node, &syncobj->cb_list);
-}
-
-static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
-						 struct dma_fence **fence,
-						 struct drm_syncobj_cb *cb,
-						 drm_syncobj_func_t func)
-{
-	int ret;
-
-	*fence = drm_syncobj_fence_get(syncobj);
-	if (*fence)
-		return 1;
+	if (wait->fence)
+		return;
 
 	spin_lock(&syncobj->lock);
 	/* We've already tried once to get a fence and failed.  Now that we
 	 * have the lock, try one more time just to be sure we don't add a
 	 * callback when a fence has already been set.
 	 */
-	if (syncobj->fence) {
-		*fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
-								 lockdep_is_held(&syncobj->lock)));
-		ret = 1;
-	} else {
-		*fence = NULL;
-		drm_syncobj_add_callback_locked(syncobj, cb, func);
-		ret = 0;
-	}
+	if (syncobj->fence)
+		wait->fence = dma_fence_get(
+			rcu_dereference_protected(syncobj->fence, 1));
+	else
+		list_add_tail(&wait->node, &syncobj->cb_list);
 	spin_unlock(&syncobj->lock);
-
-	return ret;
 }
 
-void drm_syncobj_add_callback(struct drm_syncobj *syncobj,
-			      struct drm_syncobj_cb *cb,
-			      drm_syncobj_func_t func)
+static void drm_syncobj_remove_wait(struct drm_syncobj *syncobj,
+				    struct syncobj_wait_entry *wait)
 {
-	spin_lock(&syncobj->lock);
-	drm_syncobj_add_callback_locked(syncobj, cb, func);
-	spin_unlock(&syncobj->lock);
-}
+	if (!wait->node.next)
+		return;
 
-void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
-				 struct drm_syncobj_cb *cb)
-{
 	spin_lock(&syncobj->lock);
-	list_del_init(&cb->node);
+	list_del_init(&wait->node);
 	spin_unlock(&syncobj->lock);
 }
 
@@ -148,7 +133,7 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence)
 {
 	struct dma_fence *old_fence;
-	struct drm_syncobj_cb *cur, *tmp;
+	struct syncobj_wait_entry *cur, *tmp;
 
 	if (fence)
 		dma_fence_get(fence);
@@ -162,7 +147,7 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 	if (fence != old_fence) {
 		list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
 			list_del_init(&cur->node);
-			cur->func(syncobj, cur);
+			syncobj_wait_syncobj_func(syncobj, cur);
 		}
 	}
 
@@ -608,13 +593,6 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 					&args->handle);
 }
 
-struct syncobj_wait_entry {
-	struct task_struct *task;
-	struct dma_fence *fence;
-	struct dma_fence_cb fence_cb;
-	struct drm_syncobj_cb syncobj_cb;
-};
-
 static void syncobj_wait_fence_func(struct dma_fence *fence,
 				    struct dma_fence_cb *cb)
 {
@@ -625,11 +603,8 @@ static void syncobj_wait_fence_func(struct dma_fence *fence,
 }
 
 static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
-				      struct drm_syncobj_cb *cb)
+				      struct syncobj_wait_entry *wait)
 {
-	struct syncobj_wait_entry *wait =
-		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
-
 	/* This happens inside the syncobj lock */
 	wait->fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
 							      lockdep_is_held(&syncobj->lock)));
@@ -688,12 +663,8 @@ static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
 	 */
 
 	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
-		for (i = 0; i < count; ++i) {
-			drm_syncobj_fence_get_or_add_callback(syncobjs[i],
-							      &entries[i].fence,
-							      &entries[i].syncobj_cb,
-							      syncobj_wait_syncobj_func);
-		}
+		for (i = 0; i < count; ++i)
+			drm_syncobj_fence_add_wait(syncobjs[i], &entries[i]);
 	}
 
 	do {
@@ -742,9 +713,7 @@ done_waiting:
 
 cleanup_entries:
 	for (i = 0; i < count; ++i) {
-		if (entries[i].syncobj_cb.func)
-			drm_syncobj_remove_callback(syncobjs[i],
-						    &entries[i].syncobj_cb);
+		drm_syncobj_remove_wait(syncobjs[i], &entries[i]);
 		if (entries[i].fence_cb.func)
 			dma_fence_remove_callback(entries[i].fence,
 						  &entries[i].fence_cb);
diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 98e091175921..a1b65d26d761 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -48,7 +48,7 @@
  * Drivers must initialize the vertical blanking handling core with a call to
  * drm_vblank_init(). Minimally, a driver needs to implement
  * &drm_crtc_funcs.enable_vblank and &drm_crtc_funcs.disable_vblank plus call
- * drm_crtc_handle_vblank() in it's vblank interrupt handler for working vblank
+ * drm_crtc_handle_vblank() in its vblank interrupt handler for working vblank
  * support.
  *
  * Vertical blanking interrupts can be enabled by the DRM core or by drivers
@@ -105,13 +105,20 @@ static void store_vblank(struct drm_device *dev, unsigned int pipe,
 	write_sequnlock(&vblank->seqlock);
 }
 
+static u32 drm_max_vblank_count(struct drm_device *dev, unsigned int pipe)
+{
+	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
+
+	return vblank->max_vblank_count ?: dev->max_vblank_count;
+}
+
 /*
  * "No hw counter" fallback implementation of .get_vblank_counter() hook,
  * if there is no useable hardware frame counter available.
  */
 static u32 drm_vblank_no_hw_counter(struct drm_device *dev, unsigned int pipe)
 {
-	WARN_ON_ONCE(dev->max_vblank_count != 0);
+	WARN_ON_ONCE(drm_max_vblank_count(dev, pipe) != 0);
 	return 0;
 }
 
@@ -198,6 +205,7 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
 	ktime_t t_vblank;
 	int count = DRM_TIMESTAMP_MAXRETRIES;
 	int framedur_ns = vblank->framedur_ns;
+	u32 max_vblank_count = drm_max_vblank_count(dev, pipe);
 
 	/*
 	 * Interrupts were disabled prior to this call, so deal with counter
@@ -216,9 +224,9 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
 		rc = drm_get_last_vbltimestamp(dev, pipe, &t_vblank, in_vblank_irq);
 	} while (cur_vblank != __get_vblank_counter(dev, pipe) && --count > 0);
 
-	if (dev->max_vblank_count != 0) {
+	if (max_vblank_count) {
 		/* trust the hw counter when it's around */
-		diff = (cur_vblank - vblank->last) & dev->max_vblank_count;
+		diff = (cur_vblank - vblank->last) & max_vblank_count;
 	} else if (rc && framedur_ns) {
 		u64 diff_ns = ktime_to_ns(ktime_sub(t_vblank, vblank->time));
 
@@ -1205,6 +1213,37 @@ void drm_crtc_vblank_reset(struct drm_crtc *crtc)
 EXPORT_SYMBOL(drm_crtc_vblank_reset);
 
 /**
+ * drm_crtc_set_max_vblank_count - configure the hw max vblank counter value
+ * @crtc: CRTC in question
+ * @max_vblank_count: max hardware vblank counter value
+ *
+ * Update the maximum hardware vblank counter value for @crtc
+ * at runtime. Useful for hardware where the operation of the
+ * hardware vblank counter depends on the currently active
+ * display configuration.
+ *
+ * For example, if the hardware vblank counter does not work
+ * when a specific connector is active the maximum can be set
+ * to zero. And when that specific connector isn't active the
+ * maximum can again be set to the appropriate non-zero value.
+ *
+ * If used, must be called before drm_vblank_on().
+ */
+void drm_crtc_set_max_vblank_count(struct drm_crtc *crtc,
+				   u32 max_vblank_count)
+{
+	struct drm_device *dev = crtc->dev;
+	unsigned int pipe = drm_crtc_index(crtc);
+	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
+
+	WARN_ON(dev->max_vblank_count);
+	WARN_ON(!READ_ONCE(vblank->inmodeset));
+
+	vblank->max_vblank_count = max_vblank_count;
+}
+EXPORT_SYMBOL(drm_crtc_set_max_vblank_count);
+
+/**
  * drm_crtc_vblank_on - enable vblank events on a CRTC
  * @crtc: CRTC in question
  *
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
index 4bf698de5996..a6a7ded37ef1 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
@@ -21,7 +21,6 @@
 #include <linux/mm_types.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem.h>
 #include <drm/etnaviv_drm.h>
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index 1fa74226db91..5c48915f492d 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -449,7 +449,7 @@ static void etnaviv_gem_describe_fence(struct dma_fence *fence,
 	const char *type, struct seq_file *m)
 {
 	if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
-		seq_printf(m, "\t%9s: %s %s seq %u\n",
+		seq_printf(m, "\t%9s: %s %s seq %llu\n",
 			   type,
 			   fence->ops->get_driver_name(fence),
 			   fence->ops->get_timeline_name(fence),
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 49a6763693f1..67ae26602024 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -109,16 +109,19 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
 	}
 
 	/* block scheduler */
-	kthread_park(gpu->sched.thread);
-	drm_sched_hw_job_reset(&gpu->sched, sched_job);
+	drm_sched_stop(&gpu->sched);
+
+	if(sched_job)
+		drm_sched_increase_karma(sched_job);
 
 	/* get the GPU back into the init state */
 	etnaviv_core_dump(gpu);
 	etnaviv_gpu_recover_hang(gpu);
 
+	drm_sched_resubmit_jobs(&gpu->sched);
+
 	/* restart scheduler after GPU is usable again */
-	drm_sched_job_recovery(&gpu->sched);
-	kthread_unpark(gpu->sched.thread);
+	drm_sched_start(&gpu->sched, true);
 }
 
 static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
diff --git a/drivers/gpu/drm/exynos/exynos_dp.c b/drivers/gpu/drm/exynos/exynos_dp.c
index c8449ae4f4fe..471242a5e580 100644
--- a/drivers/gpu/drm/exynos/exynos_dp.c
+++ b/drivers/gpu/drm/exynos/exynos_dp.c
@@ -22,10 +22,11 @@
 #include <video/videomode.h>
 
 #include <drm/drmP.h>
+#include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include <drm/bridge/analogix_dp.h>
 #include <drm/exynos_drm.h>
diff --git a/drivers/gpu/drm/exynos/exynos_drm_crtc.c b/drivers/gpu/drm/exynos/exynos_drm_crtc.c
index 2696289ecc78..96ee83a798c4 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_crtc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_crtc.c
@@ -13,10 +13,10 @@
  */
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_encoder.h>
+#include <drm/drm_probe_helper.h>
 
 #include "exynos_drm_crtc.h"
 #include "exynos_drm_drv.h"
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dpi.c b/drivers/gpu/drm/exynos/exynos_drm_dpi.c
index 2f0babb67c51..ae425c9a3f7b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dpi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dpi.c
@@ -11,9 +11,9 @@
 */
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_panel.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/of_graph.h>
 #include <linux/regulator/consumer.h>
diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index 2c75e789b2a7..e1ef9dc9ebf3 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -15,8 +15,8 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/component.h>
 
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index d81e62ae286a..a4253dd55f86 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -13,11 +13,11 @@
 #include <asm/unaligned.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_atomic_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_mipi_dsi.h>
 #include <drm/drm_panel.h>
-#include <drm/drm_atomic_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/clk.h>
 #include <linux/gpio/consumer.h>
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fb.c b/drivers/gpu/drm/exynos/exynos_drm_fb.c
index 31eb538a44ae..1f11ab0f8e9d 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fb.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fb.c
@@ -13,12 +13,12 @@
  */
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_fb_helper.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_crtc.h>
+#include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <uapi/drm/exynos_drm.h>
 
 #include "exynos_drm_drv.h"
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
index ce9604ca8041..c30dd88cdb25 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
@@ -15,7 +15,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_fb_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/exynos_drm.h>
 
 #include <linux/console.h>
@@ -88,7 +88,6 @@ static int exynos_drm_fbdev_update(struct drm_fb_helper *helper,
 	}
 
 	fbi->par = helper;
-	fbi->flags = FBINFO_FLAG_DEFAULT;
 	fbi->fbops = &exynos_drm_fb_ops;
 
 	drm_fb_helper_fill_fix(fbi, fb->pitches[0], fb->format->depth);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_mic.c b/drivers/gpu/drm/exynos/exynos_drm_mic.c
index 2fd299a58297..dd02e8a323ef 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_mic.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_mic.c
@@ -246,8 +246,8 @@ already_disabled:
 }
 
 static void mic_mode_set(struct drm_bridge *bridge,
-			struct drm_display_mode *mode,
-			struct drm_display_mode *adjusted_mode)
+			 const struct drm_display_mode *mode,
+			 const struct drm_display_mode *adjusted_mode)
 {
 	struct exynos_mic *mic = bridge->driver_private;
 
diff --git a/drivers/gpu/drm/exynos/exynos_drm_rotator.c b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
index 8d67b2a54be3..05abfed6f7f8 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_rotator.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
@@ -356,6 +356,11 @@ static int rotator_runtime_resume(struct device *dev)
 }
 #endif
 
+static const struct drm_exynos_ipp_limit rotator_s5pv210_rbg888_limits[] = {
+	{ IPP_SIZE_LIMIT(BUFFER, .h = { 8, SZ_16K }, .v = { 8, SZ_16K }) },
+	{ IPP_SIZE_LIMIT(AREA, .h.align = 2, .v.align = 2) },
+};
+
 static const struct drm_exynos_ipp_limit rotator_4210_rbg888_limits[] = {
 	{ IPP_SIZE_LIMIT(BUFFER, .h = { 8, SZ_16K }, .v = { 8, SZ_16K }) },
 	{ IPP_SIZE_LIMIT(AREA, .h.align = 4, .v.align = 4) },
@@ -371,6 +376,11 @@ static const struct drm_exynos_ipp_limit rotator_5250_rbg888_limits[] = {
 	{ IPP_SIZE_LIMIT(AREA, .h.align = 2, .v.align = 2) },
 };
 
+static const struct drm_exynos_ipp_limit rotator_s5pv210_yuv_limits[] = {
+	{ IPP_SIZE_LIMIT(BUFFER, .h = { 32, SZ_64K }, .v = { 32, SZ_64K }) },
+	{ IPP_SIZE_LIMIT(AREA, .h.align = 8, .v.align = 8) },
+};
+
 static const struct drm_exynos_ipp_limit rotator_4210_yuv_limits[] = {
 	{ IPP_SIZE_LIMIT(BUFFER, .h = { 32, SZ_64K }, .v = { 32, SZ_64K }) },
 	{ IPP_SIZE_LIMIT(AREA, .h.align = 8, .v.align = 8) },
@@ -381,6 +391,11 @@ static const struct drm_exynos_ipp_limit rotator_4412_yuv_limits[] = {
 	{ IPP_SIZE_LIMIT(AREA, .h.align = 8, .v.align = 8) },
 };
 
+static const struct exynos_drm_ipp_formats rotator_s5pv210_formats[] = {
+	{ IPP_SRCDST_FORMAT(XRGB8888, rotator_s5pv210_rbg888_limits) },
+	{ IPP_SRCDST_FORMAT(NV12, rotator_s5pv210_yuv_limits) },
+};
+
 static const struct exynos_drm_ipp_formats rotator_4210_formats[] = {
 	{ IPP_SRCDST_FORMAT(XRGB8888, rotator_4210_rbg888_limits) },
 	{ IPP_SRCDST_FORMAT(NV12, rotator_4210_yuv_limits) },
@@ -396,6 +411,11 @@ static const struct exynos_drm_ipp_formats rotator_5250_formats[] = {
 	{ IPP_SRCDST_FORMAT(NV12, rotator_4412_yuv_limits) },
 };
 
+static const struct rot_variant rotator_s5pv210_data = {
+	.formats = rotator_s5pv210_formats,
+	.num_formats = ARRAY_SIZE(rotator_s5pv210_formats),
+};
+
 static const struct rot_variant rotator_4210_data = {
 	.formats = rotator_4210_formats,
 	.num_formats = ARRAY_SIZE(rotator_4210_formats),
@@ -413,6 +433,9 @@ static const struct rot_variant rotator_5250_data = {
 
 static const struct of_device_id exynos_rotator_match[] = {
 	{
+		.compatible = "samsung,s5pv210-rotator",
+		.data = &rotator_s5pv210_data,
+	}, {
 		.compatible = "samsung,exynos4210-rotator",
 		.data = &rotator_4210_data,
 	}, {
diff --git a/drivers/gpu/drm/exynos/exynos_drm_scaler.c b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
index 71270efa64f3..ed1dd1aec902 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_scaler.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
@@ -1,7 +1,7 @@
 /*
  * Copyright (C) 2017 Samsung Electronics Co.Ltd
  * Author:
- *	Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ *	Andrzej Pietrasiewicz <andrzejtp2010@gmail.com>
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
diff --git a/drivers/gpu/drm/exynos/exynos_drm_vidi.c b/drivers/gpu/drm/exynos/exynos_drm_vidi.c
index 19697c1362d8..29f4c1932aed 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_vidi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_vidi.c
@@ -19,9 +19,9 @@
 
 #include <drm/exynos_drm.h>
 
-#include <drm/drm_edid.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 
 #include "exynos_drm_drv.h"
 #include "exynos_drm_crtc.h"
diff --git a/drivers/gpu/drm/exynos/exynos_hdmi.c b/drivers/gpu/drm/exynos/exynos_hdmi.c
index 2092a650df7d..8e2c02fc66e8 100644
--- a/drivers/gpu/drm/exynos/exynos_hdmi.c
+++ b/drivers/gpu/drm/exynos/exynos_hdmi.c
@@ -15,9 +15,9 @@
  */
 
 #include <drm/drmP.h>
-#include <drm/drm_edid.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 
 #include "regs-hdmi.h"
 
@@ -819,7 +819,8 @@ static void hdmi_reg_infoframes(struct hdmi_context *hdata)
 		return;
 	}
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&frm.avi, m, false);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&frm.avi,
+						       &hdata->connector, m);
 	if (!ret)
 		ret = hdmi_avi_infoframe_pack(&frm.avi, buf, sizeof(buf));
 	if (ret > 0) {
diff --git a/drivers/gpu/drm/exynos/regs-scaler.h b/drivers/gpu/drm/exynos/regs-scaler.h
index fc7ccad75e74..512a2baced11 100644
--- a/drivers/gpu/drm/exynos/regs-scaler.h
+++ b/drivers/gpu/drm/exynos/regs-scaler.h
@@ -2,7 +2,7 @@
  *
  * Copyright (c) 2017 Samsung Electronics Co., Ltd.
  *		http://www.samsung.com/
- * Author: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
+ * Author: Andrzej Pietrasiewicz <andrzejtp2010@gmail.com>
  *
  * Register definition file for Samsung scaler driver
  *
diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_crtc.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_crtc.c
index 18afc94e4dff..bf256971063d 100644
--- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_crtc.c
+++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_crtc.c
@@ -16,7 +16,7 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <video/videomode.h>
 
 #include "fsl_dcu_drm_crtc.h"
diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c
index ceddc3e29258..dfc73aade325 100644
--- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c
+++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c
@@ -24,11 +24,11 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_modeset_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "fsl_dcu_drm_crtc.h"
 #include "fsl_dcu_drm_drv.h"
@@ -137,7 +137,7 @@ static irqreturn_t fsl_dcu_drm_irq(int irq, void *arg)
 DEFINE_DRM_GEM_CMA_FOPS(fsl_dcu_drm_fops);
 
 static struct drm_driver fsl_dcu_drm_driver = {
-	.driver_features	= DRIVER_HAVE_IRQ | DRIVER_GEM | DRIVER_MODESET
+	.driver_features	= DRIVER_GEM | DRIVER_MODESET
 				| DRIVER_PRIME | DRIVER_ATOMIC,
 	.load			= fsl_dcu_load,
 	.unload			= fsl_dcu_unload,
diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_kms.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_kms.c
index ddc68e476a4d..e447f7d0c304 100644
--- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_kms.c
+++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_kms.c
@@ -11,9 +11,9 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "fsl_dcu_drm_crtc.h"
 #include "fsl_dcu_drm_drv.h"
diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c
index 9554b245746e..2a9e8a82c06a 100644
--- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c
+++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c
@@ -14,10 +14,10 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "fsl_dcu_drm_drv.h"
 #include "fsl_dcu_drm_plane.h"
diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c
index 2298ed2a9e1c..0a3a62b08240 100644
--- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c
+++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c
@@ -14,9 +14,9 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include "fsl_dcu_drm_drv.h"
 #include "fsl_tcon.h"
diff --git a/drivers/gpu/drm/gma500/framebuffer.c b/drivers/gpu/drm/gma500/framebuffer.c
index adefae58b5fc..c934b3df1f81 100644
--- a/drivers/gpu/drm/gma500/framebuffer.c
+++ b/drivers/gpu/drm/gma500/framebuffer.c
@@ -405,7 +405,6 @@ static int psbfb_create(struct psb_fbdev *fbdev,
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
 	strcpy(info->fix.id, "psbdrmfb");
 
-	info->flags = FBINFO_DEFAULT;
 	if (dev_priv->ops->accel_2d && pitch_lines > 8)	/* 2D engine */
 		info->fbops = &psbfb_ops;
 	else if (gtt_roll) {	/* GTT rolling seems best */
diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
index ac32ab5aa002..eefaf4daff2b 100644
--- a/drivers/gpu/drm/gma500/psb_drv.c
+++ b/drivers/gpu/drm/gma500/psb_drv.c
@@ -468,8 +468,7 @@ static const struct file_operations psb_gem_fops = {
 };
 
 static struct drm_driver driver = {
-	.driver_features = DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | \
-			   DRIVER_MODESET | DRIVER_GEM,
+	.driver_features = DRIVER_MODESET | DRIVER_GEM,
 	.load = psb_driver_load,
 	.unload = psb_driver_unload,
 	.lastclose = drm_fb_helper_lastclose,
diff --git a/drivers/gpu/drm/gma500/psb_intel_drv.h b/drivers/gpu/drm/gma500/psb_intel_drv.h
index e05e5399af2d..8280a923b916 100644
--- a/drivers/gpu/drm/gma500/psb_intel_drv.h
+++ b/drivers/gpu/drm/gma500/psb_intel_drv.h
@@ -24,6 +24,7 @@
 #include <drm/drm_crtc.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_encoder.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/gpio.h>
 #include "gma_display.h"
 
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
index a956545774a3..9316b724e7a2 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
@@ -18,8 +18,8 @@
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "hibmc_drm_drv.h"
 #include "hibmc_drm_regs.h"
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
index 68c0c297b3a5..8ed94fcd42a7 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
@@ -20,7 +20,7 @@
 #include <linux/module.h>
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "hibmc_drm_drv.h"
 #include "hibmc_drm_regs.h"
@@ -56,8 +56,7 @@ static irqreturn_t hibmc_drm_interrupt(int irq, void *arg)
 }
 
 static struct drm_driver hibmc_driver = {
-	.driver_features	= DRIVER_GEM | DRIVER_MODESET |
-				  DRIVER_ATOMIC | DRIVER_HAVE_IRQ,
+	.driver_features	= DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC,
 	.fops			= &hibmc_fops,
 	.name			= "hibmc",
 	.date			= "20160828",
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
index edcca1761500..de9d7cc97e44 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
@@ -17,8 +17,8 @@
  */
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "hibmc_drm_drv.h"
 
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c
index 744956cea749..d2cf7317930a 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c
@@ -17,7 +17,7 @@
  */
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "hibmc_drm_drv.h"
 #include "hibmc_drm_regs.h"
diff --git a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
index b4c7af3ab6ae..3d6c45097f51 100644
--- a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
+++ b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
@@ -17,12 +17,17 @@
 
 #include <linux/clk.h>
 #include <linux/component.h>
+#include <linux/delay.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
 
-#include <drm/drm_of.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_mipi_dsi.h>
-#include <drm/drm_encoder_slave.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_device.h>
+#include <drm/drm_encoder_slave.h>
+#include <drm/drm_mipi_dsi.h>
+#include <drm/drm_of.h>
+#include <drm/drm_print.h>
+#include <drm/drm_probe_helper.h>
 
 #include "dw_dsi_reg.h"
 
diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
index bb774202a5a1..73611a92d96c 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
@@ -23,13 +23,13 @@
 #include <linux/reset.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_plane_helper.h>
-#include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_crtc.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "kirin_drm_drv.h"
 #include "kirin_ade_reg.h"
diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
index e6a62d5a00a3..7cb7c042b93f 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
@@ -20,12 +20,13 @@
 #include <linux/of_graph.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_atomic_helper.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fb_helper.h>
+#include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
-#include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 
 #include "kirin_drm_drv.h"
 
@@ -33,32 +34,15 @@ static struct kirin_dc_ops *dc_ops;
 
 static int kirin_drm_kms_cleanup(struct drm_device *dev)
 {
-	struct kirin_drm_private *priv = dev->dev_private;
-
-	if (priv->fbdev) {
-		drm_fbdev_cma_fini(priv->fbdev);
-		priv->fbdev = NULL;
-	}
-
 	drm_kms_helper_poll_fini(dev);
 	dc_ops->cleanup(to_platform_device(dev->dev));
 	drm_mode_config_cleanup(dev);
-	devm_kfree(dev->dev, priv);
-	dev->dev_private = NULL;
 
 	return 0;
 }
 
-static void kirin_fbdev_output_poll_changed(struct drm_device *dev)
-{
-	struct kirin_drm_private *priv = dev->dev_private;
-
-	drm_fbdev_cma_hotplug_event(priv->fbdev);
-}
-
 static const struct drm_mode_config_funcs kirin_drm_mode_config_funcs = {
 	.fb_create = drm_gem_fb_create,
-	.output_poll_changed = kirin_fbdev_output_poll_changed,
 	.atomic_check = drm_atomic_helper_check,
 	.atomic_commit = drm_atomic_helper_commit,
 };
@@ -76,14 +60,8 @@ static void kirin_drm_mode_config_init(struct drm_device *dev)
 
 static int kirin_drm_kms_init(struct drm_device *dev)
 {
-	struct kirin_drm_private *priv;
 	int ret;
 
-	priv = devm_kzalloc(dev->dev, sizeof(*priv), GFP_KERNEL);
-	if (!priv)
-		return -ENOMEM;
-
-	dev->dev_private = priv;
 	dev_set_drvdata(dev->dev, dev);
 
 	/* dev->mode_config initialization */
@@ -117,26 +95,14 @@ static int kirin_drm_kms_init(struct drm_device *dev)
 	/* init kms poll for handling hpd */
 	drm_kms_helper_poll_init(dev);
 
-	priv->fbdev = drm_fbdev_cma_init(dev, 32,
-					 dev->mode_config.num_connector);
-
-	if (IS_ERR(priv->fbdev)) {
-		DRM_ERROR("failed to initialize fbdev.\n");
-		ret = PTR_ERR(priv->fbdev);
-		goto err_cleanup_poll;
-	}
 	return 0;
 
-err_cleanup_poll:
-	drm_kms_helper_poll_fini(dev);
 err_unbind_all:
 	component_unbind_all(dev->dev, dev);
 err_dc_cleanup:
 	dc_ops->cleanup(to_platform_device(dev->dev));
 err_mode_config_cleanup:
 	drm_mode_config_cleanup(dev);
-	devm_kfree(dev->dev, priv);
-	dev->dev_private = NULL;
 
 	return ret;
 }
@@ -199,6 +165,8 @@ static int kirin_drm_bind(struct device *dev)
 	if (ret)
 		goto err_kms_cleanup;
 
+	drm_fbdev_generic_setup(drm_dev, 32);
+
 	return 0;
 
 err_kms_cleanup:
diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h
index 56cb62df065c..ad027d1cc826 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h
@@ -19,10 +19,6 @@ struct kirin_dc_ops {
 	void (*cleanup)(struct platform_device *pdev);
 };
 
-struct kirin_drm_private {
-	struct drm_fbdev_cma *fbdev;
-};
-
 extern const struct kirin_dc_ops ade_dc_ops;
 
 #endif /* __KIRIN_DRM_DRV_H__ */
diff --git a/drivers/gpu/drm/i2c/ch7006_drv.c b/drivers/gpu/drm/i2c/ch7006_drv.c
index 544a8a2d3562..b91e48d2190d 100644
--- a/drivers/gpu/drm/i2c/ch7006_drv.c
+++ b/drivers/gpu/drm/i2c/ch7006_drv.c
@@ -359,10 +359,10 @@ static int ch7006_encoder_set_property(struct drm_encoder *encoder,
 	if (modes_changed) {
 		drm_helper_probe_single_connector_modes(connector, 0, 0);
 
-		/* Disable the crtc to ensure a full modeset is
-		 * performed whenever it's turned on again. */
 		if (crtc)
-			drm_crtc_force_disable(crtc);
+			drm_crtc_helper_set_mode(crtc, &crtc->mode,
+						 crtc->x, crtc->y,
+						 crtc->primary->fb);
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i2c/ch7006_priv.h b/drivers/gpu/drm/i2c/ch7006_priv.h
index dc6414af5d79..b6e091935977 100644
--- a/drivers/gpu/drm/i2c/ch7006_priv.h
+++ b/drivers/gpu/drm/i2c/ch7006_priv.h
@@ -30,6 +30,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_encoder_slave.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/i2c/ch7006.h>
 
 typedef int64_t fixed;
diff --git a/drivers/gpu/drm/i2c/sil164_drv.c b/drivers/gpu/drm/i2c/sil164_drv.c
index c52d7a3af786..878ba8d06ce2 100644
--- a/drivers/gpu/drm/i2c/sil164_drv.c
+++ b/drivers/gpu/drm/i2c/sil164_drv.c
@@ -27,8 +27,8 @@
 #include <linux/module.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_encoder_slave.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/i2c/sil164.h>
 
 struct sil164_priv {
diff --git a/drivers/gpu/drm/i2c/tda998x_drv.c b/drivers/gpu/drm/i2c/tda998x_drv.c
index a7c39f39793f..7f34601bb515 100644
--- a/drivers/gpu/drm/i2c/tda998x_drv.c
+++ b/drivers/gpu/drm/i2c/tda998x_drv.c
@@ -26,9 +26,9 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/i2c/tda998x.h>
 
 #include <media/cec-notifier.h>
@@ -845,11 +845,12 @@ static int tda998x_write_aif(struct tda998x_priv *priv,
 }
 
 static void
-tda998x_write_avi(struct tda998x_priv *priv, struct drm_display_mode *mode)
+tda998x_write_avi(struct tda998x_priv *priv, const struct drm_display_mode *mode)
 {
 	union hdmi_infoframe frame;
 
-	drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
+	drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
+						 &priv->connector, mode);
 	frame.avi.quantization_range = HDMI_QUANTIZATION_RANGE_FULL;
 
 	tda998x_write_if(priv, DIP_IF_FLAGS_IF2, REG_IF2_HB0, &frame);
@@ -1122,7 +1123,6 @@ static void tda998x_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs tda998x_connector_funcs = {
-	.dpms = drm_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = tda998x_connector_detect,
@@ -1339,8 +1339,8 @@ static void tda998x_bridge_disable(struct drm_bridge *bridge)
 }
 
 static void tda998x_bridge_mode_set(struct drm_bridge *bridge,
-				    struct drm_display_mode *mode,
-				    struct drm_display_mode *adjusted_mode)
+				    const struct drm_display_mode *mode,
+				    const struct drm_display_mode *adjusted_mode)
 {
 	struct tda998x_priv *priv = bridge_to_tda998x_priv(bridge);
 	unsigned long tmds_clock;
diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 9e36ffb5eb7c..ad4d71161dda 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -21,11 +21,11 @@ config DRM_I915_DEBUG
         select DEBUG_FS
         select PREEMPT_COUNT
         select I2C_CHARDEV
+        select STACKDEPOT
         select DRM_DP_AUX_CHARDEV
         select X86_MSR # used by igt/pm_rpm
         select DRM_VGEM # used by igt/prime_vgem (dmabuf interop checks)
         select DRM_DEBUG_MM if DRM=y
-        select STACKDEPOT if DRM=y # for DRM_DEBUG_MM
 	select DRM_DEBUG_SELFTEST
 	select SW_SYNC # signaling validation framework (igt/syncobj*)
 	select DRM_I915_SW_FENCE_DEBUG_OBJECTS
@@ -173,6 +173,7 @@ config DRM_I915_DEBUG_RUNTIME_PM
 	bool "Enable extra state checking for runtime PM"
 	depends on DRM_I915
 	default n
+	select STACKDEPOT
 	help
 	  Choose this option to turn on extra state checking for the
 	  runtime PM functionality. This may introduce overhead during
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 19b5fe5016bf..1787e1299b1b 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -22,6 +22,7 @@ subdir-ccflags-y += $(call cc-disable-warning, unused-but-set-variable)
 subdir-ccflags-y += $(call cc-disable-warning, sign-compare)
 subdir-ccflags-y += $(call cc-disable-warning, sometimes-uninitialized)
 subdir-ccflags-y += $(call cc-disable-warning, initializer-overrides)
+subdir-ccflags-y += $(call cc-disable-warning, uninitialized)
 subdir-ccflags-$(CONFIG_DRM_I915_WERROR) += -Werror
 
 # Fine grained warnings disable
@@ -40,9 +41,10 @@ i915-y := i915_drv.o \
 	  i915_mm.o \
 	  i915_params.o \
 	  i915_pci.o \
-          i915_suspend.o \
-	  i915_syncmap.o \
+	  i915_reset.o \
+	  i915_suspend.o \
 	  i915_sw_fence.o \
+	  i915_syncmap.o \
 	  i915_sysfs.o \
 	  intel_csr.o \
 	  intel_device_info.o \
@@ -55,7 +57,9 @@ i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
 i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # GEM code
-i915-y += i915_cmd_parser.o \
+i915-y += \
+	  i915_active.o \
+	  i915_cmd_parser.o \
 	  i915_gem_batch_pool.o \
 	  i915_gem_clflush.o \
 	  i915_gem_context.o \
@@ -166,6 +170,7 @@ i915-$(CONFIG_DRM_I915_SELFTEST) += \
 	selftests/i915_random.o \
 	selftests/i915_selftest.o \
 	selftests/igt_flush_test.o \
+	selftests/igt_live_test.o \
 	selftests/igt_reset.o \
 	selftests/igt_spinner.o
 
@@ -198,3 +203,4 @@ endif
 i915-y += intel_lpe_audio.o
 
 obj-$(CONFIG_DRM_I915) += i915.o
+obj-$(CONFIG_DRM_I915_GVT_KVMGT) += gvt/kvmgt.o
diff --git a/drivers/gpu/drm/i915/dvo.h b/drivers/gpu/drm/i915/dvo.h
index 5e6a3013da49..16e0345b711f 100644
--- a/drivers/gpu/drm/i915/dvo.h
+++ b/drivers/gpu/drm/i915/dvo.h
@@ -24,7 +24,6 @@
 #define _INTEL_DVO_H
 
 #include <linux/i2c.h>
-#include <drm/drmP.h>
 #include <drm/drm_crtc.h>
 #include "intel_drv.h"
 
diff --git a/drivers/gpu/drm/i915/gvt/Makefile b/drivers/gpu/drm/i915/gvt/Makefile
index b016dc753db9..271fb46d4dd0 100644
--- a/drivers/gpu/drm/i915/gvt/Makefile
+++ b/drivers/gpu/drm/i915/gvt/Makefile
@@ -7,4 +7,3 @@ GVT_SOURCE := gvt.o aperture_gm.o handlers.o vgpu.o trace_points.o firmware.o \
 
 ccflags-y				+= -I$(src) -I$(src)/$(GVT_DIR)
 i915-y					+= $(addprefix $(GVT_DIR)/, $(GVT_SOURCE))
-obj-$(CONFIG_DRM_I915_GVT_KVMGT)	+= $(GVT_DIR)/kvmgt.o
diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c b/drivers/gpu/drm/i915/gvt/aperture_gm.c
index 359d37d5c958..1fa2f65c3cd1 100644
--- a/drivers/gpu/drm/i915/gvt/aperture_gm.c
+++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c
@@ -180,7 +180,7 @@ static void free_vgpu_fence(struct intel_vgpu *vgpu)
 	}
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put_unchecked(dev_priv);
 }
 
 static int alloc_vgpu_fence(struct intel_vgpu *vgpu)
@@ -206,7 +206,7 @@ static int alloc_vgpu_fence(struct intel_vgpu *vgpu)
 	_clear_vgpu_fence(vgpu);
 
 	mutex_unlock(&dev_priv->drm.struct_mutex);
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put_unchecked(dev_priv);
 	return 0;
 out_free_fence:
 	gvt_vgpu_err("Failed to alloc fences\n");
@@ -219,7 +219,7 @@ out_free_fence:
 		vgpu->fence.regs[i] = NULL;
 	}
 	mutex_unlock(&dev_priv->drm.struct_mutex);
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put_unchecked(dev_priv);
 	return -ENOSPC;
 }
 
@@ -317,7 +317,7 @@ void intel_vgpu_reset_resource(struct intel_vgpu *vgpu)
 
 	intel_runtime_pm_get(dev_priv);
 	_clear_vgpu_fence(vgpu);
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put_unchecked(dev_priv);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c
index 77ae634eb11c..35b4ec3f7618 100644
--- a/drivers/gpu/drm/i915/gvt/cmd_parser.c
+++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c
@@ -55,10 +55,10 @@ struct sub_op_bits {
 	int low;
 };
 struct decode_info {
-	char *name;
+	const char *name;
 	int op_len;
 	int nr_sub_op;
-	struct sub_op_bits *sub_op;
+	const struct sub_op_bits *sub_op;
 };
 
 #define   MAX_CMD_BUDGET			0x7fffffff
@@ -375,7 +375,7 @@ typedef int (*parser_cmd_handler)(struct parser_exec_state *s);
 #define ADDR_FIX_5(x1, x2, x3, x4, x5)  (ADDR_FIX_1(x1) | ADDR_FIX_4(x2, x3, x4, x5))
 
 struct cmd_info {
-	char *name;
+	const char *name;
 	u32 opcode;
 
 #define F_LEN_MASK	(1U<<0)
@@ -399,10 +399,10 @@ struct cmd_info {
 #define R_VECS	(1 << VECS)
 #define R_ALL (R_RCS | R_VCS | R_BCS | R_VECS)
 	/* rings that support this cmd: BLT/RCS/VCS/VECS */
-	uint16_t rings;
+	u16 rings;
 
 	/* devices that support this cmd: SNB/IVB/HSW/... */
-	uint16_t devices;
+	u16 devices;
 
 	/* which DWords are address that need fix up.
 	 * bit 0 means a 32-bit non address operand in command
@@ -412,20 +412,20 @@ struct cmd_info {
 	 * No matter the address length, each address only takes
 	 * one bit in the bitmap.
 	 */
-	uint16_t addr_bitmap;
+	u16 addr_bitmap;
 
 	/* flag == F_LEN_CONST : command length
 	 * flag == F_LEN_VAR : length bias bits
 	 * Note: length is in DWord
 	 */
-	uint8_t	len;
+	u8 len;
 
 	parser_cmd_handler handler;
 };
 
 struct cmd_entry {
 	struct hlist_node hlist;
-	struct cmd_info *info;
+	const struct cmd_info *info;
 };
 
 enum {
@@ -474,7 +474,7 @@ struct parser_exec_state {
 	int saved_buf_addr_type;
 	bool is_ctx_wa;
 
-	struct cmd_info *info;
+	const struct cmd_info *info;
 
 	struct intel_vgpu_workload *workload;
 };
@@ -485,12 +485,12 @@ struct parser_exec_state {
 static unsigned long bypass_scan_mask = 0;
 
 /* ring ALL, type = 0 */
-static struct sub_op_bits sub_op_mi[] = {
+static const struct sub_op_bits sub_op_mi[] = {
 	{31, 29},
 	{28, 23},
 };
 
-static struct decode_info decode_info_mi = {
+static const struct decode_info decode_info_mi = {
 	"MI",
 	OP_LEN_MI,
 	ARRAY_SIZE(sub_op_mi),
@@ -498,12 +498,12 @@ static struct decode_info decode_info_mi = {
 };
 
 /* ring RCS, command type 2 */
-static struct sub_op_bits sub_op_2d[] = {
+static const struct sub_op_bits sub_op_2d[] = {
 	{31, 29},
 	{28, 22},
 };
 
-static struct decode_info decode_info_2d = {
+static const struct decode_info decode_info_2d = {
 	"2D",
 	OP_LEN_2D,
 	ARRAY_SIZE(sub_op_2d),
@@ -511,14 +511,14 @@ static struct decode_info decode_info_2d = {
 };
 
 /* ring RCS, command type 3 */
-static struct sub_op_bits sub_op_3d_media[] = {
+static const struct sub_op_bits sub_op_3d_media[] = {
 	{31, 29},
 	{28, 27},
 	{26, 24},
 	{23, 16},
 };
 
-static struct decode_info decode_info_3d_media = {
+static const struct decode_info decode_info_3d_media = {
 	"3D_Media",
 	OP_LEN_3D_MEDIA,
 	ARRAY_SIZE(sub_op_3d_media),
@@ -526,7 +526,7 @@ static struct decode_info decode_info_3d_media = {
 };
 
 /* ring VCS, command type 3 */
-static struct sub_op_bits sub_op_mfx_vc[] = {
+static const struct sub_op_bits sub_op_mfx_vc[] = {
 	{31, 29},
 	{28, 27},
 	{26, 24},
@@ -534,7 +534,7 @@ static struct sub_op_bits sub_op_mfx_vc[] = {
 	{20, 16},
 };
 
-static struct decode_info decode_info_mfx_vc = {
+static const struct decode_info decode_info_mfx_vc = {
 	"MFX_VC",
 	OP_LEN_MFX_VC,
 	ARRAY_SIZE(sub_op_mfx_vc),
@@ -542,7 +542,7 @@ static struct decode_info decode_info_mfx_vc = {
 };
 
 /* ring VECS, command type 3 */
-static struct sub_op_bits sub_op_vebox[] = {
+static const struct sub_op_bits sub_op_vebox[] = {
 	{31, 29},
 	{28, 27},
 	{26, 24},
@@ -550,14 +550,14 @@ static struct sub_op_bits sub_op_vebox[] = {
 	{20, 16},
 };
 
-static struct decode_info decode_info_vebox = {
+static const struct decode_info decode_info_vebox = {
 	"VEBOX",
 	OP_LEN_VEBOX,
 	ARRAY_SIZE(sub_op_vebox),
 	sub_op_vebox,
 };
 
-static struct decode_info *ring_decode_info[I915_NUM_ENGINES][8] = {
+static const struct decode_info *ring_decode_info[I915_NUM_ENGINES][8] = {
 	[RCS] = {
 		&decode_info_mi,
 		NULL,
@@ -616,7 +616,7 @@ static struct decode_info *ring_decode_info[I915_NUM_ENGINES][8] = {
 
 static inline u32 get_opcode(u32 cmd, int ring_id)
 {
-	struct decode_info *d_info;
+	const struct decode_info *d_info;
 
 	d_info = ring_decode_info[ring_id][CMD_TYPE(cmd)];
 	if (d_info == NULL)
@@ -625,7 +625,7 @@ static inline u32 get_opcode(u32 cmd, int ring_id)
 	return cmd >> (32 - d_info->op_len);
 }
 
-static inline struct cmd_info *find_cmd_entry(struct intel_gvt *gvt,
+static inline const struct cmd_info *find_cmd_entry(struct intel_gvt *gvt,
 		unsigned int opcode, int ring_id)
 {
 	struct cmd_entry *e;
@@ -638,7 +638,7 @@ static inline struct cmd_info *find_cmd_entry(struct intel_gvt *gvt,
 	return NULL;
 }
 
-static inline struct cmd_info *get_cmd_info(struct intel_gvt *gvt,
+static inline const struct cmd_info *get_cmd_info(struct intel_gvt *gvt,
 		u32 cmd, int ring_id)
 {
 	u32 opcode;
@@ -657,7 +657,7 @@ static inline u32 sub_op_val(u32 cmd, u32 hi, u32 low)
 
 static inline void print_opcode(u32 cmd, int ring_id)
 {
-	struct decode_info *d_info;
+	const struct decode_info *d_info;
 	int i;
 
 	d_info = ring_decode_info[ring_id][CMD_TYPE(cmd)];
@@ -776,7 +776,7 @@ static inline int ip_gma_advance(struct parser_exec_state *s,
 	return 0;
 }
 
-static inline int get_cmd_length(struct cmd_info *info, u32 cmd)
+static inline int get_cmd_length(const struct cmd_info *info, u32 cmd)
 {
 	if ((info->flag & F_LEN_MASK) == F_LEN_CONST)
 		return info->len;
@@ -901,7 +901,8 @@ static int cmd_reg_handler(struct parser_exec_state *s,
 	 * It's good enough to support initializing mmio by lri command in
 	 * vgpu inhibit context on KBL.
 	 */
-	if (IS_KABYLAKE(s->vgpu->gvt->dev_priv) &&
+	if ((IS_KABYLAKE(s->vgpu->gvt->dev_priv)
+		|| IS_COFFEELAKE(s->vgpu->gvt->dev_priv)) &&
 			intel_gvt_mmio_is_in_ctx(gvt, offset) &&
 			!strncmp(cmd, "lri", 3)) {
 		intel_gvt_hypervisor_read_gpa(s->vgpu,
@@ -1280,9 +1281,7 @@ static int gen8_check_mi_display_flip(struct parser_exec_state *s,
 	if (!info->async_flip)
 		return 0;
 
-	if (IS_SKYLAKE(dev_priv)
-		|| IS_KABYLAKE(dev_priv)
-		|| IS_BROXTON(dev_priv)) {
+	if (INTEL_GEN(dev_priv) >= 9) {
 		stride = vgpu_vreg_t(s->vgpu, info->stride_reg) & GENMASK(9, 0);
 		tile = (vgpu_vreg_t(s->vgpu, info->ctrl_reg) &
 				GENMASK(12, 10)) >> 10;
@@ -1310,9 +1309,7 @@ static int gen8_update_plane_mmio_from_mi_display_flip(
 
 	set_mask_bits(&vgpu_vreg_t(vgpu, info->surf_reg), GENMASK(31, 12),
 		      info->surf_val << 12);
-	if (IS_SKYLAKE(dev_priv)
-		|| IS_KABYLAKE(dev_priv)
-		|| IS_BROXTON(dev_priv)) {
+	if (INTEL_GEN(dev_priv) >= 9) {
 		set_mask_bits(&vgpu_vreg_t(vgpu, info->stride_reg), GENMASK(9, 0),
 			      info->stride_val);
 		set_mask_bits(&vgpu_vreg_t(vgpu, info->ctrl_reg), GENMASK(12, 10),
@@ -1336,9 +1333,7 @@ static int decode_mi_display_flip(struct parser_exec_state *s,
 
 	if (IS_BROADWELL(dev_priv))
 		return gen8_decode_mi_display_flip(s, info);
-	if (IS_SKYLAKE(dev_priv)
-		|| IS_KABYLAKE(dev_priv)
-		|| IS_BROXTON(dev_priv))
+	if (INTEL_GEN(dev_priv) >= 9)
 		return skl_decode_mi_display_flip(s, info);
 
 	return -ENODEV;
@@ -1643,8 +1638,8 @@ static int batch_buffer_needs_scan(struct parser_exec_state *s)
 static int find_bb_size(struct parser_exec_state *s, unsigned long *bb_size)
 {
 	unsigned long gma = 0;
-	struct cmd_info *info;
-	uint32_t cmd_len = 0;
+	const struct cmd_info *info;
+	u32 cmd_len = 0;
 	bool bb_end = false;
 	struct intel_vgpu *vgpu = s->vgpu;
 	u32 cmd;
@@ -1842,7 +1837,7 @@ static int cmd_handler_mi_batch_buffer_start(struct parser_exec_state *s)
 
 static int mi_noop_index;
 
-static struct cmd_info cmd_info[] = {
+static const struct cmd_info cmd_info[] = {
 	{"MI_NOOP", OP_MI_NOOP, F_LEN_CONST, R_ALL, D_ALL, 0, 1, NULL},
 
 	{"MI_SET_PREDICATE", OP_MI_SET_PREDICATE, F_LEN_CONST, R_ALL, D_ALL,
@@ -2521,7 +2516,7 @@ static void add_cmd_entry(struct intel_gvt *gvt, struct cmd_entry *e)
 static int cmd_parser_exec(struct parser_exec_state *s)
 {
 	struct intel_vgpu *vgpu = s->vgpu;
-	struct cmd_info *info;
+	const struct cmd_info *info;
 	u32 cmd;
 	int ret = 0;
 
@@ -2683,7 +2678,7 @@ static int scan_wa_ctx(struct intel_shadow_wa_ctx *wa_ctx)
 					I915_GTT_PAGE_SIZE)))
 		return -EINVAL;
 
-	ring_tail = wa_ctx->indirect_ctx.size + 3 * sizeof(uint32_t);
+	ring_tail = wa_ctx->indirect_ctx.size + 3 * sizeof(u32);
 	ring_size = round_up(wa_ctx->indirect_ctx.size + CACHELINE_BYTES,
 			PAGE_SIZE);
 	gma_head = wa_ctx->indirect_ctx.guest_gma;
@@ -2850,7 +2845,7 @@ put_obj:
 
 static int combine_wa_ctx(struct intel_shadow_wa_ctx *wa_ctx)
 {
-	uint32_t per_ctx_start[CACHELINE_DWORDS] = {0};
+	u32 per_ctx_start[CACHELINE_DWORDS] = {0};
 	unsigned char *bb_start_sva;
 
 	if (!wa_ctx->per_ctx.valid)
@@ -2895,10 +2890,10 @@ int intel_gvt_scan_and_shadow_wa_ctx(struct intel_shadow_wa_ctx *wa_ctx)
 	return 0;
 }
 
-static struct cmd_info *find_cmd_entry_any_ring(struct intel_gvt *gvt,
+static const struct cmd_info *find_cmd_entry_any_ring(struct intel_gvt *gvt,
 		unsigned int opcode, unsigned long rings)
 {
-	struct cmd_info *info = NULL;
+	const struct cmd_info *info = NULL;
 	unsigned int ring;
 
 	for_each_set_bit(ring, &rings, I915_NUM_ENGINES) {
@@ -2913,7 +2908,7 @@ static int init_cmd_table(struct intel_gvt *gvt)
 {
 	int i;
 	struct cmd_entry *e;
-	struct cmd_info	*info;
+	const struct cmd_info *info;
 	unsigned int gen_type;
 
 	gen_type = intel_gvt_get_device_type(gvt);
diff --git a/drivers/gpu/drm/i915/gvt/display.c b/drivers/gpu/drm/i915/gvt/display.c
index df1e14145747..035479e273be 100644
--- a/drivers/gpu/drm/i915/gvt/display.c
+++ b/drivers/gpu/drm/i915/gvt/display.c
@@ -198,7 +198,8 @@ static void emulate_monitor_status_change(struct intel_vgpu *vgpu)
 			SDE_PORTC_HOTPLUG_CPT |
 			SDE_PORTD_HOTPLUG_CPT);
 
-	if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) {
+	if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv) ||
+	    IS_COFFEELAKE(dev_priv)) {
 		vgpu_vreg_t(vgpu, SDEISR) &= ~(SDE_PORTA_HOTPLUG_SPT |
 				SDE_PORTE_HOTPLUG_SPT);
 		vgpu_vreg_t(vgpu, SKL_FUSE_STATUS) |=
@@ -273,7 +274,8 @@ static void emulate_monitor_status_change(struct intel_vgpu *vgpu)
 		vgpu_vreg_t(vgpu, SFUSE_STRAP) |= SFUSE_STRAP_DDID_DETECTED;
 	}
 
-	if ((IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) &&
+	if ((IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv) ||
+	     IS_COFFEELAKE(dev_priv)) &&
 			intel_vgpu_has_monitor_on_port(vgpu, PORT_E)) {
 		vgpu_vreg_t(vgpu, SDEISR) |= SDE_PORTE_HOTPLUG_SPT;
 	}
@@ -340,6 +342,7 @@ static int setup_virtual_dp_monitor(struct intel_vgpu *vgpu, int port_num,
 	port->dpcd->data_valid = true;
 	port->dpcd->data[DPCD_SINK_COUNT] = 0x1;
 	port->type = type;
+	port->id = resolution;
 
 	emulate_monitor_status_change(vgpu);
 
@@ -443,6 +446,36 @@ void intel_gvt_emulate_vblank(struct intel_gvt *gvt)
 }
 
 /**
+ * intel_vgpu_emulate_hotplug - trigger hotplug event for vGPU
+ * @vgpu: a vGPU
+ * @conncted: link state
+ *
+ * This function is used to trigger hotplug interrupt for vGPU
+ *
+ */
+void intel_vgpu_emulate_hotplug(struct intel_vgpu *vgpu, bool connected)
+{
+	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
+
+	/* TODO: add more platforms support */
+	if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) {
+		if (connected) {
+			vgpu_vreg_t(vgpu, SFUSE_STRAP) |=
+				SFUSE_STRAP_DDID_DETECTED;
+			vgpu_vreg_t(vgpu, SDEISR) |= SDE_PORTD_HOTPLUG_CPT;
+		} else {
+			vgpu_vreg_t(vgpu, SFUSE_STRAP) &=
+				~SFUSE_STRAP_DDID_DETECTED;
+			vgpu_vreg_t(vgpu, SDEISR) &= ~SDE_PORTD_HOTPLUG_CPT;
+		}
+		vgpu_vreg_t(vgpu, SDEIIR) |= SDE_PORTD_HOTPLUG_CPT;
+		vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
+				PORTD_HOTPLUG_STATUS_MASK;
+		intel_vgpu_trigger_virtual_event(vgpu, DP_D_HOTPLUG);
+	}
+}
+
+/**
  * intel_vgpu_clean_display - clean vGPU virtual display emulation
  * @vgpu: a vGPU
  *
@@ -453,7 +486,8 @@ void intel_vgpu_clean_display(struct intel_vgpu *vgpu)
 {
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
 
-	if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv))
+	if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv) ||
+	    IS_COFFEELAKE(dev_priv))
 		clean_virtual_dp_monitor(vgpu, PORT_D);
 	else
 		clean_virtual_dp_monitor(vgpu, PORT_B);
@@ -476,7 +510,8 @@ int intel_vgpu_init_display(struct intel_vgpu *vgpu, u64 resolution)
 
 	intel_vgpu_init_i2c_edid(vgpu);
 
-	if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv))
+	if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv) ||
+	    IS_COFFEELAKE(dev_priv))
 		return setup_virtual_dp_monitor(vgpu, PORT_D, GVT_DP_D,
 						resolution);
 	else
diff --git a/drivers/gpu/drm/i915/gvt/display.h b/drivers/gpu/drm/i915/gvt/display.h
index ea7c1c525b8c..a87f33e6a23c 100644
--- a/drivers/gpu/drm/i915/gvt/display.h
+++ b/drivers/gpu/drm/i915/gvt/display.h
@@ -146,18 +146,19 @@ enum intel_vgpu_port_type {
 	GVT_PORT_MAX
 };
 
+enum intel_vgpu_edid {
+	GVT_EDID_1024_768,
+	GVT_EDID_1920_1200,
+	GVT_EDID_NUM,
+};
+
 struct intel_vgpu_port {
 	/* per display EDID information */
 	struct intel_vgpu_edid_data *edid;
 	/* per display DPCD information */
 	struct intel_vgpu_dpcd_data *dpcd;
 	int type;
-};
-
-enum intel_vgpu_edid {
-	GVT_EDID_1024_768,
-	GVT_EDID_1920_1200,
-	GVT_EDID_NUM,
+	enum intel_vgpu_edid id;
 };
 
 static inline char *vgpu_edid_str(enum intel_vgpu_edid id)
@@ -172,6 +173,30 @@ static inline char *vgpu_edid_str(enum intel_vgpu_edid id)
 	}
 }
 
+static inline unsigned int vgpu_edid_xres(enum intel_vgpu_edid id)
+{
+	switch (id) {
+	case GVT_EDID_1024_768:
+		return 1024;
+	case GVT_EDID_1920_1200:
+		return 1920;
+	default:
+		return 0;
+	}
+}
+
+static inline unsigned int vgpu_edid_yres(enum intel_vgpu_edid id)
+{
+	switch (id) {
+	case GVT_EDID_1024_768:
+		return 768;
+	case GVT_EDID_1920_1200:
+		return 1200;
+	default:
+		return 0;
+	}
+}
+
 void intel_gvt_emulate_vblank(struct intel_gvt *gvt);
 void intel_gvt_check_vblank_emulation(struct intel_gvt *gvt);
 
diff --git a/drivers/gpu/drm/i915/gvt/dmabuf.c b/drivers/gpu/drm/i915/gvt/dmabuf.c
index 51ed99a37803..3e7e2b80c857 100644
--- a/drivers/gpu/drm/i915/gvt/dmabuf.c
+++ b/drivers/gpu/drm/i915/gvt/dmabuf.c
@@ -29,7 +29,6 @@
  */
 
 #include <linux/dma-buf.h>
-#include <drm/drmP.h>
 #include <linux/vfio.h>
 
 #include "i915_drv.h"
@@ -164,9 +163,7 @@ static struct drm_i915_gem_object *vgpu_create_gem(struct drm_device *dev,
 
 	obj->read_domains = I915_GEM_DOMAIN_GTT;
 	obj->write_domain = 0;
-	if (IS_SKYLAKE(dev_priv)
-		|| IS_KABYLAKE(dev_priv)
-		|| IS_BROXTON(dev_priv)) {
+	if (INTEL_GEN(dev_priv) >= 9) {
 		unsigned int tiling_mode = 0;
 		unsigned int stride = 0;
 
diff --git a/drivers/gpu/drm/i915/gvt/edid.c b/drivers/gpu/drm/i915/gvt/edid.c
index 5d4bb35bb889..1fe6124918f1 100644
--- a/drivers/gpu/drm/i915/gvt/edid.c
+++ b/drivers/gpu/drm/i915/gvt/edid.c
@@ -77,16 +77,32 @@ static unsigned char edid_get_byte(struct intel_vgpu *vgpu)
 	return chr;
 }
 
+static inline int cnp_get_port_from_gmbus0(u32 gmbus0)
+{
+	int port_select = gmbus0 & _GMBUS_PIN_SEL_MASK;
+	int port = -EINVAL;
+
+	if (port_select == GMBUS_PIN_1_BXT)
+		port = PORT_B;
+	else if (port_select == GMBUS_PIN_2_BXT)
+		port = PORT_C;
+	else if (port_select == GMBUS_PIN_3_BXT)
+		port = PORT_D;
+	else if (port_select == GMBUS_PIN_4_CNP)
+		port = PORT_E;
+	return port;
+}
+
 static inline int bxt_get_port_from_gmbus0(u32 gmbus0)
 {
 	int port_select = gmbus0 & _GMBUS_PIN_SEL_MASK;
 	int port = -EINVAL;
 
-	if (port_select == 1)
+	if (port_select == GMBUS_PIN_1_BXT)
 		port = PORT_B;
-	else if (port_select == 2)
+	else if (port_select == GMBUS_PIN_2_BXT)
 		port = PORT_C;
-	else if (port_select == 3)
+	else if (port_select == GMBUS_PIN_3_BXT)
 		port = PORT_D;
 	return port;
 }
@@ -96,13 +112,13 @@ static inline int get_port_from_gmbus0(u32 gmbus0)
 	int port_select = gmbus0 & _GMBUS_PIN_SEL_MASK;
 	int port = -EINVAL;
 
-	if (port_select == 2)
+	if (port_select == GMBUS_PIN_VGADDC)
 		port = PORT_E;
-	else if (port_select == 4)
+	else if (port_select == GMBUS_PIN_DPC)
 		port = PORT_C;
-	else if (port_select == 5)
+	else if (port_select == GMBUS_PIN_DPB)
 		port = PORT_B;
-	else if (port_select == 6)
+	else if (port_select == GMBUS_PIN_DPD)
 		port = PORT_D;
 	return port;
 }
@@ -133,6 +149,8 @@ static int gmbus0_mmio_write(struct intel_vgpu *vgpu,
 
 	if (IS_BROXTON(dev_priv))
 		port = bxt_get_port_from_gmbus0(pin_select);
+	else if (IS_COFFEELAKE(dev_priv))
+		port = cnp_get_port_from_gmbus0(pin_select);
 	else
 		port = get_port_from_gmbus0(pin_select);
 	if (WARN_ON(port < 0))
diff --git a/drivers/gpu/drm/i915/gvt/fb_decoder.c b/drivers/gpu/drm/i915/gvt/fb_decoder.c
index 85e6736f0a32..65e847392aea 100644
--- a/drivers/gpu/drm/i915/gvt/fb_decoder.c
+++ b/drivers/gpu/drm/i915/gvt/fb_decoder.c
@@ -151,9 +151,7 @@ static u32 intel_vgpu_get_stride(struct intel_vgpu *vgpu, int pipe,
 	u32 stride_reg = vgpu_vreg_t(vgpu, DSPSTRIDE(pipe)) & stride_mask;
 	u32 stride = stride_reg;
 
-	if (IS_SKYLAKE(dev_priv)
-		|| IS_KABYLAKE(dev_priv)
-		|| IS_BROXTON(dev_priv)) {
+	if (INTEL_GEN(dev_priv) >= 9) {
 		switch (tiled) {
 		case PLANE_CTL_TILED_LINEAR:
 			stride = stride_reg * 64;
@@ -217,9 +215,7 @@ int intel_vgpu_decode_primary_plane(struct intel_vgpu *vgpu,
 	if (!plane->enabled)
 		return -ENODEV;
 
-	if (IS_SKYLAKE(dev_priv)
-		|| IS_KABYLAKE(dev_priv)
-		|| IS_BROXTON(dev_priv)) {
+	if (INTEL_GEN(dev_priv) >= 9) {
 		plane->tiled = val & PLANE_CTL_TILED_MASK;
 		fmt = skl_format_to_drm(
 			val & PLANE_CTL_FORMAT_MASK,
@@ -260,9 +256,7 @@ int intel_vgpu_decode_primary_plane(struct intel_vgpu *vgpu,
 	}
 
 	plane->stride = intel_vgpu_get_stride(vgpu, pipe, plane->tiled,
-		(IS_SKYLAKE(dev_priv)
-		|| IS_KABYLAKE(dev_priv)
-		|| IS_BROXTON(dev_priv)) ?
+		(INTEL_GEN(dev_priv) >= 9) ?
 			(_PRI_PLANE_STRIDE_MASK >> 6) :
 				_PRI_PLANE_STRIDE_MASK, plane->bpp);
 
diff --git a/drivers/gpu/drm/i915/gvt/gvt.c b/drivers/gpu/drm/i915/gvt/gvt.c
index 733a2a0d0c30..43f4242062dd 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.c
+++ b/drivers/gpu/drm/i915/gvt/gvt.c
@@ -185,54 +185,9 @@ static const struct intel_gvt_ops intel_gvt_ops = {
 	.vgpu_query_plane = intel_vgpu_query_plane,
 	.vgpu_get_dmabuf = intel_vgpu_get_dmabuf,
 	.write_protect_handler = intel_vgpu_page_track_handler,
+	.emulate_hotplug = intel_vgpu_emulate_hotplug,
 };
 
-/**
- * intel_gvt_init_host - Load MPT modules and detect if we're running in host
- *
- * This function is called at the driver loading stage. If failed to find a
- * loadable MPT module or detect currently we're running in a VM, then GVT-g
- * will be disabled
- *
- * Returns:
- * Zero on success, negative error code if failed.
- *
- */
-int intel_gvt_init_host(void)
-{
-	if (intel_gvt_host.initialized)
-		return 0;
-
-	/* Xen DOM U */
-	if (xen_domain() && !xen_initial_domain())
-		return -ENODEV;
-
-	/* Try to load MPT modules for hypervisors */
-	if (xen_initial_domain()) {
-		/* In Xen dom0 */
-		intel_gvt_host.mpt = try_then_request_module(
-				symbol_get(xengt_mpt), "xengt");
-		intel_gvt_host.hypervisor_type = INTEL_GVT_HYPERVISOR_XEN;
-	} else {
-#if IS_ENABLED(CONFIG_DRM_I915_GVT_KVMGT)
-		/* not in Xen. Try KVMGT */
-		intel_gvt_host.mpt = try_then_request_module(
-				symbol_get(kvmgt_mpt), "kvmgt");
-		intel_gvt_host.hypervisor_type = INTEL_GVT_HYPERVISOR_KVM;
-#endif
-	}
-
-	/* Fail to load MPT modules - bail out */
-	if (!intel_gvt_host.mpt)
-		return -EINVAL;
-
-	gvt_dbg_core("Running with hypervisor %s in host mode\n",
-			supported_hypervisors[intel_gvt_host.hypervisor_type]);
-
-	intel_gvt_host.initialized = true;
-	return 0;
-}
-
 static void init_device_info(struct intel_gvt *gvt)
 {
 	struct intel_gvt_device_info *info = &gvt->device_info;
@@ -316,7 +271,6 @@ void intel_gvt_clean_device(struct drm_i915_private *dev_priv)
 		return;
 
 	intel_gvt_destroy_idle_vgpu(gvt->idle_vgpu);
-	intel_gvt_hypervisor_host_exit(&dev_priv->drm.pdev->dev, gvt);
 	intel_gvt_cleanup_vgpu_type_groups(gvt);
 	intel_gvt_clean_vgpu_types(gvt);
 
@@ -352,13 +306,6 @@ int intel_gvt_init_device(struct drm_i915_private *dev_priv)
 	struct intel_vgpu *vgpu;
 	int ret;
 
-	/*
-	 * Cannot initialize GVT device without intel_gvt_host gets
-	 * initialized first.
-	 */
-	if (WARN_ON(!intel_gvt_host.initialized))
-		return -EINVAL;
-
 	if (WARN_ON(dev_priv->gvt))
 		return -EEXIST;
 
@@ -420,13 +367,6 @@ int intel_gvt_init_device(struct drm_i915_private *dev_priv)
 		goto out_clean_types;
 	}
 
-	ret = intel_gvt_hypervisor_host_init(&dev_priv->drm.pdev->dev, gvt,
-				&intel_gvt_ops);
-	if (ret) {
-		gvt_err("failed to register gvt-g host device: %d\n", ret);
-		goto out_clean_types;
-	}
-
 	vgpu = intel_gvt_create_idle_vgpu(gvt);
 	if (IS_ERR(vgpu)) {
 		ret = PTR_ERR(vgpu);
@@ -441,6 +381,8 @@ int intel_gvt_init_device(struct drm_i915_private *dev_priv)
 
 	gvt_dbg_core("gvt device initialization is done\n");
 	dev_priv->gvt = gvt;
+	intel_gvt_host.dev = &dev_priv->drm.pdev->dev;
+	intel_gvt_host.initialized = true;
 	return 0;
 
 out_clean_types:
@@ -467,6 +409,45 @@ out_clean_idr:
 	return ret;
 }
 
-#if IS_ENABLED(CONFIG_DRM_I915_GVT_KVMGT)
-MODULE_SOFTDEP("pre: kvmgt");
-#endif
+int
+intel_gvt_register_hypervisor(struct intel_gvt_mpt *m)
+{
+	int ret;
+	void *gvt;
+
+	if (!intel_gvt_host.initialized)
+		return -ENODEV;
+
+	if (m->type != INTEL_GVT_HYPERVISOR_KVM &&
+	    m->type != INTEL_GVT_HYPERVISOR_XEN)
+		return -EINVAL;
+
+	/* Get a reference for device model module */
+	if (!try_module_get(THIS_MODULE))
+		return -ENODEV;
+
+	intel_gvt_host.mpt = m;
+	intel_gvt_host.hypervisor_type = m->type;
+	gvt = (void *)kdev_to_i915(intel_gvt_host.dev)->gvt;
+
+	ret = intel_gvt_hypervisor_host_init(intel_gvt_host.dev, gvt,
+					     &intel_gvt_ops);
+	if (ret < 0) {
+		gvt_err("Failed to init %s hypervisor module\n",
+			supported_hypervisors[intel_gvt_host.hypervisor_type]);
+		module_put(THIS_MODULE);
+		return -ENODEV;
+	}
+	gvt_dbg_core("Running with hypervisor %s in host mode\n",
+		     supported_hypervisors[intel_gvt_host.hypervisor_type]);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(intel_gvt_register_hypervisor);
+
+void
+intel_gvt_unregister_hypervisor(void)
+{
+	intel_gvt_hypervisor_host_exit(intel_gvt_host.dev);
+	module_put(THIS_MODULE);
+}
+EXPORT_SYMBOL_GPL(intel_gvt_unregister_hypervisor);
diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index b4ab1dad0143..8bce09de4b82 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -52,12 +52,8 @@
 
 #define GVT_MAX_VGPU 8
 
-enum {
-	INTEL_GVT_HYPERVISOR_XEN = 0,
-	INTEL_GVT_HYPERVISOR_KVM,
-};
-
 struct intel_gvt_host {
+	struct device *dev;
 	bool initialized;
 	int hypervisor_type;
 	struct intel_gvt_mpt *mpt;
@@ -540,6 +536,8 @@ int intel_vgpu_emulate_cfg_read(struct intel_vgpu *vgpu, unsigned int offset,
 int intel_vgpu_emulate_cfg_write(struct intel_vgpu *vgpu, unsigned int offset,
 		void *p_data, unsigned int bytes);
 
+void intel_vgpu_emulate_hotplug(struct intel_vgpu *vgpu, bool connected);
+
 static inline u64 intel_vgpu_get_bar_gpa(struct intel_vgpu *vgpu, int bar)
 {
 	/* We are 64bit bar. */
@@ -581,6 +579,7 @@ struct intel_gvt_ops {
 	int (*vgpu_get_dmabuf)(struct intel_vgpu *vgpu, unsigned int);
 	int (*write_protect_handler)(struct intel_vgpu *, u64, void *,
 				     unsigned int);
+	void (*emulate_hotplug)(struct intel_vgpu *vgpu, bool connected);
 };
 
 
@@ -597,7 +596,7 @@ static inline void mmio_hw_access_pre(struct drm_i915_private *dev_priv)
 
 static inline void mmio_hw_access_post(struct drm_i915_private *dev_priv)
 {
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put_unchecked(dev_priv);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index e9f343b124b0..bc64b810e0d5 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -57,6 +57,8 @@ unsigned long intel_gvt_get_device_type(struct intel_gvt *gvt)
 		return D_KBL;
 	else if (IS_BROXTON(gvt->dev_priv))
 		return D_BXT;
+	else if (IS_COFFEELAKE(gvt->dev_priv))
+		return D_CFL;
 
 	return 0;
 }
@@ -276,14 +278,12 @@ static int mul_force_wake_write(struct intel_vgpu *vgpu,
 		unsigned int offset, void *p_data, unsigned int bytes)
 {
 	u32 old, new;
-	uint32_t ack_reg_offset;
+	u32 ack_reg_offset;
 
 	old = vgpu_vreg(vgpu, offset);
 	new = CALC_MODE_MASK_REG(old, *(u32 *)p_data);
 
-	if (IS_SKYLAKE(vgpu->gvt->dev_priv)
-		|| IS_KABYLAKE(vgpu->gvt->dev_priv)
-		|| IS_BROXTON(vgpu->gvt->dev_priv)) {
+	if (INTEL_GEN(vgpu->gvt->dev_priv)  >=  9) {
 		switch (offset) {
 		case FORCEWAKE_RENDER_GEN9_REG:
 			ack_reg_offset = FORCEWAKE_ACK_RENDER_GEN9_REG;
@@ -833,7 +833,7 @@ static int dp_aux_ch_ctl_trans_done(struct intel_vgpu *vgpu, u32 value,
 }
 
 static void dp_aux_ch_ctl_link_training(struct intel_vgpu_dpcd_data *dpcd,
-		uint8_t t)
+		u8 t)
 {
 	if ((t & DPCD_TRAINING_PATTERN_SET_MASK) == DPCD_TRAINING_PATTERN_1) {
 		/* training pattern 1 for CR */
@@ -889,9 +889,7 @@ static int dp_aux_ch_ctl_mmio_write(struct intel_vgpu *vgpu,
 	write_vreg(vgpu, offset, p_data, bytes);
 	data = vgpu_vreg(vgpu, offset);
 
-	if ((IS_SKYLAKE(vgpu->gvt->dev_priv)
-		|| IS_KABYLAKE(vgpu->gvt->dev_priv)
-		|| IS_BROXTON(vgpu->gvt->dev_priv))
+	if ((INTEL_GEN(vgpu->gvt->dev_priv) >= 9)
 		&& offset != _REG_SKL_DP_AUX_CH_CTL(port_index)) {
 		/* SKL DPB/C/D aux ctl register changed */
 		return 0;
@@ -919,7 +917,7 @@ static int dp_aux_ch_ctl_mmio_write(struct intel_vgpu *vgpu,
 
 	if (op == GVT_AUX_NATIVE_WRITE) {
 		int t;
-		uint8_t buf[16];
+		u8 buf[16];
 
 		if ((addr + len + 1) >= DPCD_SIZE) {
 			/*
@@ -1407,7 +1405,8 @@ static int mailbox_write(struct intel_vgpu *vgpu, unsigned int offset,
 	switch (cmd) {
 	case GEN9_PCODE_READ_MEM_LATENCY:
 		if (IS_SKYLAKE(vgpu->gvt->dev_priv)
-			 || IS_KABYLAKE(vgpu->gvt->dev_priv)) {
+			 || IS_KABYLAKE(vgpu->gvt->dev_priv)
+			 || IS_COFFEELAKE(vgpu->gvt->dev_priv)) {
 			/**
 			 * "Read memory latency" command on gen9.
 			 * Below memory latency values are read
@@ -1431,7 +1430,8 @@ static int mailbox_write(struct intel_vgpu *vgpu, unsigned int offset,
 		break;
 	case SKL_PCODE_CDCLK_CONTROL:
 		if (IS_SKYLAKE(vgpu->gvt->dev_priv)
-			 || IS_KABYLAKE(vgpu->gvt->dev_priv))
+			 || IS_KABYLAKE(vgpu->gvt->dev_priv)
+			 || IS_COFFEELAKE(vgpu->gvt->dev_priv))
 			*data0 = SKL_CDCLK_READY_FOR_CHANGE;
 		break;
 	case GEN6_PCODE_READ_RC6VIDS:
@@ -3042,8 +3042,8 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
 	MMIO_DFH(GEN9_WM_CHICKEN3, D_SKL_PLUS, F_MODE_MASK | F_CMD_ACCESS,
 		 NULL, NULL);
 
-	MMIO_D(_MMIO(0x4ab8), D_KBL);
-	MMIO_D(_MMIO(0x2248), D_KBL | D_SKL);
+	MMIO_D(_MMIO(0x4ab8), D_KBL | D_CFL);
+	MMIO_D(_MMIO(0x2248), D_SKL_PLUS);
 
 	return 0;
 }
@@ -3303,7 +3303,8 @@ int intel_gvt_setup_mmio_info(struct intel_gvt *gvt)
 		if (ret)
 			goto err;
 	} else if (IS_SKYLAKE(dev_priv)
-		|| IS_KABYLAKE(dev_priv)) {
+		|| IS_KABYLAKE(dev_priv)
+		|| IS_COFFEELAKE(dev_priv)) {
 		ret = init_broadwell_mmio_info(gvt);
 		if (ret)
 			goto err;
diff --git a/drivers/gpu/drm/i915/gvt/hypercall.h b/drivers/gpu/drm/i915/gvt/hypercall.h
index e1675a00df12..4862fb12778e 100644
--- a/drivers/gpu/drm/i915/gvt/hypercall.h
+++ b/drivers/gpu/drm/i915/gvt/hypercall.h
@@ -33,13 +33,19 @@
 #ifndef _GVT_HYPERCALL_H_
 #define _GVT_HYPERCALL_H_
 
+enum hypervisor_type {
+	INTEL_GVT_HYPERVISOR_XEN = 0,
+	INTEL_GVT_HYPERVISOR_KVM,
+};
+
 /*
  * Specific GVT-g MPT modules function collections. Currently GVT-g supports
  * both Xen and KVM by providing dedicated hypervisor-related MPT modules.
  */
 struct intel_gvt_mpt {
+	enum hypervisor_type type;
 	int (*host_init)(struct device *dev, void *gvt, const void *ops);
-	void (*host_exit)(struct device *dev, void *gvt);
+	void (*host_exit)(struct device *dev);
 	int (*attach_vgpu)(void *vgpu, unsigned long *handle);
 	void (*detach_vgpu)(void *vgpu);
 	int (*inject_msi)(unsigned long handle, u32 addr, u16 data);
@@ -61,12 +67,12 @@ struct intel_gvt_mpt {
 	int (*set_trap_area)(unsigned long handle, u64 start, u64 end,
 			     bool map);
 	int (*set_opregion)(void *vgpu);
+	int (*set_edid)(void *vgpu, int port_num);
 	int (*get_vfio_device)(void *vgpu);
 	void (*put_vfio_device)(void *vgpu);
 	bool (*is_valid_gfn)(unsigned long handle, unsigned long gfn);
 };
 
 extern struct intel_gvt_mpt xengt_mpt;
-extern struct intel_gvt_mpt kvmgt_mpt;
 
 #endif /* _GVT_HYPERCALL_H_ */
diff --git a/drivers/gpu/drm/i915/gvt/interrupt.c b/drivers/gpu/drm/i915/gvt/interrupt.c
index 6b9d1354ff29..67125c5eec6e 100644
--- a/drivers/gpu/drm/i915/gvt/interrupt.c
+++ b/drivers/gpu/drm/i915/gvt/interrupt.c
@@ -581,9 +581,7 @@ static void gen8_init_irq(
 
 		SET_BIT_INFO(irq, 4, PRIMARY_C_FLIP_DONE, INTEL_GVT_IRQ_INFO_DE_PIPE_C);
 		SET_BIT_INFO(irq, 5, SPRITE_C_FLIP_DONE, INTEL_GVT_IRQ_INFO_DE_PIPE_C);
-	} else if (IS_SKYLAKE(gvt->dev_priv)
-			|| IS_KABYLAKE(gvt->dev_priv)
-			|| IS_BROXTON(gvt->dev_priv)) {
+	} else if (INTEL_GEN(gvt->dev_priv) >= 9) {
 		SET_BIT_INFO(irq, 25, AUX_CHANNEL_B, INTEL_GVT_IRQ_INFO_DE_PORT);
 		SET_BIT_INFO(irq, 26, AUX_CHANNEL_C, INTEL_GVT_IRQ_INFO_DE_PORT);
 		SET_BIT_INFO(irq, 27, AUX_CHANNEL_D, INTEL_GVT_IRQ_INFO_DE_PORT);
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index dd3dfd00f4e6..d5fcc447d22f 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -57,6 +57,8 @@ static const struct intel_gvt_ops *intel_gvt_ops;
 #define VFIO_PCI_INDEX_TO_OFFSET(index) ((u64)(index) << VFIO_PCI_OFFSET_SHIFT)
 #define VFIO_PCI_OFFSET_MASK    (((u64)(1) << VFIO_PCI_OFFSET_SHIFT) - 1)
 
+#define EDID_BLOB_OFFSET (PAGE_SIZE/2)
+
 #define OPREGION_SIGNATURE "IntelGraphicsMem"
 
 struct vfio_region;
@@ -76,6 +78,11 @@ struct vfio_region {
 	void				*data;
 };
 
+struct vfio_edid_region {
+	struct vfio_region_gfx_edid vfio_edid_regs;
+	void *edid_blob;
+};
+
 struct kvmgt_pgfn {
 	gfn_t gfn;
 	struct hlist_node hnode;
@@ -427,6 +434,111 @@ static const struct intel_vgpu_regops intel_vgpu_regops_opregion = {
 	.release = intel_vgpu_reg_release_opregion,
 };
 
+static int handle_edid_regs(struct intel_vgpu *vgpu,
+			struct vfio_edid_region *region, char *buf,
+			size_t count, u16 offset, bool is_write)
+{
+	struct vfio_region_gfx_edid *regs = &region->vfio_edid_regs;
+	unsigned int data;
+
+	if (offset + count > sizeof(*regs))
+		return -EINVAL;
+
+	if (count != 4)
+		return -EINVAL;
+
+	if (is_write) {
+		data = *((unsigned int *)buf);
+		switch (offset) {
+		case offsetof(struct vfio_region_gfx_edid, link_state):
+			if (data == VFIO_DEVICE_GFX_LINK_STATE_UP) {
+				if (!drm_edid_block_valid(
+					(u8 *)region->edid_blob,
+					0,
+					true,
+					NULL)) {
+					gvt_vgpu_err("invalid EDID blob\n");
+					return -EINVAL;
+				}
+				intel_gvt_ops->emulate_hotplug(vgpu, true);
+			} else if (data == VFIO_DEVICE_GFX_LINK_STATE_DOWN)
+				intel_gvt_ops->emulate_hotplug(vgpu, false);
+			else {
+				gvt_vgpu_err("invalid EDID link state %d\n",
+					regs->link_state);
+				return -EINVAL;
+			}
+			regs->link_state = data;
+			break;
+		case offsetof(struct vfio_region_gfx_edid, edid_size):
+			if (data > regs->edid_max_size) {
+				gvt_vgpu_err("EDID size is bigger than %d!\n",
+					regs->edid_max_size);
+				return -EINVAL;
+			}
+			regs->edid_size = data;
+			break;
+		default:
+			/* read-only regs */
+			gvt_vgpu_err("write read-only EDID region at offset %d\n",
+				offset);
+			return -EPERM;
+		}
+	} else {
+		memcpy(buf, (char *)regs + offset, count);
+	}
+
+	return count;
+}
+
+static int handle_edid_blob(struct vfio_edid_region *region, char *buf,
+			size_t count, u16 offset, bool is_write)
+{
+	if (offset + count > region->vfio_edid_regs.edid_size)
+		return -EINVAL;
+
+	if (is_write)
+		memcpy(region->edid_blob + offset, buf, count);
+	else
+		memcpy(buf, region->edid_blob + offset, count);
+
+	return count;
+}
+
+static size_t intel_vgpu_reg_rw_edid(struct intel_vgpu *vgpu, char *buf,
+		size_t count, loff_t *ppos, bool iswrite)
+{
+	int ret;
+	unsigned int i = VFIO_PCI_OFFSET_TO_INDEX(*ppos) -
+			VFIO_PCI_NUM_REGIONS;
+	struct vfio_edid_region *region =
+		(struct vfio_edid_region *)vgpu->vdev.region[i].data;
+	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
+
+	if (pos < region->vfio_edid_regs.edid_offset) {
+		ret = handle_edid_regs(vgpu, region, buf, count, pos, iswrite);
+	} else {
+		pos -= EDID_BLOB_OFFSET;
+		ret = handle_edid_blob(region, buf, count, pos, iswrite);
+	}
+
+	if (ret < 0)
+		gvt_vgpu_err("failed to access EDID region\n");
+
+	return ret;
+}
+
+static void intel_vgpu_reg_release_edid(struct intel_vgpu *vgpu,
+					struct vfio_region *region)
+{
+	kfree(region->data);
+}
+
+static const struct intel_vgpu_regops intel_vgpu_regops_edid = {
+	.rw = intel_vgpu_reg_rw_edid,
+	.release = intel_vgpu_reg_release_edid,
+};
+
 static int intel_vgpu_register_reg(struct intel_vgpu *vgpu,
 		unsigned int type, unsigned int subtype,
 		const struct intel_vgpu_regops *ops,
@@ -493,6 +605,36 @@ static int kvmgt_set_opregion(void *p_vgpu)
 	return ret;
 }
 
+static int kvmgt_set_edid(void *p_vgpu, int port_num)
+{
+	struct intel_vgpu *vgpu = (struct intel_vgpu *)p_vgpu;
+	struct intel_vgpu_port *port = intel_vgpu_port(vgpu, port_num);
+	struct vfio_edid_region *base;
+	int ret;
+
+	base = kzalloc(sizeof(*base), GFP_KERNEL);
+	if (!base)
+		return -ENOMEM;
+
+	/* TODO: Add multi-port and EDID extension block support */
+	base->vfio_edid_regs.edid_offset = EDID_BLOB_OFFSET;
+	base->vfio_edid_regs.edid_max_size = EDID_SIZE;
+	base->vfio_edid_regs.edid_size = EDID_SIZE;
+	base->vfio_edid_regs.max_xres = vgpu_edid_xres(port->id);
+	base->vfio_edid_regs.max_yres = vgpu_edid_yres(port->id);
+	base->edid_blob = port->edid->edid_block;
+
+	ret = intel_vgpu_register_reg(vgpu,
+			VFIO_REGION_TYPE_GFX,
+			VFIO_REGION_SUBTYPE_GFX_EDID,
+			&intel_vgpu_regops_edid, EDID_SIZE,
+			VFIO_REGION_INFO_FLAG_READ |
+			VFIO_REGION_INFO_FLAG_WRITE |
+			VFIO_REGION_INFO_FLAG_CAPS, base);
+
+	return ret;
+}
+
 static void kvmgt_put_vfio_device(void *vgpu)
 {
 	if (WARN_ON(!((struct intel_vgpu *)vgpu)->vdev.vfio_device))
@@ -627,6 +769,12 @@ static int intel_vgpu_open(struct mdev_device *mdev)
 		goto undo_iommu;
 	}
 
+	/* Take a module reference as mdev core doesn't take
+	 * a reference for vendor driver.
+	 */
+	if (!try_module_get(THIS_MODULE))
+		goto undo_group;
+
 	ret = kvmgt_guest_init(mdev);
 	if (ret)
 		goto undo_group;
@@ -679,6 +827,9 @@ static void __intel_vgpu_release(struct intel_vgpu *vgpu)
 					&vgpu->vdev.group_notifier);
 	WARN(ret, "vfio_unregister_notifier for group failed: %d\n", ret);
 
+	/* dereference module reference taken at open */
+	module_put(THIS_MODULE);
+
 	info = (struct kvmgt_guest_info *)vgpu->handle;
 	kvmgt_guest_exit(info);
 
@@ -703,7 +854,7 @@ static void intel_vgpu_release_work(struct work_struct *work)
 	__intel_vgpu_release(vgpu);
 }
 
-static uint64_t intel_vgpu_get_bar_addr(struct intel_vgpu *vgpu, int bar)
+static u64 intel_vgpu_get_bar_addr(struct intel_vgpu *vgpu, int bar)
 {
 	u32 start_lo, start_hi;
 	u32 mem_type;
@@ -730,10 +881,10 @@ static uint64_t intel_vgpu_get_bar_addr(struct intel_vgpu *vgpu, int bar)
 	return ((u64)start_hi << 32) | start_lo;
 }
 
-static int intel_vgpu_bar_rw(struct intel_vgpu *vgpu, int bar, uint64_t off,
+static int intel_vgpu_bar_rw(struct intel_vgpu *vgpu, int bar, u64 off,
 			     void *buf, unsigned int count, bool is_write)
 {
-	uint64_t bar_start = intel_vgpu_get_bar_addr(vgpu, bar);
+	u64 bar_start = intel_vgpu_get_bar_addr(vgpu, bar);
 	int ret;
 
 	if (is_write)
@@ -745,13 +896,13 @@ static int intel_vgpu_bar_rw(struct intel_vgpu *vgpu, int bar, uint64_t off,
 	return ret;
 }
 
-static inline bool intel_vgpu_in_aperture(struct intel_vgpu *vgpu, uint64_t off)
+static inline bool intel_vgpu_in_aperture(struct intel_vgpu *vgpu, u64 off)
 {
 	return off >= vgpu_aperture_offset(vgpu) &&
 	       off < vgpu_aperture_offset(vgpu) + vgpu_aperture_sz(vgpu);
 }
 
-static int intel_vgpu_aperture_rw(struct intel_vgpu *vgpu, uint64_t off,
+static int intel_vgpu_aperture_rw(struct intel_vgpu *vgpu, u64 off,
 		void *buf, unsigned long count, bool is_write)
 {
 	void *aperture_va;
@@ -783,7 +934,7 @@ static ssize_t intel_vgpu_rw(struct mdev_device *mdev, char *buf,
 {
 	struct intel_vgpu *vgpu = mdev_get_drvdata(mdev);
 	unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
-	uint64_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
+	u64 pos = *ppos & VFIO_PCI_OFFSET_MASK;
 	int ret = -EINVAL;
 
 
@@ -1039,7 +1190,7 @@ static int intel_vgpu_get_irq_count(struct intel_vgpu *vgpu, int type)
 
 static int intel_vgpu_set_intx_mask(struct intel_vgpu *vgpu,
 			unsigned int index, unsigned int start,
-			unsigned int count, uint32_t flags,
+			unsigned int count, u32 flags,
 			void *data)
 {
 	return 0;
@@ -1047,21 +1198,21 @@ static int intel_vgpu_set_intx_mask(struct intel_vgpu *vgpu,
 
 static int intel_vgpu_set_intx_unmask(struct intel_vgpu *vgpu,
 			unsigned int index, unsigned int start,
-			unsigned int count, uint32_t flags, void *data)
+			unsigned int count, u32 flags, void *data)
 {
 	return 0;
 }
 
 static int intel_vgpu_set_intx_trigger(struct intel_vgpu *vgpu,
 		unsigned int index, unsigned int start, unsigned int count,
-		uint32_t flags, void *data)
+		u32 flags, void *data)
 {
 	return 0;
 }
 
 static int intel_vgpu_set_msi_trigger(struct intel_vgpu *vgpu,
 		unsigned int index, unsigned int start, unsigned int count,
-		uint32_t flags, void *data)
+		u32 flags, void *data)
 {
 	struct eventfd_ctx *trigger;
 
@@ -1080,12 +1231,12 @@ static int intel_vgpu_set_msi_trigger(struct intel_vgpu *vgpu,
 	return 0;
 }
 
-static int intel_vgpu_set_irqs(struct intel_vgpu *vgpu, uint32_t flags,
+static int intel_vgpu_set_irqs(struct intel_vgpu *vgpu, u32 flags,
 		unsigned int index, unsigned int start, unsigned int count,
 		void *data)
 {
 	int (*func)(struct intel_vgpu *vgpu, unsigned int index,
-			unsigned int start, unsigned int count, uint32_t flags,
+			unsigned int start, unsigned int count, u32 flags,
 			void *data) = NULL;
 
 	switch (index) {
@@ -1477,7 +1628,7 @@ static int kvmgt_host_init(struct device *dev, void *gvt, const void *ops)
 	return mdev_register_device(dev, &intel_vgpu_ops);
 }
 
-static void kvmgt_host_exit(struct device *dev, void *gvt)
+static void kvmgt_host_exit(struct device *dev)
 {
 	mdev_unregister_device(dev);
 }
@@ -1871,7 +2022,8 @@ static bool kvmgt_is_valid_gfn(unsigned long handle, unsigned long gfn)
 	return ret;
 }
 
-struct intel_gvt_mpt kvmgt_mpt = {
+static struct intel_gvt_mpt kvmgt_mpt = {
+	.type = INTEL_GVT_HYPERVISOR_KVM,
 	.host_init = kvmgt_host_init,
 	.host_exit = kvmgt_host_exit,
 	.attach_vgpu = kvmgt_attach_vgpu,
@@ -1886,19 +2038,22 @@ struct intel_gvt_mpt kvmgt_mpt = {
 	.dma_map_guest_page = kvmgt_dma_map_guest_page,
 	.dma_unmap_guest_page = kvmgt_dma_unmap_guest_page,
 	.set_opregion = kvmgt_set_opregion,
+	.set_edid = kvmgt_set_edid,
 	.get_vfio_device = kvmgt_get_vfio_device,
 	.put_vfio_device = kvmgt_put_vfio_device,
 	.is_valid_gfn = kvmgt_is_valid_gfn,
 };
-EXPORT_SYMBOL_GPL(kvmgt_mpt);
 
 static int __init kvmgt_init(void)
 {
+	if (intel_gvt_register_hypervisor(&kvmgt_mpt) < 0)
+		return -ENODEV;
 	return 0;
 }
 
 static void __exit kvmgt_exit(void)
 {
+	intel_gvt_unregister_hypervisor();
 }
 
 module_init(kvmgt_init);
diff --git a/drivers/gpu/drm/i915/gvt/mmio.c b/drivers/gpu/drm/i915/gvt/mmio.c
index 43f65848ecd6..ed4df2f6d60b 100644
--- a/drivers/gpu/drm/i915/gvt/mmio.c
+++ b/drivers/gpu/drm/i915/gvt/mmio.c
@@ -57,7 +57,7 @@ int intel_vgpu_gpa_to_mmio_offset(struct intel_vgpu *vgpu, u64 gpa)
 	(reg >= gvt->device_info.gtt_start_offset \
 	 && reg < gvt->device_info.gtt_start_offset + gvt_ggtt_sz(gvt))
 
-static void failsafe_emulate_mmio_rw(struct intel_vgpu *vgpu, uint64_t pa,
+static void failsafe_emulate_mmio_rw(struct intel_vgpu *vgpu, u64 pa,
 		void *p_data, unsigned int bytes, bool read)
 {
 	struct intel_gvt *gvt = NULL;
@@ -99,7 +99,7 @@ static void failsafe_emulate_mmio_rw(struct intel_vgpu *vgpu, uint64_t pa,
  * Returns:
  * Zero on success, negative error code if failed
  */
-int intel_vgpu_emulate_mmio_read(struct intel_vgpu *vgpu, uint64_t pa,
+int intel_vgpu_emulate_mmio_read(struct intel_vgpu *vgpu, u64 pa,
 		void *p_data, unsigned int bytes)
 {
 	struct intel_gvt *gvt = vgpu->gvt;
@@ -171,7 +171,7 @@ out:
  * Returns:
  * Zero on success, negative error code if failed
  */
-int intel_vgpu_emulate_mmio_write(struct intel_vgpu *vgpu, uint64_t pa,
+int intel_vgpu_emulate_mmio_write(struct intel_vgpu *vgpu, u64 pa,
 		void *p_data, unsigned int bytes)
 {
 	struct intel_gvt *gvt = vgpu->gvt;
diff --git a/drivers/gpu/drm/i915/gvt/mmio.h b/drivers/gpu/drm/i915/gvt/mmio.h
index 1ffc69eba30e..5874f1cb4306 100644
--- a/drivers/gpu/drm/i915/gvt/mmio.h
+++ b/drivers/gpu/drm/i915/gvt/mmio.h
@@ -43,15 +43,16 @@ struct intel_vgpu;
 #define D_SKL	(1 << 1)
 #define D_KBL	(1 << 2)
 #define D_BXT	(1 << 3)
+#define D_CFL	(1 << 4)
 
-#define D_GEN9PLUS	(D_SKL | D_KBL | D_BXT)
-#define D_GEN8PLUS	(D_BDW | D_SKL | D_KBL | D_BXT)
+#define D_GEN9PLUS	(D_SKL | D_KBL | D_BXT | D_CFL)
+#define D_GEN8PLUS	(D_BDW | D_SKL | D_KBL | D_BXT | D_CFL)
 
-#define D_SKL_PLUS	(D_SKL | D_KBL | D_BXT)
-#define D_BDW_PLUS	(D_BDW | D_SKL | D_KBL | D_BXT)
+#define D_SKL_PLUS	(D_SKL | D_KBL | D_BXT | D_CFL)
+#define D_BDW_PLUS	(D_BDW | D_SKL | D_KBL | D_BXT | D_CFL)
 
 #define D_PRE_SKL	(D_BDW)
-#define D_ALL		(D_BDW | D_SKL | D_KBL | D_BXT)
+#define D_ALL		(D_BDW | D_SKL | D_KBL | D_BXT | D_CFL)
 
 typedef int (*gvt_mmio_func)(struct intel_vgpu *, unsigned int, void *,
 			     unsigned int);
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c b/drivers/gpu/drm/i915/gvt/mmio_context.c
index d6e02c15ef97..7d84cfb9051a 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -353,8 +353,7 @@ static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id)
 	 */
 	fw = intel_uncore_forcewake_for_reg(dev_priv, reg,
 					    FW_REG_READ | FW_REG_WRITE);
-	if (ring_id == RCS && (IS_SKYLAKE(dev_priv) ||
-			IS_KABYLAKE(dev_priv) || IS_BROXTON(dev_priv)))
+	if (ring_id == RCS && (INTEL_GEN(dev_priv) >= 9))
 		fw |= FORCEWAKE_RENDER;
 
 	intel_uncore_forcewake_get(dev_priv, fw);
@@ -391,7 +390,8 @@ static void switch_mocs(struct intel_vgpu *pre, struct intel_vgpu *next,
 	if (WARN_ON(ring_id >= ARRAY_SIZE(regs)))
 		return;
 
-	if ((IS_KABYLAKE(dev_priv) || IS_BROXTON(dev_priv)) && ring_id == RCS)
+	if ((IS_KABYLAKE(dev_priv)  || IS_BROXTON(dev_priv)
+		|| IS_COFFEELAKE(dev_priv)) && ring_id == RCS)
 		return;
 
 	if (!pre && !gen9_render_mocs.initialized)
@@ -457,9 +457,7 @@ static void switch_mmio(struct intel_vgpu *pre,
 	u32 old_v, new_v;
 
 	dev_priv = pre ? pre->gvt->dev_priv : next->gvt->dev_priv;
-	if (IS_SKYLAKE(dev_priv)
-		|| IS_KABYLAKE(dev_priv)
-		|| IS_BROXTON(dev_priv))
+	if (INTEL_GEN(dev_priv) >= 9)
 		switch_mocs(pre, next, ring_id);
 
 	for (mmio = dev_priv->gvt->engine_mmio_list.mmio;
@@ -471,8 +469,8 @@ static void switch_mmio(struct intel_vgpu *pre,
 		 * state image on kabylake, it's initialized by lri command and
 		 * save or restore with context together.
 		 */
-		if ((IS_KABYLAKE(dev_priv) || IS_BROXTON(dev_priv))
-			&& mmio->in_context)
+		if ((IS_KABYLAKE(dev_priv) || IS_BROXTON(dev_priv)
+			|| IS_COFFEELAKE(dev_priv)) && mmio->in_context)
 			continue;
 
 		// save
@@ -565,9 +563,7 @@ void intel_gvt_init_engine_mmio_context(struct intel_gvt *gvt)
 {
 	struct engine_mmio *mmio;
 
-	if (IS_SKYLAKE(gvt->dev_priv) ||
-		IS_KABYLAKE(gvt->dev_priv) ||
-		IS_BROXTON(gvt->dev_priv))
+	if (INTEL_GEN(gvt->dev_priv) >= 9)
 		gvt->engine_mmio_list.mmio = gen9_engine_mmio_list;
 	else
 		gvt->engine_mmio_list.mmio = gen8_engine_mmio_list;
diff --git a/drivers/gpu/drm/i915/gvt/mpt.h b/drivers/gpu/drm/i915/gvt/mpt.h
index 3ed34123d8d1..0f9440128123 100644
--- a/drivers/gpu/drm/i915/gvt/mpt.h
+++ b/drivers/gpu/drm/i915/gvt/mpt.h
@@ -50,11 +50,10 @@
  * Zero on success, negative error code if failed
  */
 static inline int intel_gvt_hypervisor_host_init(struct device *dev,
-			void *gvt, const void *ops)
+						 void *gvt, const void *ops)
 {
-	/* optional to provide */
 	if (!intel_gvt_host.mpt->host_init)
-		return 0;
+		return -ENODEV;
 
 	return intel_gvt_host.mpt->host_init(dev, gvt, ops);
 }
@@ -62,14 +61,13 @@ static inline int intel_gvt_hypervisor_host_init(struct device *dev,
 /**
  * intel_gvt_hypervisor_host_exit - exit GVT-g host side
  */
-static inline void intel_gvt_hypervisor_host_exit(struct device *dev,
-			void *gvt)
+static inline void intel_gvt_hypervisor_host_exit(struct device *dev)
 {
 	/* optional to provide */
 	if (!intel_gvt_host.mpt->host_exit)
 		return;
 
-	intel_gvt_host.mpt->host_exit(dev, gvt);
+	intel_gvt_host.mpt->host_exit(dev);
 }
 
 /**
@@ -316,6 +314,23 @@ static inline int intel_gvt_hypervisor_set_opregion(struct intel_vgpu *vgpu)
 }
 
 /**
+ * intel_gvt_hypervisor_set_edid - Set EDID region for guest
+ * @vgpu: a vGPU
+ * @port_num: display port number
+ *
+ * Returns:
+ * Zero on success, negative error code if failed.
+ */
+static inline int intel_gvt_hypervisor_set_edid(struct intel_vgpu *vgpu,
+						int port_num)
+{
+	if (!intel_gvt_host.mpt->set_edid)
+		return 0;
+
+	return intel_gvt_host.mpt->set_edid(vgpu, port_num);
+}
+
+/**
  * intel_gvt_hypervisor_get_vfio_device - increase vfio device ref count
  * @vgpu: a vGPU
  *
@@ -362,4 +377,7 @@ static inline bool intel_gvt_hypervisor_is_valid_gfn(
 	return intel_gvt_host.mpt->is_valid_gfn(vgpu->handle, gfn);
 }
 
+int intel_gvt_register_hypervisor(struct intel_gvt_mpt *);
+void intel_gvt_unregister_hypervisor(void);
+
 #endif /* _GVT_MPT_H_ */
diff --git a/drivers/gpu/drm/i915/gvt/sched_policy.c b/drivers/gpu/drm/i915/gvt/sched_policy.c
index c32e7d5e8629..1c763a27a412 100644
--- a/drivers/gpu/drm/i915/gvt/sched_policy.c
+++ b/drivers/gpu/drm/i915/gvt/sched_policy.c
@@ -94,7 +94,7 @@ static void gvt_balance_timeslice(struct gvt_sched_data *sched_data)
 {
 	struct vgpu_sched_data *vgpu_data;
 	struct list_head *pos;
-	static uint64_t stage_check;
+	static u64 stage_check;
 	int stage = stage_check++ % GVT_TS_BALANCE_STAGE_NUM;
 
 	/* The timeslice accumulation reset at stage 0, which is
@@ -474,6 +474,6 @@ void intel_vgpu_stop_schedule(struct intel_vgpu *vgpu)
 		}
 	}
 	spin_unlock_bh(&scheduler->mmio_context_lock);
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put_unchecked(dev_priv);
 	mutex_unlock(&vgpu->gvt->sched_lock);
 }
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 55bb7885e228..1bb8f936fdaa 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -299,7 +299,8 @@ static int copy_workload_to_ring_buffer(struct intel_vgpu_workload *workload)
 	void *shadow_ring_buffer_va;
 	u32 *cs;
 
-	if ((IS_KABYLAKE(req->i915) || IS_BROXTON(req->i915))
+	if ((IS_KABYLAKE(req->i915) || IS_BROXTON(req->i915)
+		|| IS_COFFEELAKE(req->i915))
 		&& is_inhibit_context(req->hw_context))
 		intel_vgpu_restore_inhibit_context(vgpu, req);
 
@@ -957,9 +958,7 @@ static int workload_thread(void *priv)
 	struct intel_vgpu_workload *workload = NULL;
 	struct intel_vgpu *vgpu = NULL;
 	int ret;
-	bool need_force_wake = IS_SKYLAKE(gvt->dev_priv)
-			|| IS_KABYLAKE(gvt->dev_priv)
-			|| IS_BROXTON(gvt->dev_priv);
+	bool need_force_wake = (INTEL_GEN(gvt->dev_priv) >= 9);
 	DEFINE_WAIT_FUNC(wait, woken_wake_function);
 
 	kfree(p);
@@ -1015,7 +1014,7 @@ complete:
 			intel_uncore_forcewake_put(gvt->dev_priv,
 					FORCEWAKE_ALL);
 
-		intel_runtime_pm_put(gvt->dev_priv);
+		intel_runtime_pm_put_unchecked(gvt->dev_priv);
 		if (ret && (vgpu_is_vm_unhealthy(ret)))
 			enter_failsafe_mode(vgpu, GVT_FAILSAFE_GUEST_ERR);
 	}
@@ -1472,7 +1471,7 @@ intel_vgpu_create_workload(struct intel_vgpu *vgpu, int ring_id,
 		mutex_lock(&dev_priv->drm.struct_mutex);
 		ret = intel_gvt_scan_and_shadow_workload(workload);
 		mutex_unlock(&dev_priv->drm.struct_mutex);
-		intel_runtime_pm_put(dev_priv);
+		intel_runtime_pm_put_unchecked(dev_priv);
 	}
 
 	if (ret && (vgpu_is_vm_unhealthy(ret))) {
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.h b/drivers/gpu/drm/i915/gvt/scheduler.h
index 2065cba59aab..0635b2c4bed7 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.h
+++ b/drivers/gpu/drm/i915/gvt/scheduler.h
@@ -61,7 +61,7 @@ struct shadow_indirect_ctx {
 	unsigned long guest_gma;
 	unsigned long shadow_gma;
 	void *shadow_va;
-	uint32_t size;
+	u32 size;
 };
 
 #define PER_CTX_ADDR_MASK 0xfffff000
diff --git a/drivers/gpu/drm/i915/gvt/trace.h b/drivers/gpu/drm/i915/gvt/trace.h
index 1fd64202d74e..6d787750d279 100644
--- a/drivers/gpu/drm/i915/gvt/trace.h
+++ b/drivers/gpu/drm/i915/gvt/trace.h
@@ -228,7 +228,7 @@ TRACE_EVENT(oos_sync,
 TRACE_EVENT(gvt_command,
 	TP_PROTO(u8 vgpu_id, u8 ring_id, u32 ip_gma, u32 *cmd_va,
 		u32 cmd_len,  u32 buf_type, u32 buf_addr_type,
-		void *workload, char *cmd_name),
+		void *workload, const char *cmd_name),
 
 	TP_ARGS(vgpu_id, ring_id, ip_gma, cmd_va, cmd_len, buf_type,
 		buf_addr_type, workload, cmd_name),
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index c628be05fbfe..720e2b10adaa 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -148,10 +148,10 @@ int intel_gvt_init_vgpu_types(struct intel_gvt *gvt)
 		gvt->types[i].avail_instance = min(low_avail / vgpu_types[i].low_mm,
 						   high_avail / vgpu_types[i].high_mm);
 
-		if (IS_GEN8(gvt->dev_priv))
+		if (IS_GEN(gvt->dev_priv, 8))
 			sprintf(gvt->types[i].name, "GVTg_V4_%s",
 						vgpu_types[i].name);
-		else if (IS_GEN9(gvt->dev_priv))
+		else if (IS_GEN(gvt->dev_priv, 9))
 			sprintf(gvt->types[i].name, "GVTg_V5_%s",
 						vgpu_types[i].name);
 
@@ -428,6 +428,12 @@ static struct intel_vgpu *__intel_gvt_create_vgpu(struct intel_gvt *gvt,
 	if (ret)
 		goto out_clean_sched_policy;
 
+	/*TODO: add more platforms support */
+	if (IS_SKYLAKE(gvt->dev_priv) || IS_KABYLAKE(gvt->dev_priv))
+		ret = intel_gvt_hypervisor_set_edid(vgpu, PORT_D);
+	if (ret)
+		goto out_clean_sched_policy;
+
 	return vgpu;
 
 out_clean_sched_policy:
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
new file mode 100644
index 000000000000..215b6ff8aa73
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -0,0 +1,286 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "i915_active.h"
+
+#define BKL(ref) (&(ref)->i915->drm.struct_mutex)
+
+/*
+ * Active refs memory management
+ *
+ * To be more economical with memory, we reap all the i915_active trees as
+ * they idle (when we know the active requests are inactive) and allocate the
+ * nodes from a local slab cache to hopefully reduce the fragmentation.
+ */
+static struct i915_global_active {
+	struct kmem_cache *slab_cache;
+} global;
+
+struct active_node {
+	struct i915_active_request base;
+	struct i915_active *ref;
+	struct rb_node node;
+	u64 timeline;
+};
+
+static void
+__active_park(struct i915_active *ref)
+{
+	struct active_node *it, *n;
+
+	rbtree_postorder_for_each_entry_safe(it, n, &ref->tree, node) {
+		GEM_BUG_ON(i915_active_request_isset(&it->base));
+		kmem_cache_free(global.slab_cache, it);
+	}
+	ref->tree = RB_ROOT;
+}
+
+static void
+__active_retire(struct i915_active *ref)
+{
+	GEM_BUG_ON(!ref->count);
+	if (--ref->count)
+		return;
+
+	/* return the unused nodes to our slabcache */
+	__active_park(ref);
+
+	ref->retire(ref);
+}
+
+static void
+node_retire(struct i915_active_request *base, struct i915_request *rq)
+{
+	__active_retire(container_of(base, struct active_node, base)->ref);
+}
+
+static void
+last_retire(struct i915_active_request *base, struct i915_request *rq)
+{
+	__active_retire(container_of(base, struct i915_active, last));
+}
+
+static struct i915_active_request *
+active_instance(struct i915_active *ref, u64 idx)
+{
+	struct active_node *node;
+	struct rb_node **p, *parent;
+	struct i915_request *old;
+
+	/*
+	 * We track the most recently used timeline to skip a rbtree search
+	 * for the common case, under typical loads we never need the rbtree
+	 * at all. We can reuse the last slot if it is empty, that is
+	 * after the previous activity has been retired, or if it matches the
+	 * current timeline.
+	 *
+	 * Note that we allow the timeline to be active simultaneously in
+	 * the rbtree and the last cache. We do this to avoid having
+	 * to search and replace the rbtree element for a new timeline, with
+	 * the cost being that we must be aware that the ref may be retired
+	 * twice for the same timeline (as the older rbtree element will be
+	 * retired before the new request added to last).
+	 */
+	old = i915_active_request_raw(&ref->last, BKL(ref));
+	if (!old || old->fence.context == idx)
+		goto out;
+
+	/* Move the currently active fence into the rbtree */
+	idx = old->fence.context;
+
+	parent = NULL;
+	p = &ref->tree.rb_node;
+	while (*p) {
+		parent = *p;
+
+		node = rb_entry(parent, struct active_node, node);
+		if (node->timeline == idx)
+			goto replace;
+
+		if (node->timeline < idx)
+			p = &parent->rb_right;
+		else
+			p = &parent->rb_left;
+	}
+
+	node = kmem_cache_alloc(global.slab_cache, GFP_KERNEL);
+
+	/* kmalloc may retire the ref->last (thanks shrinker)! */
+	if (unlikely(!i915_active_request_raw(&ref->last, BKL(ref)))) {
+		kmem_cache_free(global.slab_cache, node);
+		goto out;
+	}
+
+	if (unlikely(!node))
+		return ERR_PTR(-ENOMEM);
+
+	i915_active_request_init(&node->base, NULL, node_retire);
+	node->ref = ref;
+	node->timeline = idx;
+
+	rb_link_node(&node->node, parent, p);
+	rb_insert_color(&node->node, &ref->tree);
+
+replace:
+	/*
+	 * Overwrite the previous active slot in the rbtree with last,
+	 * leaving last zeroed. If the previous slot is still active,
+	 * we must be careful as we now only expect to receive one retire
+	 * callback not two, and so much undo the active counting for the
+	 * overwritten slot.
+	 */
+	if (i915_active_request_isset(&node->base)) {
+		/* Retire ourselves from the old rq->active_list */
+		__list_del_entry(&node->base.link);
+		ref->count--;
+		GEM_BUG_ON(!ref->count);
+	}
+	GEM_BUG_ON(list_empty(&ref->last.link));
+	list_replace_init(&ref->last.link, &node->base.link);
+	node->base.request = fetch_and_zero(&ref->last.request);
+
+out:
+	return &ref->last;
+}
+
+void i915_active_init(struct drm_i915_private *i915,
+		      struct i915_active *ref,
+		      void (*retire)(struct i915_active *ref))
+{
+	ref->i915 = i915;
+	ref->retire = retire;
+	ref->tree = RB_ROOT;
+	i915_active_request_init(&ref->last, NULL, last_retire);
+	ref->count = 0;
+}
+
+int i915_active_ref(struct i915_active *ref,
+		    u64 timeline,
+		    struct i915_request *rq)
+{
+	struct i915_active_request *active;
+
+	active = active_instance(ref, timeline);
+	if (IS_ERR(active))
+		return PTR_ERR(active);
+
+	if (!i915_active_request_isset(active))
+		ref->count++;
+	__i915_active_request_set(active, rq);
+
+	GEM_BUG_ON(!ref->count);
+	return 0;
+}
+
+bool i915_active_acquire(struct i915_active *ref)
+{
+	lockdep_assert_held(BKL(ref));
+	return !ref->count++;
+}
+
+void i915_active_release(struct i915_active *ref)
+{
+	lockdep_assert_held(BKL(ref));
+	__active_retire(ref);
+}
+
+int i915_active_wait(struct i915_active *ref)
+{
+	struct active_node *it, *n;
+	int ret = 0;
+
+	if (i915_active_acquire(ref))
+		goto out_release;
+
+	ret = i915_active_request_retire(&ref->last, BKL(ref));
+	if (ret)
+		goto out_release;
+
+	rbtree_postorder_for_each_entry_safe(it, n, &ref->tree, node) {
+		ret = i915_active_request_retire(&it->base, BKL(ref));
+		if (ret)
+			break;
+	}
+
+out_release:
+	i915_active_release(ref);
+	return ret;
+}
+
+int i915_request_await_active_request(struct i915_request *rq,
+				      struct i915_active_request *active)
+{
+	struct i915_request *barrier =
+		i915_active_request_raw(active, &rq->i915->drm.struct_mutex);
+
+	return barrier ? i915_request_await_dma_fence(rq, &barrier->fence) : 0;
+}
+
+int i915_request_await_active(struct i915_request *rq, struct i915_active *ref)
+{
+	struct active_node *it, *n;
+	int ret;
+
+	ret = i915_request_await_active_request(rq, &ref->last);
+	if (ret)
+		return ret;
+
+	rbtree_postorder_for_each_entry_safe(it, n, &ref->tree, node) {
+		ret = i915_request_await_active_request(rq, &it->base);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+void i915_active_fini(struct i915_active *ref)
+{
+	GEM_BUG_ON(i915_active_request_isset(&ref->last));
+	GEM_BUG_ON(!RB_EMPTY_ROOT(&ref->tree));
+	GEM_BUG_ON(ref->count);
+}
+#endif
+
+int i915_active_request_set(struct i915_active_request *active,
+			    struct i915_request *rq)
+{
+	int err;
+
+	/* Must maintain ordering wrt previous active requests */
+	err = i915_request_await_active_request(rq, active);
+	if (err)
+		return err;
+
+	__i915_active_request_set(active, rq);
+	return 0;
+}
+
+void i915_active_retire_noop(struct i915_active_request *active,
+			     struct i915_request *request)
+{
+	/* Space left intentionally blank */
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftests/i915_active.c"
+#endif
+
+int __init i915_global_active_init(void)
+{
+	global.slab_cache = KMEM_CACHE(active_node, SLAB_HWCACHE_ALIGN);
+	if (!global.slab_cache)
+		return -ENOMEM;
+
+	return 0;
+}
+
+void __exit i915_global_active_exit(void)
+{
+	kmem_cache_destroy(global.slab_cache);
+}
diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h
new file mode 100644
index 000000000000..12b5c1d287d1
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -0,0 +1,425 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef _I915_ACTIVE_H_
+#define _I915_ACTIVE_H_
+
+#include <linux/lockdep.h>
+
+#include "i915_active_types.h"
+#include "i915_request.h"
+
+/*
+ * We treat requests as fences. This is not be to confused with our
+ * "fence registers" but pipeline synchronisation objects ala GL_ARB_sync.
+ * We use the fences to synchronize access from the CPU with activity on the
+ * GPU, for example, we should not rewrite an object's PTE whilst the GPU
+ * is reading them. We also track fences at a higher level to provide
+ * implicit synchronisation around GEM objects, e.g. set-domain will wait
+ * for outstanding GPU rendering before marking the object ready for CPU
+ * access, or a pageflip will wait until the GPU is complete before showing
+ * the frame on the scanout.
+ *
+ * In order to use a fence, the object must track the fence it needs to
+ * serialise with. For example, GEM objects want to track both read and
+ * write access so that we can perform concurrent read operations between
+ * the CPU and GPU engines, as well as waiting for all rendering to
+ * complete, or waiting for the last GPU user of a "fence register". The
+ * object then embeds a #i915_active_request to track the most recent (in
+ * retirement order) request relevant for the desired mode of access.
+ * The #i915_active_request is updated with i915_active_request_set() to
+ * track the most recent fence request, typically this is done as part of
+ * i915_vma_move_to_active().
+ *
+ * When the #i915_active_request completes (is retired), it will
+ * signal its completion to the owner through a callback as well as mark
+ * itself as idle (i915_active_request.request == NULL). The owner
+ * can then perform any action, such as delayed freeing of an active
+ * resource including itself.
+ */
+
+void i915_active_retire_noop(struct i915_active_request *active,
+			     struct i915_request *request);
+
+/**
+ * i915_active_request_init - prepares the activity tracker for use
+ * @active - the active tracker
+ * @rq - initial request to track, can be NULL
+ * @func - a callback when then the tracker is retired (becomes idle),
+ *         can be NULL
+ *
+ * i915_active_request_init() prepares the embedded @active struct for use as
+ * an activity tracker, that is for tracking the last known active request
+ * associated with it. When the last request becomes idle, when it is retired
+ * after completion, the optional callback @func is invoked.
+ */
+static inline void
+i915_active_request_init(struct i915_active_request *active,
+			 struct i915_request *rq,
+			 i915_active_retire_fn retire)
+{
+	RCU_INIT_POINTER(active->request, rq);
+	INIT_LIST_HEAD(&active->link);
+	active->retire = retire ?: i915_active_retire_noop;
+}
+
+#define INIT_ACTIVE_REQUEST(name) i915_active_request_init((name), NULL, NULL)
+
+/**
+ * i915_active_request_set - updates the tracker to watch the current request
+ * @active - the active tracker
+ * @request - the request to watch
+ *
+ * __i915_active_request_set() watches the given @request for completion. Whilst
+ * that @request is busy, the @active reports busy. When that @request is
+ * retired, the @active tracker is updated to report idle.
+ */
+static inline void
+__i915_active_request_set(struct i915_active_request *active,
+			  struct i915_request *request)
+{
+	list_move(&active->link, &request->active_list);
+	rcu_assign_pointer(active->request, request);
+}
+
+int __must_check
+i915_active_request_set(struct i915_active_request *active,
+			struct i915_request *rq);
+
+/**
+ * i915_active_request_set_retire_fn - updates the retirement callback
+ * @active - the active tracker
+ * @fn - the routine called when the request is retired
+ * @mutex - struct_mutex used to guard retirements
+ *
+ * i915_active_request_set_retire_fn() updates the function pointer that
+ * is called when the final request associated with the @active tracker
+ * is retired.
+ */
+static inline void
+i915_active_request_set_retire_fn(struct i915_active_request *active,
+				  i915_active_retire_fn fn,
+				  struct mutex *mutex)
+{
+	lockdep_assert_held(mutex);
+	active->retire = fn ?: i915_active_retire_noop;
+}
+
+static inline struct i915_request *
+__i915_active_request_peek(const struct i915_active_request *active)
+{
+	/*
+	 * Inside the error capture (running with the driver in an unknown
+	 * state), we want to bend the rules slightly (a lot).
+	 *
+	 * Work is in progress to make it safer, in the meantime this keeps
+	 * the known issue from spamming the logs.
+	 */
+	return rcu_dereference_protected(active->request, 1);
+}
+
+/**
+ * i915_active_request_raw - return the active request
+ * @active - the active tracker
+ *
+ * i915_active_request_raw() returns the current request being tracked, or NULL.
+ * It does not obtain a reference on the request for the caller, so the caller
+ * must hold struct_mutex.
+ */
+static inline struct i915_request *
+i915_active_request_raw(const struct i915_active_request *active,
+			struct mutex *mutex)
+{
+	return rcu_dereference_protected(active->request,
+					 lockdep_is_held(mutex));
+}
+
+/**
+ * i915_active_request_peek - report the active request being monitored
+ * @active - the active tracker
+ *
+ * i915_active_request_peek() returns the current request being tracked if
+ * still active, or NULL. It does not obtain a reference on the request
+ * for the caller, so the caller must hold struct_mutex.
+ */
+static inline struct i915_request *
+i915_active_request_peek(const struct i915_active_request *active,
+			 struct mutex *mutex)
+{
+	struct i915_request *request;
+
+	request = i915_active_request_raw(active, mutex);
+	if (!request || i915_request_completed(request))
+		return NULL;
+
+	return request;
+}
+
+/**
+ * i915_active_request_get - return a reference to the active request
+ * @active - the active tracker
+ *
+ * i915_active_request_get() returns a reference to the active request, or NULL
+ * if the active tracker is idle. The caller must hold struct_mutex.
+ */
+static inline struct i915_request *
+i915_active_request_get(const struct i915_active_request *active,
+			struct mutex *mutex)
+{
+	return i915_request_get(i915_active_request_peek(active, mutex));
+}
+
+/**
+ * __i915_active_request_get_rcu - return a reference to the active request
+ * @active - the active tracker
+ *
+ * __i915_active_request_get() returns a reference to the active request,
+ * or NULL if the active tracker is idle. The caller must hold the RCU read
+ * lock, but the returned pointer is safe to use outside of RCU.
+ */
+static inline struct i915_request *
+__i915_active_request_get_rcu(const struct i915_active_request *active)
+{
+	/*
+	 * Performing a lockless retrieval of the active request is super
+	 * tricky. SLAB_TYPESAFE_BY_RCU merely guarantees that the backing
+	 * slab of request objects will not be freed whilst we hold the
+	 * RCU read lock. It does not guarantee that the request itself
+	 * will not be freed and then *reused*. Viz,
+	 *
+	 * Thread A			Thread B
+	 *
+	 * rq = active.request
+	 *				retire(rq) -> free(rq);
+	 *				(rq is now first on the slab freelist)
+	 *				active.request = NULL
+	 *
+	 *				rq = new submission on a new object
+	 * ref(rq)
+	 *
+	 * To prevent the request from being reused whilst the caller
+	 * uses it, we take a reference like normal. Whilst acquiring
+	 * the reference we check that it is not in a destroyed state
+	 * (refcnt == 0). That prevents the request being reallocated
+	 * whilst the caller holds on to it. To check that the request
+	 * was not reallocated as we acquired the reference we have to
+	 * check that our request remains the active request across
+	 * the lookup, in the same manner as a seqlock. The visibility
+	 * of the pointer versus the reference counting is controlled
+	 * by using RCU barriers (rcu_dereference and rcu_assign_pointer).
+	 *
+	 * In the middle of all that, we inspect whether the request is
+	 * complete. Retiring is lazy so the request may be completed long
+	 * before the active tracker is updated. Querying whether the
+	 * request is complete is far cheaper (as it involves no locked
+	 * instructions setting cachelines to exclusive) than acquiring
+	 * the reference, so we do it first. The RCU read lock ensures the
+	 * pointer dereference is valid, but does not ensure that the
+	 * seqno nor HWS is the right one! However, if the request was
+	 * reallocated, that means the active tracker's request was complete.
+	 * If the new request is also complete, then both are and we can
+	 * just report the active tracker is idle. If the new request is
+	 * incomplete, then we acquire a reference on it and check that
+	 * it remained the active request.
+	 *
+	 * It is then imperative that we do not zero the request on
+	 * reallocation, so that we can chase the dangling pointers!
+	 * See i915_request_alloc().
+	 */
+	do {
+		struct i915_request *request;
+
+		request = rcu_dereference(active->request);
+		if (!request || i915_request_completed(request))
+			return NULL;
+
+		/*
+		 * An especially silly compiler could decide to recompute the
+		 * result of i915_request_completed, more specifically
+		 * re-emit the load for request->fence.seqno. A race would catch
+		 * a later seqno value, which could flip the result from true to
+		 * false. Which means part of the instructions below might not
+		 * be executed, while later on instructions are executed. Due to
+		 * barriers within the refcounting the inconsistency can't reach
+		 * past the call to i915_request_get_rcu, but not executing
+		 * that while still executing i915_request_put() creates
+		 * havoc enough.  Prevent this with a compiler barrier.
+		 */
+		barrier();
+
+		request = i915_request_get_rcu(request);
+
+		/*
+		 * What stops the following rcu_access_pointer() from occurring
+		 * before the above i915_request_get_rcu()? If we were
+		 * to read the value before pausing to get the reference to
+		 * the request, we may not notice a change in the active
+		 * tracker.
+		 *
+		 * The rcu_access_pointer() is a mere compiler barrier, which
+		 * means both the CPU and compiler are free to perform the
+		 * memory read without constraint. The compiler only has to
+		 * ensure that any operations after the rcu_access_pointer()
+		 * occur afterwards in program order. This means the read may
+		 * be performed earlier by an out-of-order CPU, or adventurous
+		 * compiler.
+		 *
+		 * The atomic operation at the heart of
+		 * i915_request_get_rcu(), see dma_fence_get_rcu(), is
+		 * atomic_inc_not_zero() which is only a full memory barrier
+		 * when successful. That is, if i915_request_get_rcu()
+		 * returns the request (and so with the reference counted
+		 * incremented) then the following read for rcu_access_pointer()
+		 * must occur after the atomic operation and so confirm
+		 * that this request is the one currently being tracked.
+		 *
+		 * The corresponding write barrier is part of
+		 * rcu_assign_pointer().
+		 */
+		if (!request || request == rcu_access_pointer(active->request))
+			return rcu_pointer_handoff(request);
+
+		i915_request_put(request);
+	} while (1);
+}
+
+/**
+ * i915_active_request_get_unlocked - return a reference to the active request
+ * @active - the active tracker
+ *
+ * i915_active_request_get_unlocked() returns a reference to the active request,
+ * or NULL if the active tracker is idle. The reference is obtained under RCU,
+ * so no locking is required by the caller.
+ *
+ * The reference should be freed with i915_request_put().
+ */
+static inline struct i915_request *
+i915_active_request_get_unlocked(const struct i915_active_request *active)
+{
+	struct i915_request *request;
+
+	rcu_read_lock();
+	request = __i915_active_request_get_rcu(active);
+	rcu_read_unlock();
+
+	return request;
+}
+
+/**
+ * i915_active_request_isset - report whether the active tracker is assigned
+ * @active - the active tracker
+ *
+ * i915_active_request_isset() returns true if the active tracker is currently
+ * assigned to a request. Due to the lazy retiring, that request may be idle
+ * and this may report stale information.
+ */
+static inline bool
+i915_active_request_isset(const struct i915_active_request *active)
+{
+	return rcu_access_pointer(active->request);
+}
+
+/**
+ * i915_active_request_retire - waits until the request is retired
+ * @active - the active request on which to wait
+ *
+ * i915_active_request_retire() waits until the request is completed,
+ * and then ensures that at least the retirement handler for this
+ * @active tracker is called before returning. If the @active
+ * tracker is idle, the function returns immediately.
+ */
+static inline int __must_check
+i915_active_request_retire(struct i915_active_request *active,
+			   struct mutex *mutex)
+{
+	struct i915_request *request;
+	long ret;
+
+	request = i915_active_request_raw(active, mutex);
+	if (!request)
+		return 0;
+
+	ret = i915_request_wait(request,
+				I915_WAIT_INTERRUPTIBLE | I915_WAIT_LOCKED,
+				MAX_SCHEDULE_TIMEOUT);
+	if (ret < 0)
+		return ret;
+
+	list_del_init(&active->link);
+	RCU_INIT_POINTER(active->request, NULL);
+
+	active->retire(active, request);
+
+	return 0;
+}
+
+/*
+ * GPU activity tracking
+ *
+ * Each set of commands submitted to the GPU compromises a single request that
+ * signals a fence upon completion. struct i915_request combines the
+ * command submission, scheduling and fence signaling roles. If we want to see
+ * if a particular task is complete, we need to grab the fence (struct
+ * i915_request) for that task and check or wait for it to be signaled. More
+ * often though we want to track the status of a bunch of tasks, for example
+ * to wait for the GPU to finish accessing some memory across a variety of
+ * different command pipelines from different clients. We could choose to
+ * track every single request associated with the task, but knowing that
+ * each request belongs to an ordered timeline (later requests within a
+ * timeline must wait for earlier requests), we need only track the
+ * latest request in each timeline to determine the overall status of the
+ * task.
+ *
+ * struct i915_active provides this tracking across timelines. It builds a
+ * composite shared-fence, and is updated as new work is submitted to the task,
+ * forming a snapshot of the current status. It should be embedded into the
+ * different resources that need to track their associated GPU activity to
+ * provide a callback when that GPU activity has ceased, or otherwise to
+ * provide a serialisation point either for request submission or for CPU
+ * synchronisation.
+ */
+
+void i915_active_init(struct drm_i915_private *i915,
+		      struct i915_active *ref,
+		      void (*retire)(struct i915_active *ref));
+
+int i915_active_ref(struct i915_active *ref,
+		    u64 timeline,
+		    struct i915_request *rq);
+
+int i915_active_wait(struct i915_active *ref);
+
+int i915_request_await_active(struct i915_request *rq,
+			      struct i915_active *ref);
+int i915_request_await_active_request(struct i915_request *rq,
+				      struct i915_active_request *active);
+
+bool i915_active_acquire(struct i915_active *ref);
+
+static inline void i915_active_cancel(struct i915_active *ref)
+{
+	GEM_BUG_ON(ref->count != 1);
+	ref->count = 0;
+}
+
+void i915_active_release(struct i915_active *ref);
+
+static inline bool
+i915_active_is_idle(const struct i915_active *ref)
+{
+	return !ref->count;
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+void i915_active_fini(struct i915_active *ref);
+#else
+static inline void i915_active_fini(struct i915_active *ref) { }
+#endif
+
+int i915_global_active_init(void);
+void i915_global_active_exit(void);
+
+#endif /* _I915_ACTIVE_H_ */
diff --git a/drivers/gpu/drm/i915/i915_active_types.h b/drivers/gpu/drm/i915/i915_active_types.h
new file mode 100644
index 000000000000..b679253b53a5
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_active_types.h
@@ -0,0 +1,36 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef _I915_ACTIVE_TYPES_H_
+#define _I915_ACTIVE_TYPES_H_
+
+#include <linux/rbtree.h>
+#include <linux/rcupdate.h>
+
+struct drm_i915_private;
+struct i915_active_request;
+struct i915_request;
+
+typedef void (*i915_active_retire_fn)(struct i915_active_request *,
+				      struct i915_request *);
+
+struct i915_active_request {
+	struct i915_request __rcu *request;
+	struct list_head link;
+	i915_active_retire_fn retire;
+};
+
+struct i915_active {
+	struct drm_i915_private *i915;
+
+	struct rb_root tree;
+	struct i915_active_request last;
+	unsigned int count;
+
+	void (*retire)(struct i915_active *ref);
+};
+
+#endif /* _I915_ACTIVE_TYPES_H_ */
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 95478db9998b..33e8eed64423 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -865,7 +865,7 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine)
 	int cmd_table_count;
 	int ret;
 
-	if (!IS_GEN7(engine->i915))
+	if (!IS_GEN(engine->i915, 7))
 		return;
 
 	switch (engine->id) {
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 40a61ef9aac1..0bd890c04fe4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -26,12 +26,15 @@
  *
  */
 
-#include <linux/debugfs.h>
 #include <linux/sort.h>
 #include <linux/sched/mm.h>
+#include <drm/drm_debugfs.h>
+#include <drm/drm_fourcc.h>
 #include "intel_drv.h"
 #include "intel_guc_submission.h"
 
+#include "i915_reset.h"
+
 static inline struct drm_i915_private *node_to_i915(struct drm_info_node *node)
 {
 	return to_i915(node->minor->dev);
@@ -48,7 +51,7 @@ static int i915_capabilities(struct seq_file *m, void *data)
 	seq_printf(m, "pch: %d\n", INTEL_PCH_TYPE(dev_priv));
 
 	intel_device_info_dump_flags(info, &p);
-	intel_device_info_dump_runtime(info, &p);
+	intel_device_info_dump_runtime(RUNTIME_INFO(dev_priv), &p);
 	intel_driver_caps_print(&dev_priv->caps, &p);
 
 	kernel_param_lock(THIS_MODULE);
@@ -157,14 +160,14 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		   obj->mm.madv == I915_MADV_DONTNEED ? " purgeable" : "");
 	if (obj->base.name)
 		seq_printf(m, " (name: %d)", obj->base.name);
-	list_for_each_entry(vma, &obj->vma_list, obj_link) {
+	list_for_each_entry(vma, &obj->vma.list, obj_link) {
 		if (i915_vma_is_pinned(vma))
 			pin_count++;
 	}
 	seq_printf(m, " (pinned x %d)", pin_count);
 	if (obj->pin_global)
 		seq_printf(m, " (global)");
-	list_for_each_entry(vma, &obj->vma_list, obj_link) {
+	list_for_each_entry(vma, &obj->vma.list, obj_link) {
 		if (!drm_mm_node_allocated(&vma->node))
 			continue;
 
@@ -204,7 +207,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		if (vma->fence)
 			seq_printf(m, " , fence: %d%s",
 				   vma->fence->id,
-				   i915_gem_active_isset(&vma->last_fence) ? "*" : "");
+				   i915_active_request_isset(&vma->last_fence) ? "*" : "");
 		seq_puts(m, ")");
 	}
 	if (obj->stolen)
@@ -297,11 +300,12 @@ out:
 }
 
 struct file_stats {
-	struct drm_i915_file_private *file_priv;
+	struct i915_address_space *vm;
 	unsigned long count;
 	u64 total, unbound;
 	u64 global, shared;
 	u64 active, inactive;
+	u64 closed;
 };
 
 static int per_file_stats(int id, void *ptr, void *data)
@@ -319,16 +323,14 @@ static int per_file_stats(int id, void *ptr, void *data)
 	if (obj->base.name || obj->base.dma_buf)
 		stats->shared += obj->base.size;
 
-	list_for_each_entry(vma, &obj->vma_list, obj_link) {
+	list_for_each_entry(vma, &obj->vma.list, obj_link) {
 		if (!drm_mm_node_allocated(&vma->node))
 			continue;
 
 		if (i915_vma_is_ggtt(vma)) {
 			stats->global += vma->node.size;
 		} else {
-			struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vma->vm);
-
-			if (ppgtt->vm.file != stats->file_priv)
+			if (vma->vm != stats->vm)
 				continue;
 		}
 
@@ -336,6 +338,9 @@ static int per_file_stats(int id, void *ptr, void *data)
 			stats->active += vma->node.size;
 		else
 			stats->inactive += vma->node.size;
+
+		if (i915_vma_is_closed(vma))
+			stats->closed += vma->node.size;
 	}
 
 	return 0;
@@ -343,7 +348,7 @@ static int per_file_stats(int id, void *ptr, void *data)
 
 #define print_file_stats(m, name, stats) do { \
 	if (stats.count) \
-		seq_printf(m, "%s: %lu objects, %llu bytes (%llu active, %llu inactive, %llu global, %llu shared, %llu unbound)\n", \
+		seq_printf(m, "%s: %lu objects, %llu bytes (%llu active, %llu inactive, %llu global, %llu shared, %llu unbound, %llu closed)\n", \
 			   name, \
 			   stats.count, \
 			   stats.total, \
@@ -351,20 +356,19 @@ static int per_file_stats(int id, void *ptr, void *data)
 			   stats.inactive, \
 			   stats.global, \
 			   stats.shared, \
-			   stats.unbound); \
+			   stats.unbound, \
+			   stats.closed); \
 } while (0)
 
 static void print_batch_pool_stats(struct seq_file *m,
 				   struct drm_i915_private *dev_priv)
 {
 	struct drm_i915_gem_object *obj;
-	struct file_stats stats;
 	struct intel_engine_cs *engine;
+	struct file_stats stats = {};
 	enum intel_engine_id id;
 	int j;
 
-	memset(&stats, 0, sizeof(stats));
-
 	for_each_engine(engine, dev_priv, id) {
 		for (j = 0; j < ARRAY_SIZE(engine->batch_pool.cache_list); j++) {
 			list_for_each_entry(obj,
@@ -377,44 +381,47 @@ static void print_batch_pool_stats(struct seq_file *m,
 	print_file_stats(m, "[k]batch pool", stats);
 }
 
-static int per_file_ctx_stats(int idx, void *ptr, void *data)
+static void print_context_stats(struct seq_file *m,
+				struct drm_i915_private *i915)
 {
-	struct i915_gem_context *ctx = ptr;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
+	struct file_stats kstats = {};
+	struct i915_gem_context *ctx;
 
-	for_each_engine(engine, ctx->i915, id) {
-		struct intel_context *ce = to_intel_context(ctx, engine);
+	list_for_each_entry(ctx, &i915->contexts.list, link) {
+		struct intel_engine_cs *engine;
+		enum intel_engine_id id;
 
-		if (ce->state)
-			per_file_stats(0, ce->state->obj, data);
-		if (ce->ring)
-			per_file_stats(0, ce->ring->vma->obj, data);
-	}
+		for_each_engine(engine, i915, id) {
+			struct intel_context *ce = to_intel_context(ctx, engine);
 
-	return 0;
-}
+			if (ce->state)
+				per_file_stats(0, ce->state->obj, &kstats);
+			if (ce->ring)
+				per_file_stats(0, ce->ring->vma->obj, &kstats);
+		}
 
-static void print_context_stats(struct seq_file *m,
-				struct drm_i915_private *dev_priv)
-{
-	struct drm_device *dev = &dev_priv->drm;
-	struct file_stats stats;
-	struct drm_file *file;
+		if (!IS_ERR_OR_NULL(ctx->file_priv)) {
+			struct file_stats stats = { .vm = &ctx->ppgtt->vm, };
+			struct drm_file *file = ctx->file_priv->file;
+			struct task_struct *task;
+			char name[80];
 
-	memset(&stats, 0, sizeof(stats));
+			spin_lock(&file->table_lock);
+			idr_for_each(&file->object_idr, per_file_stats, &stats);
+			spin_unlock(&file->table_lock);
 
-	mutex_lock(&dev->struct_mutex);
-	if (dev_priv->kernel_context)
-		per_file_ctx_stats(0, dev_priv->kernel_context, &stats);
+			rcu_read_lock();
+			task = pid_task(ctx->pid ?: file->pid, PIDTYPE_PID);
+			snprintf(name, sizeof(name), "%s/%d",
+				 task ? task->comm : "<unknown>",
+				 ctx->user_handle);
+			rcu_read_unlock();
 
-	list_for_each_entry(file, &dev->filelist, lhead) {
-		struct drm_i915_file_private *fpriv = file->driver_priv;
-		idr_for_each(&fpriv->context_idr, per_file_ctx_stats, &stats);
+			print_file_stats(m, name, stats);
+		}
 	}
-	mutex_unlock(&dev->struct_mutex);
 
-	print_file_stats(m, "[k]contexts", stats);
+	print_file_stats(m, "[k]contexts", kstats);
 }
 
 static int i915_gem_object_info(struct seq_file *m, void *data)
@@ -426,14 +433,9 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	u64 size, mapped_size, purgeable_size, dpy_size, huge_size;
 	struct drm_i915_gem_object *obj;
 	unsigned int page_sizes = 0;
-	struct drm_file *file;
 	char buf[80];
 	int ret;
 
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
-	if (ret)
-		return ret;
-
 	seq_printf(m, "%u objects, %llu bytes\n",
 		   dev_priv->mm.object_count,
 		   dev_priv->mm.object_memory);
@@ -514,43 +516,14 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 					buf, sizeof(buf)));
 
 	seq_putc(m, '\n');
-	print_batch_pool_stats(m, dev_priv);
-	mutex_unlock(&dev->struct_mutex);
-
-	mutex_lock(&dev->filelist_mutex);
-	print_context_stats(m, dev_priv);
-	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
-		struct file_stats stats;
-		struct drm_i915_file_private *file_priv = file->driver_priv;
-		struct i915_request *request;
-		struct task_struct *task;
-
-		mutex_lock(&dev->struct_mutex);
 
-		memset(&stats, 0, sizeof(stats));
-		stats.file_priv = file->driver_priv;
-		spin_lock(&file->table_lock);
-		idr_for_each(&file->object_idr, per_file_stats, &stats);
-		spin_unlock(&file->table_lock);
-		/*
-		 * Although we have a valid reference on file->pid, that does
-		 * not guarantee that the task_struct who called get_pid() is
-		 * still alive (e.g. get_pid(current) => fork() => exit()).
-		 * Therefore, we need to protect this ->comm access using RCU.
-		 */
-		request = list_first_entry_or_null(&file_priv->mm.request_list,
-						   struct i915_request,
-						   client_link);
-		rcu_read_lock();
-		task = pid_task(request && request->gem_context->pid ?
-				request->gem_context->pid : file->pid,
-				PIDTYPE_PID);
-		print_file_stats(m, task ? task->comm : "<unknown>", stats);
-		rcu_read_unlock();
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
 
-		mutex_unlock(&dev->struct_mutex);
-	}
-	mutex_unlock(&dev->filelist_mutex);
+	print_batch_pool_stats(m, dev_priv);
+	print_context_stats(m, dev_priv);
+	mutex_unlock(&dev->struct_mutex);
 
 	return 0;
 }
@@ -656,10 +629,12 @@ static void gen8_display_interrupt_info(struct seq_file *m)
 
 	for_each_pipe(dev_priv, pipe) {
 		enum intel_display_power_domain power_domain;
+		intel_wakeref_t wakeref;
 
 		power_domain = POWER_DOMAIN_PIPE(pipe);
-		if (!intel_display_power_get_if_enabled(dev_priv,
-							power_domain)) {
+		wakeref = intel_display_power_get_if_enabled(dev_priv,
+							     power_domain);
+		if (!wakeref) {
 			seq_printf(m, "Pipe %c power disabled\n",
 				   pipe_name(pipe));
 			continue;
@@ -674,7 +649,7 @@ static void gen8_display_interrupt_info(struct seq_file *m)
 			   pipe_name(pipe),
 			   I915_READ(GEN8_DE_PIPE_IER(pipe)));
 
-		intel_display_power_put(dev_priv, power_domain);
+		intel_display_power_put(dev_priv, power_domain, wakeref);
 	}
 
 	seq_printf(m, "Display Engine port interrupt mask:\t%08x\n",
@@ -704,11 +679,14 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
 	int i, pipe;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	if (IS_CHERRYVIEW(dev_priv)) {
+		intel_wakeref_t pref;
+
 		seq_printf(m, "Master Interrupt Control:\t%08x\n",
 			   I915_READ(GEN8_MASTER_IRQ));
 
@@ -724,8 +702,9 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 			enum intel_display_power_domain power_domain;
 
 			power_domain = POWER_DOMAIN_PIPE(pipe);
-			if (!intel_display_power_get_if_enabled(dev_priv,
-								power_domain)) {
+			pref = intel_display_power_get_if_enabled(dev_priv,
+								  power_domain);
+			if (!pref) {
 				seq_printf(m, "Pipe %c power disabled\n",
 					   pipe_name(pipe));
 				continue;
@@ -735,17 +714,17 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 				   pipe_name(pipe),
 				   I915_READ(PIPESTAT(pipe)));
 
-			intel_display_power_put(dev_priv, power_domain);
+			intel_display_power_put(dev_priv, power_domain, pref);
 		}
 
-		intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+		pref = intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
 		seq_printf(m, "Port hotplug:\t%08x\n",
 			   I915_READ(PORT_HOTPLUG_EN));
 		seq_printf(m, "DPFLIPSTAT:\t%08x\n",
 			   I915_READ(VLV_DPFLIPSTAT));
 		seq_printf(m, "DPINVGTT:\t%08x\n",
 			   I915_READ(DPINVGTT));
-		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT, pref);
 
 		for (i = 0; i < 4; i++) {
 			seq_printf(m, "GT Interrupt IMR %d:\t%08x\n",
@@ -808,10 +787,12 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 			   I915_READ(VLV_IMR));
 		for_each_pipe(dev_priv, pipe) {
 			enum intel_display_power_domain power_domain;
+			intel_wakeref_t pref;
 
 			power_domain = POWER_DOMAIN_PIPE(pipe);
-			if (!intel_display_power_get_if_enabled(dev_priv,
-								power_domain)) {
+			pref = intel_display_power_get_if_enabled(dev_priv,
+								  power_domain);
+			if (!pref) {
 				seq_printf(m, "Pipe %c power disabled\n",
 					   pipe_name(pipe));
 				continue;
@@ -820,7 +801,7 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 			seq_printf(m, "Pipe %c stat:\t%08x\n",
 				   pipe_name(pipe),
 				   I915_READ(PIPESTAT(pipe)));
-			intel_display_power_put(dev_priv, power_domain);
+			intel_display_power_put(dev_priv, power_domain, pref);
 		}
 
 		seq_printf(m, "Master IER:\t%08x\n",
@@ -907,7 +888,7 @@ static int i915_interrupt_info(struct seq_file *m, void *data)
 		}
 	}
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return 0;
 }
@@ -980,10 +961,11 @@ static int i915_gpu_info_open(struct inode *inode, struct file *file)
 {
 	struct drm_i915_private *i915 = inode->i_private;
 	struct i915_gpu_state *gpu;
+	intel_wakeref_t wakeref;
 
-	intel_runtime_pm_get(i915);
-	gpu = i915_capture_gpu_state(i915);
-	intel_runtime_pm_put(i915);
+	gpu = NULL;
+	with_intel_runtime_pm(i915, wakeref)
+		gpu = i915_capture_gpu_state(i915);
 	if (IS_ERR(gpu))
 		return PTR_ERR(gpu);
 
@@ -1038,39 +1020,16 @@ static const struct file_operations i915_error_state_fops = {
 };
 #endif
 
-static int
-i915_next_seqno_set(void *data, u64 val)
-{
-	struct drm_i915_private *dev_priv = data;
-	struct drm_device *dev = &dev_priv->drm;
-	int ret;
-
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
-	if (ret)
-		return ret;
-
-	intel_runtime_pm_get(dev_priv);
-	ret = i915_gem_set_global_seqno(dev, val);
-	intel_runtime_pm_put(dev_priv);
-
-	mutex_unlock(&dev->struct_mutex);
-
-	return ret;
-}
-
-DEFINE_SIMPLE_ATTRIBUTE(i915_next_seqno_fops,
-			NULL, i915_next_seqno_set,
-			"0x%llx\n");
-
 static int i915_frequency_info(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct intel_rps *rps = &dev_priv->gt_pm.rps;
+	intel_wakeref_t wakeref;
 	int ret = 0;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
-	if (IS_GEN5(dev_priv)) {
+	if (IS_GEN(dev_priv, 5)) {
 		u16 rgvswctl = I915_READ16(MEMSWCTL);
 		u16 rgvstat = I915_READ16(MEMSTAT_ILK);
 
@@ -1280,7 +1239,7 @@ static int i915_frequency_info(struct seq_file *m, void *unused)
 	seq_printf(m, "Max CD clock frequency: %d kHz\n", dev_priv->max_cdclk_freq);
 	seq_printf(m, "Max pixel clock frequency: %d kHz\n", dev_priv->max_dotclk_freq);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 	return ret;
 }
 
@@ -1319,14 +1278,13 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	u64 acthd[I915_NUM_ENGINES];
 	u32 seqno[I915_NUM_ENGINES];
 	struct intel_instdone instdone;
+	intel_wakeref_t wakeref;
 	enum intel_engine_id id;
 
 	if (test_bit(I915_WEDGED, &dev_priv->gpu_error.flags))
 		seq_puts(m, "Wedged\n");
 	if (test_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags))
 		seq_puts(m, "Reset in progress: struct_mutex backoff\n");
-	if (test_bit(I915_RESET_HANDOFF, &dev_priv->gpu_error.flags))
-		seq_puts(m, "Reset in progress: reset handoff to waiter\n");
 	if (waitqueue_active(&dev_priv->gpu_error.wait_queue))
 		seq_puts(m, "Waiter holding struct mutex\n");
 	if (waitqueue_active(&dev_priv->gpu_error.reset_queue))
@@ -1337,17 +1295,15 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 		return 0;
 	}
 
-	intel_runtime_pm_get(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		for_each_engine(engine, dev_priv, id) {
+			acthd[id] = intel_engine_get_active_head(engine);
+			seqno[id] = intel_engine_get_seqno(engine);
+		}
 
-	for_each_engine(engine, dev_priv, id) {
-		acthd[id] = intel_engine_get_active_head(engine);
-		seqno[id] = intel_engine_get_seqno(engine);
+		intel_engine_get_instdone(dev_priv->engine[RCS], &instdone);
 	}
 
-	intel_engine_get_instdone(dev_priv->engine[RCS], &instdone);
-
-	intel_runtime_pm_put(dev_priv);
-
 	if (timer_pending(&dev_priv->gpu_error.hangcheck_work.timer))
 		seq_printf(m, "Hangcheck active, timer fires in %dms\n",
 			   jiffies_to_msecs(dev_priv->gpu_error.hangcheck_work.timer.expires -
@@ -1360,37 +1316,16 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	seq_printf(m, "GT active? %s\n", yesno(dev_priv->gt.awake));
 
 	for_each_engine(engine, dev_priv, id) {
-		struct intel_breadcrumbs *b = &engine->breadcrumbs;
-		struct rb_node *rb;
-
 		seq_printf(m, "%s:\n", engine->name);
-		seq_printf(m, "\tseqno = %x [current %x, last %x]\n",
+		seq_printf(m, "\tseqno = %x [current %x, last %x], %dms ago\n",
 			   engine->hangcheck.seqno, seqno[id],
-			   intel_engine_last_submit(engine));
-		seq_printf(m, "\twaiters? %s, fake irq active? %s, stalled? %s, wedged? %s\n",
-			   yesno(intel_engine_has_waiter(engine)),
-			   yesno(test_bit(engine->id,
-					  &dev_priv->gpu_error.missed_irq_rings)),
-			   yesno(engine->hangcheck.stalled),
-			   yesno(engine->hangcheck.wedged));
-
-		spin_lock_irq(&b->rb_lock);
-		for (rb = rb_first(&b->waiters); rb; rb = rb_next(rb)) {
-			struct intel_wait *w = rb_entry(rb, typeof(*w), node);
-
-			seq_printf(m, "\t%s [%d] waiting for %x\n",
-				   w->tsk->comm, w->tsk->pid, w->seqno);
-		}
-		spin_unlock_irq(&b->rb_lock);
+			   intel_engine_last_submit(engine),
+			   jiffies_to_msecs(jiffies -
+					    engine->hangcheck.action_timestamp));
 
 		seq_printf(m, "\tACTHD = 0x%08llx [current 0x%08llx]\n",
 			   (long long)engine->hangcheck.acthd,
 			   (long long)acthd[id]);
-		seq_printf(m, "\taction = %s(%d) %d ms ago\n",
-			   hangcheck_action_to_str(engine->hangcheck.action),
-			   engine->hangcheck.action,
-			   jiffies_to_msecs(jiffies -
-					    engine->hangcheck.action_timestamp));
 
 		if (engine->id == RCS) {
 			seq_puts(m, "\tinstdone read =\n");
@@ -1622,18 +1557,17 @@ static int gen6_drpc_info(struct seq_file *m)
 static int i915_drpc_info(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
-	int err;
-
-	intel_runtime_pm_get(dev_priv);
-
-	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
-		err = vlv_drpc_info(m);
-	else if (INTEL_GEN(dev_priv) >= 6)
-		err = gen6_drpc_info(m);
-	else
-		err = ironlake_drpc_info(m);
-
-	intel_runtime_pm_put(dev_priv);
+	intel_wakeref_t wakeref;
+	int err = -ENODEV;
+
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
+			err = vlv_drpc_info(m);
+		else if (INTEL_GEN(dev_priv) >= 6)
+			err = gen6_drpc_info(m);
+		else
+			err = ironlake_drpc_info(m);
+	}
 
 	return err;
 }
@@ -1655,11 +1589,12 @@ static int i915_fbc_status(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct intel_fbc *fbc = &dev_priv->fbc;
+	intel_wakeref_t wakeref;
 
 	if (!HAS_FBC(dev_priv))
 		return -ENODEV;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 	mutex_lock(&fbc->lock);
 
 	if (intel_fbc_is_active(dev_priv))
@@ -1686,7 +1621,7 @@ static int i915_fbc_status(struct seq_file *m, void *unused)
 	}
 
 	mutex_unlock(&fbc->lock);
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return 0;
 }
@@ -1731,11 +1666,12 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_fbc_false_color_fops,
 static int i915_ips_status(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	intel_wakeref_t wakeref;
 
 	if (!HAS_IPS(dev_priv))
 		return -ENODEV;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	seq_printf(m, "Enabled by kernel parameter: %s\n",
 		   yesno(i915_modparams.enable_ips));
@@ -1749,7 +1685,7 @@ static int i915_ips_status(struct seq_file *m, void *unused)
 			seq_puts(m, "Currently: disabled\n");
 	}
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return 0;
 }
@@ -1757,10 +1693,10 @@ static int i915_ips_status(struct seq_file *m, void *unused)
 static int i915_sr_status(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	intel_wakeref_t wakeref;
 	bool sr_enabled = false;
 
-	intel_runtime_pm_get(dev_priv);
-	intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+	wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
 
 	if (INTEL_GEN(dev_priv) >= 9)
 		/* no global SR status; inspect per-plane WM */;
@@ -1776,8 +1712,7 @@ static int i915_sr_status(struct seq_file *m, void *unused)
 	else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
 		sr_enabled = I915_READ(FW_BLC_SELF_VLV) & FW_CSPWRDWNEN;
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
-	intel_runtime_pm_put(dev_priv);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_INIT, wakeref);
 
 	seq_printf(m, "self-refresh: %s\n", enableddisabled(sr_enabled));
 
@@ -1786,31 +1721,24 @@ static int i915_sr_status(struct seq_file *m, void *unused)
 
 static int i915_emon_status(struct seq_file *m, void *unused)
 {
-	struct drm_i915_private *dev_priv = node_to_i915(m->private);
-	struct drm_device *dev = &dev_priv->drm;
-	unsigned long temp, chipset, gfx;
-	int ret;
+	struct drm_i915_private *i915 = node_to_i915(m->private);
+	intel_wakeref_t wakeref;
 
-	if (!IS_GEN5(dev_priv))
+	if (!IS_GEN(i915, 5))
 		return -ENODEV;
 
-	intel_runtime_pm_get(dev_priv);
+	with_intel_runtime_pm(i915, wakeref) {
+		unsigned long temp, chipset, gfx;
 
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
-	if (ret)
-		return ret;
-
-	temp = i915_mch_val(dev_priv);
-	chipset = i915_chipset_val(dev_priv);
-	gfx = i915_gfx_val(dev_priv);
-	mutex_unlock(&dev->struct_mutex);
+		temp = i915_mch_val(i915);
+		chipset = i915_chipset_val(i915);
+		gfx = i915_gfx_val(i915);
 
-	seq_printf(m, "GMCH temp: %ld\n", temp);
-	seq_printf(m, "Chipset power: %ld\n", chipset);
-	seq_printf(m, "GFX power: %ld\n", gfx);
-	seq_printf(m, "Total power: %ld\n", chipset + gfx);
-
-	intel_runtime_pm_put(dev_priv);
+		seq_printf(m, "GMCH temp: %ld\n", temp);
+		seq_printf(m, "Chipset power: %ld\n", chipset);
+		seq_printf(m, "GFX power: %ld\n", gfx);
+		seq_printf(m, "Total power: %ld\n", chipset + gfx);
+	}
 
 	return 0;
 }
@@ -1820,13 +1748,14 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused)
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct intel_rps *rps = &dev_priv->gt_pm.rps;
 	unsigned int max_gpu_freq, min_gpu_freq;
+	intel_wakeref_t wakeref;
 	int gpu_freq, ia_freq;
 	int ret;
 
 	if (!HAS_LLC(dev_priv))
 		return -ENODEV;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	ret = mutex_lock_interruptible(&dev_priv->pcu_lock);
 	if (ret)
@@ -1859,7 +1788,7 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused)
 	mutex_unlock(&dev_priv->pcu_lock);
 
 out:
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 	return ret;
 }
 
@@ -2032,15 +1961,16 @@ static const char *swizzle_string(unsigned swizzle)
 static int i915_swizzle_info(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	intel_wakeref_t wakeref;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	seq_printf(m, "bit6 swizzle for X-tiling = %s\n",
 		   swizzle_string(dev_priv->mm.bit_6_swizzle_x));
 	seq_printf(m, "bit6 swizzle for Y-tiling = %s\n",
 		   swizzle_string(dev_priv->mm.bit_6_swizzle_y));
 
-	if (IS_GEN3(dev_priv) || IS_GEN4(dev_priv)) {
+	if (IS_GEN_RANGE(dev_priv, 3, 4)) {
 		seq_printf(m, "DDC = 0x%08x\n",
 			   I915_READ(DCC));
 		seq_printf(m, "DDC2 = 0x%08x\n",
@@ -2071,141 +2001,11 @@ static int i915_swizzle_info(struct seq_file *m, void *data)
 	if (dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES)
 		seq_puts(m, "L-shaped memory detected\n");
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return 0;
 }
 
-static int per_file_ctx(int id, void *ptr, void *data)
-{
-	struct i915_gem_context *ctx = ptr;
-	struct seq_file *m = data;
-	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
-
-	if (!ppgtt) {
-		seq_printf(m, "  no ppgtt for context %d\n",
-			   ctx->user_handle);
-		return 0;
-	}
-
-	if (i915_gem_context_is_default(ctx))
-		seq_puts(m, "  default context:\n");
-	else
-		seq_printf(m, "  context %d:\n", ctx->user_handle);
-	ppgtt->debug_dump(ppgtt, m);
-
-	return 0;
-}
-
-static void gen8_ppgtt_info(struct seq_file *m,
-			    struct drm_i915_private *dev_priv)
-{
-	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int i;
-
-	if (!ppgtt)
-		return;
-
-	for_each_engine(engine, dev_priv, id) {
-		seq_printf(m, "%s\n", engine->name);
-		for (i = 0; i < 4; i++) {
-			u64 pdp = I915_READ(GEN8_RING_PDP_UDW(engine, i));
-			pdp <<= 32;
-			pdp |= I915_READ(GEN8_RING_PDP_LDW(engine, i));
-			seq_printf(m, "\tPDP%d 0x%016llx\n", i, pdp);
-		}
-	}
-}
-
-static void gen6_ppgtt_info(struct seq_file *m,
-			    struct drm_i915_private *dev_priv)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	if (IS_GEN6(dev_priv))
-		seq_printf(m, "GFX_MODE: 0x%08x\n", I915_READ(GFX_MODE));
-
-	for_each_engine(engine, dev_priv, id) {
-		seq_printf(m, "%s\n", engine->name);
-		if (IS_GEN7(dev_priv))
-			seq_printf(m, "GFX_MODE: 0x%08x\n",
-				   I915_READ(RING_MODE_GEN7(engine)));
-		seq_printf(m, "PP_DIR_BASE: 0x%08x\n",
-			   I915_READ(RING_PP_DIR_BASE(engine)));
-		seq_printf(m, "PP_DIR_BASE_READ: 0x%08x\n",
-			   I915_READ(RING_PP_DIR_BASE_READ(engine)));
-		seq_printf(m, "PP_DIR_DCLV: 0x%08x\n",
-			   I915_READ(RING_PP_DIR_DCLV(engine)));
-	}
-	if (dev_priv->mm.aliasing_ppgtt) {
-		struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
-
-		seq_puts(m, "aliasing PPGTT:\n");
-		seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd.base.ggtt_offset);
-
-		ppgtt->debug_dump(ppgtt, m);
-	}
-
-	seq_printf(m, "ECOCHK: 0x%08x\n", I915_READ(GAM_ECOCHK));
-}
-
-static int i915_ppgtt_info(struct seq_file *m, void *data)
-{
-	struct drm_i915_private *dev_priv = node_to_i915(m->private);
-	struct drm_device *dev = &dev_priv->drm;
-	struct drm_file *file;
-	int ret;
-
-	mutex_lock(&dev->filelist_mutex);
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
-	if (ret)
-		goto out_unlock;
-
-	intel_runtime_pm_get(dev_priv);
-
-	if (INTEL_GEN(dev_priv) >= 8)
-		gen8_ppgtt_info(m, dev_priv);
-	else if (INTEL_GEN(dev_priv) >= 6)
-		gen6_ppgtt_info(m, dev_priv);
-
-	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
-		struct drm_i915_file_private *file_priv = file->driver_priv;
-		struct task_struct *task;
-
-		task = get_pid_task(file->pid, PIDTYPE_PID);
-		if (!task) {
-			ret = -ESRCH;
-			goto out_rpm;
-		}
-		seq_printf(m, "\nproc: %s\n", task->comm);
-		put_task_struct(task);
-		idr_for_each(&file_priv->context_idr, per_file_ctx,
-			     (void *)(unsigned long)m);
-	}
-
-out_rpm:
-	intel_runtime_pm_put(dev_priv);
-	mutex_unlock(&dev->struct_mutex);
-out_unlock:
-	mutex_unlock(&dev->filelist_mutex);
-	return ret;
-}
-
-static int count_irq_waiters(struct drm_i915_private *i915)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int count = 0;
-
-	for_each_engine(engine, i915, id)
-		count += intel_engine_has_waiter(engine);
-
-	return count;
-}
-
 static const char *rps_power_to_str(unsigned int power)
 {
 	static const char * const strings[] = {
@@ -2226,9 +2026,10 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
 	struct drm_device *dev = &dev_priv->drm;
 	struct intel_rps *rps = &dev_priv->gt_pm.rps;
 	u32 act_freq = rps->cur_freq;
+	intel_wakeref_t wakeref;
 	struct drm_file *file;
 
-	if (intel_runtime_pm_get_if_in_use(dev_priv)) {
+	with_intel_runtime_pm_if_in_use(dev_priv, wakeref) {
 		if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
 			mutex_lock(&dev_priv->pcu_lock);
 			act_freq = vlv_punit_read(dev_priv,
@@ -2239,13 +2040,11 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
 			act_freq = intel_get_cagf(dev_priv,
 						  I915_READ(GEN6_RPSTAT1));
 		}
-		intel_runtime_pm_put(dev_priv);
 	}
 
 	seq_printf(m, "RPS enabled? %d\n", rps->enabled);
 	seq_printf(m, "GPU busy? %s [%d requests]\n",
 		   yesno(dev_priv->gt.awake), dev_priv->gt.active_requests);
-	seq_printf(m, "CPU waiting? %d\n", count_irq_waiters(dev_priv));
 	seq_printf(m, "Boosts outstanding? %d\n",
 		   atomic_read(&rps->num_waiters));
 	seq_printf(m, "Interactive? %d\n", READ_ONCE(rps->power.interactive));
@@ -2322,6 +2121,7 @@ static int i915_llc(struct seq_file *m, void *data)
 static int i915_huc_load_status_info(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	intel_wakeref_t wakeref;
 	struct drm_printer p;
 
 	if (!HAS_HUC(dev_priv))
@@ -2330,9 +2130,8 @@ static int i915_huc_load_status_info(struct seq_file *m, void *data)
 	p = drm_seq_file_printer(m);
 	intel_uc_fw_dump(&dev_priv->huc.fw, &p);
 
-	intel_runtime_pm_get(dev_priv);
-	seq_printf(m, "\nHuC status 0x%08x:\n", I915_READ(HUC_STATUS2));
-	intel_runtime_pm_put(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref)
+		seq_printf(m, "\nHuC status 0x%08x:\n", I915_READ(HUC_STATUS2));
 
 	return 0;
 }
@@ -2340,8 +2139,8 @@ static int i915_huc_load_status_info(struct seq_file *m, void *data)
 static int i915_guc_load_status_info(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	intel_wakeref_t wakeref;
 	struct drm_printer p;
-	u32 tmp, i;
 
 	if (!HAS_GUC(dev_priv))
 		return -ENODEV;
@@ -2349,22 +2148,23 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data)
 	p = drm_seq_file_printer(m);
 	intel_uc_fw_dump(&dev_priv->guc.fw, &p);
 
-	intel_runtime_pm_get(dev_priv);
-
-	tmp = I915_READ(GUC_STATUS);
-
-	seq_printf(m, "\nGuC status 0x%08x:\n", tmp);
-	seq_printf(m, "\tBootrom status = 0x%x\n",
-		(tmp & GS_BOOTROM_MASK) >> GS_BOOTROM_SHIFT);
-	seq_printf(m, "\tuKernel status = 0x%x\n",
-		(tmp & GS_UKERNEL_MASK) >> GS_UKERNEL_SHIFT);
-	seq_printf(m, "\tMIA Core status = 0x%x\n",
-		(tmp & GS_MIA_MASK) >> GS_MIA_SHIFT);
-	seq_puts(m, "\nScratch registers:\n");
-	for (i = 0; i < 16; i++)
-		seq_printf(m, "\t%2d: \t0x%x\n", i, I915_READ(SOFT_SCRATCH(i)));
-
-	intel_runtime_pm_put(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		u32 tmp = I915_READ(GUC_STATUS);
+		u32 i;
+
+		seq_printf(m, "\nGuC status 0x%08x:\n", tmp);
+		seq_printf(m, "\tBootrom status = 0x%x\n",
+			   (tmp & GS_BOOTROM_MASK) >> GS_BOOTROM_SHIFT);
+		seq_printf(m, "\tuKernel status = 0x%x\n",
+			   (tmp & GS_UKERNEL_MASK) >> GS_UKERNEL_SHIFT);
+		seq_printf(m, "\tMIA Core status = 0x%x\n",
+			   (tmp & GS_MIA_MASK) >> GS_MIA_SHIFT);
+		seq_puts(m, "\nScratch registers:\n");
+		for (i = 0; i < 16; i++) {
+			seq_printf(m, "\t%2d: \t0x%x\n",
+				   i, I915_READ(SOFT_SCRATCH(i)));
+		}
+	}
 
 	return 0;
 }
@@ -2416,7 +2216,7 @@ static void i915_guc_client_info(struct seq_file *m,
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	uint64_t tot = 0;
+	u64 tot = 0;
 
 	seq_printf(m, "\tPriority %d, GuC stage index: %u, PD offset 0x%x\n",
 		client->priority, client->stage_id, client->proc_desc_offset);
@@ -2671,7 +2471,8 @@ DEFINE_SHOW_ATTRIBUTE(i915_psr_sink_status);
 static void
 psr_source_status(struct drm_i915_private *dev_priv, struct seq_file *m)
 {
-	u32 val, psr_status;
+	u32 val, status_val;
+	const char *status = "unknown";
 
 	if (dev_priv->psr.psr2_enabled) {
 		static const char * const live_status[] = {
@@ -2687,14 +2488,11 @@ psr_source_status(struct drm_i915_private *dev_priv, struct seq_file *m)
 			"BUF_ON",
 			"TG_ON"
 		};
-		psr_status = I915_READ(EDP_PSR2_STATUS);
-		val = (psr_status & EDP_PSR2_STATUS_STATE_MASK) >>
-			EDP_PSR2_STATUS_STATE_SHIFT;
-		if (val < ARRAY_SIZE(live_status)) {
-			seq_printf(m, "Source PSR status: 0x%x [%s]\n",
-				   psr_status, live_status[val]);
-			return;
-		}
+		val = I915_READ(EDP_PSR2_STATUS);
+		status_val = (val & EDP_PSR2_STATUS_STATE_MASK) >>
+			      EDP_PSR2_STATUS_STATE_SHIFT;
+		if (status_val < ARRAY_SIZE(live_status))
+			status = live_status[status_val];
 	} else {
 		static const char * const live_status[] = {
 			"IDLE",
@@ -2706,74 +2504,102 @@ psr_source_status(struct drm_i915_private *dev_priv, struct seq_file *m)
 			"SRDOFFACK",
 			"SRDENT_ON",
 		};
-		psr_status = I915_READ(EDP_PSR_STATUS);
-		val = (psr_status & EDP_PSR_STATUS_STATE_MASK) >>
-			EDP_PSR_STATUS_STATE_SHIFT;
-		if (val < ARRAY_SIZE(live_status)) {
-			seq_printf(m, "Source PSR status: 0x%x [%s]\n",
-				   psr_status, live_status[val]);
-			return;
-		}
+		val = I915_READ(EDP_PSR_STATUS);
+		status_val = (val & EDP_PSR_STATUS_STATE_MASK) >>
+			      EDP_PSR_STATUS_STATE_SHIFT;
+		if (status_val < ARRAY_SIZE(live_status))
+			status = live_status[status_val];
 	}
 
-	seq_printf(m, "Source PSR status: 0x%x [%s]\n", psr_status, "unknown");
+	seq_printf(m, "Source PSR status: %s [0x%08x]\n", status, val);
 }
 
 static int i915_edp_psr_status(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
-	u32 psrperf = 0;
-	bool enabled = false;
-	bool sink_support;
+	struct i915_psr *psr = &dev_priv->psr;
+	intel_wakeref_t wakeref;
+	const char *status;
+	bool enabled;
+	u32 val;
 
 	if (!HAS_PSR(dev_priv))
 		return -ENODEV;
 
-	sink_support = dev_priv->psr.sink_support;
-	seq_printf(m, "Sink_Support: %s\n", yesno(sink_support));
-	if (!sink_support)
-		return 0;
+	seq_printf(m, "Sink support: %s", yesno(psr->sink_support));
+	if (psr->dp)
+		seq_printf(m, " [0x%02x]", psr->dp->psr_dpcd[0]);
+	seq_puts(m, "\n");
 
-	intel_runtime_pm_get(dev_priv);
+	if (!psr->sink_support)
+		return 0;
 
-	mutex_lock(&dev_priv->psr.lock);
-	seq_printf(m, "PSR mode: %s\n",
-		   dev_priv->psr.psr2_enabled ? "PSR2" : "PSR1");
-	seq_printf(m, "Enabled: %s\n", yesno(dev_priv->psr.enabled));
-	seq_printf(m, "Busy frontbuffer bits: 0x%03x\n",
-		   dev_priv->psr.busy_frontbuffer_bits);
+	wakeref = intel_runtime_pm_get(dev_priv);
+	mutex_lock(&psr->lock);
 
-	if (dev_priv->psr.psr2_enabled)
-		enabled = I915_READ(EDP_PSR2_CTL) & EDP_PSR2_ENABLE;
+	if (psr->enabled)
+		status = psr->psr2_enabled ? "PSR2 enabled" : "PSR1 enabled";
 	else
-		enabled = I915_READ(EDP_PSR_CTL) & EDP_PSR_ENABLE;
+		status = "disabled";
+	seq_printf(m, "PSR mode: %s\n", status);
 
-	seq_printf(m, "Main link in standby mode: %s\n",
-		   yesno(dev_priv->psr.link_standby));
+	if (!psr->enabled)
+		goto unlock;
 
-	seq_printf(m, "HW Enabled & Active bit: %s\n", yesno(enabled));
+	if (psr->psr2_enabled) {
+		val = I915_READ(EDP_PSR2_CTL);
+		enabled = val & EDP_PSR2_ENABLE;
+	} else {
+		val = I915_READ(EDP_PSR_CTL);
+		enabled = val & EDP_PSR_ENABLE;
+	}
+	seq_printf(m, "Source PSR ctl: %s [0x%08x]\n",
+		   enableddisabled(enabled), val);
+	psr_source_status(dev_priv, m);
+	seq_printf(m, "Busy frontbuffer bits: 0x%08x\n",
+		   psr->busy_frontbuffer_bits);
 
 	/*
 	 * SKL+ Perf counter is reset to 0 everytime DC state is entered
 	 */
 	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) {
-		psrperf = I915_READ(EDP_PSR_PERF_CNT) &
-			EDP_PSR_PERF_CNT_MASK;
+		val = I915_READ(EDP_PSR_PERF_CNT) & EDP_PSR_PERF_CNT_MASK;
+		seq_printf(m, "Performance counter: %u\n", val);
+	}
 
-		seq_printf(m, "Performance_Counter: %u\n", psrperf);
+	if (psr->debug & I915_PSR_DEBUG_IRQ) {
+		seq_printf(m, "Last attempted entry at: %lld\n",
+			   psr->last_entry_attempt);
+		seq_printf(m, "Last exit at: %lld\n", psr->last_exit);
 	}
 
-	psr_source_status(dev_priv, m);
-	mutex_unlock(&dev_priv->psr.lock);
+	if (psr->psr2_enabled) {
+		u32 su_frames_val[3];
+		int frame;
 
-	if (READ_ONCE(dev_priv->psr.debug) & I915_PSR_DEBUG_IRQ) {
-		seq_printf(m, "Last attempted entry at: %lld\n",
-			   dev_priv->psr.last_entry_attempt);
-		seq_printf(m, "Last exit at: %lld\n",
-			   dev_priv->psr.last_exit);
+		/*
+		 * Reading all 3 registers before hand to minimize crossing a
+		 * frame boundary between register reads
+		 */
+		for (frame = 0; frame < PSR2_SU_STATUS_FRAMES; frame += 3)
+			su_frames_val[frame / 3] = I915_READ(PSR2_SU_STATUS(frame));
+
+		seq_puts(m, "Frame:\tPSR2 SU blocks:\n");
+
+		for (frame = 0; frame < PSR2_SU_STATUS_FRAMES; frame++) {
+			u32 su_blocks;
+
+			su_blocks = su_frames_val[frame / 3] &
+				    PSR2_SU_STATUS_MASK(frame);
+			su_blocks = su_blocks >> PSR2_SU_STATUS_SHIFT(frame);
+			seq_printf(m, "%d\t%d\n", frame, su_blocks);
+		}
 	}
 
-	intel_runtime_pm_put(dev_priv);
+unlock:
+	mutex_unlock(&psr->lock);
+	intel_runtime_pm_put(dev_priv, wakeref);
+
 	return 0;
 }
 
@@ -2782,6 +2608,7 @@ i915_edp_psr_debug_set(void *data, u64 val)
 {
 	struct drm_i915_private *dev_priv = data;
 	struct drm_modeset_acquire_ctx ctx;
+	intel_wakeref_t wakeref;
 	int ret;
 
 	if (!CAN_PSR(dev_priv))
@@ -2789,7 +2616,7 @@ i915_edp_psr_debug_set(void *data, u64 val)
 
 	DRM_DEBUG_KMS("Setting PSR debug to %llx\n", val);
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	drm_modeset_acquire_init(&ctx, DRM_MODESET_ACQUIRE_INTERRUPTIBLE);
 
@@ -2804,7 +2631,7 @@ retry:
 	drm_modeset_drop_locks(&ctx);
 	drm_modeset_acquire_fini(&ctx);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return ret;
 }
@@ -2829,24 +2656,20 @@ static int i915_energy_uJ(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	unsigned long long power;
+	intel_wakeref_t wakeref;
 	u32 units;
 
 	if (INTEL_GEN(dev_priv) < 6)
 		return -ENODEV;
 
-	intel_runtime_pm_get(dev_priv);
-
-	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
-		intel_runtime_pm_put(dev_priv);
+	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power))
 		return -ENODEV;
-	}
 
 	units = (power & 0x1f00) >> 8;
-	power = I915_READ(MCH_SECP_NRG_STTS);
-	power = (1000000 * power) >> units; /* convert to uJ */
-
-	intel_runtime_pm_put(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref)
+		power = I915_READ(MCH_SECP_NRG_STTS);
 
+	power = (1000000 * power) >> units; /* convert to uJ */
 	seq_printf(m, "%llu", power);
 
 	return 0;
@@ -2860,6 +2683,9 @@ static int i915_runtime_pm_status(struct seq_file *m, void *unused)
 	if (!HAS_RUNTIME_PM(dev_priv))
 		seq_puts(m, "Runtime power management not supported\n");
 
+	seq_printf(m, "Runtime power status: %s\n",
+		   enableddisabled(!dev_priv->power_domains.wakeref));
+
 	seq_printf(m, "GPU idle: %s (epoch %u)\n",
 		   yesno(!dev_priv->gt.awake), dev_priv->gt.epoch);
 	seq_printf(m, "IRQs disabled: %s\n",
@@ -2874,6 +2700,12 @@ static int i915_runtime_pm_status(struct seq_file *m, void *unused)
 		   pci_power_name(pdev->current_state),
 		   pdev->current_state);
 
+	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)) {
+		struct drm_printer p = drm_seq_file_printer(m);
+
+		print_intel_runtime_pm_wakeref(dev_priv, &p);
+	}
+
 	return 0;
 }
 
@@ -2908,6 +2740,7 @@ static int i915_power_domain_info(struct seq_file *m, void *unused)
 static int i915_dmc_info(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	intel_wakeref_t wakeref;
 	struct intel_csr *csr;
 
 	if (!HAS_CSR(dev_priv))
@@ -2915,7 +2748,7 @@ static int i915_dmc_info(struct seq_file *m, void *unused)
 
 	csr = &dev_priv->csr;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	seq_printf(m, "fw loaded: %s\n", yesno(csr->dmc_payload != NULL));
 	seq_printf(m, "path: %s\n", csr->fw_path);
@@ -2941,7 +2774,7 @@ out:
 	seq_printf(m, "ssp base: 0x%08x\n", I915_READ(CSR_SSP_BASE));
 	seq_printf(m, "htp: 0x%08x\n", I915_READ(CSR_HTP_SKL));
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return 0;
 }
@@ -2954,14 +2787,7 @@ static void intel_seq_print_mode(struct seq_file *m, int tabs,
 	for (i = 0; i < tabs; i++)
 		seq_putc(m, '\t');
 
-	seq_printf(m, "id %d:\"%s\" freq %d clock %d hdisp %d hss %d hse %d htot %d vdisp %d vss %d vse %d vtot %d type 0x%x flags 0x%x\n",
-		   mode->base.id, mode->name,
-		   mode->vrefresh, mode->clock,
-		   mode->hdisplay, mode->hsync_start,
-		   mode->hsync_end, mode->htotal,
-		   mode->vdisplay, mode->vsync_start,
-		   mode->vsync_end, mode->vtotal,
-		   mode->type, mode->flags);
+	seq_printf(m, DRM_MODE_FMT "\n", DRM_MODE_ARG(mode));
 }
 
 static void intel_encoder_info(struct seq_file *m,
@@ -3133,14 +2959,13 @@ static const char *plane_type(enum drm_plane_type type)
 	return "unknown";
 }
 
-static const char *plane_rotation(unsigned int rotation)
+static void plane_rotation(char *buf, size_t bufsize, unsigned int rotation)
 {
-	static char buf[48];
 	/*
 	 * According to doc only one DRM_MODE_ROTATE_ is allowed but this
 	 * will print them all to visualize if the values are misused
 	 */
-	snprintf(buf, sizeof(buf),
+	snprintf(buf, bufsize,
 		 "%s%s%s%s%s%s(0x%08x)",
 		 (rotation & DRM_MODE_ROTATE_0) ? "0 " : "",
 		 (rotation & DRM_MODE_ROTATE_90) ? "90 " : "",
@@ -3149,8 +2974,6 @@ static const char *plane_rotation(unsigned int rotation)
 		 (rotation & DRM_MODE_REFLECT_X) ? "FLIPX " : "",
 		 (rotation & DRM_MODE_REFLECT_Y) ? "FLIPY " : "",
 		 rotation);
-
-	return buf;
 }
 
 static void intel_plane_info(struct seq_file *m, struct intel_crtc *intel_crtc)
@@ -3163,6 +2986,7 @@ static void intel_plane_info(struct seq_file *m, struct intel_crtc *intel_crtc)
 		struct drm_plane_state *state;
 		struct drm_plane *plane = &intel_plane->base;
 		struct drm_format_name_buf format_name;
+		char rot_str[48];
 
 		if (!plane->state) {
 			seq_puts(m, "plane->state is NULL!\n");
@@ -3178,6 +3002,8 @@ static void intel_plane_info(struct seq_file *m, struct intel_crtc *intel_crtc)
 			sprintf(format_name.str, "N/A");
 		}
 
+		plane_rotation(rot_str, sizeof(rot_str), state->rotation);
+
 		seq_printf(m, "\t--Plane id %d: type=%s, crtc_pos=%4dx%4d, crtc_size=%4dx%4d, src_pos=%d.%04ux%d.%04u, src_size=%d.%04ux%d.%04u, format=%s, rotation=%s\n",
 			   plane->base.id,
 			   plane_type(intel_plane->base.type),
@@ -3192,7 +3018,7 @@ static void intel_plane_info(struct seq_file *m, struct intel_crtc *intel_crtc)
 			   (state->src_h >> 16),
 			   ((state->src_h & 0xffff) * 15625) >> 10,
 			   format_name.str,
-			   plane_rotation(state->rotation));
+			   rot_str);
 	}
 }
 
@@ -3231,8 +3057,10 @@ static int i915_display_info(struct seq_file *m, void *unused)
 	struct intel_crtc *crtc;
 	struct drm_connector *connector;
 	struct drm_connector_list_iter conn_iter;
+	intel_wakeref_t wakeref;
+
+	wakeref = intel_runtime_pm_get(dev_priv);
 
-	intel_runtime_pm_get(dev_priv);
 	seq_printf(m, "CRTC info\n");
 	seq_printf(m, "---------\n");
 	for_each_intel_crtc(dev, crtc) {
@@ -3280,7 +3108,7 @@ static int i915_display_info(struct seq_file *m, void *unused)
 	drm_connector_list_iter_end(&conn_iter);
 	mutex_unlock(&dev->mode_config.mutex);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return 0;
 }
@@ -3289,23 +3117,24 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct intel_engine_cs *engine;
+	intel_wakeref_t wakeref;
 	enum intel_engine_id id;
 	struct drm_printer p;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	seq_printf(m, "GT awake? %s (epoch %u)\n",
 		   yesno(dev_priv->gt.awake), dev_priv->gt.epoch);
 	seq_printf(m, "Global active requests: %d\n",
 		   dev_priv->gt.active_requests);
 	seq_printf(m, "CS timestamp frequency: %u kHz\n",
-		   dev_priv->info.cs_timestamp_frequency_khz);
+		   RUNTIME_INFO(dev_priv)->cs_timestamp_frequency_khz);
 
 	p = drm_seq_file_printer(m);
 	for_each_engine(engine, dev_priv, id)
 		intel_engine_dump(engine, &p, "%s\n", engine->name);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return 0;
 }
@@ -3315,7 +3144,7 @@ static int i915_rcs_topology(struct seq_file *m, void *unused)
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct drm_printer p = drm_seq_file_printer(m);
 
-	intel_device_info_dump_topology(&INTEL_INFO(dev_priv)->sseu, &p);
+	intel_device_info_dump_topology(&RUNTIME_INFO(dev_priv)->sseu, &p);
 
 	return 0;
 }
@@ -3418,20 +3247,21 @@ static ssize_t i915_ipc_status_write(struct file *file, const char __user *ubuf,
 {
 	struct seq_file *m = file->private_data;
 	struct drm_i915_private *dev_priv = m->private;
-	int ret;
+	intel_wakeref_t wakeref;
 	bool enable;
+	int ret;
 
 	ret = kstrtobool_from_user(ubuf, len, &enable);
 	if (ret < 0)
 		return ret;
 
-	intel_runtime_pm_get(dev_priv);
-	if (!dev_priv->ipc_enabled && enable)
-		DRM_INFO("Enabling IPC: WM will be proper only after next commit\n");
-	dev_priv->wm.distrust_bios_wm = true;
-	dev_priv->ipc_enabled = enable;
-	intel_enable_ipc(dev_priv);
-	intel_runtime_pm_put(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		if (!dev_priv->ipc_enabled && enable)
+			DRM_INFO("Enabling IPC: WM will be proper only after next commit\n");
+		dev_priv->wm.distrust_bios_wm = true;
+		dev_priv->ipc_enabled = enable;
+		intel_enable_ipc(dev_priv);
+	}
 
 	return len;
 }
@@ -3799,7 +3629,7 @@ static int i915_displayport_test_type_show(struct seq_file *m, void *data)
 }
 DEFINE_SHOW_ATTRIBUTE(i915_displayport_test_type);
 
-static void wm_latency_show(struct seq_file *m, const uint16_t wm[8])
+static void wm_latency_show(struct seq_file *m, const u16 wm[8])
 {
 	struct drm_i915_private *dev_priv = m->private;
 	struct drm_device *dev = &dev_priv->drm;
@@ -3842,7 +3672,7 @@ static void wm_latency_show(struct seq_file *m, const uint16_t wm[8])
 static int pri_wm_latency_show(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = m->private;
-	const uint16_t *latencies;
+	const u16 *latencies;
 
 	if (INTEL_GEN(dev_priv) >= 9)
 		latencies = dev_priv->wm.skl_latency;
@@ -3857,7 +3687,7 @@ static int pri_wm_latency_show(struct seq_file *m, void *data)
 static int spr_wm_latency_show(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = m->private;
-	const uint16_t *latencies;
+	const u16 *latencies;
 
 	if (INTEL_GEN(dev_priv) >= 9)
 		latencies = dev_priv->wm.skl_latency;
@@ -3872,7 +3702,7 @@ static int spr_wm_latency_show(struct seq_file *m, void *data)
 static int cur_wm_latency_show(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = m->private;
-	const uint16_t *latencies;
+	const u16 *latencies;
 
 	if (INTEL_GEN(dev_priv) >= 9)
 		latencies = dev_priv->wm.skl_latency;
@@ -3898,7 +3728,7 @@ static int spr_wm_latency_open(struct inode *inode, struct file *file)
 {
 	struct drm_i915_private *dev_priv = inode->i_private;
 
-	if (HAS_GMCH_DISPLAY(dev_priv))
+	if (HAS_GMCH(dev_priv))
 		return -ENODEV;
 
 	return single_open(file, spr_wm_latency_show, dev_priv);
@@ -3908,19 +3738,19 @@ static int cur_wm_latency_open(struct inode *inode, struct file *file)
 {
 	struct drm_i915_private *dev_priv = inode->i_private;
 
-	if (HAS_GMCH_DISPLAY(dev_priv))
+	if (HAS_GMCH(dev_priv))
 		return -ENODEV;
 
 	return single_open(file, cur_wm_latency_show, dev_priv);
 }
 
 static ssize_t wm_latency_write(struct file *file, const char __user *ubuf,
-				size_t len, loff_t *offp, uint16_t wm[8])
+				size_t len, loff_t *offp, u16 wm[8])
 {
 	struct seq_file *m = file->private_data;
 	struct drm_i915_private *dev_priv = m->private;
 	struct drm_device *dev = &dev_priv->drm;
-	uint16_t new[8] = { 0 };
+	u16 new[8] = { 0 };
 	int num_levels;
 	int level;
 	int ret;
@@ -3965,7 +3795,7 @@ static ssize_t pri_wm_latency_write(struct file *file, const char __user *ubuf,
 {
 	struct seq_file *m = file->private_data;
 	struct drm_i915_private *dev_priv = m->private;
-	uint16_t *latencies;
+	u16 *latencies;
 
 	if (INTEL_GEN(dev_priv) >= 9)
 		latencies = dev_priv->wm.skl_latency;
@@ -3980,7 +3810,7 @@ static ssize_t spr_wm_latency_write(struct file *file, const char __user *ubuf,
 {
 	struct seq_file *m = file->private_data;
 	struct drm_i915_private *dev_priv = m->private;
-	uint16_t *latencies;
+	u16 *latencies;
 
 	if (INTEL_GEN(dev_priv) >= 9)
 		latencies = dev_priv->wm.skl_latency;
@@ -3995,7 +3825,7 @@ static ssize_t cur_wm_latency_write(struct file *file, const char __user *ubuf,
 {
 	struct seq_file *m = file->private_data;
 	struct drm_i915_private *dev_priv = m->private;
-	uint16_t *latencies;
+	u16 *latencies;
 
 	if (INTEL_GEN(dev_priv) >= 9)
 		latencies = dev_priv->wm.skl_latency;
@@ -4046,8 +3876,6 @@ static int
 i915_wedged_set(void *data, u64 val)
 {
 	struct drm_i915_private *i915 = data;
-	struct intel_engine_cs *engine;
-	unsigned int tmp;
 
 	/*
 	 * There is no safeguard against this debugfs entry colliding
@@ -4060,18 +3888,8 @@ i915_wedged_set(void *data, u64 val)
 	if (i915_reset_backoff(&i915->gpu_error))
 		return -EAGAIN;
 
-	for_each_engine_masked(engine, i915, val, tmp) {
-		engine->hangcheck.seqno = intel_engine_get_seqno(engine);
-		engine->hangcheck.stalled = true;
-	}
-
 	i915_handle_error(i915, val, I915_ERROR_CAPTURE,
 			  "Manually set wedged engine mask = %llx", val);
-
-	wait_on_bit(&i915->gpu_error.flags,
-		    I915_RESET_HANDOFF,
-		    TASK_UNINTERRUPTIBLE);
-
 	return 0;
 }
 
@@ -4079,94 +3897,6 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_wedged_fops,
 			i915_wedged_get, i915_wedged_set,
 			"%llu\n");
 
-static int
-fault_irq_set(struct drm_i915_private *i915,
-	      unsigned long *irq,
-	      unsigned long val)
-{
-	int err;
-
-	err = mutex_lock_interruptible(&i915->drm.struct_mutex);
-	if (err)
-		return err;
-
-	err = i915_gem_wait_for_idle(i915,
-				     I915_WAIT_LOCKED |
-				     I915_WAIT_INTERRUPTIBLE,
-				     MAX_SCHEDULE_TIMEOUT);
-	if (err)
-		goto err_unlock;
-
-	*irq = val;
-	mutex_unlock(&i915->drm.struct_mutex);
-
-	/* Flush idle worker to disarm irq */
-	drain_delayed_work(&i915->gt.idle_work);
-
-	return 0;
-
-err_unlock:
-	mutex_unlock(&i915->drm.struct_mutex);
-	return err;
-}
-
-static int
-i915_ring_missed_irq_get(void *data, u64 *val)
-{
-	struct drm_i915_private *dev_priv = data;
-
-	*val = dev_priv->gpu_error.missed_irq_rings;
-	return 0;
-}
-
-static int
-i915_ring_missed_irq_set(void *data, u64 val)
-{
-	struct drm_i915_private *i915 = data;
-
-	return fault_irq_set(i915, &i915->gpu_error.missed_irq_rings, val);
-}
-
-DEFINE_SIMPLE_ATTRIBUTE(i915_ring_missed_irq_fops,
-			i915_ring_missed_irq_get, i915_ring_missed_irq_set,
-			"0x%08llx\n");
-
-static int
-i915_ring_test_irq_get(void *data, u64 *val)
-{
-	struct drm_i915_private *dev_priv = data;
-
-	*val = dev_priv->gpu_error.test_irq_rings;
-
-	return 0;
-}
-
-static int
-i915_ring_test_irq_set(void *data, u64 val)
-{
-	struct drm_i915_private *i915 = data;
-
-	/* GuC keeps the user interrupt permanently enabled for submission */
-	if (USES_GUC_SUBMISSION(i915))
-		return -ENODEV;
-
-	/*
-	 * From icl, we can no longer individually mask interrupt generation
-	 * from each engine.
-	 */
-	if (INTEL_GEN(i915) >= 11)
-		return -ENODEV;
-
-	val &= INTEL_INFO(i915)->ring_mask;
-	DRM_DEBUG_DRIVER("Masking interrupts on rings 0x%08llx\n", val);
-
-	return fault_irq_set(i915, &i915->gpu_error.test_irq_rings, val);
-}
-
-DEFINE_SIMPLE_ATTRIBUTE(i915_ring_test_irq_fops,
-			i915_ring_test_irq_get, i915_ring_test_irq_set,
-			"0x%08llx\n");
-
 #define DROP_UNBOUND	BIT(0)
 #define DROP_BOUND	BIT(1)
 #define DROP_RETIRE	BIT(2)
@@ -4197,13 +3927,15 @@ static int
 i915_drop_caches_set(void *data, u64 val)
 {
 	struct drm_i915_private *i915 = data;
+	intel_wakeref_t wakeref;
 	int ret = 0;
 
 	DRM_DEBUG("Dropping caches: 0x%08llx [0x%08llx]\n",
 		  val, val & DROP_ALL);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
-	if (val & DROP_RESET_ACTIVE && !intel_engines_are_idle(i915))
+	if (val & DROP_RESET_ACTIVE &&
+	    wait_for(intel_engines_are_idle(i915), I915_IDLE_ENGINES_TIMEOUT))
 		i915_gem_set_wedged(i915);
 
 	/* No need to check and wait for gpu resets, only libdrm auto-restarts
@@ -4219,22 +3951,14 @@ i915_drop_caches_set(void *data, u64 val)
 						     I915_WAIT_LOCKED,
 						     MAX_SCHEDULE_TIMEOUT);
 
-		if (ret == 0 && val & DROP_RESET_SEQNO)
-			ret = i915_gem_set_global_seqno(&i915->drm, 1);
-
 		if (val & DROP_RETIRE)
 			i915_retire_requests(i915);
 
 		mutex_unlock(&i915->drm.struct_mutex);
 	}
 
-	if (val & DROP_RESET_ACTIVE &&
-	    i915_terminally_wedged(&i915->gpu_error)) {
+	if (val & DROP_RESET_ACTIVE && i915_terminally_wedged(&i915->gpu_error))
 		i915_handle_error(i915, ALL_ENGINES, 0, NULL);
-		wait_on_bit(&i915->gpu_error.flags,
-			    I915_RESET_HANDOFF,
-			    TASK_UNINTERRUPTIBLE);
-	}
 
 	fs_reclaim_acquire(GFP_KERNEL);
 	if (val & DROP_BOUND)
@@ -4259,7 +3983,7 @@ i915_drop_caches_set(void *data, u64 val)
 		i915_gem_drain_freed_objects(i915);
 
 out:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 
 	return ret;
 }
@@ -4272,16 +3996,14 @@ static int
 i915_cache_sharing_get(void *data, u64 *val)
 {
 	struct drm_i915_private *dev_priv = data;
-	u32 snpcr;
+	intel_wakeref_t wakeref;
+	u32 snpcr = 0;
 
-	if (!(IS_GEN6(dev_priv) || IS_GEN7(dev_priv)))
+	if (!(IS_GEN_RANGE(dev_priv, 6, 7)))
 		return -ENODEV;
 
-	intel_runtime_pm_get(dev_priv);
-
-	snpcr = I915_READ(GEN6_MBCUNIT_SNPCR);
-
-	intel_runtime_pm_put(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref)
+		snpcr = I915_READ(GEN6_MBCUNIT_SNPCR);
 
 	*val = (snpcr & GEN6_MBC_SNPCR_MASK) >> GEN6_MBC_SNPCR_SHIFT;
 
@@ -4292,24 +4014,25 @@ static int
 i915_cache_sharing_set(void *data, u64 val)
 {
 	struct drm_i915_private *dev_priv = data;
-	u32 snpcr;
+	intel_wakeref_t wakeref;
 
-	if (!(IS_GEN6(dev_priv) || IS_GEN7(dev_priv)))
+	if (!(IS_GEN_RANGE(dev_priv, 6, 7)))
 		return -ENODEV;
 
 	if (val > 3)
 		return -EINVAL;
 
-	intel_runtime_pm_get(dev_priv);
 	DRM_DEBUG_DRIVER("Manually setting uncore sharing to %llu\n", val);
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		u32 snpcr;
+
+		/* Update the cache sharing policy here as well */
+		snpcr = I915_READ(GEN6_MBCUNIT_SNPCR);
+		snpcr &= ~GEN6_MBC_SNPCR_MASK;
+		snpcr |= val << GEN6_MBC_SNPCR_SHIFT;
+		I915_WRITE(GEN6_MBCUNIT_SNPCR, snpcr);
+	}
 
-	/* Update the cache sharing policy here as well */
-	snpcr = I915_READ(GEN6_MBCUNIT_SNPCR);
-	snpcr &= ~GEN6_MBC_SNPCR_MASK;
-	snpcr |= (val << GEN6_MBC_SNPCR_SHIFT);
-	I915_WRITE(GEN6_MBCUNIT_SNPCR, snpcr);
-
-	intel_runtime_pm_put(dev_priv);
 	return 0;
 }
 
@@ -4354,7 +4077,7 @@ static void gen10_sseu_device_status(struct drm_i915_private *dev_priv,
 				     struct sseu_dev_info *sseu)
 {
 #define SS_MAX 6
-	const struct intel_device_info *info = INTEL_INFO(dev_priv);
+	const struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
 	u32 s_reg[SS_MAX], eu_reg[2 * SS_MAX], eu_mask[2];
 	int s, ss;
 
@@ -4410,7 +4133,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 				    struct sseu_dev_info *sseu)
 {
 #define SS_MAX 3
-	const struct intel_device_info *info = INTEL_INFO(dev_priv);
+	const struct intel_runtime_info *info = RUNTIME_INFO(dev_priv);
 	u32 s_reg[SS_MAX], eu_reg[2 * SS_MAX], eu_mask[2];
 	int s, ss;
 
@@ -4438,7 +4161,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv,
 
 		if (IS_GEN9_BC(dev_priv))
 			sseu->subslice_mask[s] =
-				INTEL_INFO(dev_priv)->sseu.subslice_mask[s];
+				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
 
 		for (ss = 0; ss < info->sseu.max_subslices; ss++) {
 			unsigned int eu_cnt;
@@ -4472,10 +4195,10 @@ static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
 
 	if (sseu->slice_mask) {
 		sseu->eu_per_subslice =
-				INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
+			RUNTIME_INFO(dev_priv)->sseu.eu_per_subslice;
 		for (s = 0; s < fls(sseu->slice_mask); s++) {
 			sseu->subslice_mask[s] =
-				INTEL_INFO(dev_priv)->sseu.subslice_mask[s];
+				RUNTIME_INFO(dev_priv)->sseu.subslice_mask[s];
 		}
 		sseu->eu_total = sseu->eu_per_subslice *
 				 sseu_subslice_total(sseu);
@@ -4483,7 +4206,7 @@ static void broadwell_sseu_device_status(struct drm_i915_private *dev_priv,
 		/* subtract fused off EU(s) from enabled slice(s) */
 		for (s = 0; s < fls(sseu->slice_mask); s++) {
 			u8 subslice_7eu =
-				INTEL_INFO(dev_priv)->sseu.subslice_7eu[s];
+				RUNTIME_INFO(dev_priv)->sseu.subslice_7eu[s];
 
 			sseu->eu_total -= hweight8(subslice_7eu);
 		}
@@ -4531,34 +4254,32 @@ static int i915_sseu_status(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
 	struct sseu_dev_info sseu;
+	intel_wakeref_t wakeref;
 
 	if (INTEL_GEN(dev_priv) < 8)
 		return -ENODEV;
 
 	seq_puts(m, "SSEU Device Info\n");
-	i915_print_sseu_info(m, true, &INTEL_INFO(dev_priv)->sseu);
+	i915_print_sseu_info(m, true, &RUNTIME_INFO(dev_priv)->sseu);
 
 	seq_puts(m, "SSEU Device Status\n");
 	memset(&sseu, 0, sizeof(sseu));
-	sseu.max_slices = INTEL_INFO(dev_priv)->sseu.max_slices;
-	sseu.max_subslices = INTEL_INFO(dev_priv)->sseu.max_subslices;
+	sseu.max_slices = RUNTIME_INFO(dev_priv)->sseu.max_slices;
+	sseu.max_subslices = RUNTIME_INFO(dev_priv)->sseu.max_subslices;
 	sseu.max_eus_per_subslice =
-		INTEL_INFO(dev_priv)->sseu.max_eus_per_subslice;
-
-	intel_runtime_pm_get(dev_priv);
-
-	if (IS_CHERRYVIEW(dev_priv)) {
-		cherryview_sseu_device_status(dev_priv, &sseu);
-	} else if (IS_BROADWELL(dev_priv)) {
-		broadwell_sseu_device_status(dev_priv, &sseu);
-	} else if (IS_GEN9(dev_priv)) {
-		gen9_sseu_device_status(dev_priv, &sseu);
-	} else if (INTEL_GEN(dev_priv) >= 10) {
-		gen10_sseu_device_status(dev_priv, &sseu);
+		RUNTIME_INFO(dev_priv)->sseu.max_eus_per_subslice;
+
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		if (IS_CHERRYVIEW(dev_priv))
+			cherryview_sseu_device_status(dev_priv, &sseu);
+		else if (IS_BROADWELL(dev_priv))
+			broadwell_sseu_device_status(dev_priv, &sseu);
+		else if (IS_GEN(dev_priv, 9))
+			gen9_sseu_device_status(dev_priv, &sseu);
+		else if (INTEL_GEN(dev_priv) >= 10)
+			gen10_sseu_device_status(dev_priv, &sseu);
 	}
 
-	intel_runtime_pm_put(dev_priv);
-
 	i915_print_sseu_info(m, false, &sseu);
 
 	return 0;
@@ -4571,7 +4292,7 @@ static int i915_forcewake_open(struct inode *inode, struct file *file)
 	if (INTEL_GEN(i915) < 6)
 		return 0;
 
-	intel_runtime_pm_get(i915);
+	file->private_data = (void *)(uintptr_t)intel_runtime_pm_get(i915);
 	intel_uncore_forcewake_user_get(i915);
 
 	return 0;
@@ -4585,7 +4306,8 @@ static int i915_forcewake_release(struct inode *inode, struct file *file)
 		return 0;
 
 	intel_uncore_forcewake_user_put(i915);
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915,
+			     (intel_wakeref_t)(uintptr_t)file->private_data);
 
 	return 0;
 }
@@ -4912,7 +4634,6 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_context_status", i915_context_status, 0},
 	{"i915_forcewake_domains", i915_forcewake_domains, 0},
 	{"i915_swizzle_info", i915_swizzle_info, 0},
-	{"i915_ppgtt_info", i915_ppgtt_info, 0},
 	{"i915_llc", i915_llc, 0},
 	{"i915_edp_psr_status", i915_edp_psr_status, 0},
 	{"i915_energy_uJ", i915_energy_uJ, 0},
@@ -4939,15 +4660,12 @@ static const struct i915_debugfs_files {
 } i915_debugfs_files[] = {
 	{"i915_wedged", &i915_wedged_fops},
 	{"i915_cache_sharing", &i915_cache_sharing_fops},
-	{"i915_ring_missed_irq", &i915_ring_missed_irq_fops},
-	{"i915_ring_test_irq", &i915_ring_test_irq_fops},
 	{"i915_gem_drop_caches", &i915_drop_caches_fops},
 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 	{"i915_error_state", &i915_error_state_fops},
 	{"i915_gpu_info", &i915_gpu_info_fops},
 #endif
 	{"i915_fifo_underrun_reset", &i915_fifo_underrun_reset_ops},
-	{"i915_next_seqno", &i915_next_seqno_fops},
 	{"i915_pri_wm_latency", &i915_pri_wm_latency_fops},
 	{"i915_spr_wm_latency", &i915_spr_wm_latency_fops},
 	{"i915_cur_wm_latency", &i915_cur_wm_latency_fops},
@@ -5020,7 +4738,7 @@ static int i915_dpcd_show(struct seq_file *m, void *data)
 	struct drm_connector *connector = m->private;
 	struct intel_dp *intel_dp =
 		enc_to_intel_dp(&intel_attached_encoder(connector)->base);
-	uint8_t buf[16];
+	u8 buf[16];
 	ssize_t err;
 	int i;
 
@@ -5094,6 +4812,105 @@ static int i915_hdcp_sink_capability_show(struct seq_file *m, void *data)
 }
 DEFINE_SHOW_ATTRIBUTE(i915_hdcp_sink_capability);
 
+static int i915_dsc_fec_support_show(struct seq_file *m, void *data)
+{
+	struct drm_connector *connector = m->private;
+	struct drm_device *dev = connector->dev;
+	struct drm_crtc *crtc;
+	struct intel_dp *intel_dp;
+	struct drm_modeset_acquire_ctx ctx;
+	struct intel_crtc_state *crtc_state = NULL;
+	int ret = 0;
+	bool try_again = false;
+
+	drm_modeset_acquire_init(&ctx, DRM_MODESET_ACQUIRE_INTERRUPTIBLE);
+
+	do {
+		try_again = false;
+		ret = drm_modeset_lock(&dev->mode_config.connection_mutex,
+				       &ctx);
+		if (ret) {
+			ret = -EINTR;
+			break;
+		}
+		crtc = connector->state->crtc;
+		if (connector->status != connector_status_connected || !crtc) {
+			ret = -ENODEV;
+			break;
+		}
+		ret = drm_modeset_lock(&crtc->mutex, &ctx);
+		if (ret == -EDEADLK) {
+			ret = drm_modeset_backoff(&ctx);
+			if (!ret) {
+				try_again = true;
+				continue;
+			}
+			break;
+		} else if (ret) {
+			break;
+		}
+		intel_dp = enc_to_intel_dp(&intel_attached_encoder(connector)->base);
+		crtc_state = to_intel_crtc_state(crtc->state);
+		seq_printf(m, "DSC_Enabled: %s\n",
+			   yesno(crtc_state->dsc_params.compression_enable));
+		seq_printf(m, "DSC_Sink_Support: %s\n",
+			   yesno(drm_dp_sink_supports_dsc(intel_dp->dsc_dpcd)));
+		if (!intel_dp_is_edp(intel_dp))
+			seq_printf(m, "FEC_Sink_Support: %s\n",
+				   yesno(drm_dp_sink_supports_fec(intel_dp->fec_capable)));
+	} while (try_again);
+
+	drm_modeset_drop_locks(&ctx);
+	drm_modeset_acquire_fini(&ctx);
+
+	return ret;
+}
+
+static ssize_t i915_dsc_fec_support_write(struct file *file,
+					  const char __user *ubuf,
+					  size_t len, loff_t *offp)
+{
+	bool dsc_enable = false;
+	int ret;
+	struct drm_connector *connector =
+		((struct seq_file *)file->private_data)->private;
+	struct intel_encoder *encoder = intel_attached_encoder(connector);
+	struct intel_dp *intel_dp = enc_to_intel_dp(&encoder->base);
+
+	if (len == 0)
+		return 0;
+
+	DRM_DEBUG_DRIVER("Copied %zu bytes from user to force DSC\n",
+			 len);
+
+	ret = kstrtobool_from_user(ubuf, len, &dsc_enable);
+	if (ret < 0)
+		return ret;
+
+	DRM_DEBUG_DRIVER("Got %s for DSC Enable\n",
+			 (dsc_enable) ? "true" : "false");
+	intel_dp->force_dsc_en = dsc_enable;
+
+	*offp += len;
+	return len;
+}
+
+static int i915_dsc_fec_support_open(struct inode *inode,
+				     struct file *file)
+{
+	return single_open(file, i915_dsc_fec_support_show,
+			   inode->i_private);
+}
+
+static const struct file_operations i915_dsc_fec_support_fops = {
+	.owner = THIS_MODULE,
+	.open = i915_dsc_fec_support_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
+	.write = i915_dsc_fec_support_write
+};
+
 /**
  * i915_debugfs_connector_add - add i915 specific connector debugfs files
  * @connector: pointer to a registered drm_connector
@@ -5106,6 +4923,7 @@ DEFINE_SHOW_ATTRIBUTE(i915_hdcp_sink_capability);
 int i915_debugfs_connector_add(struct drm_connector *connector)
 {
 	struct dentry *root = connector->debugfs_entry;
+	struct drm_i915_private *dev_priv = to_i915(connector->dev);
 
 	/* The connector must have been registered beforehands. */
 	if (!root)
@@ -5130,5 +4948,11 @@ int i915_debugfs_connector_add(struct drm_connector *connector)
 				    connector, &i915_hdcp_sink_capability_fops);
 	}
 
+	if (INTEL_GEN(dev_priv) >= 10 &&
+	    (connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
+	     connector->connector_type == DRM_MODE_CONNECTOR_eDP))
+		debugfs_create_file("i915_dsc_fec_support", S_IRUGO, root,
+				    connector, &i915_dsc_fec_support_fops);
+
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index b310a897a4ad..6630212f2faf 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -41,14 +41,16 @@
 #include <linux/vt.h>
 #include <acpi/video.h>
 
-#include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_ioctl.h>
+#include <drm/drm_irq.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/i915_drm.h>
 
 #include "i915_drv.h"
 #include "i915_trace.h"
 #include "i915_pmu.h"
+#include "i915_reset.h"
 #include "i915_query.h"
 #include "i915_vgpu.h"
 #include "intel_drv.h"
@@ -132,15 +134,15 @@ intel_pch_type(const struct drm_i915_private *dev_priv, unsigned short id)
 	switch (id) {
 	case INTEL_PCH_IBX_DEVICE_ID_TYPE:
 		DRM_DEBUG_KMS("Found Ibex Peak PCH\n");
-		WARN_ON(!IS_GEN5(dev_priv));
+		WARN_ON(!IS_GEN(dev_priv, 5));
 		return PCH_IBX;
 	case INTEL_PCH_CPT_DEVICE_ID_TYPE:
 		DRM_DEBUG_KMS("Found CougarPoint PCH\n");
-		WARN_ON(!IS_GEN6(dev_priv) && !IS_IVYBRIDGE(dev_priv));
+		WARN_ON(!IS_GEN(dev_priv, 6) && !IS_IVYBRIDGE(dev_priv));
 		return PCH_CPT;
 	case INTEL_PCH_PPT_DEVICE_ID_TYPE:
 		DRM_DEBUG_KMS("Found PantherPoint PCH\n");
-		WARN_ON(!IS_GEN6(dev_priv) && !IS_IVYBRIDGE(dev_priv));
+		WARN_ON(!IS_GEN(dev_priv, 6) && !IS_IVYBRIDGE(dev_priv));
 		/* PantherPoint is CPT compatible */
 		return PCH_CPT;
 	case INTEL_PCH_LPT_DEVICE_ID_TYPE:
@@ -217,9 +219,9 @@ intel_virt_detect_pch(const struct drm_i915_private *dev_priv)
 	 * make an educated guess as to which PCH is really there.
 	 */
 
-	if (IS_GEN5(dev_priv))
+	if (IS_GEN(dev_priv, 5))
 		id = INTEL_PCH_IBX_DEVICE_ID_TYPE;
-	else if (IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv))
+	else if (IS_GEN(dev_priv, 6) || IS_IVYBRIDGE(dev_priv))
 		id = INTEL_PCH_CPT_DEVICE_ID_TYPE;
 	else if (IS_HSW_ULT(dev_priv) || IS_BDW_ULT(dev_priv))
 		id = INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
@@ -349,7 +351,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = min_t(int, INTEL_PPGTT(dev_priv), I915_GEM_PPGTT_FULL);
 		break;
 	case I915_PARAM_HAS_SEMAPHORES:
-		value = HAS_LEGACY_SEMAPHORES(dev_priv);
+		value = 0;
 		break;
 	case I915_PARAM_HAS_SECURE_BATCHES:
 		value = capable(CAP_SYS_ADMIN);
@@ -358,12 +360,12 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = i915_cmd_parser_get_version(dev_priv);
 		break;
 	case I915_PARAM_SUBSLICE_TOTAL:
-		value = sseu_subslice_total(&INTEL_INFO(dev_priv)->sseu);
+		value = sseu_subslice_total(&RUNTIME_INFO(dev_priv)->sseu);
 		if (!value)
 			return -ENODEV;
 		break;
 	case I915_PARAM_EU_TOTAL:
-		value = INTEL_INFO(dev_priv)->sseu.eu_total;
+		value = RUNTIME_INFO(dev_priv)->sseu.eu_total;
 		if (!value)
 			return -ENODEV;
 		break;
@@ -380,7 +382,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = HAS_POOLED_EU(dev_priv);
 		break;
 	case I915_PARAM_MIN_EU_IN_POOL:
-		value = INTEL_INFO(dev_priv)->sseu.min_eu_in_pool;
+		value = RUNTIME_INFO(dev_priv)->sseu.min_eu_in_pool;
 		break;
 	case I915_PARAM_HUC_STATUS:
 		value = intel_huc_check_status(&dev_priv->huc);
@@ -430,17 +432,17 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = intel_engines_has_context_isolation(dev_priv);
 		break;
 	case I915_PARAM_SLICE_MASK:
-		value = INTEL_INFO(dev_priv)->sseu.slice_mask;
+		value = RUNTIME_INFO(dev_priv)->sseu.slice_mask;
 		if (!value)
 			return -ENODEV;
 		break;
 	case I915_PARAM_SUBSLICE_MASK:
-		value = INTEL_INFO(dev_priv)->sseu.subslice_mask[0];
+		value = RUNTIME_INFO(dev_priv)->sseu.subslice_mask[0];
 		if (!value)
 			return -ENODEV;
 		break;
 	case I915_PARAM_CS_TIMESTAMP_FREQUENCY:
-		value = 1000 * INTEL_INFO(dev_priv)->cs_timestamp_frequency_khz;
+		value = 1000 * RUNTIME_INFO(dev_priv)->cs_timestamp_frequency_khz;
 		break;
 	case I915_PARAM_MMAP_GTT_COHERENT:
 		value = INTEL_INFO(dev_priv)->has_coherent_ggtt;
@@ -906,6 +908,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv)
 	mutex_init(&dev_priv->pps_mutex);
 
 	i915_memcpy_init_early(dev_priv);
+	intel_runtime_pm_init_early(dev_priv);
 
 	ret = i915_workqueues_init(dev_priv);
 	if (ret < 0)
@@ -966,7 +969,7 @@ static int i915_mmio_setup(struct drm_i915_private *dev_priv)
 	int mmio_bar;
 	int mmio_size;
 
-	mmio_bar = IS_GEN2(dev_priv) ? 1 : 0;
+	mmio_bar = IS_GEN(dev_priv, 2) ? 1 : 0;
 	/*
 	 * Before gen4, the registers and the GTT are behind different BARs.
 	 * However, from gen4 onwards, the registers and the GTT are shared
@@ -1341,7 +1344,7 @@ intel_get_dram_info(struct drm_i915_private *dev_priv)
 	/* Need to calculate bandwidth only for Gen9 */
 	if (IS_BROXTON(dev_priv))
 		ret = bxt_get_dram_info(dev_priv);
-	else if (IS_GEN9(dev_priv))
+	else if (IS_GEN(dev_priv, 9))
 		ret = skl_get_dram_info(dev_priv);
 	else
 		ret = skl_dram_get_channels_info(dev_priv);
@@ -1374,7 +1377,7 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 	if (i915_inject_load_failure())
 		return -ENODEV;
 
-	intel_device_info_runtime_init(mkwrite_device_info(dev_priv));
+	intel_device_info_runtime_init(dev_priv);
 
 	if (HAS_PPGTT(dev_priv)) {
 		if (intel_vgpu_active(dev_priv) &&
@@ -1436,7 +1439,7 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 	pci_set_master(pdev);
 
 	/* overlay on gen2 is broken and can't address above 1G */
-	if (IS_GEN2(dev_priv)) {
+	if (IS_GEN(dev_priv, 2)) {
 		ret = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(30));
 		if (ret) {
 			DRM_ERROR("failed to set DMA mask\n");
@@ -1574,7 +1577,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
 		acpi_video_register();
 	}
 
-	if (IS_GEN5(dev_priv))
+	if (IS_GEN(dev_priv, 5))
 		intel_gpu_ips_init(dev_priv);
 
 	intel_audio_init(dev_priv);
@@ -1636,8 +1639,14 @@ static void i915_welcome_messages(struct drm_i915_private *dev_priv)
 	if (drm_debug & DRM_UT_DRIVER) {
 		struct drm_printer p = drm_debug_printer("i915 device info:");
 
-		intel_device_info_dump(&dev_priv->info, &p);
-		intel_device_info_dump_runtime(&dev_priv->info, &p);
+		drm_printf(&p, "pciid=0x%04x rev=0x%02x platform=%s gen=%i\n",
+			   INTEL_DEVID(dev_priv),
+			   INTEL_REVID(dev_priv),
+			   intel_platform_name(INTEL_INFO(dev_priv)->platform),
+			   INTEL_GEN(dev_priv));
+
+		intel_device_info_dump_flags(INTEL_INFO(dev_priv), &p);
+		intel_device_info_dump_runtime(RUNTIME_INFO(dev_priv), &p);
 	}
 
 	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG))
@@ -1674,7 +1683,7 @@ i915_driver_create(struct pci_dev *pdev, const struct pci_device_id *ent)
 	/* Setup the write-once "constant" device info */
 	device_info = mkwrite_device_info(i915);
 	memcpy(device_info, match_info, sizeof(*device_info));
-	device_info->device_id = pdev->device;
+	RUNTIME_INFO(i915)->device_id = pdev->device;
 
 	BUILD_BUG_ON(INTEL_MAX_PLATFORMS >
 		     BITS_PER_TYPE(device_info->platform_mask));
@@ -1774,6 +1783,9 @@ void i915_driver_unload(struct drm_device *dev)
 
 	i915_driver_unregister(dev_priv);
 
+	/* Flush any external code that still may be under the RCU lock */
+	synchronize_rcu();
+
 	if (i915_gem_suspend(dev_priv))
 		DRM_ERROR("failed to idle hardware; continuing to unload!\n");
 
@@ -1802,8 +1814,7 @@ void i915_driver_unload(struct drm_device *dev)
 	i915_driver_cleanup_mmio(dev_priv);
 
 	enable_rpm_wakeref_asserts(dev_priv);
-
-	WARN_ON(atomic_read(&dev_priv->runtime_pm.wakeref_count));
+	intel_runtime_pm_cleanup(dev_priv);
 }
 
 static void i915_driver_release(struct drm_device *dev)
@@ -2005,6 +2016,8 @@ static int i915_drm_suspend_late(struct drm_device *dev, bool hibernation)
 
 out:
 	enable_rpm_wakeref_asserts(dev_priv);
+	if (!dev_priv->uncore.user_forcewake.count)
+		intel_runtime_pm_cleanup(dev_priv);
 
 	return ret;
 }
@@ -2174,7 +2187,7 @@ static int i915_drm_resume_early(struct drm_device *dev)
 
 	intel_power_domains_resume(dev_priv);
 
-	intel_engines_sanitize(dev_priv);
+	intel_engines_sanitize(dev_priv, true);
 
 	enable_rpm_wakeref_asserts(dev_priv);
 
@@ -2195,210 +2208,6 @@ static int i915_resume_switcheroo(struct drm_device *dev)
 	return i915_drm_resume(dev);
 }
 
-/**
- * i915_reset - reset chip after a hang
- * @i915: #drm_i915_private to reset
- * @stalled_mask: mask of the stalled engines with the guilty requests
- * @reason: user error message for why we are resetting
- *
- * Reset the chip.  Useful if a hang is detected. Marks the device as wedged
- * on failure.
- *
- * Caller must hold the struct_mutex.
- *
- * Procedure is fairly simple:
- *   - reset the chip using the reset reg
- *   - re-init context state
- *   - re-init hardware status page
- *   - re-init ring buffer
- *   - re-init interrupt state
- *   - re-init display
- */
-void i915_reset(struct drm_i915_private *i915,
-		unsigned int stalled_mask,
-		const char *reason)
-{
-	struct i915_gpu_error *error = &i915->gpu_error;
-	int ret;
-	int i;
-
-	GEM_TRACE("flags=%lx\n", error->flags);
-
-	might_sleep();
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
-
-	if (!test_bit(I915_RESET_HANDOFF, &error->flags))
-		return;
-
-	/* Clear any previous failed attempts at recovery. Time to try again. */
-	if (!i915_gem_unset_wedged(i915))
-		goto wakeup;
-
-	if (reason)
-		dev_notice(i915->drm.dev, "Resetting chip for %s\n", reason);
-	error->reset_count++;
-
-	ret = i915_gem_reset_prepare(i915);
-	if (ret) {
-		dev_err(i915->drm.dev, "GPU recovery failed\n");
-		goto taint;
-	}
-
-	if (!intel_has_gpu_reset(i915)) {
-		if (i915_modparams.reset)
-			dev_err(i915->drm.dev, "GPU reset not supported\n");
-		else
-			DRM_DEBUG_DRIVER("GPU reset disabled\n");
-		goto error;
-	}
-
-	for (i = 0; i < 3; i++) {
-		ret = intel_gpu_reset(i915, ALL_ENGINES);
-		if (ret == 0)
-			break;
-
-		msleep(100);
-	}
-	if (ret) {
-		dev_err(i915->drm.dev, "Failed to reset chip\n");
-		goto taint;
-	}
-
-	/* Ok, now get things going again... */
-
-	/*
-	 * Everything depends on having the GTT running, so we need to start
-	 * there.
-	 */
-	ret = i915_ggtt_enable_hw(i915);
-	if (ret) {
-		DRM_ERROR("Failed to re-enable GGTT following reset (%d)\n",
-			  ret);
-		goto error;
-	}
-
-	i915_gem_reset(i915, stalled_mask);
-	intel_overlay_reset(i915);
-
-	/*
-	 * Next we need to restore the context, but we don't use those
-	 * yet either...
-	 *
-	 * Ring buffer needs to be re-initialized in the KMS case, or if X
-	 * was running at the time of the reset (i.e. we weren't VT
-	 * switched away).
-	 */
-	ret = i915_gem_init_hw(i915);
-	if (ret) {
-		DRM_ERROR("Failed to initialise HW following reset (%d)\n",
-			  ret);
-		goto error;
-	}
-
-	i915_queue_hangcheck(i915);
-
-finish:
-	i915_gem_reset_finish(i915);
-wakeup:
-	clear_bit(I915_RESET_HANDOFF, &error->flags);
-	wake_up_bit(&error->flags, I915_RESET_HANDOFF);
-	return;
-
-taint:
-	/*
-	 * History tells us that if we cannot reset the GPU now, we
-	 * never will. This then impacts everything that is run
-	 * subsequently. On failing the reset, we mark the driver
-	 * as wedged, preventing further execution on the GPU.
-	 * We also want to go one step further and add a taint to the
-	 * kernel so that any subsequent faults can be traced back to
-	 * this failure. This is important for CI, where if the
-	 * GPU/driver fails we would like to reboot and restart testing
-	 * rather than continue on into oblivion. For everyone else,
-	 * the system should still plod along, but they have been warned!
-	 */
-	add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-error:
-	i915_gem_set_wedged(i915);
-	i915_retire_requests(i915);
-	goto finish;
-}
-
-static inline int intel_gt_reset_engine(struct drm_i915_private *dev_priv,
-					struct intel_engine_cs *engine)
-{
-	return intel_gpu_reset(dev_priv, intel_engine_flag(engine));
-}
-
-/**
- * i915_reset_engine - reset GPU engine to recover from a hang
- * @engine: engine to reset
- * @msg: reason for GPU reset; or NULL for no dev_notice()
- *
- * Reset a specific GPU engine. Useful if a hang is detected.
- * Returns zero on successful reset or otherwise an error code.
- *
- * Procedure is:
- *  - identifies the request that caused the hang and it is dropped
- *  - reset engine (which will force the engine to idle)
- *  - re-init/configure engine
- */
-int i915_reset_engine(struct intel_engine_cs *engine, const char *msg)
-{
-	struct i915_gpu_error *error = &engine->i915->gpu_error;
-	struct i915_request *active_request;
-	int ret;
-
-	GEM_TRACE("%s flags=%lx\n", engine->name, error->flags);
-	GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
-
-	active_request = i915_gem_reset_prepare_engine(engine);
-	if (IS_ERR_OR_NULL(active_request)) {
-		/* Either the previous reset failed, or we pardon the reset. */
-		ret = PTR_ERR(active_request);
-		goto out;
-	}
-
-	if (msg)
-		dev_notice(engine->i915->drm.dev,
-			   "Resetting %s for %s\n", engine->name, msg);
-	error->reset_engine_count[engine->id]++;
-
-	if (!engine->i915->guc.execbuf_client)
-		ret = intel_gt_reset_engine(engine->i915, engine);
-	else
-		ret = intel_guc_reset_engine(&engine->i915->guc, engine);
-	if (ret) {
-		/* If we fail here, we expect to fallback to a global reset */
-		DRM_DEBUG_DRIVER("%sFailed to reset %s, ret=%d\n",
-				 engine->i915->guc.execbuf_client ? "GuC " : "",
-				 engine->name, ret);
-		goto out;
-	}
-
-	/*
-	 * The request that caused the hang is stuck on elsp, we know the
-	 * active request and can drop it, adjust head to skip the offending
-	 * request to resume executing remaining requests in the queue.
-	 */
-	i915_gem_reset_engine(engine, active_request, true);
-
-	/*
-	 * The engine and its registers (and workarounds in case of render)
-	 * have been reset to their default values. Follow the init_ring
-	 * process to program RING_MODE, HWSP and re-enable submission.
-	 */
-	ret = engine->init_hw(engine);
-	if (ret)
-		goto out;
-
-out:
-	intel_engine_cancel_stop_cs(engine);
-	i915_gem_reset_finish_engine(engine);
-	return ret;
-}
-
 static int i915_pm_prepare(struct device *kdev)
 {
 	struct pci_dev *pdev = to_pci_dev(kdev);
@@ -2736,6 +2545,10 @@ static void vlv_restore_gunit_s0ix_state(struct drm_i915_private *dev_priv)
 static int vlv_wait_for_pw_status(struct drm_i915_private *dev_priv,
 				  u32 mask, u32 val)
 {
+	i915_reg_t reg = VLV_GTLC_PW_STATUS;
+	u32 reg_value;
+	int ret;
+
 	/* The HW does not like us polling for PW_STATUS frequently, so
 	 * use the sleeping loop rather than risk the busy spin within
 	 * intel_wait_for_register().
@@ -2743,8 +2556,12 @@ static int vlv_wait_for_pw_status(struct drm_i915_private *dev_priv,
 	 * Transitioning between RC6 states should be at most 2ms (see
 	 * valleyview_enable_rps) so use a 3ms timeout.
 	 */
-	return wait_for((I915_READ_NOTRACE(VLV_GTLC_PW_STATUS) & mask) == val,
-			3);
+	ret = wait_for(((reg_value = I915_READ_NOTRACE(reg)) & mask) == val, 3);
+
+	/* just trace the final value */
+	trace_i915_reg_rw(false, reg, reg_value, sizeof(reg_value), true);
+
+	return ret;
 }
 
 int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool force_on)
@@ -2959,7 +2776,7 @@ static int intel_runtime_suspend(struct device *kdev)
 	}
 
 	enable_rpm_wakeref_asserts(dev_priv);
-	WARN_ON_ONCE(atomic_read(&dev_priv->runtime_pm.wakeref_count));
+	intel_runtime_pm_cleanup(dev_priv);
 
 	if (intel_uncore_arm_unclaimed_mmio_detection(dev_priv))
 		DRM_ERROR("Unclaimed access detected prior to suspending\n");
@@ -3203,7 +3020,7 @@ static struct drm_driver driver = {
 	 * deal with them for Intel hardware.
 	 */
 	.driver_features =
-	    DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM | DRIVER_PRIME |
+	    DRIVER_GEM | DRIVER_PRIME |
 	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_SYNCOBJ,
 	.release = i915_driver_release,
 	.open = i915_driver_open,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b1c31967194b..9adc7bb9e69c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -45,8 +45,8 @@
 #include <linux/pm_qos.h>
 #include <linux/reservation.h>
 #include <linux/shmem_fs.h>
+#include <linux/stackdepot.h>
 
-#include <drm/drmP.h>
 #include <drm/intel-gtt.h>
 #include <drm/drm_legacy.h> /* for struct drm_dma_handle */
 #include <drm/drm_gem.h>
@@ -54,6 +54,7 @@
 #include <drm/drm_cache.h>
 #include <drm/drm_util.h>
 #include <drm/drm_dsc.h>
+#include <drm/drm_connector.h>
 
 #include "i915_fixed.h"
 #include "i915_params.h"
@@ -90,8 +91,8 @@
 
 #define DRIVER_NAME		"i915"
 #define DRIVER_DESC		"Intel Graphics"
-#define DRIVER_DATE		"20181204"
-#define DRIVER_TIMESTAMP	1543944377
+#define DRIVER_DATE		"20190207"
+#define DRIVER_TIMESTAMP	1549572331
 
 /* Use I915_STATE_WARN(x) and I915_STATE_WARN_ON() (rather than WARN() and
  * WARN_ON()) for hw state sanity checks to check for unexpected conditions
@@ -130,6 +131,8 @@ bool i915_error_injected(void);
 	__i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
 		      fmt, ##__VA_ARGS__)
 
+typedef depot_stack_handle_t intel_wakeref_t;
+
 enum hpd_pin {
 	HPD_NONE = 0,
 	HPD_TV = HPD_NONE,     /* TV is known to be unreliable */
@@ -281,16 +284,14 @@ struct drm_i915_display_funcs {
 	int (*get_fifo_size)(struct drm_i915_private *dev_priv,
 			     enum i9xx_plane_id i9xx_plane);
 	int (*compute_pipe_wm)(struct intel_crtc_state *cstate);
-	int (*compute_intermediate_wm)(struct drm_device *dev,
-				       struct intel_crtc *intel_crtc,
-				       struct intel_crtc_state *newstate);
+	int (*compute_intermediate_wm)(struct intel_crtc_state *newstate);
 	void (*initial_watermarks)(struct intel_atomic_state *state,
 				   struct intel_crtc_state *cstate);
 	void (*atomic_update_watermarks)(struct intel_atomic_state *state,
 					 struct intel_crtc_state *cstate);
 	void (*optimize_watermarks)(struct intel_atomic_state *state,
 				    struct intel_crtc_state *cstate);
-	int (*compute_global_watermarks)(struct drm_atomic_state *state);
+	int (*compute_global_watermarks)(struct intel_atomic_state *state);
 	void (*update_wm)(struct intel_crtc *crtc);
 	int (*modeset_calc_cdclk)(struct drm_atomic_state *state);
 	/* Returns the active state of the crtc, and if the crtc is active,
@@ -322,8 +323,20 @@ struct drm_i915_display_funcs {
 	/* display clock increase/decrease */
 	/* pll clock increase/decrease */
 
-	void (*load_csc_matrix)(struct drm_crtc_state *crtc_state);
-	void (*load_luts)(struct drm_crtc_state *crtc_state);
+	/*
+	 * Program double buffered color management registers during
+	 * vblank evasion. The registers should then latch during the
+	 * next vblank start, alongside any other double buffered registers
+	 * involved with the same commit.
+	 */
+	void (*color_commit)(const struct intel_crtc_state *crtc_state);
+	/*
+	 * Load LUTs (and other single buffered color management
+	 * registers). Will (hopefully) be called during the vblank
+	 * following the latching of any double buffered registers
+	 * involved with the same commit.
+	 */
+	void (*load_luts)(const struct intel_crtc_state *crtc_state);
 };
 
 #define CSR_VERSION(major, minor)	((major) << 16 | (minor))
@@ -333,16 +346,17 @@ struct drm_i915_display_funcs {
 struct intel_csr {
 	struct work_struct work;
 	const char *fw_path;
-	uint32_t required_version;
-	uint32_t max_fw_size; /* bytes */
-	uint32_t *dmc_payload;
-	uint32_t dmc_fw_size; /* dwords */
-	uint32_t version;
-	uint32_t mmio_count;
+	u32 required_version;
+	u32 max_fw_size; /* bytes */
+	u32 *dmc_payload;
+	u32 dmc_fw_size; /* dwords */
+	u32 version;
+	u32 mmio_count;
 	i915_reg_t mmioaddr[8];
-	uint32_t mmiodata[8];
-	uint32_t dc_state;
-	uint32_t allowed_dc_mask;
+	u32 mmiodata[8];
+	u32 dc_state;
+	u32 allowed_dc_mask;
+	intel_wakeref_t wakeref;
 };
 
 enum i915_cache_level {
@@ -398,7 +412,7 @@ struct intel_fbc {
 
 		struct {
 			unsigned int mode_flags;
-			uint32_t hsw_bdw_pixel_rate;
+			u32 hsw_bdw_pixel_rate;
 		} crtc;
 
 		struct {
@@ -417,7 +431,7 @@ struct intel_fbc {
 
 			int y;
 
-			uint16_t pixel_blend_mode;
+			u16 pixel_blend_mode;
 		} plane;
 
 		struct {
@@ -509,6 +523,7 @@ struct i915_psr {
 	ktime_t last_exit;
 	bool sink_not_reliable;
 	bool irq_aux_error;
+	u16 su_x_granularity;
 };
 
 enum intel_pch {
@@ -556,7 +571,7 @@ struct i915_suspend_saved_registers {
 	u32 saveSWF0[16];
 	u32 saveSWF1[16];
 	u32 saveSWF3[3];
-	uint64_t saveFENCE[I915_MAX_NUM_FENCES];
+	u64 saveFENCE[I915_MAX_NUM_FENCES];
 	u32 savePCH_PORT_HOTPLUG;
 	u16 saveGCDGMBUS;
 };
@@ -819,6 +834,8 @@ struct i915_power_domains {
 	bool display_core_suspended;
 	int power_well_count;
 
+	intel_wakeref_t wakeref;
+
 	struct mutex lock;
 	int domain_use_count[POWER_DOMAIN_NUM];
 	struct i915_power_well *power_wells;
@@ -901,9 +918,9 @@ struct i915_gem_mm {
 	atomic_t bsd_engine_dispatch_index;
 
 	/** Bit 6 swizzling required for X tiling */
-	uint32_t bit_6_swizzle_x;
+	u32 bit_6_swizzle_x;
 	/** Bit 6 swizzling required for Y tiling */
-	uint32_t bit_6_swizzle_y;
+	u32 bit_6_swizzle_y;
 
 	/* accounting, useful for userland debugging */
 	spinlock_t object_stat_lock;
@@ -930,18 +947,20 @@ struct ddi_vbt_port_info {
 	 * populate this field.
 	 */
 #define HDMI_LEVEL_SHIFT_UNKNOWN	0xff
-	uint8_t hdmi_level_shift;
+	u8 hdmi_level_shift;
 
-	uint8_t supports_dvi:1;
-	uint8_t supports_hdmi:1;
-	uint8_t supports_dp:1;
-	uint8_t supports_edp:1;
+	u8 supports_dvi:1;
+	u8 supports_hdmi:1;
+	u8 supports_dp:1;
+	u8 supports_edp:1;
+	u8 supports_typec_usb:1;
+	u8 supports_tbt:1;
 
-	uint8_t alternate_aux_channel;
-	uint8_t alternate_ddc_pin;
+	u8 alternate_aux_channel;
+	u8 alternate_ddc_pin;
 
-	uint8_t dp_boost_level;
-	uint8_t hdmi_boost_level;
+	u8 dp_boost_level;
+	u8 hdmi_boost_level;
 	int dp_max_link_rate;		/* 0 for not limited by VBT */
 };
 
@@ -1032,41 +1051,41 @@ enum intel_ddb_partitioning {
 
 struct intel_wm_level {
 	bool enable;
-	uint32_t pri_val;
-	uint32_t spr_val;
-	uint32_t cur_val;
-	uint32_t fbc_val;
+	u32 pri_val;
+	u32 spr_val;
+	u32 cur_val;
+	u32 fbc_val;
 };
 
 struct ilk_wm_values {
-	uint32_t wm_pipe[3];
-	uint32_t wm_lp[3];
-	uint32_t wm_lp_spr[3];
-	uint32_t wm_linetime[3];
+	u32 wm_pipe[3];
+	u32 wm_lp[3];
+	u32 wm_lp_spr[3];
+	u32 wm_linetime[3];
 	bool enable_fbc_wm;
 	enum intel_ddb_partitioning partitioning;
 };
 
 struct g4x_pipe_wm {
-	uint16_t plane[I915_MAX_PLANES];
-	uint16_t fbc;
+	u16 plane[I915_MAX_PLANES];
+	u16 fbc;
 };
 
 struct g4x_sr_wm {
-	uint16_t plane;
-	uint16_t cursor;
-	uint16_t fbc;
+	u16 plane;
+	u16 cursor;
+	u16 fbc;
 };
 
 struct vlv_wm_ddl_values {
-	uint8_t plane[I915_MAX_PLANES];
+	u8 plane[I915_MAX_PLANES];
 };
 
 struct vlv_wm_values {
 	struct g4x_pipe_wm pipe[3];
 	struct g4x_sr_wm sr;
 	struct vlv_wm_ddl_values ddl[3];
-	uint8_t level;
+	u8 level;
 	bool cxsr;
 };
 
@@ -1080,10 +1099,10 @@ struct g4x_wm_values {
 };
 
 struct skl_ddb_entry {
-	uint16_t start, end;	/* in number of blocks, 'end' is exclusive */
+	u16 start, end;	/* in number of blocks, 'end' is exclusive */
 };
 
-static inline uint16_t skl_ddb_entry_size(const struct skl_ddb_entry *entry)
+static inline u16 skl_ddb_entry_size(const struct skl_ddb_entry *entry)
 {
 	return entry->end - entry->start;
 }
@@ -1107,8 +1126,9 @@ struct skl_ddb_values {
 };
 
 struct skl_wm_level {
-	uint16_t plane_res_b;
-	uint8_t plane_res_l;
+	u16 min_ddb_alloc;
+	u16 plane_res_b;
+	u8 plane_res_l;
 	bool plane_en;
 };
 
@@ -1117,15 +1137,15 @@ struct skl_wm_params {
 	bool x_tiled, y_tiled;
 	bool rc_surface;
 	bool is_planar;
-	uint32_t width;
-	uint8_t cpp;
-	uint32_t plane_pixel_rate;
-	uint32_t y_min_scanlines;
-	uint32_t plane_bytes_per_line;
+	u32 width;
+	u8 cpp;
+	u32 plane_pixel_rate;
+	u32 y_min_scanlines;
+	u32 plane_bytes_per_line;
 	uint_fixed_16_16_t plane_blocks_per_line;
 	uint_fixed_16_16_t y_tile_minimum;
-	uint32_t linetime_us;
-	uint32_t dbuf_block_size;
+	u32 linetime_us;
+	u32 dbuf_block_size;
 };
 
 /*
@@ -1155,6 +1175,25 @@ struct i915_runtime_pm {
 	atomic_t wakeref_count;
 	bool suspended;
 	bool irqs_enabled;
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
+	/*
+	 * To aide detection of wakeref leaks and general misuse, we
+	 * track all wakeref holders. With manual markup (i.e. returning
+	 * a cookie to each rpm_get caller which they then supply to their
+	 * paired rpm_put) we can remove corresponding pairs of and keep
+	 * the array trimmed to active wakerefs.
+	 */
+	struct intel_runtime_pm_debug {
+		spinlock_t lock;
+
+		depot_stack_handle_t last_acquire;
+		depot_stack_handle_t last_release;
+
+		depot_stack_handle_t *owners;
+		unsigned long count;
+	} debug;
+#endif
 };
 
 enum intel_pipe_crc_source {
@@ -1311,6 +1350,12 @@ struct i915_perf_stream {
 	struct list_head link;
 
 	/**
+	 * @wakeref: As we keep the device awake while the perf stream is
+	 * active, we track our runtime pm reference for later release.
+	 */
+	intel_wakeref_t wakeref;
+
+	/**
 	 * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
 	 * properties given when opening a stream, representing the contents
 	 * of a single sample as read() by userspace.
@@ -1430,7 +1475,8 @@ struct drm_i915_private {
 	struct kmem_cache *dependencies;
 	struct kmem_cache *priorities;
 
-	const struct intel_device_info info;
+	const struct intel_device_info __info; /* Use INTEL_INFO() to access. */
+	struct intel_runtime_info __runtime; /* Use RUNTIME_INFO() to access. */
 	struct intel_driver_caps caps;
 
 	/**
@@ -1482,14 +1528,14 @@ struct drm_i915_private {
 	 * Base address of where the gmbus and gpio blocks are located (either
 	 * on PCH or on SoC for platforms without PCH).
 	 */
-	uint32_t gpio_mmio_base;
+	u32 gpio_mmio_base;
 
 	/* MMIO base address for MIPI regs */
-	uint32_t mipi_mmio_base;
+	u32 mipi_mmio_base;
 
-	uint32_t psr_mmio_base;
+	u32 psr_mmio_base;
 
-	uint32_t pps_mmio_base;
+	u32 pps_mmio_base;
 
 	wait_queue_head_t gmbus_wait_queue;
 
@@ -1744,17 +1790,17 @@ struct drm_i915_private {
 		 * in 0.5us units for WM1+.
 		 */
 		/* primary */
-		uint16_t pri_latency[5];
+		u16 pri_latency[5];
 		/* sprite */
-		uint16_t spr_latency[5];
+		u16 spr_latency[5];
 		/* cursor */
-		uint16_t cur_latency[5];
+		u16 cur_latency[5];
 		/*
 		 * Raw watermark memory latency values
 		 * for SKL for all 8 levels
 		 * in 1us units.
 		 */
-		uint16_t skl_latency[8];
+		u16 skl_latency[8];
 
 		/* current hardware state */
 		union {
@@ -1764,7 +1810,7 @@ struct drm_i915_private {
 			struct g4x_wm_values g4x;
 		};
 
-		uint8_t max_level;
+		u8 max_level;
 
 		/*
 		 * Should be held around atomic WM register writing; also
@@ -1942,12 +1988,18 @@ struct drm_i915_private {
 		void (*resume)(struct drm_i915_private *);
 		void (*cleanup_engine)(struct intel_engine_cs *engine);
 
-		struct list_head timelines;
+		struct i915_gt_timelines {
+			struct mutex mutex; /* protects list, tainted by GPU */
+			struct list_head active_list;
+
+			/* Pack multiple timelines' seqnos into the same page */
+			spinlock_t hwsp_lock;
+			struct list_head hwsp_free_list;
+		} timelines;
 
 		struct list_head active_rings;
 		struct list_head closed_vma;
 		u32 active_requests;
-		u32 request_serial;
 
 		/**
 		 * Is the GPU currently considered idle, or busy executing
@@ -1956,7 +2008,7 @@ struct drm_i915_private {
 		 * In order to reduce the effect on performance, there
 		 * is a slight delay before we do so.
 		 */
-		bool awake;
+		intel_wakeref_t awake;
 
 		/**
 		 * The number of times we have woken up.
@@ -2191,17 +2243,12 @@ static inline unsigned int i915_sg_segment_size(void)
 	return size;
 }
 
-static inline const struct intel_device_info *
-intel_info(const struct drm_i915_private *dev_priv)
-{
-	return &dev_priv->info;
-}
-
-#define INTEL_INFO(dev_priv)	intel_info((dev_priv))
+#define INTEL_INFO(dev_priv)	(&(dev_priv)->__info)
+#define RUNTIME_INFO(dev_priv)	(&(dev_priv)->__runtime)
 #define DRIVER_CAPS(dev_priv)	(&(dev_priv)->caps)
 
-#define INTEL_GEN(dev_priv)	((dev_priv)->info.gen)
-#define INTEL_DEVID(dev_priv)	((dev_priv)->info.device_id)
+#define INTEL_GEN(dev_priv)	(INTEL_INFO(dev_priv)->gen)
+#define INTEL_DEVID(dev_priv)	(RUNTIME_INFO(dev_priv)->device_id)
 
 #define REVID_FOREVER		0xff
 #define INTEL_REVID(dev_priv)	((dev_priv)->drm.pdev->revision)
@@ -2212,8 +2259,12 @@ intel_info(const struct drm_i915_private *dev_priv)
 	GENMASK((e) - 1, (s) - 1))
 
 /* Returns true if Gen is in inclusive range [Start, End] */
-#define IS_GEN(dev_priv, s, e) \
-	(!!((dev_priv)->info.gen_mask & INTEL_GEN_MASK((s), (e))))
+#define IS_GEN_RANGE(dev_priv, s, e) \
+	(!!(INTEL_INFO(dev_priv)->gen_mask & INTEL_GEN_MASK((s), (e))))
+
+#define IS_GEN(dev_priv, n) \
+	(BUILD_BUG_ON_ZERO(!__builtin_constant_p(n)) + \
+	 INTEL_INFO(dev_priv)->gen == (n))
 
 /*
  * Return true if revision is in range [since,until] inclusive.
@@ -2223,7 +2274,7 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define IS_REVID(p, since, until) \
 	(INTEL_REVID(p) >= (since) && INTEL_REVID(p) <= (until))
 
-#define IS_PLATFORM(dev_priv, p) ((dev_priv)->info.platform_mask & BIT(p))
+#define IS_PLATFORM(dev_priv, p) (INTEL_INFO(dev_priv)->platform_mask & BIT(p))
 
 #define IS_I830(dev_priv)	IS_PLATFORM(dev_priv, INTEL_I830)
 #define IS_I845G(dev_priv)	IS_PLATFORM(dev_priv, INTEL_I845G)
@@ -2245,7 +2296,7 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define IS_IRONLAKE_M(dev_priv)	(INTEL_DEVID(dev_priv) == 0x0046)
 #define IS_IVYBRIDGE(dev_priv)	IS_PLATFORM(dev_priv, INTEL_IVYBRIDGE)
 #define IS_IVB_GT1(dev_priv)	(IS_IVYBRIDGE(dev_priv) && \
-				 (dev_priv)->info.gt == 1)
+				 INTEL_INFO(dev_priv)->gt == 1)
 #define IS_VALLEYVIEW(dev_priv)	IS_PLATFORM(dev_priv, INTEL_VALLEYVIEW)
 #define IS_CHERRYVIEW(dev_priv)	IS_PLATFORM(dev_priv, INTEL_CHERRYVIEW)
 #define IS_HASWELL(dev_priv)	IS_PLATFORM(dev_priv, INTEL_HASWELL)
@@ -2257,7 +2308,7 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define IS_COFFEELAKE(dev_priv)	IS_PLATFORM(dev_priv, INTEL_COFFEELAKE)
 #define IS_CANNONLAKE(dev_priv)	IS_PLATFORM(dev_priv, INTEL_CANNONLAKE)
 #define IS_ICELAKE(dev_priv)	IS_PLATFORM(dev_priv, INTEL_ICELAKE)
-#define IS_MOBILE(dev_priv)	((dev_priv)->info.is_mobile)
+#define IS_MOBILE(dev_priv)	(INTEL_INFO(dev_priv)->is_mobile)
 #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \
 				    (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00)
 #define IS_BDW_ULT(dev_priv)	(IS_BROADWELL(dev_priv) && \
@@ -2268,11 +2319,13 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define IS_BDW_ULX(dev_priv)	(IS_BROADWELL(dev_priv) && \
 				 (INTEL_DEVID(dev_priv) & 0xf) == 0xe)
 #define IS_BDW_GT3(dev_priv)	(IS_BROADWELL(dev_priv) && \
-				 (dev_priv)->info.gt == 3)
+				 INTEL_INFO(dev_priv)->gt == 3)
 #define IS_HSW_ULT(dev_priv)	(IS_HASWELL(dev_priv) && \
 				 (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0A00)
 #define IS_HSW_GT3(dev_priv)	(IS_HASWELL(dev_priv) && \
-				 (dev_priv)->info.gt == 3)
+				 INTEL_INFO(dev_priv)->gt == 3)
+#define IS_HSW_GT1(dev_priv)	(IS_HASWELL(dev_priv) && \
+				 INTEL_INFO(dev_priv)->gt == 1)
 /* ULX machines are also considered ULT. */
 #define IS_HSW_ULX(dev_priv)	(INTEL_DEVID(dev_priv) == 0x0A0E || \
 				 INTEL_DEVID(dev_priv) == 0x0A1E)
@@ -2295,23 +2348,25 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define IS_AML_ULX(dev_priv)	(INTEL_DEVID(dev_priv) == 0x591C || \
 				 INTEL_DEVID(dev_priv) == 0x87C0)
 #define IS_SKL_GT2(dev_priv)	(IS_SKYLAKE(dev_priv) && \
-				 (dev_priv)->info.gt == 2)
+				 INTEL_INFO(dev_priv)->gt == 2)
 #define IS_SKL_GT3(dev_priv)	(IS_SKYLAKE(dev_priv) && \
-				 (dev_priv)->info.gt == 3)
+				 INTEL_INFO(dev_priv)->gt == 3)
 #define IS_SKL_GT4(dev_priv)	(IS_SKYLAKE(dev_priv) && \
-				 (dev_priv)->info.gt == 4)
+				 INTEL_INFO(dev_priv)->gt == 4)
 #define IS_KBL_GT2(dev_priv)	(IS_KABYLAKE(dev_priv) && \
-				 (dev_priv)->info.gt == 2)
+				 INTEL_INFO(dev_priv)->gt == 2)
 #define IS_KBL_GT3(dev_priv)	(IS_KABYLAKE(dev_priv) && \
-				 (dev_priv)->info.gt == 3)
+				 INTEL_INFO(dev_priv)->gt == 3)
 #define IS_CFL_ULT(dev_priv)	(IS_COFFEELAKE(dev_priv) && \
 				 (INTEL_DEVID(dev_priv) & 0x00F0) == 0x00A0)
 #define IS_CFL_GT2(dev_priv)	(IS_COFFEELAKE(dev_priv) && \
-				 (dev_priv)->info.gt == 2)
+				 INTEL_INFO(dev_priv)->gt == 2)
 #define IS_CFL_GT3(dev_priv)	(IS_COFFEELAKE(dev_priv) && \
-				 (dev_priv)->info.gt == 3)
+				 INTEL_INFO(dev_priv)->gt == 3)
 #define IS_CNL_WITH_PORT_F(dev_priv)   (IS_CANNONLAKE(dev_priv) && \
 					(INTEL_DEVID(dev_priv) & 0x0004) == 0x0004)
+#define IS_ICL_WITH_PORT_F(dev_priv)   (IS_ICELAKE(dev_priv) && \
+					INTEL_DEVID(dev_priv) != 0x8A51)
 
 #define IS_ALPHA_SUPPORT(intel_info) ((intel_info)->is_alpha_support)
 
@@ -2366,26 +2421,9 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define IS_ICL_REVID(p, since, until) \
 	(IS_ICELAKE(p) && IS_REVID(p, since, until))
 
-/*
- * The genX designation typically refers to the render engine, so render
- * capability related checks should use IS_GEN, while display and other checks
- * have their own (e.g. HAS_PCH_SPLIT for ILK+ display, IS_foo for particular
- * chips, etc.).
- */
-#define IS_GEN2(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(1)))
-#define IS_GEN3(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(2)))
-#define IS_GEN4(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(3)))
-#define IS_GEN5(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(4)))
-#define IS_GEN6(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(5)))
-#define IS_GEN7(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(6)))
-#define IS_GEN8(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(7)))
-#define IS_GEN9(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(8)))
-#define IS_GEN10(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(9)))
-#define IS_GEN11(dev_priv)	(!!((dev_priv)->info.gen_mask & BIT(10)))
-
 #define IS_LP(dev_priv)	(INTEL_INFO(dev_priv)->is_lp)
-#define IS_GEN9_LP(dev_priv)	(IS_GEN9(dev_priv) && IS_LP(dev_priv))
-#define IS_GEN9_BC(dev_priv)	(IS_GEN9(dev_priv) && !IS_LP(dev_priv))
+#define IS_GEN9_LP(dev_priv)	(IS_GEN(dev_priv, 9) && IS_LP(dev_priv))
+#define IS_GEN9_BC(dev_priv)	(IS_GEN(dev_priv, 9) && !IS_LP(dev_priv))
 
 #define ENGINE_MASK(id)	BIT(id)
 #define RENDER_RING	ENGINE_MASK(RCS)
@@ -2399,29 +2437,27 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define ALL_ENGINES	(~0)
 
 #define HAS_ENGINE(dev_priv, id) \
-	(!!((dev_priv)->info.ring_mask & ENGINE_MASK(id)))
+	(!!(INTEL_INFO(dev_priv)->ring_mask & ENGINE_MASK(id)))
 
 #define HAS_BSD(dev_priv)	HAS_ENGINE(dev_priv, VCS)
 #define HAS_BSD2(dev_priv)	HAS_ENGINE(dev_priv, VCS2)
 #define HAS_BLT(dev_priv)	HAS_ENGINE(dev_priv, BCS)
 #define HAS_VEBOX(dev_priv)	HAS_ENGINE(dev_priv, VECS)
 
-#define HAS_LEGACY_SEMAPHORES(dev_priv) IS_GEN7(dev_priv)
-
-#define HAS_LLC(dev_priv)	((dev_priv)->info.has_llc)
-#define HAS_SNOOP(dev_priv)	((dev_priv)->info.has_snoop)
+#define HAS_LLC(dev_priv)	(INTEL_INFO(dev_priv)->has_llc)
+#define HAS_SNOOP(dev_priv)	(INTEL_INFO(dev_priv)->has_snoop)
 #define HAS_EDRAM(dev_priv)	(!!((dev_priv)->edram_cap & EDRAM_ENABLED))
 #define HAS_WT(dev_priv)	((IS_HASWELL(dev_priv) || \
 				 IS_BROADWELL(dev_priv)) && HAS_EDRAM(dev_priv))
 
-#define HWS_NEEDS_PHYSICAL(dev_priv)	((dev_priv)->info.hws_needs_physical)
+#define HWS_NEEDS_PHYSICAL(dev_priv)	(INTEL_INFO(dev_priv)->hws_needs_physical)
 
 #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
-		((dev_priv)->info.has_logical_ring_contexts)
+		(INTEL_INFO(dev_priv)->has_logical_ring_contexts)
 #define HAS_LOGICAL_RING_ELSQ(dev_priv) \
-		((dev_priv)->info.has_logical_ring_elsq)
+		(INTEL_INFO(dev_priv)->has_logical_ring_elsq)
 #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
-		((dev_priv)->info.has_logical_ring_preemption)
+		(INTEL_INFO(dev_priv)->has_logical_ring_preemption)
 
 #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
 
@@ -2435,12 +2471,12 @@ intel_info(const struct drm_i915_private *dev_priv)
 
 #define HAS_PAGE_SIZES(dev_priv, sizes) ({ \
 	GEM_BUG_ON((sizes) == 0); \
-	((sizes) & ~(dev_priv)->info.page_sizes) == 0; \
+	((sizes) & ~INTEL_INFO(dev_priv)->page_sizes) == 0; \
 })
 
-#define HAS_OVERLAY(dev_priv)		 ((dev_priv)->info.display.has_overlay)
+#define HAS_OVERLAY(dev_priv)		 (INTEL_INFO(dev_priv)->display.has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
-		((dev_priv)->info.display.overlay_needs_physical)
+		(INTEL_INFO(dev_priv)->display.overlay_needs_physical)
 
 /* Early gen2 have a totally busted CS tlb and require pinned batches. */
 #define HAS_BROKEN_CS_TLB(dev_priv)	(IS_I830(dev_priv) || IS_I845G(dev_priv))
@@ -2458,42 +2494,42 @@ intel_info(const struct drm_i915_private *dev_priv)
 /* With the 945 and later, Y tiling got adjusted so that it was 32 128-byte
  * rows, which changed the alignment requirements and fence programming.
  */
-#define HAS_128_BYTE_Y_TILING(dev_priv) (!IS_GEN2(dev_priv) && \
+#define HAS_128_BYTE_Y_TILING(dev_priv) (!IS_GEN(dev_priv, 2) && \
 					 !(IS_I915G(dev_priv) || \
 					 IS_I915GM(dev_priv)))
-#define SUPPORTS_TV(dev_priv)		((dev_priv)->info.display.supports_tv)
-#define I915_HAS_HOTPLUG(dev_priv)	((dev_priv)->info.display.has_hotplug)
+#define SUPPORTS_TV(dev_priv)		(INTEL_INFO(dev_priv)->display.supports_tv)
+#define I915_HAS_HOTPLUG(dev_priv)	(INTEL_INFO(dev_priv)->display.has_hotplug)
 
 #define HAS_FW_BLC(dev_priv) 	(INTEL_GEN(dev_priv) > 2)
-#define HAS_FBC(dev_priv)	((dev_priv)->info.display.has_fbc)
-#define HAS_CUR_FBC(dev_priv)	(!HAS_GMCH_DISPLAY(dev_priv) && INTEL_GEN(dev_priv) >= 7)
+#define HAS_FBC(dev_priv)	(INTEL_INFO(dev_priv)->display.has_fbc)
+#define HAS_CUR_FBC(dev_priv)	(!HAS_GMCH(dev_priv) && INTEL_GEN(dev_priv) >= 7)
 
 #define HAS_IPS(dev_priv)	(IS_HSW_ULT(dev_priv) || IS_BROADWELL(dev_priv))
 
-#define HAS_DP_MST(dev_priv)	((dev_priv)->info.display.has_dp_mst)
+#define HAS_DP_MST(dev_priv)	(INTEL_INFO(dev_priv)->display.has_dp_mst)
 
-#define HAS_DDI(dev_priv)		 ((dev_priv)->info.display.has_ddi)
-#define HAS_FPGA_DBG_UNCLAIMED(dev_priv) ((dev_priv)->info.has_fpga_dbg)
-#define HAS_PSR(dev_priv)		 ((dev_priv)->info.display.has_psr)
+#define HAS_DDI(dev_priv)		 (INTEL_INFO(dev_priv)->display.has_ddi)
+#define HAS_FPGA_DBG_UNCLAIMED(dev_priv) (INTEL_INFO(dev_priv)->has_fpga_dbg)
+#define HAS_PSR(dev_priv)		 (INTEL_INFO(dev_priv)->display.has_psr)
 
-#define HAS_RC6(dev_priv)		 ((dev_priv)->info.has_rc6)
-#define HAS_RC6p(dev_priv)		 ((dev_priv)->info.has_rc6p)
+#define HAS_RC6(dev_priv)		 (INTEL_INFO(dev_priv)->has_rc6)
+#define HAS_RC6p(dev_priv)		 (INTEL_INFO(dev_priv)->has_rc6p)
 #define HAS_RC6pp(dev_priv)		 (false) /* HW was never validated */
 
-#define HAS_CSR(dev_priv)	((dev_priv)->info.display.has_csr)
+#define HAS_CSR(dev_priv)	(INTEL_INFO(dev_priv)->display.has_csr)
 
-#define HAS_RUNTIME_PM(dev_priv) ((dev_priv)->info.has_runtime_pm)
-#define HAS_64BIT_RELOC(dev_priv) ((dev_priv)->info.has_64bit_reloc)
+#define HAS_RUNTIME_PM(dev_priv) (INTEL_INFO(dev_priv)->has_runtime_pm)
+#define HAS_64BIT_RELOC(dev_priv) (INTEL_INFO(dev_priv)->has_64bit_reloc)
 
-#define HAS_IPC(dev_priv)		 ((dev_priv)->info.display.has_ipc)
+#define HAS_IPC(dev_priv)		 (INTEL_INFO(dev_priv)->display.has_ipc)
 
 /*
  * For now, anything with a GuC requires uCode loading, and then supports
  * command submission once loaded. But these are logically independent
  * properties, so we have separate macros to test them.
  */
-#define HAS_GUC(dev_priv)	((dev_priv)->info.has_guc)
-#define HAS_GUC_CT(dev_priv)	((dev_priv)->info.has_guc_ct)
+#define HAS_GUC(dev_priv)	(INTEL_INFO(dev_priv)->has_guc)
+#define HAS_GUC_CT(dev_priv)	(INTEL_INFO(dev_priv)->has_guc_ct)
 #define HAS_GUC_UCODE(dev_priv)	(HAS_GUC(dev_priv))
 #define HAS_GUC_SCHED(dev_priv)	(HAS_GUC(dev_priv))
 
@@ -2502,11 +2538,11 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define HAS_HUC_UCODE(dev_priv)	(HAS_GUC(dev_priv))
 
 /* Having a GuC is not the same as using a GuC */
-#define USES_GUC(dev_priv)		intel_uc_is_using_guc()
-#define USES_GUC_SUBMISSION(dev_priv)	intel_uc_is_using_guc_submission()
-#define USES_HUC(dev_priv)		intel_uc_is_using_huc()
+#define USES_GUC(dev_priv)		intel_uc_is_using_guc(dev_priv)
+#define USES_GUC_SUBMISSION(dev_priv)	intel_uc_is_using_guc_submission(dev_priv)
+#define USES_HUC(dev_priv)		intel_uc_is_using_huc(dev_priv)
 
-#define HAS_POOLED_EU(dev_priv)	((dev_priv)->info.has_pooled_eu)
+#define HAS_POOLED_EU(dev_priv)	(INTEL_INFO(dev_priv)->has_pooled_eu)
 
 #define INTEL_PCH_DEVICE_ID_MASK		0xff80
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
@@ -2546,12 +2582,12 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define HAS_PCH_NOP(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_NOP)
 #define HAS_PCH_SPLIT(dev_priv) (INTEL_PCH_TYPE(dev_priv) != PCH_NONE)
 
-#define HAS_GMCH_DISPLAY(dev_priv) ((dev_priv)->info.display.has_gmch_display)
+#define HAS_GMCH(dev_priv) (INTEL_INFO(dev_priv)->display.has_gmch)
 
 #define HAS_LSPCON(dev_priv) (INTEL_GEN(dev_priv) >= 9)
 
 /* DPF == dynamic parity feature */
-#define HAS_L3_DPF(dev_priv) ((dev_priv)->info.has_l3_dpf)
+#define HAS_L3_DPF(dev_priv) (INTEL_INFO(dev_priv)->has_l3_dpf)
 #define NUM_L3_SLICES(dev_priv) (IS_HSW_GT3(dev_priv) ? \
 				 2 : HAS_L3_DPF(dev_priv))
 
@@ -2601,19 +2637,7 @@ extern const struct dev_pm_ops i915_pm_ops;
 extern int i915_driver_load(struct pci_dev *pdev,
 			    const struct pci_device_id *ent);
 extern void i915_driver_unload(struct drm_device *dev);
-extern int intel_gpu_reset(struct drm_i915_private *dev_priv, u32 engine_mask);
-extern bool intel_has_gpu_reset(struct drm_i915_private *dev_priv);
-
-extern void i915_reset(struct drm_i915_private *i915,
-		       unsigned int stalled_mask,
-		       const char *reason);
-extern int i915_reset_engine(struct intel_engine_cs *engine,
-			     const char *reason);
-
-extern bool intel_has_reset_engine(struct drm_i915_private *dev_priv);
-extern int intel_reset_guc(struct drm_i915_private *dev_priv);
-extern int intel_guc_reset_engine(struct intel_guc *guc,
-				  struct intel_engine_cs *engine);
+
 extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine);
 extern void intel_hangcheck_init(struct drm_i915_private *dev_priv);
 extern unsigned long i915_chipset_val(struct drm_i915_private *dev_priv);
@@ -2656,20 +2680,11 @@ static inline void i915_queue_hangcheck(struct drm_i915_private *dev_priv)
 			   &dev_priv->gpu_error.hangcheck_work, delay);
 }
 
-__printf(4, 5)
-void i915_handle_error(struct drm_i915_private *dev_priv,
-		       u32 engine_mask,
-		       unsigned long flags,
-		       const char *fmt, ...);
-#define I915_ERROR_CAPTURE BIT(0)
-
 extern void intel_irq_init(struct drm_i915_private *dev_priv);
 extern void intel_irq_fini(struct drm_i915_private *dev_priv);
 int intel_irq_install(struct drm_i915_private *dev_priv);
 void intel_irq_uninstall(struct drm_i915_private *dev_priv);
 
-void i915_clear_error_registers(struct drm_i915_private *dev_priv);
-
 static inline bool intel_gvt_active(struct drm_i915_private *dev_priv)
 {
 	return dev_priv->gvt;
@@ -2693,45 +2708,45 @@ i915_disable_pipestat(struct drm_i915_private *dev_priv, enum pipe pipe,
 void valleyview_enable_display_irqs(struct drm_i915_private *dev_priv);
 void valleyview_disable_display_irqs(struct drm_i915_private *dev_priv);
 void i915_hotplug_interrupt_update(struct drm_i915_private *dev_priv,
-				   uint32_t mask,
-				   uint32_t bits);
+				   u32 mask,
+				   u32 bits);
 void ilk_update_display_irq(struct drm_i915_private *dev_priv,
-			    uint32_t interrupt_mask,
-			    uint32_t enabled_irq_mask);
+			    u32 interrupt_mask,
+			    u32 enabled_irq_mask);
 static inline void
-ilk_enable_display_irq(struct drm_i915_private *dev_priv, uint32_t bits)
+ilk_enable_display_irq(struct drm_i915_private *dev_priv, u32 bits)
 {
 	ilk_update_display_irq(dev_priv, bits, bits);
 }
 static inline void
-ilk_disable_display_irq(struct drm_i915_private *dev_priv, uint32_t bits)
+ilk_disable_display_irq(struct drm_i915_private *dev_priv, u32 bits)
 {
 	ilk_update_display_irq(dev_priv, bits, 0);
 }
 void bdw_update_pipe_irq(struct drm_i915_private *dev_priv,
 			 enum pipe pipe,
-			 uint32_t interrupt_mask,
-			 uint32_t enabled_irq_mask);
+			 u32 interrupt_mask,
+			 u32 enabled_irq_mask);
 static inline void bdw_enable_pipe_irq(struct drm_i915_private *dev_priv,
-				       enum pipe pipe, uint32_t bits)
+				       enum pipe pipe, u32 bits)
 {
 	bdw_update_pipe_irq(dev_priv, pipe, bits, bits);
 }
 static inline void bdw_disable_pipe_irq(struct drm_i915_private *dev_priv,
-					enum pipe pipe, uint32_t bits)
+					enum pipe pipe, u32 bits)
 {
 	bdw_update_pipe_irq(dev_priv, pipe, bits, 0);
 }
 void ibx_display_interrupt_update(struct drm_i915_private *dev_priv,
-				  uint32_t interrupt_mask,
-				  uint32_t enabled_irq_mask);
+				  u32 interrupt_mask,
+				  u32 enabled_irq_mask);
 static inline void
-ibx_enable_display_interrupt(struct drm_i915_private *dev_priv, uint32_t bits)
+ibx_enable_display_interrupt(struct drm_i915_private *dev_priv, u32 bits)
 {
 	ibx_display_interrupt_update(dev_priv, bits, bits);
 }
 static inline void
-ibx_disable_display_interrupt(struct drm_i915_private *dev_priv, uint32_t bits)
+ibx_disable_display_interrupt(struct drm_i915_private *dev_priv, u32 bits)
 {
 	ibx_display_interrupt_update(dev_priv, bits, 0);
 }
@@ -2916,13 +2931,13 @@ i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 	__i915_gem_object_unpin_pages(obj);
 }
 
-enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock */
+enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */
 	I915_MM_NORMAL = 0,
-	I915_MM_SHRINKER
+	I915_MM_SHRINKER /* called "recursively" from direct-reclaim-esque */
 };
 
-void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
-				 enum i915_mm_subclass subclass);
+int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
+				enum i915_mm_subclass subclass);
 void __i915_gem_object_invalidate(struct drm_i915_gem_object *obj);
 
 enum i915_map_type {
@@ -2991,7 +3006,7 @@ int i915_gem_dumb_create(struct drm_file *file_priv,
 			 struct drm_device *dev,
 			 struct drm_mode_create_dumb *args);
 int i915_gem_mmap_gtt(struct drm_file *file_priv, struct drm_device *dev,
-		      uint32_t handle, uint64_t *offset);
+		      u32 handle, u64 *offset);
 int i915_gem_mmap_gtt_version(void);
 
 void i915_gem_track_fb(struct drm_i915_gem_object *old,
@@ -3008,11 +3023,6 @@ static inline bool i915_reset_backoff(struct i915_gpu_error *error)
 	return unlikely(test_bit(I915_RESET_BACKOFF, &error->flags));
 }
 
-static inline bool i915_reset_handoff(struct i915_gpu_error *error)
-{
-	return unlikely(test_bit(I915_RESET_HANDOFF, &error->flags));
-}
-
 static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
 {
 	return unlikely(test_bit(I915_WEDGED, &error->flags));
@@ -3034,18 +3044,8 @@ static inline u32 i915_reset_engine_count(struct i915_gpu_error *error,
 	return READ_ONCE(error->reset_engine_count[engine->id]);
 }
 
-struct i915_request *
-i915_gem_reset_prepare_engine(struct intel_engine_cs *engine);
-int i915_gem_reset_prepare(struct drm_i915_private *dev_priv);
-void i915_gem_reset(struct drm_i915_private *dev_priv,
-		    unsigned int stalled_mask);
-void i915_gem_reset_finish_engine(struct intel_engine_cs *engine);
-void i915_gem_reset_finish(struct drm_i915_private *dev_priv);
 void i915_gem_set_wedged(struct drm_i915_private *dev_priv);
 bool i915_gem_unset_wedged(struct drm_i915_private *dev_priv);
-void i915_gem_reset_engine(struct intel_engine_cs *engine,
-			   struct i915_request *request,
-			   bool stalled);
 
 void i915_gem_init_mmio(struct drm_i915_private *i915);
 int __must_check i915_gem_init(struct drm_i915_private *dev_priv);
@@ -3142,7 +3142,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file);
 void i915_oa_init_reg_state(struct intel_engine_cs *engine,
 			    struct i915_gem_context *ctx,
-			    uint32_t *reg_state);
+			    u32 *reg_state);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
@@ -3204,7 +3204,8 @@ unsigned long i915_gem_shrink(struct drm_i915_private *i915,
 unsigned long i915_gem_shrink_all(struct drm_i915_private *i915);
 void i915_gem_shrinker_register(struct drm_i915_private *i915);
 void i915_gem_shrinker_unregister(struct drm_i915_private *i915);
-void i915_gem_shrinker_taints_mutex(struct mutex *mutex);
+void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915,
+				    struct mutex *mutex);
 
 /* i915_gem_tiling.c */
 static inline bool i915_gem_object_needs_bit17_swizzle(struct drm_i915_gem_object *obj)
@@ -3313,7 +3314,21 @@ static inline void intel_unregister_dsm_handler(void) { return; }
 static inline struct intel_device_info *
 mkwrite_device_info(struct drm_i915_private *dev_priv)
 {
-	return (struct intel_device_info *)&dev_priv->info;
+	return (struct intel_device_info *)INTEL_INFO(dev_priv);
+}
+
+static inline struct intel_sseu
+intel_device_default_sseu(struct drm_i915_private *i915)
+{
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
+	struct intel_sseu value = {
+		.slice_mask = sseu->slice_mask,
+		.subslice_mask = sseu->subslice_mask[0],
+		.min_eus_per_subslice = sseu->max_eus_per_subslice,
+		.max_eus_per_subslice = sseu->max_eus_per_subslice,
+	};
+
+	return value;
 }
 
 /* modesetting */
@@ -3393,10 +3408,10 @@ bool bxt_ddi_phy_is_enabled(struct drm_i915_private *dev_priv,
 			    enum dpio_phy phy);
 bool bxt_ddi_phy_verify_state(struct drm_i915_private *dev_priv,
 			      enum dpio_phy phy);
-uint8_t bxt_ddi_phy_calc_lane_lat_optim_mask(uint8_t lane_count);
+u8 bxt_ddi_phy_calc_lane_lat_optim_mask(u8 lane_count);
 void bxt_ddi_phy_set_lane_optim_mask(struct intel_encoder *encoder,
-				     uint8_t lane_lat_optim_mask);
-uint8_t bxt_ddi_phy_get_lane_lat_optim_mask(struct intel_encoder *encoder);
+				     u8 lane_lat_optim_mask);
+u8 bxt_ddi_phy_get_lane_lat_optim_mask(struct intel_encoder *encoder);
 
 void chv_set_phy_signal_level(struct intel_encoder *encoder,
 			      u32 deemph_reg_value, u32 margin_reg_value,
@@ -3599,90 +3614,6 @@ wait_remaining_ms_from_jiffies(unsigned long timestamp_jiffies, int to_wait_ms)
 	}
 }
 
-static inline bool
-__i915_request_irq_complete(const struct i915_request *rq)
-{
-	struct intel_engine_cs *engine = rq->engine;
-	u32 seqno;
-
-	/* Note that the engine may have wrapped around the seqno, and
-	 * so our request->global_seqno will be ahead of the hardware,
-	 * even though it completed the request before wrapping. We catch
-	 * this by kicking all the waiters before resetting the seqno
-	 * in hardware, and also signal the fence.
-	 */
-	if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags))
-		return true;
-
-	/* The request was dequeued before we were awoken. We check after
-	 * inspecting the hw to confirm that this was the same request
-	 * that generated the HWS update. The memory barriers within
-	 * the request execution are sufficient to ensure that a check
-	 * after reading the value from hw matches this request.
-	 */
-	seqno = i915_request_global_seqno(rq);
-	if (!seqno)
-		return false;
-
-	/* Before we do the heavier coherent read of the seqno,
-	 * check the value (hopefully) in the CPU cacheline.
-	 */
-	if (__i915_request_completed(rq, seqno))
-		return true;
-
-	/* Ensure our read of the seqno is coherent so that we
-	 * do not "miss an interrupt" (i.e. if this is the last
-	 * request and the seqno write from the GPU is not visible
-	 * by the time the interrupt fires, we will see that the
-	 * request is incomplete and go back to sleep awaiting
-	 * another interrupt that will never come.)
-	 *
-	 * Strictly, we only need to do this once after an interrupt,
-	 * but it is easier and safer to do it every time the waiter
-	 * is woken.
-	 */
-	if (engine->irq_seqno_barrier &&
-	    test_and_clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted)) {
-		struct intel_breadcrumbs *b = &engine->breadcrumbs;
-
-		/* The ordering of irq_posted versus applying the barrier
-		 * is crucial. The clearing of the current irq_posted must
-		 * be visible before we perform the barrier operation,
-		 * such that if a subsequent interrupt arrives, irq_posted
-		 * is reasserted and our task rewoken (which causes us to
-		 * do another __i915_request_irq_complete() immediately
-		 * and reapply the barrier). Conversely, if the clear
-		 * occurs after the barrier, then an interrupt that arrived
-		 * whilst we waited on the barrier would not trigger a
-		 * barrier on the next pass, and the read may not see the
-		 * seqno update.
-		 */
-		engine->irq_seqno_barrier(engine);
-
-		/* If we consume the irq, but we are no longer the bottom-half,
-		 * the real bottom-half may not have serialised their own
-		 * seqno check with the irq-barrier (i.e. may have inspected
-		 * the seqno before we believe it coherent since they see
-		 * irq_posted == false but we are still running).
-		 */
-		spin_lock_irq(&b->irq_lock);
-		if (b->irq_wait && b->irq_wait->tsk != current)
-			/* Note that if the bottom-half is changed as we
-			 * are sending the wake-up, the new bottom-half will
-			 * be woken by whomever made the change. We only have
-			 * to worry about when we steal the irq-posted for
-			 * ourself.
-			 */
-			wake_up_process(b->irq_wait->tsk);
-		spin_unlock_irq(&b->irq_lock);
-
-		if (__i915_request_completed(rq, seqno))
-			return true;
-	}
-
-	return false;
-}
-
 void i915_memcpy_init_early(struct drm_i915_private *dev_priv);
 bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c882ea94172c..6728ea5c71d4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -25,18 +25,9 @@
  *
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_vma_manager.h>
+#include <drm/drm_pci.h>
 #include <drm/i915_drm.h>
-#include "i915_drv.h"
-#include "i915_gem_clflush.h"
-#include "i915_vgpu.h"
-#include "i915_trace.h"
-#include "intel_drv.h"
-#include "intel_frontbuffer.h"
-#include "intel_mocs.h"
-#include "intel_workarounds.h"
-#include "i915_gemfs.h"
 #include <linux/dma-fence-array.h>
 #include <linux/kthread.h>
 #include <linux/reservation.h>
@@ -46,6 +37,19 @@
 #include <linux/swap.h>
 #include <linux/pci.h>
 #include <linux/dma-buf.h>
+#include <linux/mman.h>
+
+#include "i915_drv.h"
+#include "i915_gem_clflush.h"
+#include "i915_gemfs.h"
+#include "i915_reset.h"
+#include "i915_trace.h"
+#include "i915_vgpu.h"
+
+#include "intel_drv.h"
+#include "intel_frontbuffer.h"
+#include "intel_mocs.h"
+#include "intel_workarounds.h"
 
 static void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
@@ -139,6 +143,8 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 
 static u32 __i915_gem_park(struct drm_i915_private *i915)
 {
+	intel_wakeref_t wakeref;
+
 	GEM_TRACE("\n");
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
@@ -169,14 +175,13 @@ static u32 __i915_gem_park(struct drm_i915_private *i915)
 	i915_pmu_gt_parked(i915);
 	i915_vma_parked(i915);
 
-	i915->gt.awake = false;
+	wakeref = fetch_and_zero(&i915->gt.awake);
+	GEM_BUG_ON(!wakeref);
 
 	if (INTEL_GEN(i915) >= 6)
 		gen6_rps_idle(i915);
 
-	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ);
-
-	intel_runtime_pm_put(i915);
+	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ, wakeref);
 
 	return i915->gt.epoch;
 }
@@ -201,12 +206,11 @@ void i915_gem_unpark(struct drm_i915_private *i915)
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
 	GEM_BUG_ON(!i915->gt.active_requests);
+	assert_rpm_wakelock_held(i915);
 
 	if (i915->gt.awake)
 		return;
 
-	intel_runtime_pm_get_noresume(i915);
-
 	/*
 	 * It seems that the DMC likes to transition between the DC states a lot
 	 * when there are no connected displays (no active power domains) during
@@ -218,9 +222,9 @@ void i915_gem_unpark(struct drm_i915_private *i915)
 	 * Work around it by grabbing a GT IRQ power domain whilst there is any
 	 * GT activity, preventing any DC state transitions.
 	 */
-	intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
+	i915->gt.awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
+	GEM_BUG_ON(!i915->gt.awake);
 
-	i915->gt.awake = true;
 	if (unlikely(++i915->gt.epoch == 0)) /* keep 0 as invalid */
 		i915->gt.epoch = 1;
 
@@ -243,21 +247,19 @@ int
 i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct i915_ggtt *ggtt = &dev_priv->ggtt;
+	struct i915_ggtt *ggtt = &to_i915(dev)->ggtt;
 	struct drm_i915_gem_get_aperture *args = data;
 	struct i915_vma *vma;
 	u64 pinned;
 
+	mutex_lock(&ggtt->vm.mutex);
+
 	pinned = ggtt->vm.reserved;
-	mutex_lock(&dev->struct_mutex);
-	list_for_each_entry(vma, &ggtt->vm.active_list, vm_link)
+	list_for_each_entry(vma, &ggtt->vm.bound_list, vm_link)
 		if (i915_vma_is_pinned(vma))
 			pinned += vma->node.size;
-	list_for_each_entry(vma, &ggtt->vm.inactive_list, vm_link)
-		if (i915_vma_is_pinned(vma))
-			pinned += vma->node.size;
-	mutex_unlock(&dev->struct_mutex);
+
+	mutex_unlock(&ggtt->vm.mutex);
 
 	args->aper_size = ggtt->vm.total;
 	args->aper_available_size = args->aper_size - pinned;
@@ -437,15 +439,19 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	if (ret)
 		return ret;
 
-	while ((vma = list_first_entry_or_null(&obj->vma_list,
-					       struct i915_vma,
-					       obj_link))) {
+	spin_lock(&obj->vma.lock);
+	while (!ret && (vma = list_first_entry_or_null(&obj->vma.list,
+						       struct i915_vma,
+						       obj_link))) {
 		list_move_tail(&vma->obj_link, &still_in_list);
+		spin_unlock(&obj->vma.lock);
+
 		ret = i915_vma_unbind(vma);
-		if (ret)
-			break;
+
+		spin_lock(&obj->vma.lock);
 	}
-	list_splice(&still_in_list, &obj->vma_list);
+	list_splice(&still_in_list, &obj->vma.list);
+	spin_unlock(&obj->vma.lock);
 
 	return ret;
 }
@@ -655,11 +661,6 @@ i915_gem_object_wait(struct drm_i915_gem_object *obj,
 		     struct intel_rps_client *rps_client)
 {
 	might_sleep();
-#if IS_ENABLED(CONFIG_LOCKDEP)
-	GEM_BUG_ON(debug_locks &&
-		   !!lockdep_is_held(&obj->base.dev->struct_mutex) !=
-		   !!(flags & I915_WAIT_LOCKED));
-#endif
 	GEM_BUG_ON(timeout < 0);
 
 	timeout = i915_gem_object_wait_reservation(obj->resv,
@@ -711,8 +712,8 @@ void i915_gem_object_free(struct drm_i915_gem_object *obj)
 static int
 i915_gem_create(struct drm_file *file,
 		struct drm_i915_private *dev_priv,
-		uint64_t size,
-		uint32_t *handle_p)
+		u64 size,
+		u32 *handle_p)
 {
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -783,6 +784,8 @@ fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain)
 
 void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv)
 {
+	intel_wakeref_t wakeref;
+
 	/*
 	 * No actual flushing is required for the GTT write domain for reads
 	 * from the GTT domain. Writes to it "immediately" go to main memory
@@ -809,13 +812,13 @@ void i915_gem_flush_ggtt_writes(struct drm_i915_private *dev_priv)
 
 	i915_gem_chipset_flush(dev_priv);
 
-	intel_runtime_pm_get(dev_priv);
-	spin_lock_irq(&dev_priv->uncore.lock);
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		spin_lock_irq(&dev_priv->uncore.lock);
 
-	POSTING_READ_FW(RING_HEAD(RENDER_RING_BASE));
+		POSTING_READ_FW(RING_HEAD(RENDER_RING_BASE));
 
-	spin_unlock_irq(&dev_priv->uncore.lock);
-	intel_runtime_pm_put(dev_priv);
+		spin_unlock_irq(&dev_priv->uncore.lock);
+	}
 }
 
 static void
@@ -859,58 +862,6 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
 	obj->write_domain = 0;
 }
 
-static inline int
-__copy_to_user_swizzled(char __user *cpu_vaddr,
-			const char *gpu_vaddr, int gpu_offset,
-			int length)
-{
-	int ret, cpu_offset = 0;
-
-	while (length > 0) {
-		int cacheline_end = ALIGN(gpu_offset + 1, 64);
-		int this_length = min(cacheline_end - gpu_offset, length);
-		int swizzled_gpu_offset = gpu_offset ^ 64;
-
-		ret = __copy_to_user(cpu_vaddr + cpu_offset,
-				     gpu_vaddr + swizzled_gpu_offset,
-				     this_length);
-		if (ret)
-			return ret + length;
-
-		cpu_offset += this_length;
-		gpu_offset += this_length;
-		length -= this_length;
-	}
-
-	return 0;
-}
-
-static inline int
-__copy_from_user_swizzled(char *gpu_vaddr, int gpu_offset,
-			  const char __user *cpu_vaddr,
-			  int length)
-{
-	int ret, cpu_offset = 0;
-
-	while (length > 0) {
-		int cacheline_end = ALIGN(gpu_offset + 1, 64);
-		int this_length = min(cacheline_end - gpu_offset, length);
-		int swizzled_gpu_offset = gpu_offset ^ 64;
-
-		ret = __copy_from_user(gpu_vaddr + swizzled_gpu_offset,
-				       cpu_vaddr + cpu_offset,
-				       this_length);
-		if (ret)
-			return ret + length;
-
-		cpu_offset += this_length;
-		gpu_offset += this_length;
-		length -= this_length;
-	}
-
-	return 0;
-}
-
 /*
  * Pins the specified object's pages and synchronizes the object with
  * GPU accesses. Sets needs_clflush to non-zero if the caller should
@@ -1030,72 +981,23 @@ err_unpin:
 	return ret;
 }
 
-static void
-shmem_clflush_swizzled_range(char *addr, unsigned long length,
-			     bool swizzled)
-{
-	if (unlikely(swizzled)) {
-		unsigned long start = (unsigned long) addr;
-		unsigned long end = (unsigned long) addr + length;
-
-		/* For swizzling simply ensure that we always flush both
-		 * channels. Lame, but simple and it works. Swizzled
-		 * pwrite/pread is far from a hotpath - current userspace
-		 * doesn't use it at all. */
-		start = round_down(start, 128);
-		end = round_up(end, 128);
-
-		drm_clflush_virt_range((void *)start, end - start);
-	} else {
-		drm_clflush_virt_range(addr, length);
-	}
-
-}
-
-/* Only difference to the fast-path function is that this can handle bit17
- * and uses non-atomic copy and kmap functions. */
 static int
-shmem_pread_slow(struct page *page, int offset, int length,
-		 char __user *user_data,
-		 bool page_do_bit17_swizzling, bool needs_clflush)
+shmem_pread(struct page *page, int offset, int len, char __user *user_data,
+	    bool needs_clflush)
 {
 	char *vaddr;
 	int ret;
 
 	vaddr = kmap(page);
-	if (needs_clflush)
-		shmem_clflush_swizzled_range(vaddr + offset, length,
-					     page_do_bit17_swizzling);
-
-	if (page_do_bit17_swizzling)
-		ret = __copy_to_user_swizzled(user_data, vaddr, offset, length);
-	else
-		ret = __copy_to_user(user_data, vaddr + offset, length);
-	kunmap(page);
-
-	return ret ? - EFAULT : 0;
-}
 
-static int
-shmem_pread(struct page *page, int offset, int length, char __user *user_data,
-	    bool page_do_bit17_swizzling, bool needs_clflush)
-{
-	int ret;
+	if (needs_clflush)
+		drm_clflush_virt_range(vaddr + offset, len);
 
-	ret = -ENODEV;
-	if (!page_do_bit17_swizzling) {
-		char *vaddr = kmap_atomic(page);
+	ret = __copy_to_user(user_data, vaddr + offset, len);
 
-		if (needs_clflush)
-			drm_clflush_virt_range(vaddr + offset, length);
-		ret = __copy_to_user_inatomic(user_data, vaddr + offset, length);
-		kunmap_atomic(vaddr);
-	}
-	if (ret == 0)
-		return 0;
+	kunmap(page);
 
-	return shmem_pread_slow(page, offset, length, user_data,
-				page_do_bit17_swizzling, needs_clflush);
+	return ret ? -EFAULT : 0;
 }
 
 static int
@@ -1104,15 +1006,10 @@ i915_gem_shmem_pread(struct drm_i915_gem_object *obj,
 {
 	char __user *user_data;
 	u64 remain;
-	unsigned int obj_do_bit17_swizzling;
 	unsigned int needs_clflush;
 	unsigned int idx, offset;
 	int ret;
 
-	obj_do_bit17_swizzling = 0;
-	if (i915_gem_object_needs_bit17_swizzle(obj))
-		obj_do_bit17_swizzling = BIT(17);
-
 	ret = mutex_lock_interruptible(&obj->base.dev->struct_mutex);
 	if (ret)
 		return ret;
@@ -1130,7 +1027,6 @@ i915_gem_shmem_pread(struct drm_i915_gem_object *obj,
 		unsigned int length = min_t(u64, remain, PAGE_SIZE - offset);
 
 		ret = shmem_pread(page, offset, length, user_data,
-				  page_to_phys(page) & obj_do_bit17_swizzling,
 				  needs_clflush);
 		if (ret)
 			break;
@@ -1174,6 +1070,7 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct i915_ggtt *ggtt = &i915->ggtt;
+	intel_wakeref_t wakeref;
 	struct drm_mm_node node;
 	struct i915_vma *vma;
 	void __user *user_data;
@@ -1184,7 +1081,7 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
 				       PIN_MAPPABLE |
 				       PIN_NONFAULT |
@@ -1257,7 +1154,7 @@ out_unpin:
 		i915_vma_unpin(vma);
 	}
 out_unlock:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	return ret;
@@ -1358,6 +1255,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct i915_ggtt *ggtt = &i915->ggtt;
+	intel_wakeref_t wakeref;
 	struct drm_mm_node node;
 	struct i915_vma *vma;
 	u64 remain, offset;
@@ -1376,13 +1274,14 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 		 * This easily dwarfs any performance advantage from
 		 * using the cache bypass of indirect GGTT access.
 		 */
-		if (!intel_runtime_pm_get_if_in_use(i915)) {
+		wakeref = intel_runtime_pm_get_if_in_use(i915);
+		if (!wakeref) {
 			ret = -EFAULT;
 			goto out_unlock;
 		}
 	} else {
 		/* No backing pages, no fallback, we must force GGTT access */
-		intel_runtime_pm_get(i915);
+		wakeref = intel_runtime_pm_get(i915);
 	}
 
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
@@ -1464,39 +1363,12 @@ out_unpin:
 		i915_vma_unpin(vma);
 	}
 out_rpm:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 out_unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
 	return ret;
 }
 
-static int
-shmem_pwrite_slow(struct page *page, int offset, int length,
-		  char __user *user_data,
-		  bool page_do_bit17_swizzling,
-		  bool needs_clflush_before,
-		  bool needs_clflush_after)
-{
-	char *vaddr;
-	int ret;
-
-	vaddr = kmap(page);
-	if (unlikely(needs_clflush_before || page_do_bit17_swizzling))
-		shmem_clflush_swizzled_range(vaddr + offset, length,
-					     page_do_bit17_swizzling);
-	if (page_do_bit17_swizzling)
-		ret = __copy_from_user_swizzled(vaddr, offset, user_data,
-						length);
-	else
-		ret = __copy_from_user(vaddr + offset, user_data, length);
-	if (needs_clflush_after)
-		shmem_clflush_swizzled_range(vaddr + offset, length,
-					     page_do_bit17_swizzling);
-	kunmap(page);
-
-	return ret ? -EFAULT : 0;
-}
-
 /* Per-page copy function for the shmem pwrite fastpath.
  * Flushes invalid cachelines before writing to the target if
  * needs_clflush_before is set and flushes out any written cachelines after
@@ -1504,31 +1376,24 @@ shmem_pwrite_slow(struct page *page, int offset, int length,
  */
 static int
 shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
-	     bool page_do_bit17_swizzling,
 	     bool needs_clflush_before,
 	     bool needs_clflush_after)
 {
+	char *vaddr;
 	int ret;
 
-	ret = -ENODEV;
-	if (!page_do_bit17_swizzling) {
-		char *vaddr = kmap_atomic(page);
+	vaddr = kmap(page);
 
-		if (needs_clflush_before)
-			drm_clflush_virt_range(vaddr + offset, len);
-		ret = __copy_from_user_inatomic(vaddr + offset, user_data, len);
-		if (needs_clflush_after)
-			drm_clflush_virt_range(vaddr + offset, len);
+	if (needs_clflush_before)
+		drm_clflush_virt_range(vaddr + offset, len);
 
-		kunmap_atomic(vaddr);
-	}
-	if (ret == 0)
-		return ret;
+	ret = __copy_from_user(vaddr + offset, user_data, len);
+	if (!ret && needs_clflush_after)
+		drm_clflush_virt_range(vaddr + offset, len);
+
+	kunmap(page);
 
-	return shmem_pwrite_slow(page, offset, len, user_data,
-				 page_do_bit17_swizzling,
-				 needs_clflush_before,
-				 needs_clflush_after);
+	return ret ? -EFAULT : 0;
 }
 
 static int
@@ -1538,7 +1403,6 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj,
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	void __user *user_data;
 	u64 remain;
-	unsigned int obj_do_bit17_swizzling;
 	unsigned int partial_cacheline_write;
 	unsigned int needs_clflush;
 	unsigned int offset, idx;
@@ -1553,10 +1417,6 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
-	obj_do_bit17_swizzling = 0;
-	if (i915_gem_object_needs_bit17_swizzle(obj))
-		obj_do_bit17_swizzling = BIT(17);
-
 	/* If we don't overwrite a cacheline completely we need to be
 	 * careful to have up-to-date data by first clflushing. Don't
 	 * overcomplicate things and flush the entire patch.
@@ -1573,7 +1433,6 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj,
 		unsigned int length = min_t(u64, remain, PAGE_SIZE - offset);
 
 		ret = shmem_pwrite(page, offset, length, user_data,
-				   page_to_phys(page) & obj_do_bit17_swizzling,
 				   (offset | length) & partial_cacheline_write,
 				   needs_clflush & CLFLUSH_AFTER);
 		if (ret)
@@ -1677,23 +1536,21 @@ err:
 
 static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj)
 {
-	struct drm_i915_private *i915;
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct list_head *list;
 	struct i915_vma *vma;
 
 	GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
 
+	mutex_lock(&i915->ggtt.vm.mutex);
 	for_each_ggtt_vma(vma, obj) {
-		if (i915_vma_is_active(vma))
-			continue;
-
 		if (!drm_mm_node_allocated(&vma->node))
 			continue;
 
-		list_move_tail(&vma->vm_link, &vma->vm->inactive_list);
+		list_move_tail(&vma->vm_link, &vma->vm->bound_list);
 	}
+	mutex_unlock(&i915->ggtt.vm.mutex);
 
-	i915 = to_i915(obj->base.dev);
 	spin_lock(&i915->mm.obj_lock);
 	list = obj->bind_count ? &i915->mm.bound_list : &i915->mm.unbound_list;
 	list_move_tail(&obj->mm.link, list);
@@ -1713,8 +1570,8 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_gem_set_domain *args = data;
 	struct drm_i915_gem_object *obj;
-	uint32_t read_domains = args->read_domains;
-	uint32_t write_domain = args->write_domain;
+	u32 read_domains = args->read_domains;
+	u32 write_domain = args->write_domain;
 	int err;
 
 	/* Only handle setting domains to types used by the CPU. */
@@ -1883,6 +1740,9 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	addr = vm_mmap(obj->base.filp, 0, args->size,
 		       PROT_READ | PROT_WRITE, MAP_SHARED,
 		       args->offset);
+	if (IS_ERR_VALUE(addr))
+		goto err;
+
 	if (args->flags & I915_MMAP_WC) {
 		struct mm_struct *mm = current->mm;
 		struct vm_area_struct *vma;
@@ -1898,17 +1758,22 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 		else
 			addr = -ENOMEM;
 		up_write(&mm->mmap_sem);
+		if (IS_ERR_VALUE(addr))
+			goto err;
 
 		/* This may race, but that's ok, it only gets set */
 		WRITE_ONCE(obj->frontbuffer_ggtt_origin, ORIGIN_CPU);
 	}
 	i915_gem_object_put(obj);
-	if (IS_ERR((void *)addr))
-		return addr;
 
-	args->addr_ptr = (uint64_t) addr;
+	args->addr_ptr = (u64)addr;
 
 	return 0;
+
+err:
+	i915_gem_object_put(obj);
+
+	return addr;
 }
 
 static unsigned int tile_row_pages(const struct drm_i915_gem_object *obj)
@@ -2019,6 +1884,7 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
 	bool write = area->vm_flags & VM_WRITE;
+	intel_wakeref_t wakeref;
 	struct i915_vma *vma;
 	pgoff_t page_offset;
 	int ret;
@@ -2048,7 +1914,7 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 	if (ret)
 		goto err;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	ret = i915_mutex_lock_interruptible(dev);
 	if (ret)
@@ -2126,7 +1992,7 @@ err_unpin:
 err_unlock:
 	mutex_unlock(&dev->struct_mutex);
 err_rpm:
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 	i915_gem_object_unpin_pages(obj);
 err:
 	switch (ret) {
@@ -2199,6 +2065,7 @@ void
 i915_gem_release_mmap(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	intel_wakeref_t wakeref;
 
 	/* Serialisation between user GTT access and our code depends upon
 	 * revoking the CPU's PTE whilst the mutex is held. The next user
@@ -2209,7 +2076,7 @@ i915_gem_release_mmap(struct drm_i915_gem_object *obj)
 	 * wakeref.
 	 */
 	lockdep_assert_held(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	if (!obj->userfault_count)
 		goto out;
@@ -2226,7 +2093,7 @@ i915_gem_release_mmap(struct drm_i915_gem_object *obj)
 	wmb();
 
 out:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 }
 
 void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
@@ -2306,8 +2173,8 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
 int
 i915_gem_mmap_gtt(struct drm_file *file,
 		  struct drm_device *dev,
-		  uint32_t handle,
-		  uint64_t *offset)
+		  u32 handle,
+		  u64 *offset)
 {
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -2454,8 +2321,8 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
 	struct sg_table *pages;
 
 	pages = fetch_and_zero(&obj->mm.pages);
-	if (!pages)
-		return NULL;
+	if (IS_ERR_OR_NULL(pages))
+		return pages;
 
 	spin_lock(&i915->mm.obj_lock);
 	list_del(&obj->mm.link);
@@ -2479,22 +2346,23 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
 	return pages;
 }
 
-void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
-				 enum i915_mm_subclass subclass)
+int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
+				enum i915_mm_subclass subclass)
 {
 	struct sg_table *pages;
+	int ret;
 
 	if (i915_gem_object_has_pinned_pages(obj))
-		return;
+		return -EBUSY;
 
 	GEM_BUG_ON(obj->bind_count);
-	if (!i915_gem_object_has_pages(obj))
-		return;
 
 	/* May be called by shrinker from within get_pages() (on another bo) */
 	mutex_lock_nested(&obj->mm.lock, subclass);
-	if (unlikely(atomic_read(&obj->mm.pages_pin_count)))
+	if (unlikely(atomic_read(&obj->mm.pages_pin_count))) {
+		ret = -EBUSY;
 		goto unlock;
+	}
 
 	/*
 	 * ->put_pages might need to allocate memory for the bit17 swizzle
@@ -2502,11 +2370,24 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
 	 * lists early.
 	 */
 	pages = __i915_gem_object_unset_pages(obj);
+
+	/*
+	 * XXX Temporary hijinx to avoid updating all backends to handle
+	 * NULL pages. In the future, when we have more asynchronous
+	 * get_pages backends we should be better able to handle the
+	 * cancellation of the async task in a more uniform manner.
+	 */
+	if (!pages && !i915_gem_object_needs_async_cancel(obj))
+		pages = ERR_PTR(-EINVAL);
+
 	if (!IS_ERR(pages))
 		obj->ops->put_pages(obj, pages);
 
+	ret = 0;
 unlock:
 	mutex_unlock(&obj->mm.lock);
+
+	return ret;
 }
 
 bool i915_sg_trim(struct sg_table *orig_st)
@@ -3010,59 +2891,12 @@ i915_gem_object_pwrite_gtt(struct drm_i915_gem_object *obj,
 	return 0;
 }
 
-static void i915_gem_client_mark_guilty(struct drm_i915_file_private *file_priv,
-					const struct i915_gem_context *ctx)
+static bool match_ring(struct i915_request *rq)
 {
-	unsigned int score;
-	unsigned long prev_hang;
-
-	if (i915_gem_context_is_banned(ctx))
-		score = I915_CLIENT_SCORE_CONTEXT_BAN;
-	else
-		score = 0;
+	struct drm_i915_private *dev_priv = rq->i915;
+	u32 ring = I915_READ(RING_START(rq->engine->mmio_base));
 
-	prev_hang = xchg(&file_priv->hang_timestamp, jiffies);
-	if (time_before(jiffies, prev_hang + I915_CLIENT_FAST_HANG_JIFFIES))
-		score += I915_CLIENT_SCORE_HANG_FAST;
-
-	if (score) {
-		atomic_add(score, &file_priv->ban_score);
-
-		DRM_DEBUG_DRIVER("client %s: gained %u ban score, now %u\n",
-				 ctx->name, score,
-				 atomic_read(&file_priv->ban_score));
-	}
-}
-
-static void i915_gem_context_mark_guilty(struct i915_gem_context *ctx)
-{
-	unsigned int score;
-	bool banned, bannable;
-
-	atomic_inc(&ctx->guilty_count);
-
-	bannable = i915_gem_context_is_bannable(ctx);
-	score = atomic_add_return(CONTEXT_SCORE_GUILTY, &ctx->ban_score);
-	banned = score >= CONTEXT_SCORE_BAN_THRESHOLD;
-
-	/* Cool contexts don't accumulate client ban score */
-	if (!bannable)
-		return;
-
-	if (banned) {
-		DRM_DEBUG_DRIVER("context %s: guilty %d, score %u, banned\n",
-				 ctx->name, atomic_read(&ctx->guilty_count),
-				 score);
-		i915_gem_context_set_banned(ctx);
-	}
-
-	if (!IS_ERR_OR_NULL(ctx->file_priv))
-		i915_gem_client_mark_guilty(ctx->file_priv, ctx);
-}
-
-static void i915_gem_context_mark_innocent(struct i915_gem_context *ctx)
-{
-	atomic_inc(&ctx->active_count);
+	return ring == i915_ggtt_offset(rq->ring->vma);
 }
 
 struct i915_request *
@@ -3084,9 +2918,16 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
 	 */
 	spin_lock_irqsave(&engine->timeline.lock, flags);
 	list_for_each_entry(request, &engine->timeline.requests, link) {
-		if (__i915_request_completed(request, request->global_seqno))
+		if (i915_request_completed(request))
 			continue;
 
+		if (!i915_request_started(request))
+			break;
+
+		/* More than one preemptible request may match! */
+		if (!match_ring(request))
+			break;
+
 		active = request;
 		break;
 	}
@@ -3095,366 +2936,6 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
 	return active;
 }
 
-/*
- * Ensure irq handler finishes, and not run again.
- * Also return the active request so that we only search for it once.
- */
-struct i915_request *
-i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
-{
-	struct i915_request *request;
-
-	/*
-	 * During the reset sequence, we must prevent the engine from
-	 * entering RC6. As the context state is undefined until we restart
-	 * the engine, if it does enter RC6 during the reset, the state
-	 * written to the powercontext is undefined and so we may lose
-	 * GPU state upon resume, i.e. fail to restart after a reset.
-	 */
-	intel_uncore_forcewake_get(engine->i915, FORCEWAKE_ALL);
-
-	request = engine->reset.prepare(engine);
-	if (request && request->fence.error == -EIO)
-		request = ERR_PTR(-EIO); /* Previous reset failed! */
-
-	return request;
-}
-
-int i915_gem_reset_prepare(struct drm_i915_private *dev_priv)
-{
-	struct intel_engine_cs *engine;
-	struct i915_request *request;
-	enum intel_engine_id id;
-	int err = 0;
-
-	for_each_engine(engine, dev_priv, id) {
-		request = i915_gem_reset_prepare_engine(engine);
-		if (IS_ERR(request)) {
-			err = PTR_ERR(request);
-			continue;
-		}
-
-		engine->hangcheck.active_request = request;
-	}
-
-	i915_gem_revoke_fences(dev_priv);
-	intel_uc_sanitize(dev_priv);
-
-	return err;
-}
-
-static void engine_skip_context(struct i915_request *request)
-{
-	struct intel_engine_cs *engine = request->engine;
-	struct i915_gem_context *hung_ctx = request->gem_context;
-	struct i915_timeline *timeline = request->timeline;
-	unsigned long flags;
-
-	GEM_BUG_ON(timeline == &engine->timeline);
-
-	spin_lock_irqsave(&engine->timeline.lock, flags);
-	spin_lock(&timeline->lock);
-
-	list_for_each_entry_continue(request, &engine->timeline.requests, link)
-		if (request->gem_context == hung_ctx)
-			i915_request_skip(request, -EIO);
-
-	list_for_each_entry(request, &timeline->requests, link)
-		i915_request_skip(request, -EIO);
-
-	spin_unlock(&timeline->lock);
-	spin_unlock_irqrestore(&engine->timeline.lock, flags);
-}
-
-/* Returns the request if it was guilty of the hang */
-static struct i915_request *
-i915_gem_reset_request(struct intel_engine_cs *engine,
-		       struct i915_request *request,
-		       bool stalled)
-{
-	/* The guilty request will get skipped on a hung engine.
-	 *
-	 * Users of client default contexts do not rely on logical
-	 * state preserved between batches so it is safe to execute
-	 * queued requests following the hang. Non default contexts
-	 * rely on preserved state, so skipping a batch loses the
-	 * evolution of the state and it needs to be considered corrupted.
-	 * Executing more queued batches on top of corrupted state is
-	 * risky. But we take the risk by trying to advance through
-	 * the queued requests in order to make the client behaviour
-	 * more predictable around resets, by not throwing away random
-	 * amount of batches it has prepared for execution. Sophisticated
-	 * clients can use gem_reset_stats_ioctl and dma fence status
-	 * (exported via sync_file info ioctl on explicit fences) to observe
-	 * when it loses the context state and should rebuild accordingly.
-	 *
-	 * The context ban, and ultimately the client ban, mechanism are safety
-	 * valves if client submission ends up resulting in nothing more than
-	 * subsequent hangs.
-	 */
-
-	if (i915_request_completed(request)) {
-		GEM_TRACE("%s pardoned global=%d (fence %llx:%d), current %d\n",
-			  engine->name, request->global_seqno,
-			  request->fence.context, request->fence.seqno,
-			  intel_engine_get_seqno(engine));
-		stalled = false;
-	}
-
-	if (stalled) {
-		i915_gem_context_mark_guilty(request->gem_context);
-		i915_request_skip(request, -EIO);
-
-		/* If this context is now banned, skip all pending requests. */
-		if (i915_gem_context_is_banned(request->gem_context))
-			engine_skip_context(request);
-	} else {
-		/*
-		 * Since this is not the hung engine, it may have advanced
-		 * since the hang declaration. Double check by refinding
-		 * the active request at the time of the reset.
-		 */
-		request = i915_gem_find_active_request(engine);
-		if (request) {
-			unsigned long flags;
-
-			i915_gem_context_mark_innocent(request->gem_context);
-			dma_fence_set_error(&request->fence, -EAGAIN);
-
-			/* Rewind the engine to replay the incomplete rq */
-			spin_lock_irqsave(&engine->timeline.lock, flags);
-			request = list_prev_entry(request, link);
-			if (&request->link == &engine->timeline.requests)
-				request = NULL;
-			spin_unlock_irqrestore(&engine->timeline.lock, flags);
-		}
-	}
-
-	return request;
-}
-
-void i915_gem_reset_engine(struct intel_engine_cs *engine,
-			   struct i915_request *request,
-			   bool stalled)
-{
-	/*
-	 * Make sure this write is visible before we re-enable the interrupt
-	 * handlers on another CPU, as tasklet_enable() resolves to just
-	 * a compiler barrier which is insufficient for our purpose here.
-	 */
-	smp_store_mb(engine->irq_posted, 0);
-
-	if (request)
-		request = i915_gem_reset_request(engine, request, stalled);
-
-	/* Setup the CS to resume from the breadcrumb of the hung request */
-	engine->reset.reset(engine, request);
-}
-
-void i915_gem_reset(struct drm_i915_private *dev_priv,
-		    unsigned int stalled_mask)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	lockdep_assert_held(&dev_priv->drm.struct_mutex);
-
-	i915_retire_requests(dev_priv);
-
-	for_each_engine(engine, dev_priv, id) {
-		struct intel_context *ce;
-
-		i915_gem_reset_engine(engine,
-				      engine->hangcheck.active_request,
-				      stalled_mask & ENGINE_MASK(id));
-		ce = fetch_and_zero(&engine->last_retired_context);
-		if (ce)
-			intel_context_unpin(ce);
-
-		/*
-		 * Ostensibily, we always want a context loaded for powersaving,
-		 * so if the engine is idle after the reset, send a request
-		 * to load our scratch kernel_context.
-		 *
-		 * More mysteriously, if we leave the engine idle after a reset,
-		 * the next userspace batch may hang, with what appears to be
-		 * an incoherent read by the CS (presumably stale TLB). An
-		 * empty request appears sufficient to paper over the glitch.
-		 */
-		if (intel_engine_is_idle(engine)) {
-			struct i915_request *rq;
-
-			rq = i915_request_alloc(engine,
-						dev_priv->kernel_context);
-			if (!IS_ERR(rq))
-				i915_request_add(rq);
-		}
-	}
-
-	i915_gem_restore_fences(dev_priv);
-}
-
-void i915_gem_reset_finish_engine(struct intel_engine_cs *engine)
-{
-	engine->reset.finish(engine);
-
-	intel_uncore_forcewake_put(engine->i915, FORCEWAKE_ALL);
-}
-
-void i915_gem_reset_finish(struct drm_i915_private *dev_priv)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	lockdep_assert_held(&dev_priv->drm.struct_mutex);
-
-	for_each_engine(engine, dev_priv, id) {
-		engine->hangcheck.active_request = NULL;
-		i915_gem_reset_finish_engine(engine);
-	}
-}
-
-static void nop_submit_request(struct i915_request *request)
-{
-	unsigned long flags;
-
-	GEM_TRACE("%s fence %llx:%d -> -EIO\n",
-		  request->engine->name,
-		  request->fence.context, request->fence.seqno);
-	dma_fence_set_error(&request->fence, -EIO);
-
-	spin_lock_irqsave(&request->engine->timeline.lock, flags);
-	__i915_request_submit(request);
-	intel_engine_init_global_seqno(request->engine, request->global_seqno);
-	spin_unlock_irqrestore(&request->engine->timeline.lock, flags);
-}
-
-void i915_gem_set_wedged(struct drm_i915_private *i915)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	GEM_TRACE("start\n");
-
-	if (GEM_SHOW_DEBUG()) {
-		struct drm_printer p = drm_debug_printer(__func__);
-
-		for_each_engine(engine, i915, id)
-			intel_engine_dump(engine, &p, "%s\n", engine->name);
-	}
-
-	if (test_and_set_bit(I915_WEDGED, &i915->gpu_error.flags))
-		goto out;
-
-	/*
-	 * First, stop submission to hw, but do not yet complete requests by
-	 * rolling the global seqno forward (since this would complete requests
-	 * for which we haven't set the fence error to EIO yet).
-	 */
-	for_each_engine(engine, i915, id)
-		i915_gem_reset_prepare_engine(engine);
-
-	/* Even if the GPU reset fails, it should still stop the engines */
-	if (INTEL_GEN(i915) >= 5)
-		intel_gpu_reset(i915, ALL_ENGINES);
-
-	for_each_engine(engine, i915, id) {
-		engine->submit_request = nop_submit_request;
-		engine->schedule = NULL;
-	}
-	i915->caps.scheduler = 0;
-
-	/*
-	 * Make sure no request can slip through without getting completed by
-	 * either this call here to intel_engine_init_global_seqno, or the one
-	 * in nop_submit_request.
-	 */
-	synchronize_rcu();
-
-	/* Mark all executing requests as skipped */
-	for_each_engine(engine, i915, id)
-		engine->cancel_requests(engine);
-
-	for_each_engine(engine, i915, id) {
-		i915_gem_reset_finish_engine(engine);
-		intel_engine_wakeup(engine);
-	}
-
-out:
-	GEM_TRACE("end\n");
-
-	wake_up_all(&i915->gpu_error.reset_queue);
-}
-
-bool i915_gem_unset_wedged(struct drm_i915_private *i915)
-{
-	struct i915_timeline *tl;
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-	if (!test_bit(I915_WEDGED, &i915->gpu_error.flags))
-		return true;
-
-	GEM_TRACE("start\n");
-
-	/*
-	 * Before unwedging, make sure that all pending operations
-	 * are flushed and errored out - we may have requests waiting upon
-	 * third party fences. We marked all inflight requests as EIO, and
-	 * every execbuf since returned EIO, for consistency we want all
-	 * the currently pending requests to also be marked as EIO, which
-	 * is done inside our nop_submit_request - and so we must wait.
-	 *
-	 * No more can be submitted until we reset the wedged bit.
-	 */
-	list_for_each_entry(tl, &i915->gt.timelines, link) {
-		struct i915_request *rq;
-
-		rq = i915_gem_active_peek(&tl->last_request,
-					  &i915->drm.struct_mutex);
-		if (!rq)
-			continue;
-
-		/*
-		 * We can't use our normal waiter as we want to
-		 * avoid recursively trying to handle the current
-		 * reset. The basic dma_fence_default_wait() installs
-		 * a callback for dma_fence_signal(), which is
-		 * triggered by our nop handler (indirectly, the
-		 * callback enables the signaler thread which is
-		 * woken by the nop_submit_request() advancing the seqno
-		 * and when the seqno passes the fence, the signaler
-		 * then signals the fence waking us up).
-		 */
-		if (dma_fence_default_wait(&rq->fence, true,
-					   MAX_SCHEDULE_TIMEOUT) < 0)
-			return false;
-	}
-	i915_retire_requests(i915);
-	GEM_BUG_ON(i915->gt.active_requests);
-
-	if (!intel_gpu_reset(i915, ALL_ENGINES))
-		intel_engines_sanitize(i915);
-
-	/*
-	 * Undo nop_submit_request. We prevent all new i915 requests from
-	 * being queued (by disallowing execbuf whilst wedged) so having
-	 * waited for all active requests above, we know the system is idle
-	 * and do not have to worry about a thread being inside
-	 * engine->submit_request() as we swap over. So unlike installing
-	 * the nop_submit_request on reset, we can do this from normal
-	 * context and do not require stop_machine().
-	 */
-	intel_engines_reset_default_submission(i915);
-	i915_gem_contexts_lost(i915);
-
-	GEM_TRACE("end\n");
-
-	smp_mb__before_atomic(); /* complete takeover before enabling execbuf */
-	clear_bit(I915_WEDGED, &i915->gpu_error.flags);
-
-	return true;
-}
-
 static void
 i915_gem_retire_work_handler(struct work_struct *work)
 {
@@ -3557,7 +3038,7 @@ static void assert_kernel_context_is_current(struct drm_i915_private *i915)
 
 	GEM_BUG_ON(i915->gt.active_requests);
 	for_each_engine(engine, i915, id) {
-		GEM_BUG_ON(__i915_gem_active_peek(&engine->timeline.last_request));
+		GEM_BUG_ON(__i915_active_request_peek(&engine->timeline.last_request));
 		GEM_BUG_ON(engine->last_retired_context !=
 			   to_intel_context(i915->kernel_context, engine));
 	}
@@ -3776,33 +3257,6 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	return ret;
 }
 
-static long wait_for_timeline(struct i915_timeline *tl,
-			      unsigned int flags, long timeout)
-{
-	struct i915_request *rq;
-
-	rq = i915_gem_active_get_unlocked(&tl->last_request);
-	if (!rq)
-		return timeout;
-
-	/*
-	 * "Race-to-idle".
-	 *
-	 * Switching to the kernel context is often used a synchronous
-	 * step prior to idling, e.g. in suspend for flushing all
-	 * current operations to memory before sleeping. These we
-	 * want to complete as quickly as possible to avoid prolonged
-	 * stalls, so allow the gpu to boost to maximum clocks.
-	 */
-	if (flags & I915_WAIT_FOR_IDLE_BOOST)
-		gen6_rps_boost(rq, NULL);
-
-	timeout = i915_request_wait(rq, flags, timeout);
-	i915_request_put(rq);
-
-	return timeout;
-}
-
 static int wait_for_engines(struct drm_i915_private *i915)
 {
 	if (wait_for(intel_engines_are_idle(i915), I915_IDLE_ENGINES_TIMEOUT)) {
@@ -3816,6 +3270,52 @@ static int wait_for_engines(struct drm_i915_private *i915)
 	return 0;
 }
 
+static long
+wait_for_timelines(struct drm_i915_private *i915,
+		   unsigned int flags, long timeout)
+{
+	struct i915_gt_timelines *gt = &i915->gt.timelines;
+	struct i915_timeline *tl;
+
+	if (!READ_ONCE(i915->gt.active_requests))
+		return timeout;
+
+	mutex_lock(&gt->mutex);
+	list_for_each_entry(tl, &gt->active_list, link) {
+		struct i915_request *rq;
+
+		rq = i915_active_request_get_unlocked(&tl->last_request);
+		if (!rq)
+			continue;
+
+		mutex_unlock(&gt->mutex);
+
+		/*
+		 * "Race-to-idle".
+		 *
+		 * Switching to the kernel context is often used a synchronous
+		 * step prior to idling, e.g. in suspend for flushing all
+		 * current operations to memory before sleeping. These we
+		 * want to complete as quickly as possible to avoid prolonged
+		 * stalls, so allow the gpu to boost to maximum clocks.
+		 */
+		if (flags & I915_WAIT_FOR_IDLE_BOOST)
+			gen6_rps_boost(rq, NULL);
+
+		timeout = i915_request_wait(rq, flags, timeout);
+		i915_request_put(rq);
+		if (timeout < 0)
+			return timeout;
+
+		/* restart after reacquiring the lock */
+		mutex_lock(&gt->mutex);
+		tl = list_entry(&gt->active_list, typeof(*tl), link);
+	}
+	mutex_unlock(&gt->mutex);
+
+	return timeout;
+}
+
 int i915_gem_wait_for_idle(struct drm_i915_private *i915,
 			   unsigned int flags, long timeout)
 {
@@ -3827,17 +3327,15 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915,
 	if (!READ_ONCE(i915->gt.awake))
 		return 0;
 
+	timeout = wait_for_timelines(i915, flags, timeout);
+	if (timeout < 0)
+		return timeout;
+
 	if (flags & I915_WAIT_LOCKED) {
-		struct i915_timeline *tl;
 		int err;
 
 		lockdep_assert_held(&i915->drm.struct_mutex);
 
-		list_for_each_entry(tl, &i915->gt.timelines, link) {
-			timeout = wait_for_timeline(tl, flags, timeout);
-			if (timeout < 0)
-				return timeout;
-		}
 		if (GEM_SHOW_DEBUG() && !timeout) {
 			/* Presume that timeout was non-zero to begin with! */
 			dev_warn(&i915->drm.pdev->dev,
@@ -3851,17 +3349,6 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915,
 
 		i915_retire_requests(i915);
 		GEM_BUG_ON(i915->gt.active_requests);
-	} else {
-		struct intel_engine_cs *engine;
-		enum intel_engine_id id;
-
-		for_each_engine(engine, i915, id) {
-			struct i915_timeline *tl = &engine->timeline;
-
-			timeout = wait_for_timeline(tl, flags, timeout);
-			if (timeout < 0)
-				return timeout;
-		}
 	}
 
 	return 0;
@@ -4047,7 +3534,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 	 * reading an invalid PTE on older architectures.
 	 */
 restart:
-	list_for_each_entry(vma, &obj->vma_list, obj_link) {
+	list_for_each_entry(vma, &obj->vma.list, obj_link) {
 		if (!drm_mm_node_allocated(&vma->node))
 			continue;
 
@@ -4125,7 +3612,7 @@ restart:
 			 */
 		}
 
-		list_for_each_entry(vma, &obj->vma_list, obj_link) {
+		list_for_each_entry(vma, &obj->vma.list, obj_link) {
 			if (!drm_mm_node_allocated(&vma->node))
 				continue;
 
@@ -4135,7 +3622,7 @@ restart:
 		}
 	}
 
-	list_for_each_entry(vma, &obj->vma_list, obj_link)
+	list_for_each_entry(vma, &obj->vma.list, obj_link)
 		vma->node.color = cache_level;
 	i915_gem_object_set_cache_coherency(obj, cache_level);
 	obj->cache_dirty = true; /* Always invalidate stale cachelines */
@@ -4698,7 +4185,8 @@ out:
 }
 
 static void
-frontbuffer_retire(struct i915_gem_active *active, struct i915_request *request)
+frontbuffer_retire(struct i915_active_request *active,
+		   struct i915_request *request)
 {
 	struct drm_i915_gem_object *obj =
 		container_of(active, typeof(*obj), frontbuffer_write);
@@ -4711,7 +4199,9 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 {
 	mutex_init(&obj->mm.lock);
 
-	INIT_LIST_HEAD(&obj->vma_list);
+	spin_lock_init(&obj->vma.lock);
+	INIT_LIST_HEAD(&obj->vma.list);
+
 	INIT_LIST_HEAD(&obj->lut_list);
 	INIT_LIST_HEAD(&obj->batch_pool_link);
 
@@ -4723,7 +4213,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	obj->resv = &obj->__builtin_resv;
 
 	obj->frontbuffer_ggtt_origin = ORIGIN_GTT;
-	init_request_active(&obj->frontbuffer_write, frontbuffer_retire);
+	i915_active_request_init(&obj->frontbuffer_write,
+				 NULL, frontbuffer_retire);
 
 	obj->mm.madv = I915_MADV_WILLNEED;
 	INIT_RADIX_TREE(&obj->mm.get_page.radix, GFP_KERNEL | __GFP_NOWARN);
@@ -4866,8 +4357,9 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 				    struct llist_node *freed)
 {
 	struct drm_i915_gem_object *obj, *on;
+	intel_wakeref_t wakeref;
 
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 	llist_for_each_entry_safe(obj, on, freed, freed) {
 		struct i915_vma *vma, *vn;
 
@@ -4876,14 +4368,13 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 		mutex_lock(&i915->drm.struct_mutex);
 
 		GEM_BUG_ON(i915_gem_object_is_active(obj));
-		list_for_each_entry_safe(vma, vn,
-					 &obj->vma_list, obj_link) {
+		list_for_each_entry_safe(vma, vn, &obj->vma.list, obj_link) {
 			GEM_BUG_ON(i915_vma_is_active(vma));
 			vma->flags &= ~I915_VMA_PIN_MASK;
 			i915_vma_destroy(vma);
 		}
-		GEM_BUG_ON(!list_empty(&obj->vma_list));
-		GEM_BUG_ON(!RB_EMPTY_ROOT(&obj->vma_tree));
+		GEM_BUG_ON(!list_empty(&obj->vma.list));
+		GEM_BUG_ON(!RB_EMPTY_ROOT(&obj->vma.tree));
 
 		/* This serializes freeing with the shrinker. Since the free
 		 * is delayed, first by RCU then by the workqueue, we want the
@@ -4928,7 +4419,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 		if (on)
 			cond_resched();
 	}
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 }
 
 static void i915_gem_flush_free_objects(struct drm_i915_private *i915)
@@ -5037,13 +4528,11 @@ void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj)
 
 void i915_gem_sanitize(struct drm_i915_private *i915)
 {
-	int err;
+	intel_wakeref_t wakeref;
 
 	GEM_TRACE("\n");
 
-	mutex_lock(&i915->drm.struct_mutex);
-
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 	intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
 
 	/*
@@ -5063,28 +4552,28 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
 	 * it may impact the display and we are uncertain about the stability
 	 * of the reset, so this could be applied to even earlier gen.
 	 */
-	err = -ENODEV;
-	if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915))
-		err = WARN_ON(intel_gpu_reset(i915, ALL_ENGINES));
-	if (!err)
-		intel_engines_sanitize(i915);
+	intel_engines_sanitize(i915, false);
 
 	intel_uncore_forcewake_put(i915, FORCEWAKE_ALL);
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 
+	mutex_lock(&i915->drm.struct_mutex);
 	i915_gem_contexts_lost(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 }
 
 int i915_gem_suspend(struct drm_i915_private *i915)
 {
+	intel_wakeref_t wakeref;
 	int ret;
 
 	GEM_TRACE("\n");
 
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 	intel_suspend_gt_powersave(i915);
 
+	flush_workqueue(i915->wq);
+
 	mutex_lock(&i915->drm.struct_mutex);
 
 	/*
@@ -5114,11 +4603,9 @@ int i915_gem_suspend(struct drm_i915_private *i915)
 	i915_retire_requests(i915); /* ensure we flush after wedging */
 
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_reset_flush(i915);
 
-	intel_uc_suspend(i915);
-
-	cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
-	cancel_delayed_work_sync(&i915->gt.retire_work);
+	drain_delayed_work(&i915->gt.retire_work);
 
 	/*
 	 * As the idle_work is rearming if it detects a race, play safe and
@@ -5126,6 +4613,8 @@ int i915_gem_suspend(struct drm_i915_private *i915)
 	 */
 	drain_delayed_work(&i915->gt.idle_work);
 
+	intel_uc_suspend(i915);
+
 	/*
 	 * Assert that we successfully flushed all the work and
 	 * reset the GPU back to its idle, low power state.
@@ -5134,12 +4623,12 @@ int i915_gem_suspend(struct drm_i915_private *i915)
 	if (WARN_ON(!intel_engines_are_idle(i915)))
 		i915_gem_set_wedged(i915); /* no hope, discard everything */
 
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	return 0;
 
 err_unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	return ret;
 }
 
@@ -5233,15 +4722,15 @@ void i915_gem_init_swizzling(struct drm_i915_private *dev_priv)
 	I915_WRITE(DISP_ARB_CTL, I915_READ(DISP_ARB_CTL) |
 				 DISP_TILE_SURFACE_SWIZZLING);
 
-	if (IS_GEN5(dev_priv))
+	if (IS_GEN(dev_priv, 5))
 		return;
 
 	I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_SWZCTL);
-	if (IS_GEN6(dev_priv))
+	if (IS_GEN(dev_priv, 6))
 		I915_WRITE(ARB_MODE, _MASKED_BIT_ENABLE(ARB_MODE_SWIZZLE_SNB));
-	else if (IS_GEN7(dev_priv))
+	else if (IS_GEN(dev_priv, 7))
 		I915_WRITE(ARB_MODE, _MASKED_BIT_ENABLE(ARB_MODE_SWIZZLE_IVB));
-	else if (IS_GEN8(dev_priv))
+	else if (IS_GEN(dev_priv, 8))
 		I915_WRITE(GAMTARBMODE, _MASKED_BIT_ENABLE(ARB_MODE_SWIZZLE_BDW));
 	else
 		BUG();
@@ -5263,10 +4752,10 @@ static void init_unused_rings(struct drm_i915_private *dev_priv)
 		init_unused_ring(dev_priv, SRB1_BASE);
 		init_unused_ring(dev_priv, SRB2_BASE);
 		init_unused_ring(dev_priv, SRB3_BASE);
-	} else if (IS_GEN2(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 2)) {
 		init_unused_ring(dev_priv, SRB0_BASE);
 		init_unused_ring(dev_priv, SRB1_BASE);
-	} else if (IS_GEN3(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 3)) {
 		init_unused_ring(dev_priv, PRB1_BASE);
 		init_unused_ring(dev_priv, PRB2_BASE);
 	}
@@ -5562,6 +5051,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		dev_priv->gt.cleanup_engine = intel_engine_cleanup;
 	}
 
+	i915_timelines_init(dev_priv);
+
 	ret = i915_gem_init_userptr(dev_priv);
 	if (ret)
 		return ret;
@@ -5590,7 +5081,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	}
 
 	ret = i915_gem_init_scratch(dev_priv,
-				    IS_GEN2(dev_priv) ? SZ_256K : PAGE_SIZE);
+				    IS_GEN(dev_priv, 2) ? SZ_256K : PAGE_SIZE);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
 		goto err_ggtt;
@@ -5684,8 +5175,10 @@ err_unlock:
 err_uc_misc:
 	intel_uc_fini_misc(dev_priv);
 
-	if (ret != -EIO)
+	if (ret != -EIO) {
 		i915_gem_cleanup_userptr(dev_priv);
+		i915_timelines_fini(dev_priv);
+	}
 
 	if (ret == -EIO) {
 		mutex_lock(&dev_priv->drm.struct_mutex);
@@ -5736,6 +5229,7 @@ void i915_gem_fini(struct drm_i915_private *dev_priv)
 
 	intel_uc_fini_misc(dev_priv);
 	i915_gem_cleanup_userptr(dev_priv);
+	i915_timelines_fini(dev_priv);
 
 	i915_gem_drain_freed_objects(dev_priv);
 
@@ -5838,7 +5332,6 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
 	if (!dev_priv->priorities)
 		goto err_dependencies;
 
-	INIT_LIST_HEAD(&dev_priv->gt.timelines);
 	INIT_LIST_HEAD(&dev_priv->gt.active_rings);
 	INIT_LIST_HEAD(&dev_priv->gt.closed_vma);
 
@@ -5850,6 +5343,7 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
 			  i915_gem_idle_work_handler);
 	init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
 	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
+	mutex_init(&dev_priv->gpu_error.wedge_mutex);
 
 	atomic_set(&dev_priv->mm.bsd_engine_dispatch_index, 0);
 
@@ -5881,7 +5375,6 @@ void i915_gem_cleanup_early(struct drm_i915_private *dev_priv)
 	GEM_BUG_ON(!llist_empty(&dev_priv->mm.free_list));
 	GEM_BUG_ON(atomic_read(&dev_priv->mm.free_count));
 	WARN_ON(dev_priv->mm.object_count);
-	WARN_ON(!list_empty(&dev_priv->gt.timelines));
 
 	kmem_cache_destroy(dev_priv->priorities);
 	kmem_cache_destroy(dev_priv->dependencies);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 371c07087095..280813a4bf82 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -86,10 +86,10 @@
  */
 
 #include <linux/log2.h>
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "i915_trace.h"
+#include "intel_lrc_reg.h"
 #include "intel_workarounds.h"
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
@@ -311,7 +311,7 @@ static u32 default_desc_template(const struct drm_i915_private *i915,
 		address_mode = INTEL_LEGACY_64B_CONTEXT;
 	desc |= address_mode << GEN8_CTX_ADDRESSING_MODE_SHIFT;
 
-	if (IS_GEN8(i915))
+	if (IS_GEN(i915, 8))
 		desc |= GEN8_CTX_L3LLC_COHERENT;
 
 	/* TODO: WaDisableLiteRestore when we start using semaphore
@@ -322,6 +322,32 @@ static u32 default_desc_template(const struct drm_i915_private *i915,
 	return desc;
 }
 
+static void intel_context_retire(struct i915_active_request *active,
+				 struct i915_request *rq)
+{
+	struct intel_context *ce =
+		container_of(active, typeof(*ce), active_tracker);
+
+	intel_context_unpin(ce);
+}
+
+void
+intel_context_init(struct intel_context *ce,
+		   struct i915_gem_context *ctx,
+		   struct intel_engine_cs *engine)
+{
+	ce->gem_context = ctx;
+
+	INIT_LIST_HEAD(&ce->signal_link);
+	INIT_LIST_HEAD(&ce->signals);
+
+	/* Use the whole device by default */
+	ce->sseu = intel_device_default_sseu(ctx->i915);
+
+	i915_active_request_init(&ce->active_tracker,
+				 NULL, intel_context_retire);
+}
+
 static struct i915_gem_context *
 __create_hw_context(struct drm_i915_private *dev_priv,
 		    struct drm_i915_file_private *file_priv)
@@ -339,11 +365,8 @@ __create_hw_context(struct drm_i915_private *dev_priv,
 	ctx->i915 = dev_priv;
 	ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL);
 
-	for (n = 0; n < ARRAY_SIZE(ctx->__engine); n++) {
-		struct intel_context *ce = &ctx->__engine[n];
-
-		ce->gem_context = ctx;
-	}
+	for (n = 0; n < ARRAY_SIZE(ctx->__engine); n++)
+		intel_context_init(&ctx->__engine[n], ctx, dev_priv->engine[n]);
 
 	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
 	INIT_LIST_HEAD(&ctx->handles_list);
@@ -646,10 +669,10 @@ last_request_on_engine(struct i915_timeline *timeline,
 
 	GEM_BUG_ON(timeline == &engine->timeline);
 
-	rq = i915_gem_active_raw(&timeline->last_request,
-				 &engine->i915->drm.struct_mutex);
+	rq = i915_active_request_raw(&timeline->last_request,
+				     &engine->i915->drm.struct_mutex);
 	if (rq && rq->engine == engine) {
-		GEM_TRACE("last request for %s on engine %s: %llx:%d\n",
+		GEM_TRACE("last request for %s on engine %s: %llx:%llu\n",
 			  timeline->name, engine->name,
 			  rq->fence.context, rq->fence.seqno);
 		GEM_BUG_ON(rq->timeline != timeline);
@@ -686,14 +709,14 @@ static bool engine_has_kernel_context_barrier(struct intel_engine_cs *engine)
 		 * switch-to-kernel-context?
 		 */
 		if (!i915_timeline_sync_is_later(barrier, &rq->fence)) {
-			GEM_TRACE("%s needs barrier for %llx:%d\n",
+			GEM_TRACE("%s needs barrier for %llx:%lld\n",
 				  ring->timeline->name,
 				  rq->fence.context,
 				  rq->fence.seqno);
 			return false;
 		}
 
-		GEM_TRACE("%s has barrier after %llx:%d\n",
+		GEM_TRACE("%s has barrier after %llx:%lld\n",
 			  ring->timeline->name,
 			  rq->fence.context,
 			  rq->fence.seqno);
@@ -749,7 +772,7 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915)
 			if (prev->gem_context == i915->kernel_context)
 				continue;
 
-			GEM_TRACE("add barrier on %s for %llx:%d\n",
+			GEM_TRACE("add barrier on %s for %llx:%lld\n",
 				  engine->name,
 				  prev->fence.context,
 				  prev->fence.seqno);
@@ -840,6 +863,56 @@ out:
 	return 0;
 }
 
+static int get_sseu(struct i915_gem_context *ctx,
+		    struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_param_sseu user_sseu;
+	struct intel_engine_cs *engine;
+	struct intel_context *ce;
+	int ret;
+
+	if (args->size == 0)
+		goto out;
+	else if (args->size < sizeof(user_sseu))
+		return -EINVAL;
+
+	if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value),
+			   sizeof(user_sseu)))
+		return -EFAULT;
+
+	if (user_sseu.flags || user_sseu.rsvd)
+		return -EINVAL;
+
+	engine = intel_engine_lookup_user(ctx->i915,
+					  user_sseu.engine_class,
+					  user_sseu.engine_instance);
+	if (!engine)
+		return -EINVAL;
+
+	/* Only use for mutex here is to serialize get_param and set_param. */
+	ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex);
+	if (ret)
+		return ret;
+
+	ce = to_intel_context(ctx, engine);
+
+	user_sseu.slice_mask = ce->sseu.slice_mask;
+	user_sseu.subslice_mask = ce->sseu.subslice_mask;
+	user_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice;
+	user_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice;
+
+	mutex_unlock(&ctx->i915->drm.struct_mutex);
+
+	if (copy_to_user(u64_to_user_ptr(args->value), &user_sseu,
+			 sizeof(user_sseu)))
+		return -EFAULT;
+
+out:
+	args->size = sizeof(user_sseu);
+
+	return 0;
+}
+
 int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file)
 {
@@ -852,15 +925,17 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	if (!ctx)
 		return -ENOENT;
 
-	args->size = 0;
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 		ret = -EINVAL;
 		break;
 	case I915_CONTEXT_PARAM_NO_ZEROMAP:
+		args->size = 0;
 		args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
 		break;
 	case I915_CONTEXT_PARAM_GTT_SIZE:
+		args->size = 0;
+
 		if (ctx->ppgtt)
 			args->value = ctx->ppgtt->vm.total;
 		else if (to_i915(dev)->mm.aliasing_ppgtt)
@@ -869,14 +944,20 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 			args->value = to_i915(dev)->ggtt.vm.total;
 		break;
 	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
+		args->size = 0;
 		args->value = i915_gem_context_no_error_capture(ctx);
 		break;
 	case I915_CONTEXT_PARAM_BANNABLE:
+		args->size = 0;
 		args->value = i915_gem_context_is_bannable(ctx);
 		break;
 	case I915_CONTEXT_PARAM_PRIORITY:
+		args->size = 0;
 		args->value = ctx->sched.priority >> I915_USER_PRIORITY_SHIFT;
 		break;
+	case I915_CONTEXT_PARAM_SSEU:
+		ret = get_sseu(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -886,6 +967,281 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	return ret;
 }
 
+static int gen8_emit_rpcs_config(struct i915_request *rq,
+				 struct intel_context *ce,
+				 struct intel_sseu sseu)
+{
+	u64 offset;
+	u32 *cs;
+
+	cs = intel_ring_begin(rq, 4);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	offset = i915_ggtt_offset(ce->state) +
+		 LRC_STATE_PN * PAGE_SIZE +
+		 (CTX_R_PWR_CLK_STATE + 1) * 4;
+
+	*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+	*cs++ = lower_32_bits(offset);
+	*cs++ = upper_32_bits(offset);
+	*cs++ = gen8_make_rpcs(rq->i915, &sseu);
+
+	intel_ring_advance(rq, cs);
+
+	return 0;
+}
+
+static int
+gen8_modify_rpcs_gpu(struct intel_context *ce,
+		     struct intel_engine_cs *engine,
+		     struct intel_sseu sseu)
+{
+	struct drm_i915_private *i915 = engine->i915;
+	struct i915_request *rq, *prev;
+	intel_wakeref_t wakeref;
+	int ret;
+
+	GEM_BUG_ON(!ce->pin_count);
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	/* Submitting requests etc needs the hw awake. */
+	wakeref = intel_runtime_pm_get(i915);
+
+	rq = i915_request_alloc(engine, i915->kernel_context);
+	if (IS_ERR(rq)) {
+		ret = PTR_ERR(rq);
+		goto out_put;
+	}
+
+	/* Queue this switch after all other activity by this context. */
+	prev = i915_active_request_raw(&ce->ring->timeline->last_request,
+				       &i915->drm.struct_mutex);
+	if (prev && !i915_request_completed(prev)) {
+		ret = i915_request_await_dma_fence(rq, &prev->fence);
+		if (ret < 0)
+			goto out_add;
+	}
+
+	/* Order all following requests to be after. */
+	ret = i915_timeline_set_barrier(ce->ring->timeline, rq);
+	if (ret)
+		goto out_add;
+
+	ret = gen8_emit_rpcs_config(rq, ce, sseu);
+	if (ret)
+		goto out_add;
+
+	/*
+	 * Guarantee context image and the timeline remains pinned until the
+	 * modifying request is retired by setting the ce activity tracker.
+	 *
+	 * But we only need to take one pin on the account of it. Or in other
+	 * words transfer the pinned ce object to tracked active request.
+	 */
+	if (!i915_active_request_isset(&ce->active_tracker))
+		__intel_context_pin(ce);
+	__i915_active_request_set(&ce->active_tracker, rq);
+
+out_add:
+	i915_request_add(rq);
+out_put:
+	intel_runtime_pm_put(i915, wakeref);
+
+	return ret;
+}
+
+static int
+__i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
+				    struct intel_engine_cs *engine,
+				    struct intel_sseu sseu)
+{
+	struct intel_context *ce = to_intel_context(ctx, engine);
+	int ret = 0;
+
+	GEM_BUG_ON(INTEL_GEN(ctx->i915) < 8);
+	GEM_BUG_ON(engine->id != RCS);
+
+	/* Nothing to do if unmodified. */
+	if (!memcmp(&ce->sseu, &sseu, sizeof(sseu)))
+		return 0;
+
+	/*
+	 * If context is not idle we have to submit an ordered request to modify
+	 * its context image via the kernel context. Pristine and idle contexts
+	 * will be configured on pinning.
+	 */
+	if (ce->pin_count)
+		ret = gen8_modify_rpcs_gpu(ce, engine, sseu);
+
+	if (!ret)
+		ce->sseu = sseu;
+
+	return ret;
+}
+
+static int
+i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
+				  struct intel_engine_cs *engine,
+				  struct intel_sseu sseu)
+{
+	int ret;
+
+	ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex);
+	if (ret)
+		return ret;
+
+	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
+
+	mutex_unlock(&ctx->i915->drm.struct_mutex);
+
+	return ret;
+}
+
+static int
+user_to_context_sseu(struct drm_i915_private *i915,
+		     const struct drm_i915_gem_context_param_sseu *user,
+		     struct intel_sseu *context)
+{
+	const struct sseu_dev_info *device = &RUNTIME_INFO(i915)->sseu;
+
+	/* No zeros in any field. */
+	if (!user->slice_mask || !user->subslice_mask ||
+	    !user->min_eus_per_subslice || !user->max_eus_per_subslice)
+		return -EINVAL;
+
+	/* Max > min. */
+	if (user->max_eus_per_subslice < user->min_eus_per_subslice)
+		return -EINVAL;
+
+	/*
+	 * Some future proofing on the types since the uAPI is wider than the
+	 * current internal implementation.
+	 */
+	if (overflows_type(user->slice_mask, context->slice_mask) ||
+	    overflows_type(user->subslice_mask, context->subslice_mask) ||
+	    overflows_type(user->min_eus_per_subslice,
+			   context->min_eus_per_subslice) ||
+	    overflows_type(user->max_eus_per_subslice,
+			   context->max_eus_per_subslice))
+		return -EINVAL;
+
+	/* Check validity against hardware. */
+	if (user->slice_mask & ~device->slice_mask)
+		return -EINVAL;
+
+	if (user->subslice_mask & ~device->subslice_mask[0])
+		return -EINVAL;
+
+	if (user->max_eus_per_subslice > device->max_eus_per_subslice)
+		return -EINVAL;
+
+	context->slice_mask = user->slice_mask;
+	context->subslice_mask = user->subslice_mask;
+	context->min_eus_per_subslice = user->min_eus_per_subslice;
+	context->max_eus_per_subslice = user->max_eus_per_subslice;
+
+	/* Part specific restrictions. */
+	if (IS_GEN(i915, 11)) {
+		unsigned int hw_s = hweight8(device->slice_mask);
+		unsigned int hw_ss_per_s = hweight8(device->subslice_mask[0]);
+		unsigned int req_s = hweight8(context->slice_mask);
+		unsigned int req_ss = hweight8(context->subslice_mask);
+
+		/*
+		 * Only full subslice enablement is possible if more than one
+		 * slice is turned on.
+		 */
+		if (req_s > 1 && req_ss != hw_ss_per_s)
+			return -EINVAL;
+
+		/*
+		 * If more than four (SScount bitfield limit) subslices are
+		 * requested then the number has to be even.
+		 */
+		if (req_ss > 4 && (req_ss & 1))
+			return -EINVAL;
+
+		/*
+		 * If only one slice is enabled and subslice count is below the
+		 * device full enablement, it must be at most half of the all
+		 * available subslices.
+		 */
+		if (req_s == 1 && req_ss < hw_ss_per_s &&
+		    req_ss > (hw_ss_per_s / 2))
+			return -EINVAL;
+
+		/* ABI restriction - VME use case only. */
+
+		/* All slices or one slice only. */
+		if (req_s != 1 && req_s != hw_s)
+			return -EINVAL;
+
+		/*
+		 * Half subslices or full enablement only when one slice is
+		 * enabled.
+		 */
+		if (req_s == 1 &&
+		    (req_ss != hw_ss_per_s && req_ss != (hw_ss_per_s / 2)))
+			return -EINVAL;
+
+		/* No EU configuration changes. */
+		if ((user->min_eus_per_subslice !=
+		     device->max_eus_per_subslice) ||
+		    (user->max_eus_per_subslice !=
+		     device->max_eus_per_subslice))
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int set_sseu(struct i915_gem_context *ctx,
+		    struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_private *i915 = ctx->i915;
+	struct drm_i915_gem_context_param_sseu user_sseu;
+	struct intel_engine_cs *engine;
+	struct intel_sseu sseu;
+	int ret;
+
+	if (args->size < sizeof(user_sseu))
+		return -EINVAL;
+
+	if (!IS_GEN(i915, 11))
+		return -ENODEV;
+
+	if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value),
+			   sizeof(user_sseu)))
+		return -EFAULT;
+
+	if (user_sseu.flags || user_sseu.rsvd)
+		return -EINVAL;
+
+	engine = intel_engine_lookup_user(i915,
+					  user_sseu.engine_class,
+					  user_sseu.engine_instance);
+	if (!engine)
+		return -EINVAL;
+
+	/* Only render engine supports RPCS configuration. */
+	if (engine->class != RENDER_CLASS)
+		return -ENODEV;
+
+	ret = user_to_context_sseu(i915, &user_sseu, &sseu);
+	if (ret)
+		return ret;
+
+	ret = i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
+	if (ret)
+		return ret;
+
+	args->size = sizeof(user_sseu);
+
+	return 0;
+}
+
 int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file)
 {
@@ -948,7 +1304,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 					I915_USER_PRIORITY(priority);
 		}
 		break;
-
+	case I915_CONTEXT_PARAM_SSEU:
+		ret = set_sseu(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index f6d870b1f73e..ca150a764c24 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -31,6 +31,7 @@
 
 #include "i915_gem.h"
 #include "i915_scheduler.h"
+#include "intel_device_info.h"
 
 struct pid;
 
@@ -53,6 +54,16 @@ struct intel_context_ops {
 	void (*destroy)(struct intel_context *ce);
 };
 
+/*
+ * Powergating configuration for a particular (context,engine).
+ */
+struct intel_sseu {
+	u8 slice_mask;
+	u8 subslice_mask;
+	u8 min_eus_per_subslice;
+	u8 max_eus_per_subslice;
+};
+
 /**
  * struct i915_gem_context - client state
  *
@@ -164,13 +175,24 @@ struct i915_gem_context {
 	struct intel_context {
 		struct i915_gem_context *gem_context;
 		struct intel_engine_cs *active;
+		struct list_head signal_link;
+		struct list_head signals;
 		struct i915_vma *state;
 		struct intel_ring *ring;
 		u32 *lrc_reg_state;
 		u64 lrc_desc;
 		int pin_count;
 
+		/**
+		 * active_tracker: Active tracker for the external rq activity
+		 * on this intel_context object.
+		 */
+		struct i915_active_request active_tracker;
+
 		const struct intel_context_ops *ops;
+
+		/** sseu: Control eu/slice partitioning */
+		struct intel_sseu sseu;
 	} __engine[I915_NUM_ENGINES];
 
 	/** ring_size: size for allocating the per-engine ring buffer */
@@ -364,4 +386,8 @@ static inline void i915_gem_context_put(struct i915_gem_context *ctx)
 	kref_put(&ctx->ref, i915_gem_context_release);
 }
 
+void intel_context_init(struct intel_context *ce,
+			struct i915_gem_context *ctx,
+			struct intel_engine_cs *engine);
+
 #endif /* !__I915_GEM_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 82e2ca17a441..02f7298bfe57 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -27,7 +27,6 @@
 #include <linux/dma-buf.h>
 #include <linux/reservation.h>
 
-#include <drm/drmP.h>
 
 #include "i915_drv.h"
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 02b83a5ed96c..68d74c50ac39 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -26,7 +26,6 @@
  *
  */
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 
 #include "i915_drv.h"
@@ -127,31 +126,25 @@ i915_gem_evict_something(struct i915_address_space *vm,
 	struct drm_i915_private *dev_priv = vm->i915;
 	struct drm_mm_scan scan;
 	struct list_head eviction_list;
-	struct list_head *phases[] = {
-		&vm->inactive_list,
-		&vm->active_list,
-		NULL,
-	}, **phase;
 	struct i915_vma *vma, *next;
 	struct drm_mm_node *node;
 	enum drm_mm_insert_mode mode;
+	struct i915_vma *active;
 	int ret;
 
 	lockdep_assert_held(&vm->i915->drm.struct_mutex);
 	trace_i915_gem_evict(vm, min_size, alignment, flags);
 
 	/*
-	 * The goal is to evict objects and amalgamate space in LRU order.
-	 * The oldest idle objects reside on the inactive list, which is in
-	 * retirement order. The next objects to retire are those in flight,
-	 * on the active list, again in retirement order.
+	 * The goal is to evict objects and amalgamate space in rough LRU order.
+	 * Since both active and inactive objects reside on the same list,
+	 * in a mix of creation and last scanned order, as we process the list
+	 * we sort it into inactive/active, which keeps the active portion
+	 * in a rough MRU order.
 	 *
 	 * The retirement sequence is thus:
-	 *   1. Inactive objects (already retired)
-	 *   2. Active objects (will stall on unbinding)
-	 *
-	 * On each list, the oldest objects lie at the HEAD with the freshest
-	 * object on the TAIL.
+	 *   1. Inactive objects (already retired, random order)
+	 *   2. Active objects (will stall on unbinding, oldest scanned first)
 	 */
 	mode = DRM_MM_INSERT_BEST;
 	if (flags & PIN_HIGH)
@@ -170,17 +163,46 @@ i915_gem_evict_something(struct i915_address_space *vm,
 	 */
 	if (!(flags & PIN_NONBLOCK))
 		i915_retire_requests(dev_priv);
-	else
-		phases[1] = NULL;
 
 search_again:
+	active = NULL;
 	INIT_LIST_HEAD(&eviction_list);
-	phase = phases;
-	do {
-		list_for_each_entry(vma, *phase, vm_link)
-			if (mark_free(&scan, vma, flags, &eviction_list))
-				goto found;
-	} while (*++phase);
+	list_for_each_entry_safe(vma, next, &vm->bound_list, vm_link) {
+		/*
+		 * We keep this list in a rough least-recently scanned order
+		 * of active elements (inactive elements are cheap to reap).
+		 * New entries are added to the end, and we move anything we
+		 * scan to the end. The assumption is that the working set
+		 * of applications is either steady state (and thanks to the
+		 * userspace bo cache it almost always is) or volatile and
+		 * frequently replaced after a frame, which are self-evicting!
+		 * Given that assumption, the MRU order of the scan list is
+		 * fairly static, and keeping it in least-recently scan order
+		 * is suitable.
+		 *
+		 * To notice when we complete one full cycle, we record the
+		 * first active element seen, before moving it to the tail.
+		 */
+		if (i915_vma_is_active(vma)) {
+			if (vma == active) {
+				if (flags & PIN_NONBLOCK)
+					break;
+
+				active = ERR_PTR(-EAGAIN);
+			}
+
+			if (active != ERR_PTR(-EAGAIN)) {
+				if (!active)
+					active = vma;
+
+				list_move_tail(&vma->vm_link, &vm->bound_list);
+				continue;
+			}
+		}
+
+		if (mark_free(&scan, vma, flags, &eviction_list))
+			goto found;
+	}
 
 	/* Nothing found, clean up and bail out! */
 	list_for_each_entry_safe(vma, next, &eviction_list, evict_link) {
@@ -389,11 +411,6 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
  */
 int i915_gem_evict_vm(struct i915_address_space *vm)
 {
-	struct list_head *phases[] = {
-		&vm->inactive_list,
-		&vm->active_list,
-		NULL
-	}, **phase;
 	struct list_head eviction_list;
 	struct i915_vma *vma, *next;
 	int ret;
@@ -413,16 +430,15 @@ int i915_gem_evict_vm(struct i915_address_space *vm)
 	}
 
 	INIT_LIST_HEAD(&eviction_list);
-	phase = phases;
-	do {
-		list_for_each_entry(vma, *phase, vm_link) {
-			if (i915_vma_is_pinned(vma))
-				continue;
+	mutex_lock(&vm->mutex);
+	list_for_each_entry(vma, &vm->bound_list, vm_link) {
+		if (i915_vma_is_pinned(vma))
+			continue;
 
-			__i915_vma_pin(vma);
-			list_add(&vma->evict_link, &eviction_list);
-		}
-	} while (*++phase);
+		__i915_vma_pin(vma);
+		list_add(&vma->evict_link, &eviction_list);
+	}
+	mutex_unlock(&vm->mutex);
 
 	ret = 0;
 	list_for_each_entry_safe(vma, next, &eviction_list, evict_link) {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 485b259127c3..02adcaf6ebea 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -31,7 +31,6 @@
 #include <linux/sync_file.h>
 #include <linux/uaccess.h>
 
-#include <drm/drmP.h>
 #include <drm/drm_syncobj.h>
 #include <drm/i915_drm.h>
 
@@ -754,6 +753,68 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	return 0;
 }
 
+static struct i915_request *__eb_wait_for_ring(struct intel_ring *ring)
+{
+	struct i915_request *rq;
+
+	/*
+	 * Completely unscientific finger-in-the-air estimates for suitable
+	 * maximum user request size (to avoid blocking) and then backoff.
+	 */
+	if (intel_ring_update_space(ring) >= PAGE_SIZE)
+		return NULL;
+
+	/*
+	 * Find a request that after waiting upon, there will be at least half
+	 * the ring available. The hysteresis allows us to compete for the
+	 * shared ring and should mean that we sleep less often prior to
+	 * claiming our resources, but not so long that the ring completely
+	 * drains before we can submit our next request.
+	 */
+	list_for_each_entry(rq, &ring->request_list, ring_link) {
+		if (__intel_ring_space(rq->postfix,
+				       ring->emit, ring->size) > ring->size / 2)
+			break;
+	}
+	if (&rq->ring_link == &ring->request_list)
+		return NULL; /* weird, we will check again later for real */
+
+	return i915_request_get(rq);
+}
+
+static int eb_wait_for_ring(const struct i915_execbuffer *eb)
+{
+	const struct intel_context *ce;
+	struct i915_request *rq;
+	int ret = 0;
+
+	/*
+	 * Apply a light amount of backpressure to prevent excessive hogs
+	 * from blocking waiting for space whilst holding struct_mutex and
+	 * keeping all of their resources pinned.
+	 */
+
+	ce = to_intel_context(eb->ctx, eb->engine);
+	if (!ce->ring) /* first use, assume empty! */
+		return 0;
+
+	rq = __eb_wait_for_ring(ce->ring);
+	if (rq) {
+		mutex_unlock(&eb->i915->drm.struct_mutex);
+
+		if (i915_request_wait(rq,
+				      I915_WAIT_INTERRUPTIBLE,
+				      MAX_SCHEDULE_TIMEOUT) < 0)
+			ret = -EINTR;
+
+		i915_request_put(rq);
+
+		mutex_lock(&eb->i915->drm.struct_mutex);
+	}
+
+	return ret;
+}
+
 static int eb_lookup_vmas(struct i915_execbuffer *eb)
 {
 	struct radix_tree_root *handles_vma = &eb->ctx->handles_vma;
@@ -1380,7 +1441,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 		 * batchbuffers.
 		 */
 		if (reloc->write_domain == I915_GEM_DOMAIN_INSTRUCTION &&
-		    IS_GEN6(eb->i915)) {
+		    IS_GEN(eb->i915, 6)) {
 			err = i915_vma_bind(target, target->obj->cache_level,
 					    PIN_GLOBAL);
 			if (WARN_ONCE(err,
@@ -1896,7 +1957,7 @@ static int i915_reset_gen7_sol_offsets(struct i915_request *rq)
 	u32 *cs;
 	int i;
 
-	if (!IS_GEN7(rq->i915) || rq->engine->id != RCS) {
+	if (!IS_GEN(rq->i915, 7) || rq->engine->id != RCS) {
 		DRM_DEBUG("sol reset is gen7/rcs only\n");
 		return -EINVAL;
 	}
@@ -1977,6 +2038,18 @@ static int eb_submit(struct i915_execbuffer *eb)
 			return err;
 	}
 
+	/*
+	 * After we completed waiting for other engines (using HW semaphores)
+	 * then we can signal that this request/batch is ready to run. This
+	 * allows us to determine if the batch is still waiting on the GPU
+	 * or actually running by checking the breadcrumb.
+	 */
+	if (eb->engine->emit_init_breadcrumb) {
+		err = eb->engine->emit_init_breadcrumb(eb->request);
+		if (err)
+			return err;
+	}
+
 	err = eb->engine->emit_bb_start(eb->request,
 					eb->batch->node.start +
 					eb->batch_start_offset,
@@ -2203,6 +2276,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	struct i915_execbuffer eb;
 	struct dma_fence *in_fence = NULL;
 	struct sync_file *out_fence = NULL;
+	intel_wakeref_t wakeref;
 	int out_fence_fd = -1;
 	int err;
 
@@ -2273,12 +2347,16 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	 * wakeref that we hold until the GPU has been idle for at least
 	 * 100ms.
 	 */
-	intel_runtime_pm_get(eb.i915);
+	wakeref = intel_runtime_pm_get(eb.i915);
 
 	err = i915_mutex_lock_interruptible(dev);
 	if (err)
 		goto err_rpm;
 
+	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
+	if (unlikely(err))
+		goto err_unlock;
+
 	err = eb_relocate(&eb);
 	if (err) {
 		/*
@@ -2423,9 +2501,10 @@ err_batch_unpin:
 err_vma:
 	if (eb.exec)
 		eb_release_vmas(&eb);
+err_unlock:
 	mutex_unlock(&dev->struct_mutex);
 err_rpm:
-	intel_runtime_pm_put(eb.i915);
+	intel_runtime_pm_put(eb.i915, wakeref);
 	i915_gem_context_put(eb.ctx);
 err_destroy:
 	eb_destroy(&eb);
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index d548ac05ccd7..e037e94792f3 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -21,7 +21,6 @@
  * IN THE SOFTWARE.
  */
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 
@@ -193,9 +192,9 @@ static void fence_write(struct drm_i915_fence_reg *fence,
 	 * and explicitly managed for internal users.
 	 */
 
-	if (IS_GEN2(fence->i915))
+	if (IS_GEN(fence->i915, 2))
 		i830_write_fence_reg(fence, vma);
-	else if (IS_GEN3(fence->i915))
+	else if (IS_GEN(fence->i915, 3))
 		i915_write_fence_reg(fence, vma);
 	else
 		i965_write_fence_reg(fence, vma);
@@ -210,6 +209,7 @@ static void fence_write(struct drm_i915_fence_reg *fence,
 static int fence_update(struct drm_i915_fence_reg *fence,
 			struct i915_vma *vma)
 {
+	intel_wakeref_t wakeref;
 	int ret;
 
 	if (vma) {
@@ -223,7 +223,7 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 			 i915_gem_object_get_tiling(vma->obj)))
 			return -EINVAL;
 
-		ret = i915_gem_active_retire(&vma->last_fence,
+		ret = i915_active_request_retire(&vma->last_fence,
 					     &vma->obj->base.dev->struct_mutex);
 		if (ret)
 			return ret;
@@ -232,7 +232,7 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 	if (fence->vma) {
 		struct i915_vma *old = fence->vma;
 
-		ret = i915_gem_active_retire(&old->last_fence,
+		ret = i915_active_request_retire(&old->last_fence,
 					     &old->obj->base.dev->struct_mutex);
 		if (ret)
 			return ret;
@@ -257,9 +257,10 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 	 * If the device is currently powered down, we will defer the write
 	 * to the runtime resume, see i915_gem_restore_fences().
 	 */
-	if (intel_runtime_pm_get_if_in_use(fence->i915)) {
+	wakeref = intel_runtime_pm_get_if_in_use(fence->i915);
+	if (wakeref) {
 		fence_write(fence, vma);
-		intel_runtime_pm_put(fence->i915);
+		intel_runtime_pm_put(fence->i915, wakeref);
 	}
 
 	if (vma) {
@@ -554,8 +555,8 @@ void i915_gem_restore_fences(struct drm_i915_private *dev_priv)
 void
 i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv)
 {
-	uint32_t swizzle_x = I915_BIT_6_SWIZZLE_UNKNOWN;
-	uint32_t swizzle_y = I915_BIT_6_SWIZZLE_UNKNOWN;
+	u32 swizzle_x = I915_BIT_6_SWIZZLE_UNKNOWN;
+	u32 swizzle_y = I915_BIT_6_SWIZZLE_UNKNOWN;
 
 	if (INTEL_GEN(dev_priv) >= 8 || IS_VALLEYVIEW(dev_priv)) {
 		/*
@@ -578,7 +579,7 @@ i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv)
 				swizzle_y = I915_BIT_6_SWIZZLE_NONE;
 			}
 		} else {
-			uint32_t dimm_c0, dimm_c1;
+			u32 dimm_c0, dimm_c1;
 			dimm_c0 = I915_READ(MAD_DIMM_C0);
 			dimm_c1 = I915_READ(MAD_DIMM_C1);
 			dimm_c0 &= MAD_DIMM_A_SIZE_MASK | MAD_DIMM_B_SIZE_MASK;
@@ -596,13 +597,13 @@ i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv)
 				swizzle_y = I915_BIT_6_SWIZZLE_NONE;
 			}
 		}
-	} else if (IS_GEN5(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 5)) {
 		/* On Ironlake whatever DRAM config, GPU always do
 		 * same swizzling setup.
 		 */
 		swizzle_x = I915_BIT_6_SWIZZLE_9_10;
 		swizzle_y = I915_BIT_6_SWIZZLE_9;
-	} else if (IS_GEN2(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 2)) {
 		/* As far as we know, the 865 doesn't have these bit 6
 		 * swizzling issues.
 		 */
@@ -610,7 +611,7 @@ i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv)
 		swizzle_y = I915_BIT_6_SWIZZLE_NONE;
 	} else if (IS_MOBILE(dev_priv) ||
 		   IS_I915G(dev_priv) || IS_I945G(dev_priv)) {
-		uint32_t dcc;
+		u32 dcc;
 
 		/* On 9xx chipsets, channel interleave by the CPU is
 		 * determined by DCC.  For single-channel, neither the CPU
@@ -647,7 +648,7 @@ i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv)
 		}
 
 		/* check for L-shaped memory aka modified enhanced addressing */
-		if (IS_GEN4(dev_priv) &&
+		if (IS_GEN(dev_priv, 4) &&
 		    !(I915_READ(DCC2) & DCC2_MODIFIED_ENHANCED_DISABLE)) {
 			swizzle_x = I915_BIT_6_SWIZZLE_UNKNOWN;
 			swizzle_y = I915_BIT_6_SWIZZLE_UNKNOWN;
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.h b/drivers/gpu/drm/i915/i915_gem_fence_reg.h
index 99a31ded4dfd..09dcaf14121b 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.h
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.h
@@ -50,4 +50,3 @@ struct drm_i915_fence_reg {
 };
 
 #endif
-
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index bd17dd1f5da5..d646d37eec2f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -33,11 +33,11 @@
 
 #include <asm/set_memory.h>
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 
 #include "i915_drv.h"
 #include "i915_vgpu.h"
+#include "i915_reset.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
@@ -474,8 +474,7 @@ static void vm_free_page(struct i915_address_space *vm, struct page *page)
 	spin_unlock(&vm->free_pages.lock);
 }
 
-static void i915_address_space_init(struct i915_address_space *vm,
-				    struct drm_i915_private *dev_priv)
+static void i915_address_space_init(struct i915_address_space *vm, int subclass)
 {
 	/*
 	 * The vm->mutex must be reclaim safe (for use in the shrinker).
@@ -483,7 +482,8 @@ static void i915_address_space_init(struct i915_address_space *vm,
 	 * attempt holding the lock is immediately reported by lockdep.
 	 */
 	mutex_init(&vm->mutex);
-	i915_gem_shrinker_taints_mutex(&vm->mutex);
+	lockdep_set_subclass(&vm->mutex, subclass);
+	i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex);
 
 	GEM_BUG_ON(!vm->total);
 	drm_mm_init(&vm->mm, 0, vm->total);
@@ -491,9 +491,8 @@ static void i915_address_space_init(struct i915_address_space *vm,
 
 	stash_init(&vm->free_pages);
 
-	INIT_LIST_HEAD(&vm->active_list);
-	INIT_LIST_HEAD(&vm->inactive_list);
 	INIT_LIST_HEAD(&vm->unbound_list);
+	INIT_LIST_HEAD(&vm->bound_list);
 }
 
 static void i915_address_space_fini(struct i915_address_space *vm)
@@ -1423,8 +1422,6 @@ static int gen8_ppgtt_alloc_pdp(struct i915_address_space *vm,
 			gen8_initialize_pd(vm, pd);
 			gen8_ppgtt_set_pdpe(vm, pdp, pd, pdpe);
 			GEM_BUG_ON(pdp->used_pdpes > i915_pdpes_per_pdp(vm));
-
-			mark_tlbs_dirty(i915_vm_to_ppgtt(vm));
 		}
 
 		ret = gen8_ppgtt_alloc_pd(vm, pd, start, length);
@@ -1490,84 +1487,6 @@ unwind:
 	return -ENOMEM;
 }
 
-static void gen8_dump_pdp(struct i915_hw_ppgtt *ppgtt,
-			  struct i915_page_directory_pointer *pdp,
-			  u64 start, u64 length,
-			  gen8_pte_t scratch_pte,
-			  struct seq_file *m)
-{
-	struct i915_address_space *vm = &ppgtt->vm;
-	struct i915_page_directory *pd;
-	u32 pdpe;
-
-	gen8_for_each_pdpe(pd, pdp, start, length, pdpe) {
-		struct i915_page_table *pt;
-		u64 pd_len = length;
-		u64 pd_start = start;
-		u32 pde;
-
-		if (pdp->page_directory[pdpe] == ppgtt->vm.scratch_pd)
-			continue;
-
-		seq_printf(m, "\tPDPE #%d\n", pdpe);
-		gen8_for_each_pde(pt, pd, pd_start, pd_len, pde) {
-			u32 pte;
-			gen8_pte_t *pt_vaddr;
-
-			if (pd->page_table[pde] == ppgtt->vm.scratch_pt)
-				continue;
-
-			pt_vaddr = kmap_atomic_px(pt);
-			for (pte = 0; pte < GEN8_PTES; pte += 4) {
-				u64 va = (pdpe << GEN8_PDPE_SHIFT |
-					  pde << GEN8_PDE_SHIFT |
-					  pte << GEN8_PTE_SHIFT);
-				int i;
-				bool found = false;
-
-				for (i = 0; i < 4; i++)
-					if (pt_vaddr[pte + i] != scratch_pte)
-						found = true;
-				if (!found)
-					continue;
-
-				seq_printf(m, "\t\t0x%llx [%03d,%03d,%04d]: =", va, pdpe, pde, pte);
-				for (i = 0; i < 4; i++) {
-					if (pt_vaddr[pte + i] != scratch_pte)
-						seq_printf(m, " %llx", pt_vaddr[pte + i]);
-					else
-						seq_puts(m, "  SCRATCH ");
-				}
-				seq_puts(m, "\n");
-			}
-			kunmap_atomic(pt_vaddr);
-		}
-	}
-}
-
-static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
-{
-	struct i915_address_space *vm = &ppgtt->vm;
-	const gen8_pte_t scratch_pte = vm->scratch_pte;
-	u64 start = 0, length = ppgtt->vm.total;
-
-	if (use_4lvl(vm)) {
-		u64 pml4e;
-		struct i915_pml4 *pml4 = &ppgtt->pml4;
-		struct i915_page_directory_pointer *pdp;
-
-		gen8_for_each_pml4e(pdp, pml4, start, length, pml4e) {
-			if (pml4->pdps[pml4e] == ppgtt->vm.scratch_pdp)
-				continue;
-
-			seq_printf(m, "    PML4E #%llu\n", pml4e);
-			gen8_dump_pdp(ppgtt, pdp, start, length, scratch_pte, m);
-		}
-	} else {
-		gen8_dump_pdp(ppgtt, &ppgtt->pdp, start, length, scratch_pte, m);
-	}
-}
-
 static int gen8_preallocate_top_level_pdp(struct i915_hw_ppgtt *ppgtt)
 {
 	struct i915_address_space *vm = &ppgtt->vm;
@@ -1628,7 +1547,7 @@ static struct i915_hw_ppgtt *gen8_ppgtt_create(struct drm_i915_private *i915)
 	/* From bdw, there is support for read-only pages in the PPGTT. */
 	ppgtt->vm.has_read_only = true;
 
-	i915_address_space_init(&ppgtt->vm, i915);
+	i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
 
 	/* There are only few exceptions for gen >=6. chv and bxt.
 	 * And we are not sure about the latter so play safe for now.
@@ -1672,7 +1591,6 @@ static struct i915_hw_ppgtt *gen8_ppgtt_create(struct drm_i915_private *i915)
 		gen8_ppgtt_notify_vgt(ppgtt, true);
 
 	ppgtt->vm.cleanup = gen8_ppgtt_cleanup;
-	ppgtt->debug_dump = gen8_dump_ppgtt;
 
 	ppgtt->vm.vma_ops.bind_vma    = ppgtt_bind_vma;
 	ppgtt->vm.vma_ops.unbind_vma  = ppgtt_unbind_vma;
@@ -1688,60 +1606,6 @@ err_free:
 	return ERR_PTR(err);
 }
 
-static void gen6_dump_ppgtt(struct i915_hw_ppgtt *base, struct seq_file *m)
-{
-	struct gen6_hw_ppgtt *ppgtt = to_gen6_ppgtt(base);
-	const gen6_pte_t scratch_pte = base->vm.scratch_pte;
-	struct i915_page_table *pt;
-	u32 pte, pde;
-
-	gen6_for_all_pdes(pt, &base->pd, pde) {
-		gen6_pte_t *vaddr;
-
-		if (pt == base->vm.scratch_pt)
-			continue;
-
-		if (i915_vma_is_bound(ppgtt->vma, I915_VMA_GLOBAL_BIND)) {
-			u32 expected =
-				GEN6_PDE_ADDR_ENCODE(px_dma(pt)) |
-				GEN6_PDE_VALID;
-			u32 pd_entry = readl(ppgtt->pd_addr + pde);
-
-			if (pd_entry != expected)
-				seq_printf(m,
-					   "\tPDE #%d mismatch: Actual PDE: %x Expected PDE: %x\n",
-					   pde,
-					   pd_entry,
-					   expected);
-
-			seq_printf(m, "\tPDE: %x\n", pd_entry);
-		}
-
-		vaddr = kmap_atomic_px(base->pd.page_table[pde]);
-		for (pte = 0; pte < GEN6_PTES; pte += 4) {
-			int i;
-
-			for (i = 0; i < 4; i++)
-				if (vaddr[pte + i] != scratch_pte)
-					break;
-			if (i == 4)
-				continue;
-
-			seq_printf(m, "\t\t(%03d, %04d) %08llx: ",
-				   pde, pte,
-				   (pde * GEN6_PTES + pte) * I915_GTT_PAGE_SIZE);
-			for (i = 0; i < 4; i++) {
-				if (vaddr[pte + i] != scratch_pte)
-					seq_printf(m, " %08x", vaddr[pte + i]);
-				else
-					seq_puts(m, "  SCRATCH");
-			}
-			seq_puts(m, "\n");
-		}
-		kunmap_atomic(vaddr);
-	}
-}
-
 /* Write pde (index) from the page directory @pd to the page table @pt */
 static inline void gen6_write_pde(const struct gen6_hw_ppgtt *ppgtt,
 				  const unsigned int pde,
@@ -2053,21 +1917,23 @@ static struct i915_vma *pd_vma_create(struct gen6_hw_ppgtt *ppgtt, int size)
 	if (!vma)
 		return ERR_PTR(-ENOMEM);
 
-	init_request_active(&vma->last_fence, NULL);
+	i915_active_init(i915, &vma->active, NULL);
+	INIT_ACTIVE_REQUEST(&vma->last_fence);
 
 	vma->vm = &ggtt->vm;
 	vma->ops = &pd_vma_ops;
 	vma->private = ppgtt;
 
-	vma->active = RB_ROOT;
-
 	vma->size = size;
 	vma->fence_size = size;
 	vma->flags = I915_VMA_GGTT;
 	vma->ggtt_view.type = I915_GGTT_VIEW_ROTATED; /* prevent fencing */
 
 	INIT_LIST_HEAD(&vma->obj_link);
+
+	mutex_lock(&vma->vm->mutex);
 	list_add(&vma->vm_link, &vma->vm->unbound_list);
+	mutex_unlock(&vma->vm->mutex);
 
 	return vma;
 }
@@ -2132,13 +1998,12 @@ static struct i915_hw_ppgtt *gen6_ppgtt_create(struct drm_i915_private *i915)
 
 	ppgtt->base.vm.total = I915_PDES * GEN6_PTES * I915_GTT_PAGE_SIZE;
 
-	i915_address_space_init(&ppgtt->base.vm, i915);
+	i915_address_space_init(&ppgtt->base.vm, VM_CLASS_PPGTT);
 
 	ppgtt->base.vm.allocate_va_range = gen6_alloc_va_range;
 	ppgtt->base.vm.clear_range = gen6_ppgtt_clear_range;
 	ppgtt->base.vm.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.vm.cleanup = gen6_ppgtt_cleanup;
-	ppgtt->base.debug_dump = gen6_dump_ppgtt;
 
 	ppgtt->base.vm.vma_ops.bind_vma    = ppgtt_bind_vma;
 	ppgtt->base.vm.vma_ops.unbind_vma  = ppgtt_unbind_vma;
@@ -2204,9 +2069,9 @@ int i915_ppgtt_init_hw(struct drm_i915_private *dev_priv)
 {
 	gtt_write_workarounds(dev_priv);
 
-	if (IS_GEN6(dev_priv))
+	if (IS_GEN(dev_priv, 6))
 		gen6_ppgtt_enable(dev_priv);
-	else if (IS_GEN7(dev_priv))
+	else if (IS_GEN(dev_priv, 7))
 		gen7_ppgtt_enable(dev_priv);
 
 	return 0;
@@ -2247,8 +2112,7 @@ void i915_ppgtt_close(struct i915_address_space *vm)
 static void ppgtt_destroy_vma(struct i915_address_space *vm)
 {
 	struct list_head *phases[] = {
-		&vm->active_list,
-		&vm->inactive_list,
+		&vm->bound_list,
 		&vm->unbound_list,
 		NULL,
 	}, **phase;
@@ -2271,8 +2135,7 @@ void i915_ppgtt_release(struct kref *kref)
 
 	ppgtt_destroy_vma(&ppgtt->vm);
 
-	GEM_BUG_ON(!list_empty(&ppgtt->vm.active_list));
-	GEM_BUG_ON(!list_empty(&ppgtt->vm.inactive_list));
+	GEM_BUG_ON(!list_empty(&ppgtt->vm.bound_list));
 	GEM_BUG_ON(!list_empty(&ppgtt->vm.unbound_list));
 
 	ppgtt->vm.cleanup(&ppgtt->vm);
@@ -2288,7 +2151,7 @@ static bool needs_idle_maps(struct drm_i915_private *dev_priv)
 	/* Query intel_iommu to see if we need the workaround. Presumably that
 	 * was loaded first.
 	 */
-	return IS_GEN5(dev_priv) && IS_MOBILE(dev_priv) && intel_vtd_active();
+	return IS_GEN(dev_priv, 5) && IS_MOBILE(dev_priv) && intel_vtd_active();
 }
 
 static void gen6_check_faults(struct drm_i915_private *dev_priv)
@@ -2381,7 +2244,8 @@ int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
 				     DMA_ATTR_NO_WARN))
 			return 0;
 
-		/* If the DMA remap fails, one cause can be that we have
+		/*
+		 * If the DMA remap fails, one cause can be that we have
 		 * too many objects pinned in a small remapping table,
 		 * such as swiotlb. Incrementally purge all other objects and
 		 * try again - if there are no more pages to remove from
@@ -2391,8 +2255,7 @@ int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
 	} while (i915_gem_shrink(to_i915(obj->base.dev),
 				 obj->base.size >> PAGE_SHIFT, NULL,
 				 I915_SHRINK_BOUND |
-				 I915_SHRINK_UNBOUND |
-				 I915_SHRINK_ACTIVE));
+				 I915_SHRINK_UNBOUND));
 
 	return -ENOSPC;
 }
@@ -2664,6 +2527,7 @@ static int ggtt_bind_vma(struct i915_vma *vma,
 {
 	struct drm_i915_private *i915 = vma->vm->i915;
 	struct drm_i915_gem_object *obj = vma->obj;
+	intel_wakeref_t wakeref;
 	u32 pte_flags;
 
 	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
@@ -2671,9 +2535,8 @@ static int ggtt_bind_vma(struct i915_vma *vma,
 	if (i915_gem_object_is_readonly(obj))
 		pte_flags |= PTE_READ_ONLY;
 
-	intel_runtime_pm_get(i915);
-	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
-	intel_runtime_pm_put(i915);
+	with_intel_runtime_pm(i915, wakeref)
+		vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
 
 	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 
@@ -2690,10 +2553,10 @@ static int ggtt_bind_vma(struct i915_vma *vma,
 static void ggtt_unbind_vma(struct i915_vma *vma)
 {
 	struct drm_i915_private *i915 = vma->vm->i915;
+	intel_wakeref_t wakeref;
 
-	intel_runtime_pm_get(i915);
-	vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
-	intel_runtime_pm_put(i915);
+	with_intel_runtime_pm(i915, wakeref)
+		vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
 }
 
 static int aliasing_gtt_bind_vma(struct i915_vma *vma,
@@ -2725,9 +2588,12 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 	}
 
 	if (flags & I915_VMA_GLOBAL_BIND) {
-		intel_runtime_pm_get(i915);
-		vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
-		intel_runtime_pm_put(i915);
+		intel_wakeref_t wakeref;
+
+		with_intel_runtime_pm(i915, wakeref) {
+			vma->vm->insert_entries(vma->vm, vma,
+						cache_level, pte_flags);
+		}
 	}
 
 	return 0;
@@ -2738,9 +2604,11 @@ static void aliasing_gtt_unbind_vma(struct i915_vma *vma)
 	struct drm_i915_private *i915 = vma->vm->i915;
 
 	if (vma->flags & I915_VMA_GLOBAL_BIND) {
-		intel_runtime_pm_get(i915);
-		vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
-		intel_runtime_pm_put(i915);
+		struct i915_address_space *vm = vma->vm;
+		intel_wakeref_t wakeref;
+
+		with_intel_runtime_pm(i915, wakeref)
+			vm->clear_range(vm, vma->node.start, vma->size);
 	}
 
 	if (vma->flags & I915_VMA_LOCAL_BIND) {
@@ -2932,8 +2800,7 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private *dev_priv)
 	mutex_lock(&dev_priv->drm.struct_mutex);
 	i915_gem_fini_aliasing_ppgtt(dev_priv);
 
-	GEM_BUG_ON(!list_empty(&ggtt->vm.active_list));
-	list_for_each_entry_safe(vma, vn, &ggtt->vm.inactive_list, vm_link)
+	list_for_each_entry_safe(vma, vn, &ggtt->vm.bound_list, vm_link)
 		WARN_ON(i915_vma_unbind(vma));
 
 	if (drm_mm_node_allocated(&ggtt->error_capture))
@@ -3364,7 +3231,8 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 	ggtt->vm.insert_entries = gen8_ggtt_insert_entries;
 
 	/* Serialize GTT updates with aperture access on BXT if VT-d is on. */
-	if (intel_ggtt_update_needs_vtd_wa(dev_priv)) {
+	if (intel_ggtt_update_needs_vtd_wa(dev_priv) ||
+	    IS_CHERRYVIEW(dev_priv) /* fails with concurrent use/update */) {
 		ggtt->vm.insert_entries = bxt_vtd_ggtt_insert_entries__BKL;
 		ggtt->vm.insert_page    = bxt_vtd_ggtt_insert_page__BKL;
 		if (ggtt->vm.clear_range != nop_clear_range)
@@ -3565,7 +3433,7 @@ int i915_ggtt_init_hw(struct drm_i915_private *dev_priv)
 	 * and beyond the end of the GTT if we do not provide a guard.
 	 */
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	i915_address_space_init(&ggtt->vm, dev_priv);
+	i915_address_space_init(&ggtt->vm, VM_CLASS_GGTT);
 
 	ggtt->vm.is_ggtt = true;
 
@@ -3638,32 +3506,39 @@ void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
 
 	i915_check_and_clear_faults(dev_priv);
 
+	mutex_lock(&ggtt->vm.mutex);
+
 	/* First fill our portion of the GTT with scratch pages */
 	ggtt->vm.clear_range(&ggtt->vm, 0, ggtt->vm.total);
-
 	ggtt->vm.closed = true; /* skip rewriting PTE on VMA unbind */
 
 	/* clflush objects bound into the GGTT and rebind them. */
-	GEM_BUG_ON(!list_empty(&ggtt->vm.active_list));
-	list_for_each_entry_safe(vma, vn, &ggtt->vm.inactive_list, vm_link) {
+	list_for_each_entry_safe(vma, vn, &ggtt->vm.bound_list, vm_link) {
 		struct drm_i915_gem_object *obj = vma->obj;
 
 		if (!(vma->flags & I915_VMA_GLOBAL_BIND))
 			continue;
 
+		mutex_unlock(&ggtt->vm.mutex);
+
 		if (!i915_vma_unbind(vma))
-			continue;
+			goto lock;
 
 		WARN_ON(i915_vma_bind(vma,
 				      obj ? obj->cache_level : 0,
 				      PIN_UPDATE));
 		if (obj)
 			WARN_ON(i915_gem_object_set_to_gtt_domain(obj, false));
+
+lock:
+		mutex_lock(&ggtt->vm.mutex);
 	}
 
 	ggtt->vm.closed = false;
 	i915_ggtt_invalidate(dev_priv);
 
+	mutex_unlock(&ggtt->vm.mutex);
+
 	if (INTEL_GEN(dev_priv) >= 8) {
 		struct intel_ppat *ppat = &dev_priv->ppat;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 4874da09a3c4..03ade71b8d9a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -39,6 +39,7 @@
 #include <linux/pagevec.h>
 
 #include "i915_request.h"
+#include "i915_reset.h"
 #include "i915_selftest.h"
 #include "i915_timeline.h"
 
@@ -288,6 +289,8 @@ struct i915_address_space {
 	bool closed;
 
 	struct mutex mutex; /* protects vma and our lists */
+#define VM_CLASS_GGTT 0
+#define VM_CLASS_PPGTT 1
 
 	u64 scratch_pte;
 	struct i915_page_dma scratch_page;
@@ -296,32 +299,12 @@ struct i915_address_space {
 	struct i915_page_directory_pointer *scratch_pdp; /* GEN8+ & 48b PPGTT */
 
 	/**
-	 * List of objects currently involved in rendering.
-	 *
-	 * Includes buffers having the contents of their GPU caches
-	 * flushed, not necessarily primitives. last_read_req
-	 * represents when the rendering involved will be completed.
-	 *
-	 * A reference is held on the buffer while on this list.
+	 * List of vma currently bound.
 	 */
-	struct list_head active_list;
+	struct list_head bound_list;
 
 	/**
-	 * LRU list of objects which are not in the ringbuffer and
-	 * are ready to unbind, but are still in the GTT.
-	 *
-	 * last_read_req is NULL while an object is in this list.
-	 *
-	 * A reference is not held on the buffer while on this list,
-	 * as merely being GTT-bound shouldn't prevent its being
-	 * freed, and we'll pull it off the list in the free path.
-	 */
-	struct list_head inactive_list;
-
-	/**
-	 * List of vma that have been unbound.
-	 *
-	 * A reference is not held on the buffer while on this list.
+	 * List of vma that are not unbound.
 	 */
 	struct list_head unbound_list;
 
@@ -413,8 +396,6 @@ struct i915_hw_ppgtt {
 		struct i915_page_directory_pointer pdp;	/* GEN8+ */
 		struct i915_page_directory pd;		/* GEN6-7 */
 	};
-
-	void (*debug_dump)(struct i915_hw_ppgtt *ppgtt, struct seq_file *m);
 };
 
 struct gen6_hw_ppgtt {
@@ -661,19 +642,19 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 /* Flags used by pin/bind&friends. */
 #define PIN_NONBLOCK		BIT_ULL(0)
-#define PIN_MAPPABLE		BIT_ULL(1)
-#define PIN_ZONE_4G		BIT_ULL(2)
-#define PIN_NONFAULT		BIT_ULL(3)
-#define PIN_NOEVICT		BIT_ULL(4)
-
-#define PIN_MBZ			BIT_ULL(5) /* I915_VMA_PIN_OVERFLOW */
-#define PIN_GLOBAL		BIT_ULL(6) /* I915_VMA_GLOBAL_BIND */
-#define PIN_USER		BIT_ULL(7) /* I915_VMA_LOCAL_BIND */
-#define PIN_UPDATE		BIT_ULL(8)
-
-#define PIN_HIGH		BIT_ULL(9)
-#define PIN_OFFSET_BIAS		BIT_ULL(10)
-#define PIN_OFFSET_FIXED	BIT_ULL(11)
+#define PIN_NONFAULT		BIT_ULL(1)
+#define PIN_NOEVICT		BIT_ULL(2)
+#define PIN_MAPPABLE		BIT_ULL(3)
+#define PIN_ZONE_4G		BIT_ULL(4)
+#define PIN_HIGH		BIT_ULL(5)
+#define PIN_OFFSET_BIAS		BIT_ULL(6)
+#define PIN_OFFSET_FIXED	BIT_ULL(7)
+
+#define PIN_MBZ			BIT_ULL(8) /* I915_VMA_PIN_OVERFLOW */
+#define PIN_GLOBAL		BIT_ULL(9) /* I915_VMA_GLOBAL_BIND */
+#define PIN_USER		BIT_ULL(10) /* I915_VMA_LOCAL_BIND */
+#define PIN_UPDATE		BIT_ULL(11)
+
 #define PIN_OFFSET_MASK		(-I915_GTT_PAGE_SIZE)
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem_internal.c b/drivers/gpu/drm/i915/i915_gem_internal.c
index 0d0144b2104c..fddde1033e74 100644
--- a/drivers/gpu/drm/i915/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/i915_gem_internal.c
@@ -22,7 +22,6 @@
  *
  */
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index a6dd7c46de0d..fab040331cdb 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -29,7 +29,8 @@
 
 #include <drm/drm_vma_manager.h>
 #include <drm/drm_gem.h>
-#include <drm/drmP.h>
+#include <drm/drm_file.h>
+#include <drm/drm_device.h>
 
 #include <drm/i915_drm.h>
 
@@ -56,6 +57,7 @@ struct drm_i915_gem_object_ops {
 #define I915_GEM_OBJECT_HAS_STRUCT_PAGE	BIT(0)
 #define I915_GEM_OBJECT_IS_SHRINKABLE	BIT(1)
 #define I915_GEM_OBJECT_IS_PROXY	BIT(2)
+#define I915_GEM_OBJECT_ASYNC_CANCEL	BIT(3)
 
 	/* Interface between the GEM object and its backing storage.
 	 * get_pages() is called once prior to the use of the associated set
@@ -85,24 +87,33 @@ struct drm_i915_gem_object {
 
 	const struct drm_i915_gem_object_ops *ops;
 
-	/**
-	 * @vma_list: List of VMAs backed by this object
-	 *
-	 * The VMA on this list are ordered by type, all GGTT vma are placed
-	 * at the head and all ppGTT vma are placed at the tail. The different
-	 * types of GGTT vma are unordered between themselves, use the
-	 * @vma_tree (which has a defined order between all VMA) to find an
-	 * exact match.
-	 */
-	struct list_head vma_list;
-	/**
-	 * @vma_tree: Ordered tree of VMAs backed by this object
-	 *
-	 * All VMA created for this object are placed in the @vma_tree for
-	 * fast retrieval via a binary search in i915_vma_instance().
-	 * They are also added to @vma_list for easy iteration.
-	 */
-	struct rb_root vma_tree;
+	struct {
+		/**
+		 * @vma.lock: protect the list/tree of vmas
+		 */
+		spinlock_t lock;
+
+		/**
+		 * @vma.list: List of VMAs backed by this object
+		 *
+		 * The VMA on this list are ordered by type, all GGTT vma are
+		 * placed at the head and all ppGTT vma are placed at the tail.
+		 * The different types of GGTT vma are unordered between
+		 * themselves, use the @vma.tree (which has a defined order
+		 * between all VMA) to quickly find an exact match.
+		 */
+		struct list_head list;
+
+		/**
+		 * @vma.tree: Ordered tree of VMAs backed by this object
+		 *
+		 * All VMA created for this object are placed in the @vma.tree
+		 * for fast retrieval via a binary search in
+		 * i915_vma_instance(). They are also added to @vma.list for
+		 * easy iteration.
+		 */
+		struct rb_root tree;
+	} vma;
 
 	/**
 	 * @lut_list: List of vma lookup entries in use for this object.
@@ -164,7 +175,7 @@ struct drm_i915_gem_object {
 
 	atomic_t frontbuffer_bits;
 	unsigned int frontbuffer_ggtt_origin; /* write once */
-	struct i915_gem_active frontbuffer_write;
+	struct i915_active_request frontbuffer_write;
 
 	/** Current tiling stride for the object, if it's tiled. */
 	unsigned int tiling_and_stride;
@@ -387,6 +398,12 @@ i915_gem_object_is_proxy(const struct drm_i915_gem_object *obj)
 }
 
 static inline bool
+i915_gem_object_needs_async_cancel(const struct drm_i915_gem_object *obj)
+{
+	return obj->ops->flags & I915_GEM_OBJECT_ASYNC_CANCEL;
+}
+
+static inline bool
 i915_gem_object_is_active(const struct drm_i915_gem_object *obj)
 {
 	return obj->active_count;
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index ea90d3a0d511..6da795c7e62e 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -30,30 +30,27 @@
 #include <linux/pci.h>
 #include <linux/dma-buf.h>
 #include <linux/vmalloc.h>
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 
 #include "i915_drv.h"
 #include "i915_trace.h"
 
-static bool shrinker_lock(struct drm_i915_private *i915, bool *unlock)
+static bool shrinker_lock(struct drm_i915_private *i915,
+			  unsigned int flags,
+			  bool *unlock)
 {
-	switch (mutex_trylock_recursive(&i915->drm.struct_mutex)) {
+	struct mutex *m = &i915->drm.struct_mutex;
+
+	switch (mutex_trylock_recursive(m)) {
 	case MUTEX_TRYLOCK_RECURSIVE:
 		*unlock = false;
 		return true;
 
 	case MUTEX_TRYLOCK_FAILED:
 		*unlock = false;
-		preempt_disable();
-		do {
-			cpu_relax();
-			if (mutex_trylock(&i915->drm.struct_mutex)) {
-				*unlock = true;
-				break;
-			}
-		} while (!need_resched());
-		preempt_enable();
+		if (flags & I915_SHRINK_ACTIVE &&
+		    mutex_lock_killable_nested(m, I915_MM_SHRINKER) == 0)
+			*unlock = true;
 		return *unlock;
 
 	case MUTEX_TRYLOCK_SUCCESS:
@@ -156,11 +153,12 @@ i915_gem_shrink(struct drm_i915_private *i915,
 		{ &i915->mm.bound_list, I915_SHRINK_BOUND },
 		{ NULL, 0 },
 	}, *phase;
+	intel_wakeref_t wakeref = 0;
 	unsigned long count = 0;
 	unsigned long scanned = 0;
 	bool unlock;
 
-	if (!shrinker_lock(i915, &unlock))
+	if (!shrinker_lock(i915, flags, &unlock))
 		return 0;
 
 	/*
@@ -185,9 +183,11 @@ i915_gem_shrink(struct drm_i915_private *i915,
 	 * device just to recover a little memory. If absolutely necessary,
 	 * we will force the wake during oom-notifier.
 	 */
-	if ((flags & I915_SHRINK_BOUND) &&
-	    !intel_runtime_pm_get_if_in_use(i915))
-		flags &= ~I915_SHRINK_BOUND;
+	if (flags & I915_SHRINK_BOUND) {
+		wakeref = intel_runtime_pm_get_if_in_use(i915);
+		if (!wakeref)
+			flags &= ~I915_SHRINK_BOUND;
+	}
 
 	/*
 	 * As we may completely rewrite the (un)bound list whilst unbinding
@@ -268,7 +268,7 @@ i915_gem_shrink(struct drm_i915_private *i915,
 	}
 
 	if (flags & I915_SHRINK_BOUND)
-		intel_runtime_pm_put(i915);
+		intel_runtime_pm_put(i915, wakeref);
 
 	i915_retire_requests(i915);
 
@@ -295,14 +295,15 @@ i915_gem_shrink(struct drm_i915_private *i915,
  */
 unsigned long i915_gem_shrink_all(struct drm_i915_private *i915)
 {
-	unsigned long freed;
-
-	intel_runtime_pm_get(i915);
-	freed = i915_gem_shrink(i915, -1UL, NULL,
-				I915_SHRINK_BOUND |
-				I915_SHRINK_UNBOUND |
-				I915_SHRINK_ACTIVE);
-	intel_runtime_pm_put(i915);
+	intel_wakeref_t wakeref;
+	unsigned long freed = 0;
+
+	with_intel_runtime_pm(i915, wakeref) {
+		freed = i915_gem_shrink(i915, -1UL, NULL,
+					I915_SHRINK_BOUND |
+					I915_SHRINK_UNBOUND |
+					I915_SHRINK_ACTIVE);
+	}
 
 	return freed;
 }
@@ -357,7 +358,7 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 
 	sc->nr_scanned = 0;
 
-	if (!shrinker_lock(i915, &unlock))
+	if (!shrinker_lock(i915, 0, &unlock))
 		return SHRINK_STOP;
 
 	freed = i915_gem_shrink(i915,
@@ -373,14 +374,16 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 					 I915_SHRINK_BOUND |
 					 I915_SHRINK_UNBOUND);
 	if (sc->nr_scanned < sc->nr_to_scan && current_is_kswapd()) {
-		intel_runtime_pm_get(i915);
-		freed += i915_gem_shrink(i915,
-					 sc->nr_to_scan - sc->nr_scanned,
-					 &sc->nr_scanned,
-					 I915_SHRINK_ACTIVE |
-					 I915_SHRINK_BOUND |
-					 I915_SHRINK_UNBOUND);
-		intel_runtime_pm_put(i915);
+		intel_wakeref_t wakeref;
+
+		with_intel_runtime_pm(i915, wakeref) {
+			freed += i915_gem_shrink(i915,
+						 sc->nr_to_scan - sc->nr_scanned,
+						 &sc->nr_scanned,
+						 I915_SHRINK_ACTIVE |
+						 I915_SHRINK_BOUND |
+						 I915_SHRINK_UNBOUND);
+		}
 	}
 
 	shrinker_unlock(i915, unlock);
@@ -388,31 +391,6 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 	return sc->nr_scanned ? freed : SHRINK_STOP;
 }
 
-static bool
-shrinker_lock_uninterruptible(struct drm_i915_private *i915, bool *unlock,
-			      int timeout_ms)
-{
-	unsigned long timeout = jiffies + msecs_to_jiffies_timeout(timeout_ms);
-
-	do {
-		if (i915_gem_wait_for_idle(i915,
-					   0, MAX_SCHEDULE_TIMEOUT) == 0 &&
-		    shrinker_lock(i915, unlock))
-			break;
-
-		schedule_timeout_killable(1);
-		if (fatal_signal_pending(current))
-			return false;
-
-		if (time_after(jiffies, timeout)) {
-			pr_err("Unable to lock GPU to purge memory.\n");
-			return false;
-		}
-	} while (1);
-
-	return true;
-}
-
 static int
 i915_gem_shrinker_oom(struct notifier_block *nb, unsigned long event, void *ptr)
 {
@@ -420,8 +398,13 @@ i915_gem_shrinker_oom(struct notifier_block *nb, unsigned long event, void *ptr)
 		container_of(nb, struct drm_i915_private, mm.oom_notifier);
 	struct drm_i915_gem_object *obj;
 	unsigned long unevictable, bound, unbound, freed_pages;
+	intel_wakeref_t wakeref;
 
-	freed_pages = i915_gem_shrink_all(i915);
+	freed_pages = 0;
+	with_intel_runtime_pm(i915, wakeref)
+		freed_pages += i915_gem_shrink(i915, -1UL, NULL,
+					       I915_SHRINK_BOUND |
+					       I915_SHRINK_UNBOUND);
 
 	/* Because we may be allocating inside our own driver, we cannot
 	 * assert that there are no objects with pinned pages that are not
@@ -447,10 +430,6 @@ i915_gem_shrinker_oom(struct notifier_block *nb, unsigned long event, void *ptr)
 		pr_info("Purging GPU memory, %lu pages freed, "
 			"%lu pages still pinned.\n",
 			freed_pages, unevictable);
-	if (unbound || bound)
-		pr_err("%lu and %lu pages still available in the "
-		       "bound and unbound GPU page lists.\n",
-		       bound, unbound);
 
 	*(unsigned long *)ptr += freed_pages;
 	return NOTIFY_DONE;
@@ -463,34 +442,39 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr
 		container_of(nb, struct drm_i915_private, mm.vmap_notifier);
 	struct i915_vma *vma, *next;
 	unsigned long freed_pages = 0;
+	intel_wakeref_t wakeref;
 	bool unlock;
-	int ret;
 
-	if (!shrinker_lock_uninterruptible(i915, &unlock, 5000))
+	if (!shrinker_lock(i915, 0, &unlock))
 		return NOTIFY_DONE;
 
 	/* Force everything onto the inactive lists */
-	ret = i915_gem_wait_for_idle(i915,
-				     I915_WAIT_LOCKED,
-				     MAX_SCHEDULE_TIMEOUT);
-	if (ret)
+	if (i915_gem_wait_for_idle(i915,
+				   I915_WAIT_LOCKED,
+				   MAX_SCHEDULE_TIMEOUT))
 		goto out;
 
-	intel_runtime_pm_get(i915);
-	freed_pages += i915_gem_shrink(i915, -1UL, NULL,
-				       I915_SHRINK_BOUND |
-				       I915_SHRINK_UNBOUND |
-				       I915_SHRINK_ACTIVE |
-				       I915_SHRINK_VMAPS);
-	intel_runtime_pm_put(i915);
+	with_intel_runtime_pm(i915, wakeref)
+		freed_pages += i915_gem_shrink(i915, -1UL, NULL,
+					       I915_SHRINK_BOUND |
+					       I915_SHRINK_UNBOUND |
+					       I915_SHRINK_VMAPS);
 
 	/* We also want to clear any cached iomaps as they wrap vmap */
+	mutex_lock(&i915->ggtt.vm.mutex);
 	list_for_each_entry_safe(vma, next,
-				 &i915->ggtt.vm.inactive_list, vm_link) {
+				 &i915->ggtt.vm.bound_list, vm_link) {
 		unsigned long count = vma->node.size >> PAGE_SHIFT;
-		if (vma->iomap && i915_vma_unbind(vma) == 0)
+
+		if (!vma->iomap || i915_vma_is_active(vma))
+			continue;
+
+		mutex_unlock(&i915->ggtt.vm.mutex);
+		if (i915_vma_unbind(vma) == 0)
 			freed_pages += count;
+		mutex_lock(&i915->ggtt.vm.mutex);
 	}
+	mutex_unlock(&i915->ggtt.vm.mutex);
 
 out:
 	shrinker_unlock(i915, unlock);
@@ -533,13 +517,40 @@ void i915_gem_shrinker_unregister(struct drm_i915_private *i915)
 	unregister_shrinker(&i915->mm.shrinker);
 }
 
-void i915_gem_shrinker_taints_mutex(struct mutex *mutex)
+void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915,
+				    struct mutex *mutex)
 {
+	bool unlock = false;
+
 	if (!IS_ENABLED(CONFIG_LOCKDEP))
 		return;
 
+	if (!lockdep_is_held_type(&i915->drm.struct_mutex, -1)) {
+		mutex_acquire(&i915->drm.struct_mutex.dep_map,
+			      I915_MM_NORMAL, 0, _RET_IP_);
+		unlock = true;
+	}
+
 	fs_reclaim_acquire(GFP_KERNEL);
-	mutex_lock(mutex);
-	mutex_unlock(mutex);
+
+	/*
+	 * As we invariably rely on the struct_mutex within the shrinker,
+	 * but have a complicated recursion dance, taint all the mutexes used
+	 * within the shrinker with the struct_mutex. For completeness, we
+	 * taint with all subclass of struct_mutex, even though we should
+	 * only need tainting by I915_MM_NORMAL to catch possible ABBA
+	 * deadlocks from using struct_mutex inside @mutex.
+	 */
+	mutex_acquire(&i915->drm.struct_mutex.dep_map,
+		      I915_MM_SHRINKER, 0, _RET_IP_);
+
+	mutex_acquire(&mutex->dep_map, 0, 0, _RET_IP_);
+	mutex_release(&mutex->dep_map, 0, _RET_IP_);
+
+	mutex_release(&i915->drm.struct_mutex.dep_map, 0, _RET_IP_);
+
 	fs_reclaim_release(GFP_KERNEL);
+
+	if (unlock)
+		mutex_release(&i915->drm.struct_mutex.dep_map, 0, _RET_IP_);
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index f29a7ff7c362..74a9661479ca 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -26,7 +26,6 @@
  *
  */
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 
@@ -102,7 +101,7 @@ static int i915_adjust_stolen(struct drm_i915_private *dev_priv,
 		resource_size_t ggtt_start;
 
 		ggtt_start = I915_READ(PGTBL_CTL);
-		if (IS_GEN4(dev_priv))
+		if (IS_GEN(dev_priv, 4))
 			ggtt_start = (ggtt_start & PGTBL_ADDRESS_LO_MASK) |
 				     (ggtt_start & PGTBL_ADDRESS_HI_MASK) << 28;
 		else
@@ -156,7 +155,7 @@ static int i915_adjust_stolen(struct drm_i915_private *dev_priv,
 		 * GEN3 firmware likes to smash pci bridges into the stolen
 		 * range. Apparently this works.
 		 */
-		if (r == NULL && !IS_GEN3(dev_priv)) {
+		if (r == NULL && !IS_GEN(dev_priv, 3)) {
 			DRM_ERROR("conflict detected with stolen region: %pR\n",
 				  dsm);
 
@@ -194,7 +193,8 @@ static void g4x_get_stolen_reserved(struct drm_i915_private *dev_priv,
 	 * Whether ILK really reuses the ELK register for this is unclear.
 	 * Let's see if we catch anyone with this supposedly enabled on ILK.
 	 */
-	WARN(IS_GEN5(dev_priv), "ILK stolen reserved found? 0x%08x\n", reg_val);
+	WARN(IS_GEN(dev_priv, 5), "ILK stolen reserved found? 0x%08x\n",
+	     reg_val);
 
 	if (!(reg_val & G4X_STOLEN_RESERVED_ADDR2_MASK))
 		return;
@@ -701,7 +701,10 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
 	vma->pages = obj->mm.pages;
 	vma->flags |= I915_VMA_GLOBAL_BIND;
 	__i915_vma_set_map_and_fenceable(vma);
-	list_move_tail(&vma->vm_link, &ggtt->vm.inactive_list);
+
+	mutex_lock(&ggtt->vm.mutex);
+	list_move_tail(&vma->vm_link, &ggtt->vm.bound_list);
+	mutex_unlock(&ggtt->vm.mutex);
 
 	spin_lock(&dev_priv->mm.obj_lock);
 	list_move_tail(&obj->mm.link, &dev_priv->mm.bound_list);
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index d9dc9df523b5..16cc9ddbce34 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -27,7 +27,6 @@
 
 #include <linux/string.h>
 #include <linux/bitops.h>
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 
@@ -87,7 +86,7 @@ u32 i915_gem_fence_size(struct drm_i915_private *i915,
 	}
 
 	/* Previous chips need a power-of-two fence region when tiling */
-	if (IS_GEN3(i915))
+	if (IS_GEN(i915, 3))
 		ggtt_size = 1024*1024;
 	else
 		ggtt_size = 512*1024;
@@ -162,7 +161,7 @@ i915_tiling_ok(struct drm_i915_gem_object *obj,
 			return false;
 	}
 
-	if (IS_GEN2(i915) ||
+	if (IS_GEN(i915, 2) ||
 	    (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(i915)))
 		tile_width = 128;
 	else
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 9558582c105e..1d3f9a31ad61 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -22,7 +22,6 @@
  *
  */
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "i915_trace.h"
@@ -50,77 +49,67 @@ struct i915_mmu_notifier {
 	struct hlist_node node;
 	struct mmu_notifier mn;
 	struct rb_root_cached objects;
-	struct workqueue_struct *wq;
+	struct i915_mm_struct *mm;
 };
 
 struct i915_mmu_object {
 	struct i915_mmu_notifier *mn;
 	struct drm_i915_gem_object *obj;
 	struct interval_tree_node it;
-	struct list_head link;
-	struct work_struct work;
-	bool attached;
 };
 
-static void cancel_userptr(struct work_struct *work)
+static void add_object(struct i915_mmu_object *mo)
 {
-	struct i915_mmu_object *mo = container_of(work, typeof(*mo), work);
-	struct drm_i915_gem_object *obj = mo->obj;
-	struct work_struct *active;
-
-	/* Cancel any active worker and force us to re-evaluate gup */
-	mutex_lock(&obj->mm.lock);
-	active = fetch_and_zero(&obj->userptr.work);
-	mutex_unlock(&obj->mm.lock);
-	if (active)
-		goto out;
-
-	i915_gem_object_wait(obj, I915_WAIT_ALL, MAX_SCHEDULE_TIMEOUT, NULL);
-
-	mutex_lock(&obj->base.dev->struct_mutex);
-
-	/* We are inside a kthread context and can't be interrupted */
-	if (i915_gem_object_unbind(obj) == 0)
-		__i915_gem_object_put_pages(obj, I915_MM_NORMAL);
-	WARN_ONCE(i915_gem_object_has_pages(obj),
-		  "Failed to release pages: bind_count=%d, pages_pin_count=%d, pin_global=%d\n",
-		  obj->bind_count,
-		  atomic_read(&obj->mm.pages_pin_count),
-		  obj->pin_global);
-
-	mutex_unlock(&obj->base.dev->struct_mutex);
-
-out:
-	i915_gem_object_put(obj);
+	GEM_BUG_ON(!RB_EMPTY_NODE(&mo->it.rb));
+	interval_tree_insert(&mo->it, &mo->mn->objects);
 }
 
-static void add_object(struct i915_mmu_object *mo)
+static void del_object(struct i915_mmu_object *mo)
 {
-	if (mo->attached)
+	if (RB_EMPTY_NODE(&mo->it.rb))
 		return;
 
-	interval_tree_insert(&mo->it, &mo->mn->objects);
-	mo->attached = true;
+	interval_tree_remove(&mo->it, &mo->mn->objects);
+	RB_CLEAR_NODE(&mo->it.rb);
 }
 
-static void del_object(struct i915_mmu_object *mo)
+static void
+__i915_gem_userptr_set_active(struct drm_i915_gem_object *obj, bool value)
 {
-	if (!mo->attached)
+	struct i915_mmu_object *mo = obj->userptr.mmu_object;
+
+	/*
+	 * During mm_invalidate_range we need to cancel any userptr that
+	 * overlaps the range being invalidated. Doing so requires the
+	 * struct_mutex, and that risks recursion. In order to cause
+	 * recursion, the user must alias the userptr address space with
+	 * a GTT mmapping (possible with a MAP_FIXED) - then when we have
+	 * to invalidate that mmaping, mm_invalidate_range is called with
+	 * the userptr address *and* the struct_mutex held.  To prevent that
+	 * we set a flag under the i915_mmu_notifier spinlock to indicate
+	 * whether this object is valid.
+	 */
+	if (!mo)
 		return;
 
-	interval_tree_remove(&mo->it, &mo->mn->objects);
-	mo->attached = false;
+	spin_lock(&mo->mn->lock);
+	if (value)
+		add_object(mo);
+	else
+		del_object(mo);
+	spin_unlock(&mo->mn->lock);
 }
 
-static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
-			const struct mmu_notifier_range *range)
+static int
+userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
+				  const struct mmu_notifier_range *range)
 {
 	struct i915_mmu_notifier *mn =
 		container_of(_mn, struct i915_mmu_notifier, mn);
-	struct i915_mmu_object *mo;
 	struct interval_tree_node *it;
-	LIST_HEAD(cancelled);
+	struct mutex *unlock = NULL;
 	unsigned long end;
+	int ret = 0;
 
 	if (RB_EMPTY_ROOT(&mn->objects.rb_root))
 		return 0;
@@ -131,11 +120,15 @@ static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
 	spin_lock(&mn->lock);
 	it = interval_tree_iter_first(&mn->objects, range->start, end);
 	while (it) {
+		struct drm_i915_gem_object *obj;
+
 		if (!range->blockable) {
-			spin_unlock(&mn->lock);
-			return -EAGAIN;
+			ret = -EAGAIN;
+			break;
 		}
-		/* The mmu_object is released late when destroying the
+
+		/*
+		 * The mmu_object is released late when destroying the
 		 * GEM object so it is entirely possible to gain a
 		 * reference on an object in the process of being freed
 		 * since our serialisation is via the spinlock and not
@@ -144,29 +137,65 @@ static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
 		 * use-after-free we only acquire a reference on the
 		 * object if it is not in the process of being destroyed.
 		 */
-		mo = container_of(it, struct i915_mmu_object, it);
-		if (kref_get_unless_zero(&mo->obj->base.refcount))
-			queue_work(mn->wq, &mo->work);
+		obj = container_of(it, struct i915_mmu_object, it)->obj;
+		if (!kref_get_unless_zero(&obj->base.refcount)) {
+			it = interval_tree_iter_next(it, range->start, end);
+			continue;
+		}
+		spin_unlock(&mn->lock);
+
+		if (!unlock) {
+			unlock = &mn->mm->i915->drm.struct_mutex;
+
+			switch (mutex_trylock_recursive(unlock)) {
+			default:
+			case MUTEX_TRYLOCK_FAILED:
+				if (mutex_lock_killable_nested(unlock, I915_MM_SHRINKER)) {
+					i915_gem_object_put(obj);
+					return -EINTR;
+				}
+				/* fall through */
+			case MUTEX_TRYLOCK_SUCCESS:
+				break;
+
+			case MUTEX_TRYLOCK_RECURSIVE:
+				unlock = ERR_PTR(-EEXIST);
+				break;
+			}
+		}
+
+		ret = i915_gem_object_unbind(obj);
+		if (ret == 0)
+			ret = __i915_gem_object_put_pages(obj, I915_MM_SHRINKER);
+		i915_gem_object_put(obj);
+		if (ret)
+			goto unlock;
 
-		list_add(&mo->link, &cancelled);
-		it = interval_tree_iter_next(it, range->start, end);
+		spin_lock(&mn->lock);
+
+		/*
+		 * As we do not (yet) protect the mmu from concurrent insertion
+		 * over this range, there is no guarantee that this search will
+		 * terminate given a pathologic workload.
+		 */
+		it = interval_tree_iter_first(&mn->objects, range->start, end);
 	}
-	list_for_each_entry(mo, &cancelled, link)
-		del_object(mo);
 	spin_unlock(&mn->lock);
 
-	if (!list_empty(&cancelled))
-		flush_workqueue(mn->wq);
+unlock:
+	if (!IS_ERR_OR_NULL(unlock))
+		mutex_unlock(unlock);
+
+	return ret;
 
-	return 0;
 }
 
 static const struct mmu_notifier_ops i915_gem_userptr_notifier = {
-	.invalidate_range_start = i915_gem_userptr_mn_invalidate_range_start,
+	.invalidate_range_start = userptr_mn_invalidate_range_start,
 };
 
 static struct i915_mmu_notifier *
-i915_mmu_notifier_create(struct mm_struct *mm)
+i915_mmu_notifier_create(struct i915_mm_struct *mm)
 {
 	struct i915_mmu_notifier *mn;
 
@@ -177,13 +206,7 @@ i915_mmu_notifier_create(struct mm_struct *mm)
 	spin_lock_init(&mn->lock);
 	mn->mn.ops = &i915_gem_userptr_notifier;
 	mn->objects = RB_ROOT_CACHED;
-	mn->wq = alloc_workqueue("i915-userptr-release",
-				 WQ_UNBOUND | WQ_MEM_RECLAIM,
-				 0);
-	if (mn->wq == NULL) {
-		kfree(mn);
-		return ERR_PTR(-ENOMEM);
-	}
+	mn->mm = mm;
 
 	return mn;
 }
@@ -193,16 +216,14 @@ i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
 {
 	struct i915_mmu_object *mo;
 
-	mo = obj->userptr.mmu_object;
-	if (mo == NULL)
+	mo = fetch_and_zero(&obj->userptr.mmu_object);
+	if (!mo)
 		return;
 
 	spin_lock(&mo->mn->lock);
 	del_object(mo);
 	spin_unlock(&mo->mn->lock);
 	kfree(mo);
-
-	obj->userptr.mmu_object = NULL;
 }
 
 static struct i915_mmu_notifier *
@@ -215,7 +236,7 @@ i915_mmu_notifier_find(struct i915_mm_struct *mm)
 	if (mn)
 		return mn;
 
-	mn = i915_mmu_notifier_create(mm->mm);
+	mn = i915_mmu_notifier_create(mm);
 	if (IS_ERR(mn))
 		err = PTR_ERR(mn);
 
@@ -238,10 +259,8 @@ i915_mmu_notifier_find(struct i915_mm_struct *mm)
 	mutex_unlock(&mm->i915->mm_lock);
 	up_write(&mm->mm->mmap_sem);
 
-	if (mn && !IS_ERR(mn)) {
-		destroy_workqueue(mn->wq);
+	if (mn && !IS_ERR(mn))
 		kfree(mn);
-	}
 
 	return err ? ERR_PTR(err) : mm->mn;
 }
@@ -264,14 +283,14 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 		return PTR_ERR(mn);
 
 	mo = kzalloc(sizeof(*mo), GFP_KERNEL);
-	if (mo == NULL)
+	if (!mo)
 		return -ENOMEM;
 
 	mo->mn = mn;
 	mo->obj = obj;
 	mo->it.start = obj->userptr.ptr;
 	mo->it.last = obj->userptr.ptr + obj->base.size - 1;
-	INIT_WORK(&mo->work, cancel_userptr);
+	RB_CLEAR_NODE(&mo->it.rb);
 
 	obj->userptr.mmu_object = mo;
 	return 0;
@@ -285,13 +304,17 @@ i915_mmu_notifier_free(struct i915_mmu_notifier *mn,
 		return;
 
 	mmu_notifier_unregister(&mn->mn, mm);
-	destroy_workqueue(mn->wq);
 	kfree(mn);
 }
 
 #else
 
 static void
+__i915_gem_userptr_set_active(struct drm_i915_gem_object *obj, bool value)
+{
+}
+
+static void
 i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
 {
 }
@@ -459,42 +482,6 @@ alloc_table:
 	return st;
 }
 
-static int
-__i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
-			      bool value)
-{
-	int ret = 0;
-
-	/* During mm_invalidate_range we need to cancel any userptr that
-	 * overlaps the range being invalidated. Doing so requires the
-	 * struct_mutex, and that risks recursion. In order to cause
-	 * recursion, the user must alias the userptr address space with
-	 * a GTT mmapping (possible with a MAP_FIXED) - then when we have
-	 * to invalidate that mmaping, mm_invalidate_range is called with
-	 * the userptr address *and* the struct_mutex held.  To prevent that
-	 * we set a flag under the i915_mmu_notifier spinlock to indicate
-	 * whether this object is valid.
-	 */
-#if defined(CONFIG_MMU_NOTIFIER)
-	if (obj->userptr.mmu_object == NULL)
-		return 0;
-
-	spin_lock(&obj->userptr.mmu_object->mn->lock);
-	/* In order to serialise get_pages with an outstanding
-	 * cancel_userptr, we must drop the struct_mutex and try again.
-	 */
-	if (!value)
-		del_object(obj->userptr.mmu_object);
-	else if (!work_pending(&obj->userptr.mmu_object->work))
-		add_object(obj->userptr.mmu_object);
-	else
-		ret = -EAGAIN;
-	spin_unlock(&obj->userptr.mmu_object->mn->lock);
-#endif
-
-	return ret;
-}
-
 static void
 __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
 {
@@ -680,8 +667,11 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 	struct sgt_iter sgt_iter;
 	struct page *page;
 
-	BUG_ON(obj->userptr.work != NULL);
+	/* Cancel any inflight work and force them to restart their gup */
+	obj->userptr.work = NULL;
 	__i915_gem_userptr_set_active(obj, false);
+	if (!pages)
+		return;
 
 	if (obj->mm.madv != I915_MADV_WILLNEED)
 		obj->mm.dirty = false;
@@ -719,7 +709,8 @@ i915_gem_userptr_dmabuf_export(struct drm_i915_gem_object *obj)
 
 static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
 	.flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE |
-		 I915_GEM_OBJECT_IS_SHRINKABLE,
+		 I915_GEM_OBJECT_IS_SHRINKABLE |
+		 I915_GEM_OBJECT_ASYNC_CANCEL,
 	.get_pages = i915_gem_userptr_get_pages,
 	.put_pages = i915_gem_userptr_put_pages,
 	.dmabuf_export = i915_gem_userptr_dmabuf_export,
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 3f9ce403c755..9a65341fec09 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -447,9 +447,14 @@ static void error_print_request(struct drm_i915_error_state_buf *m,
 	if (!erq->seqno)
 		return;
 
-	err_printf(m, "%s pid %d, ban score %d, seqno %8x:%08x, prio %d, emitted %dms, start %08x, head %08x, tail %08x\n",
+	err_printf(m, "%s pid %d, ban score %d, seqno %8x:%08x%s%s, prio %d, emitted %dms, start %08x, head %08x, tail %08x\n",
 		   prefix, erq->pid, erq->ban_score,
-		   erq->context, erq->seqno, erq->sched_attr.priority,
+		   erq->context, erq->seqno,
+		   test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+			    &erq->flags) ? "!" : "",
+		   test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+			    &erq->flags) ? "+" : "",
+		   erq->sched_attr.priority,
 		   jiffies_to_msecs(erq->jiffies - epoch),
 		   erq->start, erq->head, erq->tail);
 }
@@ -530,13 +535,9 @@ static void error_print_engine(struct drm_i915_error_state_buf *m,
 	}
 	err_printf(m, "  seqno: 0x%08x\n", ee->seqno);
 	err_printf(m, "  last_seqno: 0x%08x\n", ee->last_seqno);
-	err_printf(m, "  waiting: %s\n", yesno(ee->waiting));
 	err_printf(m, "  ring->head: 0x%08x\n", ee->cpu_ring_head);
 	err_printf(m, "  ring->tail: 0x%08x\n", ee->cpu_ring_tail);
-	err_printf(m, "  hangcheck stall: %s\n", yesno(ee->hangcheck_stalled));
-	err_printf(m, "  hangcheck action: %s\n",
-		   hangcheck_action_to_str(ee->hangcheck_action));
-	err_printf(m, "  hangcheck action timestamp: %dms (%lu%s)\n",
+	err_printf(m, "  hangcheck timestamp: %dms (%lu%s)\n",
 		   jiffies_to_msecs(ee->hangcheck_timestamp - epoch),
 		   ee->hangcheck_timestamp,
 		   ee->hangcheck_timestamp == epoch ? "; epoch" : "");
@@ -594,13 +595,14 @@ static void print_error_obj(struct drm_i915_error_state_buf *m,
 
 static void err_print_capabilities(struct drm_i915_error_state_buf *m,
 				   const struct intel_device_info *info,
+				   const struct intel_runtime_info *runtime,
 				   const struct intel_driver_caps *caps)
 {
 	struct drm_printer p = i915_error_printer(m);
 
 	intel_device_info_dump_flags(info, &p);
 	intel_driver_caps_print(caps, &p);
-	intel_device_info_dump_topology(&info->sseu, &p);
+	intel_device_info_dump_topology(&runtime->sseu, &p);
 }
 
 static void err_print_params(struct drm_i915_error_state_buf *m,
@@ -664,7 +666,9 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m,
 
 	if (*error->error_msg)
 		err_printf(m, "%s\n", error->error_msg);
-	err_printf(m, "Kernel: %s\n", init_utsname()->release);
+	err_printf(m, "Kernel: %s %s\n",
+		   init_utsname()->release,
+		   init_utsname()->machine);
 	ts = ktime_to_timespec64(error->time);
 	err_printf(m, "Time: %lld s %ld us\n",
 		   (s64)ts.tv_sec, ts.tv_nsec / NSEC_PER_USEC);
@@ -681,15 +685,15 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m,
 		   jiffies_to_msecs(error->capture - error->epoch));
 
 	for (i = 0; i < ARRAY_SIZE(error->engine); i++) {
-		if (error->engine[i].hangcheck_stalled &&
-		    error->engine[i].context.pid) {
-			err_printf(m, "Active process (on ring %s): %s [%d], score %d%s\n",
-				   engine_name(m->i915, i),
-				   error->engine[i].context.comm,
-				   error->engine[i].context.pid,
-				   error->engine[i].context.ban_score,
-				   bannable(&error->engine[i].context));
-		}
+		if (!error->engine[i].context.pid)
+			continue;
+
+		err_printf(m, "Active process (on ring %s): %s [%d], score %d%s\n",
+			   engine_name(m->i915, i),
+			   error->engine[i].context.comm,
+			   error->engine[i].context.pid,
+			   error->engine[i].context.ban_score,
+			   bannable(&error->engine[i].context));
 	}
 	err_printf(m, "Reset count: %u\n", error->reset_count);
 	err_printf(m, "Suspend count: %u\n", error->suspend_count);
@@ -719,8 +723,6 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m,
 	err_printf(m, "FORCEWAKE: 0x%08x\n", error->forcewake);
 	err_printf(m, "DERRMR: 0x%08x\n", error->derrmr);
 	err_printf(m, "CCID: 0x%08x\n", error->ccid);
-	err_printf(m, "Missed interrupts: 0x%08lx\n",
-		   m->i915->gpu_error.missed_irq_rings);
 
 	for (i = 0; i < error->nfence; i++)
 		err_printf(m, "  fence[%d] = %08llx\n", i, error->fence[i]);
@@ -735,7 +737,7 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m,
 		err_printf(m, "DONE_REG: 0x%08x\n", error->done_reg);
 	}
 
-	if (IS_GEN7(m->i915))
+	if (IS_GEN(m->i915, 7))
 		err_printf(m, "ERR_INT: 0x%08x\n", error->err_int);
 
 	for (i = 0; i < ARRAY_SIZE(error->engine); i++) {
@@ -804,21 +806,6 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m,
 						    error->epoch);
 		}
 
-		if (IS_ERR(ee->waiters)) {
-			err_printf(m, "%s --- ? waiters [unable to acquire spinlock]\n",
-				   m->i915->engine[i]->name);
-		} else if (ee->num_waiters) {
-			err_printf(m, "%s --- %d waiters\n",
-				   m->i915->engine[i]->name,
-				   ee->num_waiters);
-			for (j = 0; j < ee->num_waiters; j++) {
-				err_printf(m, " seqno 0x%08x for %s [%d]\n",
-					   ee->waiters[j].seqno,
-					   ee->waiters[j].comm,
-					   ee->waiters[j].pid);
-			}
-		}
-
 		print_error_obj(m, m->i915->engine[i],
 				"ringbuffer", ee->ringbuffer);
 
@@ -844,7 +831,8 @@ static void __err_print_to_sgl(struct drm_i915_error_state_buf *m,
 	if (error->display)
 		intel_display_print_error_state(m, error->display);
 
-	err_print_capabilities(m, &error->device_info, &error->driver_caps);
+	err_print_capabilities(m, &error->device_info, &error->runtime_info,
+			       &error->driver_caps);
 	err_print_params(m, &error->params);
 	err_print_uc(m, &error->uc);
 }
@@ -963,17 +951,10 @@ static void i915_error_object_free(struct drm_i915_error_object *obj)
 	kfree(obj);
 }
 
-static __always_inline void free_param(const char *type, void *x)
-{
-	if (!__builtin_strcmp(type, "char *"))
-		kfree(*(void **)x);
-}
 
 static void cleanup_params(struct i915_gpu_state *error)
 {
-#define FREE(T, x, ...) free_param(#T, &error->params.x);
-	I915_PARAMS_FOR_EACH(FREE);
-#undef FREE
+	i915_params_free(&error->params);
 }
 
 static void cleanup_uc_state(struct i915_gpu_state *error)
@@ -1006,8 +987,6 @@ void __i915_gpu_state_free(struct kref *error_ref)
 		i915_error_object_free(ee->wa_ctx);
 
 		kfree(ee->requests);
-		if (!IS_ERR_OR_NULL(ee->waiters))
-			kfree(ee->waiters);
 	}
 
 	for (i = 0; i < ARRAY_SIZE(error->active_bo); i++)
@@ -1037,7 +1016,7 @@ i915_error_object_create(struct drm_i915_private *i915,
 	dma_addr_t dma;
 	int ret;
 
-	if (!vma)
+	if (!vma || !vma->pages)
 		return NULL;
 
 	num_pages = min_t(u64, vma->size, vma->obj->base.size) >> PAGE_SHIFT;
@@ -1083,23 +1062,23 @@ i915_error_object_create(struct drm_i915_private *i915,
 }
 
 /* The error capture is special as tries to run underneath the normal
- * locking rules - so we use the raw version of the i915_gem_active lookup.
+ * locking rules - so we use the raw version of the i915_active_request lookup.
  */
-static inline uint32_t
-__active_get_seqno(struct i915_gem_active *active)
+static inline u32
+__active_get_seqno(struct i915_active_request *active)
 {
 	struct i915_request *request;
 
-	request = __i915_gem_active_peek(active);
+	request = __i915_active_request_peek(active);
 	return request ? request->global_seqno : 0;
 }
 
 static inline int
-__active_get_engine_id(struct i915_gem_active *active)
+__active_get_engine_id(struct i915_active_request *active)
 {
 	struct i915_request *request;
 
-	request = __i915_gem_active_peek(active);
+	request = __i915_active_request_peek(active);
 	return request ? request->engine->id : -1;
 }
 
@@ -1127,7 +1106,9 @@ static void capture_bo(struct drm_i915_error_buffer *err,
 
 static u32 capture_error_bo(struct drm_i915_error_buffer *err,
 			    int count, struct list_head *head,
-			    bool pinned_only)
+			    unsigned int flags)
+#define ACTIVE_ONLY BIT(0)
+#define PINNED_ONLY BIT(1)
 {
 	struct i915_vma *vma;
 	int i = 0;
@@ -1136,7 +1117,10 @@ static u32 capture_error_bo(struct drm_i915_error_buffer *err,
 		if (!vma->obj)
 			continue;
 
-		if (pinned_only && !i915_vma_is_pinned(vma))
+		if (flags & ACTIVE_ONLY && !i915_vma_is_active(vma))
+			continue;
+
+		if (flags & PINNED_ONLY && !i915_vma_is_pinned(vma))
 			continue;
 
 		capture_bo(err++, vma);
@@ -1147,7 +1131,8 @@ static u32 capture_error_bo(struct drm_i915_error_buffer *err,
 	return i;
 }
 
-/* Generate a semi-unique error code. The code is not meant to have meaning, The
+/*
+ * Generate a semi-unique error code. The code is not meant to have meaning, The
  * code's only purpose is to try to prevent false duplicated bug reports by
  * grossly estimating a GPU error state.
  *
@@ -1156,29 +1141,23 @@ static u32 capture_error_bo(struct drm_i915_error_buffer *err,
  *
  * It's only a small step better than a random number in its current form.
  */
-static uint32_t i915_error_generate_code(struct drm_i915_private *dev_priv,
-					 struct i915_gpu_state *error,
-					 int *engine_id)
+static u32 i915_error_generate_code(struct i915_gpu_state *error,
+				    unsigned long engine_mask)
 {
-	uint32_t error_code = 0;
-	int i;
-
-	/* IPEHR would be an ideal way to detect errors, as it's the gross
+	/*
+	 * IPEHR would be an ideal way to detect errors, as it's the gross
 	 * measure of "the command that hung." However, has some very common
 	 * synchronization commands which almost always appear in the case
 	 * strictly a client bug. Use instdone to differentiate those some.
 	 */
-	for (i = 0; i < I915_NUM_ENGINES; i++) {
-		if (error->engine[i].hangcheck_stalled) {
-			if (engine_id)
-				*engine_id = i;
+	if (engine_mask) {
+		struct drm_i915_error_engine *ee =
+			&error->engine[ffs(engine_mask)];
 
-			return error->engine[i].ipehr ^
-			       error->engine[i].instdone.instdone;
-		}
+		return ee->ipehr ^ ee->instdone.instdone;
 	}
 
-	return error_code;
+	return 0;
 }
 
 static void gem_record_fences(struct i915_gpu_state *error)
@@ -1211,59 +1190,6 @@ static void gen6_record_semaphore_state(struct intel_engine_cs *engine,
 			I915_READ(RING_SYNC_2(engine->mmio_base));
 }
 
-static void error_record_engine_waiters(struct intel_engine_cs *engine,
-					struct drm_i915_error_engine *ee)
-{
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	struct drm_i915_error_waiter *waiter;
-	struct rb_node *rb;
-	int count;
-
-	ee->num_waiters = 0;
-	ee->waiters = NULL;
-
-	if (RB_EMPTY_ROOT(&b->waiters))
-		return;
-
-	if (!spin_trylock_irq(&b->rb_lock)) {
-		ee->waiters = ERR_PTR(-EDEADLK);
-		return;
-	}
-
-	count = 0;
-	for (rb = rb_first(&b->waiters); rb != NULL; rb = rb_next(rb))
-		count++;
-	spin_unlock_irq(&b->rb_lock);
-
-	waiter = NULL;
-	if (count)
-		waiter = kmalloc_array(count,
-				       sizeof(struct drm_i915_error_waiter),
-				       GFP_ATOMIC);
-	if (!waiter)
-		return;
-
-	if (!spin_trylock_irq(&b->rb_lock)) {
-		kfree(waiter);
-		ee->waiters = ERR_PTR(-EDEADLK);
-		return;
-	}
-
-	ee->waiters = waiter;
-	for (rb = rb_first(&b->waiters); rb; rb = rb_next(rb)) {
-		struct intel_wait *w = rb_entry(rb, typeof(*w), node);
-
-		strcpy(waiter->comm, w->tsk->comm);
-		waiter->pid = w->tsk->pid;
-		waiter->seqno = w->seqno;
-		waiter++;
-
-		if (++ee->num_waiters == count)
-			break;
-	}
-	spin_unlock_irq(&b->rb_lock);
-}
-
 static void error_record_engine_registers(struct i915_gpu_state *error,
 					  struct intel_engine_cs *engine,
 					  struct drm_i915_error_engine *ee)
@@ -1299,7 +1225,6 @@ static void error_record_engine_registers(struct i915_gpu_state *error,
 
 	intel_engine_get_instdone(engine, &ee->instdone);
 
-	ee->waiting = intel_engine_has_waiter(engine);
 	ee->instpm = I915_READ(RING_INSTPM(engine->mmio_base));
 	ee->acthd = intel_engine_get_active_head(engine);
 	ee->seqno = intel_engine_get_seqno(engine);
@@ -1314,7 +1239,7 @@ static void error_record_engine_registers(struct i915_gpu_state *error,
 	if (!HWS_NEEDS_PHYSICAL(dev_priv)) {
 		i915_reg_t mmio;
 
-		if (IS_GEN7(dev_priv)) {
+		if (IS_GEN(dev_priv, 7)) {
 			switch (engine->id) {
 			default:
 			case RCS:
@@ -1330,7 +1255,7 @@ static void error_record_engine_registers(struct i915_gpu_state *error,
 				mmio = VEBOX_HWS_PGA_GEN7;
 				break;
 			}
-		} else if (IS_GEN6(engine->i915)) {
+		} else if (IS_GEN(engine->i915, 6)) {
 			mmio = RING_HWS_PGA_GEN6(engine->mmio_base);
 		} else {
 			/* XXX: gen8 returns to sanity */
@@ -1341,9 +1266,8 @@ static void error_record_engine_registers(struct i915_gpu_state *error,
 	}
 
 	ee->idle = intel_engine_is_idle(engine);
-	ee->hangcheck_timestamp = engine->hangcheck.action_timestamp;
-	ee->hangcheck_action = engine->hangcheck.action;
-	ee->hangcheck_stalled = engine->hangcheck.stalled;
+	if (!ee->idle)
+		ee->hangcheck_timestamp = engine->hangcheck.action_timestamp;
 	ee->reset_count = i915_reset_engine_count(&dev_priv->gpu_error,
 						  engine);
 
@@ -1352,10 +1276,10 @@ static void error_record_engine_registers(struct i915_gpu_state *error,
 
 		ee->vm_info.gfx_mode = I915_READ(RING_MODE_GEN7(engine));
 
-		if (IS_GEN6(dev_priv))
+		if (IS_GEN(dev_priv, 6))
 			ee->vm_info.pp_dir_base =
 				I915_READ(RING_PP_DIR_BASE_READ(engine));
-		else if (IS_GEN7(dev_priv))
+		else if (IS_GEN(dev_priv, 7))
 			ee->vm_info.pp_dir_base =
 				I915_READ(RING_PP_DIR_BASE(engine));
 		else if (INTEL_GEN(dev_priv) >= 8)
@@ -1374,6 +1298,7 @@ static void record_request(struct i915_request *request,
 {
 	struct i915_gem_context *ctx = request->gem_context;
 
+	erq->flags = request->fence.flags;
 	erq->context = ctx->hw_id;
 	erq->sched_attr = request->sched.attr;
 	erq->ban_score = atomic_read(&ctx->ban_score);
@@ -1549,7 +1474,6 @@ static void gem_record_rings(struct i915_gpu_state *error)
 		ee->engine_id = i;
 
 		error_record_engine_registers(error, engine, ee);
-		error_record_engine_waiters(engine, ee);
 		error_record_engine_execlists(engine, ee);
 
 		request = i915_gem_find_active_request(engine);
@@ -1613,14 +1537,17 @@ static void gem_capture_vm(struct i915_gpu_state *error,
 	int count;
 
 	count = 0;
-	list_for_each_entry(vma, &vm->active_list, vm_link)
-		count++;
+	list_for_each_entry(vma, &vm->bound_list, vm_link)
+		if (i915_vma_is_active(vma))
+			count++;
 
 	active_bo = NULL;
 	if (count)
 		active_bo = kcalloc(count, sizeof(*active_bo), GFP_ATOMIC);
 	if (active_bo)
-		count = capture_error_bo(active_bo, count, &vm->active_list, false);
+		count = capture_error_bo(active_bo,
+					 count, &vm->bound_list,
+					 ACTIVE_ONLY);
 	else
 		count = 0;
 
@@ -1658,28 +1585,20 @@ static void capture_pinned_buffers(struct i915_gpu_state *error)
 	struct i915_address_space *vm = &error->i915->ggtt.vm;
 	struct drm_i915_error_buffer *bo;
 	struct i915_vma *vma;
-	int count_inactive, count_active;
-
-	count_inactive = 0;
-	list_for_each_entry(vma, &vm->inactive_list, vm_link)
-		count_inactive++;
+	int count;
 
-	count_active = 0;
-	list_for_each_entry(vma, &vm->active_list, vm_link)
-		count_active++;
+	count = 0;
+	list_for_each_entry(vma, &vm->bound_list, vm_link)
+		count++;
 
 	bo = NULL;
-	if (count_inactive + count_active)
-		bo = kcalloc(count_inactive + count_active,
-			     sizeof(*bo), GFP_ATOMIC);
+	if (count)
+		bo = kcalloc(count, sizeof(*bo), GFP_ATOMIC);
 	if (!bo)
 		return;
 
-	count_inactive = capture_error_bo(bo, count_inactive,
-					  &vm->active_list, true);
-	count_active = capture_error_bo(bo + count_inactive, count_active,
-					&vm->inactive_list, true);
-	error->pinned_bo_count = count_inactive + count_active;
+	error->pinned_bo_count =
+		capture_error_bo(bo, count, &vm->bound_list, PINNED_ONLY);
 	error->pinned_bo = bo;
 }
 
@@ -1725,7 +1644,7 @@ static void capture_reg_state(struct i915_gpu_state *error)
 		error->forcewake = I915_READ_FW(FORCEWAKE_VLV);
 	}
 
-	if (IS_GEN7(dev_priv))
+	if (IS_GEN(dev_priv, 7))
 		error->err_int = I915_READ(GEN7_ERR_INT);
 
 	if (INTEL_GEN(dev_priv) >= 8) {
@@ -1733,7 +1652,7 @@ static void capture_reg_state(struct i915_gpu_state *error)
 		error->fault_data1 = I915_READ(GEN8_FAULT_TLB_DATA1);
 	}
 
-	if (IS_GEN6(dev_priv)) {
+	if (IS_GEN(dev_priv, 6)) {
 		error->forcewake = I915_READ_FW(FORCEWAKE);
 		error->gab_ctl = I915_READ(GAB_CTL);
 		error->gfx_mode = I915_READ(GFX_MODE);
@@ -1753,7 +1672,7 @@ static void capture_reg_state(struct i915_gpu_state *error)
 		error->ccid = I915_READ(CCID);
 
 	/* 3: Feature specific registers */
-	if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) {
+	if (IS_GEN_RANGE(dev_priv, 6, 7)) {
 		error->gam_ecochk = I915_READ(GAM_ECOCHK);
 		error->gac_eco = I915_READ(GAC_ECO_BITS);
 	}
@@ -1777,7 +1696,7 @@ static void capture_reg_state(struct i915_gpu_state *error)
 		error->ier = I915_READ(DEIER);
 		error->gtier[0] = I915_READ(GTIER);
 		error->ngtier = 1;
-	} else if (IS_GEN2(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 2)) {
 		error->ier = I915_READ16(IER);
 	} else if (!IS_VALLEYVIEW(dev_priv)) {
 		error->ier = I915_READ(IER);
@@ -1786,31 +1705,35 @@ static void capture_reg_state(struct i915_gpu_state *error)
 	error->pgtbl_er = I915_READ(PGTBL_ER);
 }
 
-static void i915_error_capture_msg(struct drm_i915_private *dev_priv,
-				   struct i915_gpu_state *error,
-				   u32 engine_mask,
-				   const char *error_msg)
+static const char *
+error_msg(struct i915_gpu_state *error, unsigned long engines, const char *msg)
 {
-	u32 ecode;
-	int engine_id = -1, len;
+	int len;
+	int i;
 
-	ecode = i915_error_generate_code(dev_priv, error, &engine_id);
+	for (i = 0; i < ARRAY_SIZE(error->engine); i++)
+		if (!error->engine[i].context.pid)
+			engines &= ~BIT(i);
 
 	len = scnprintf(error->error_msg, sizeof(error->error_msg),
-			"GPU HANG: ecode %d:%d:0x%08x",
-			INTEL_GEN(dev_priv), engine_id, ecode);
-
-	if (engine_id != -1 && error->engine[engine_id].context.pid)
+			"GPU HANG: ecode %d:%lx:0x%08x",
+			INTEL_GEN(error->i915), engines,
+			i915_error_generate_code(error, engines));
+	if (engines) {
+		/* Just show the first executing process, more is confusing */
+		i = ffs(engines);
 		len += scnprintf(error->error_msg + len,
 				 sizeof(error->error_msg) - len,
 				 ", in %s [%d]",
-				 error->engine[engine_id].context.comm,
-				 error->engine[engine_id].context.pid);
+				 error->engine[i].context.comm,
+				 error->engine[i].context.pid);
+	}
+	if (msg)
+		len += scnprintf(error->error_msg + len,
+				 sizeof(error->error_msg) - len,
+				 ", %s", msg);
 
-	scnprintf(error->error_msg + len, sizeof(error->error_msg) - len,
-		  ", reason: %s, action: %s",
-		  error_msg,
-		  engine_mask ? "reset" : "continue");
+	return error->error_msg;
 }
 
 static void capture_gen_state(struct i915_gpu_state *error)
@@ -1831,21 +1754,15 @@ static void capture_gen_state(struct i915_gpu_state *error)
 	memcpy(&error->device_info,
 	       INTEL_INFO(i915),
 	       sizeof(error->device_info));
+	memcpy(&error->runtime_info,
+	       RUNTIME_INFO(i915),
+	       sizeof(error->runtime_info));
 	error->driver_caps = i915->caps;
 }
 
-static __always_inline void dup_param(const char *type, void *x)
-{
-	if (!__builtin_strcmp(type, "char *"))
-		*(void **)x = kstrdup(*(void **)x, GFP_ATOMIC);
-}
-
 static void capture_params(struct i915_gpu_state *error)
 {
-	error->params = i915_modparams;
-#define DUP(T, x, ...) dup_param(#T, &error->params.x);
-	I915_PARAMS_FOR_EACH(DUP);
-#undef DUP
+	i915_params_copy(&error->params, &i915_modparams);
 }
 
 static unsigned long capture_find_epoch(const struct i915_gpu_state *error)
@@ -1856,7 +1773,7 @@ static unsigned long capture_find_epoch(const struct i915_gpu_state *error)
 	for (i = 0; i < ARRAY_SIZE(error->engine); i++) {
 		const struct drm_i915_error_engine *ee = &error->engine[i];
 
-		if (ee->hangcheck_stalled &&
+		if (ee->hangcheck_timestamp &&
 		    time_before(ee->hangcheck_timestamp, epoch))
 			epoch = ee->hangcheck_timestamp;
 	}
@@ -1930,7 +1847,7 @@ i915_capture_gpu_state(struct drm_i915_private *i915)
  * i915_capture_error_state - capture an error record for later analysis
  * @i915: i915 device
  * @engine_mask: the mask of engines triggering the hang
- * @error_msg: a message to insert into the error capture header
+ * @msg: a message to insert into the error capture header
  *
  * Should be called when an error is detected (either a hang or an error
  * interrupt) to capture error state from the time of the error.  Fills
@@ -1938,8 +1855,8 @@ i915_capture_gpu_state(struct drm_i915_private *i915)
  * to pick up.
  */
 void i915_capture_error_state(struct drm_i915_private *i915,
-			      u32 engine_mask,
-			      const char *error_msg)
+			      unsigned long engine_mask,
+			      const char *msg)
 {
 	static bool warned;
 	struct i915_gpu_state *error;
@@ -1955,8 +1872,7 @@ void i915_capture_error_state(struct drm_i915_private *i915,
 	if (IS_ERR(error))
 		return;
 
-	i915_error_capture_msg(i915, error, engine_mask, error_msg);
-	DRM_INFO("%s\n", error->error_msg);
+	dev_info(i915->drm.dev, "%s\n", error_msg(error, engine_mask, msg));
 
 	if (!error->simulated) {
 		spin_lock_irqsave(&i915->gpu_error.lock, flags);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index ff2652bbb0b0..53b1f22dd365 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -45,6 +45,7 @@ struct i915_gpu_state {
 	u32 reset_count;
 	u32 suspend_count;
 	struct intel_device_info device_info;
+	struct intel_runtime_info runtime_info;
 	struct intel_driver_caps driver_caps;
 	struct i915_params params;
 
@@ -81,11 +82,7 @@ struct i915_gpu_state {
 		int engine_id;
 		/* Software tracked state */
 		bool idle;
-		bool waiting;
-		int num_waiters;
 		unsigned long hangcheck_timestamp;
-		bool hangcheck_stalled;
-		enum intel_engine_hangcheck_action hangcheck_action;
 		struct i915_address_space *vm;
 		int num_requests;
 		u32 reset_count;
@@ -148,6 +145,7 @@ struct i915_gpu_state {
 		struct drm_i915_error_object *default_state;
 
 		struct drm_i915_error_request {
+			unsigned long flags;
 			long jiffies;
 			pid_t pid;
 			u32 context;
@@ -160,12 +158,6 @@ struct i915_gpu_state {
 		} *requests, execlist[EXECLIST_MAX_PORTS];
 		unsigned int num_ports;
 
-		struct drm_i915_error_waiter {
-			char comm[TASK_COMM_LEN];
-			pid_t pid;
-			u32 seqno;
-		} *waiters;
-
 		struct {
 			u32 gfx_mode;
 			union {
@@ -196,6 +188,8 @@ struct i915_gpu_state {
 	struct scatterlist *sgl, *fit;
 };
 
+struct i915_gpu_restart;
+
 struct i915_gpu_error {
 	/* For hangcheck timer */
 #define DRM_I915_HANGCHECK_PERIOD 1500 /* in ms */
@@ -210,8 +204,6 @@ struct i915_gpu_error {
 
 	atomic_t pending_fb_pin;
 
-	unsigned long missed_irq_rings;
-
 	/**
 	 * State variable controlling the reset flow and count
 	 *
@@ -246,15 +238,6 @@ struct i915_gpu_error {
 	 * i915_mutex_lock_interruptible()?). I915_RESET_BACKOFF serves a
 	 * secondary role in preventing two concurrent global reset attempts.
 	 *
-	 * #I915_RESET_HANDOFF - To perform the actual GPU reset, we need the
-	 * struct_mutex. We try to acquire the struct_mutex in the reset worker,
-	 * but it may be held by some long running waiter (that we cannot
-	 * interrupt without causing trouble). Once we are ready to do the GPU
-	 * reset, we set the I915_RESET_HANDOFF bit and wakeup any waiters. If
-	 * they already hold the struct_mutex and want to participate they can
-	 * inspect the bit and do the reset directly, otherwise the worker
-	 * waits for the struct_mutex.
-	 *
 	 * #I915_RESET_ENGINE[num_engines] - Since the driver doesn't need to
 	 * acquire the struct_mutex to reset an engine, we need an explicit
 	 * flag to prevent two concurrent reset attempts in the same engine.
@@ -268,19 +251,14 @@ struct i915_gpu_error {
 	 */
 	unsigned long flags;
 #define I915_RESET_BACKOFF	0
-#define I915_RESET_HANDOFF	1
-#define I915_RESET_MODESET	2
+#define I915_RESET_MODESET	1
+#define I915_RESET_ENGINE	2
 #define I915_WEDGED		(BITS_PER_LONG - 1)
-#define I915_RESET_ENGINE	(I915_WEDGED - I915_NUM_ENGINES)
 
 	/** Number of times an engine has been reset */
 	u32 reset_engine_count[I915_NUM_ENGINES];
 
-	/** Set of stalled engines with guilty requests, in the current reset */
-	u32 stalled_mask;
-
-	/** Reason for the current *global* reset */
-	const char *reason;
+	struct mutex wedge_mutex; /* serialises wedging/unwedging */
 
 	/**
 	 * Waitqueue to signal when a hang is detected. Used to for waiters
@@ -294,8 +272,7 @@ struct i915_gpu_error {
 	 */
 	wait_queue_head_t reset_queue;
 
-	/* For missed irq/seqno simulation. */
-	unsigned long test_irq_rings;
+	struct i915_gpu_restart *restart;
 };
 
 struct drm_i915_error_state_buf {
@@ -317,7 +294,7 @@ void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...);
 
 struct i915_gpu_state *i915_capture_gpu_state(struct drm_i915_private *i915);
 void i915_capture_error_state(struct drm_i915_private *dev_priv,
-			      u32 engine_mask,
+			      unsigned long engine_mask,
 			      const char *error_msg);
 
 static inline struct i915_gpu_state *
diff --git a/drivers/gpu/drm/i915/i915_ioc32.c b/drivers/gpu/drm/i915/i915_ioc32.c
index e869daf9c8a9..c1007245f46d 100644
--- a/drivers/gpu/drm/i915/i915_ioc32.c
+++ b/drivers/gpu/drm/i915/i915_ioc32.c
@@ -28,8 +28,8 @@
  */
 #include <linux/compat.h>
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
+#include <drm/drm_ioctl.h>
 #include "i915_drv.h"
 
 struct drm_i915_getparam32 {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index d447d7d508f4..441d2674b272 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -31,7 +31,8 @@
 #include <linux/sysrq.h>
 #include <linux/slab.h>
 #include <linux/circ_buf.h>
-#include <drm/drmP.h>
+#include <drm/drm_irq.h>
+#include <drm/drm_drv.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "i915_trace.h"
@@ -224,10 +225,10 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir);
 /* For display hotplug interrupt */
 static inline void
 i915_hotplug_interrupt_update_locked(struct drm_i915_private *dev_priv,
-				     uint32_t mask,
-				     uint32_t bits)
+				     u32 mask,
+				     u32 bits)
 {
-	uint32_t val;
+	u32 val;
 
 	lockdep_assert_held(&dev_priv->irq_lock);
 	WARN_ON(bits & ~mask);
@@ -251,8 +252,8 @@ i915_hotplug_interrupt_update_locked(struct drm_i915_private *dev_priv,
  * version is also available.
  */
 void i915_hotplug_interrupt_update(struct drm_i915_private *dev_priv,
-				   uint32_t mask,
-				   uint32_t bits)
+				   u32 mask,
+				   u32 bits)
 {
 	spin_lock_irq(&dev_priv->irq_lock);
 	i915_hotplug_interrupt_update_locked(dev_priv, mask, bits);
@@ -301,10 +302,10 @@ static bool gen11_reset_one_iir(struct drm_i915_private * const i915,
  * @enabled_irq_mask: mask of interrupt bits to enable
  */
 void ilk_update_display_irq(struct drm_i915_private *dev_priv,
-			    uint32_t interrupt_mask,
-			    uint32_t enabled_irq_mask)
+			    u32 interrupt_mask,
+			    u32 enabled_irq_mask)
 {
-	uint32_t new_val;
+	u32 new_val;
 
 	lockdep_assert_held(&dev_priv->irq_lock);
 
@@ -331,8 +332,8 @@ void ilk_update_display_irq(struct drm_i915_private *dev_priv,
  * @enabled_irq_mask: mask of interrupt bits to enable
  */
 static void ilk_update_gt_irq(struct drm_i915_private *dev_priv,
-			      uint32_t interrupt_mask,
-			      uint32_t enabled_irq_mask)
+			      u32 interrupt_mask,
+			      u32 enabled_irq_mask)
 {
 	lockdep_assert_held(&dev_priv->irq_lock);
 
@@ -346,13 +347,13 @@ static void ilk_update_gt_irq(struct drm_i915_private *dev_priv,
 	I915_WRITE(GTIMR, dev_priv->gt_irq_mask);
 }
 
-void gen5_enable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen5_enable_gt_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
 	ilk_update_gt_irq(dev_priv, mask, mask);
 	POSTING_READ_FW(GTIMR);
 }
 
-void gen5_disable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask)
+void gen5_disable_gt_irq(struct drm_i915_private *dev_priv, u32 mask)
 {
 	ilk_update_gt_irq(dev_priv, mask, 0);
 }
@@ -391,10 +392,10 @@ static i915_reg_t gen6_pm_ier(struct drm_i915_private *dev_priv)
  * @enabled_irq_mask: mask of interrupt bits to enable
  */
 static void snb_update_pm_irq(struct drm_i915_private *dev_priv,
-			      uint32_t interrupt_mask,
-			      uint32_t enabled_irq_mask)
+			      u32 interrupt_mask,
+			      u32 enabled_irq_mask)
 {
-	uint32_t new_val;
+	u32 new_val;
 
 	WARN_ON(enabled_irq_mask & ~interrupt_mask);
 
@@ -578,11 +579,11 @@ void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
  * @enabled_irq_mask: mask of interrupt bits to enable
  */
 static void bdw_update_port_irq(struct drm_i915_private *dev_priv,
-				uint32_t interrupt_mask,
-				uint32_t enabled_irq_mask)
+				u32 interrupt_mask,
+				u32 enabled_irq_mask)
 {
-	uint32_t new_val;
-	uint32_t old_val;
+	u32 new_val;
+	u32 old_val;
 
 	lockdep_assert_held(&dev_priv->irq_lock);
 
@@ -612,10 +613,10 @@ static void bdw_update_port_irq(struct drm_i915_private *dev_priv,
  */
 void bdw_update_pipe_irq(struct drm_i915_private *dev_priv,
 			 enum pipe pipe,
-			 uint32_t interrupt_mask,
-			 uint32_t enabled_irq_mask)
+			 u32 interrupt_mask,
+			 u32 enabled_irq_mask)
 {
-	uint32_t new_val;
+	u32 new_val;
 
 	lockdep_assert_held(&dev_priv->irq_lock);
 
@@ -642,10 +643,10 @@ void bdw_update_pipe_irq(struct drm_i915_private *dev_priv,
  * @enabled_irq_mask: mask of interrupt bits to enable
  */
 void ibx_display_interrupt_update(struct drm_i915_private *dev_priv,
-				  uint32_t interrupt_mask,
-				  uint32_t enabled_irq_mask)
+				  u32 interrupt_mask,
+				  u32 enabled_irq_mask)
 {
-	uint32_t sdeimr = I915_READ(SDEIMR);
+	u32 sdeimr = I915_READ(SDEIMR);
 	sdeimr &= ~interrupt_mask;
 	sdeimr |= (~enabled_irq_mask & interrupt_mask);
 
@@ -822,11 +823,26 @@ static void i915_enable_asle_pipestat(struct drm_i915_private *dev_priv)
 static u32 i915_get_vblank_counter(struct drm_device *dev, unsigned int pipe)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
+	const struct drm_display_mode *mode = &vblank->hwmode;
 	i915_reg_t high_frame, low_frame;
 	u32 high1, high2, low, pixel, vbl_start, hsync_start, htotal;
-	const struct drm_display_mode *mode = &dev->vblank[pipe].hwmode;
 	unsigned long irqflags;
 
+	/*
+	 * On i965gm TV output the frame counter only works up to
+	 * the point when we enable the TV encoder. After that the
+	 * frame counter ceases to work and reads zero. We need a
+	 * vblank wait before enabling the TV encoder and so we
+	 * have to enable vblank interrupts while the frame counter
+	 * is still in a working state. However the core vblank code
+	 * does not like us returning non-zero frame counter values
+	 * when we've told it that we don't have a working frame
+	 * counter. Thus we must stop non-zero values leaking out.
+	 */
+	if (!vblank->max_vblank_count)
+		return 0;
+
 	htotal = mode->crtc_htotal;
 	hsync_start = mode->crtc_hsync_start;
 	vbl_start = mode->crtc_vblank_start;
@@ -950,7 +966,7 @@ static int __intel_get_crtc_scanline(struct intel_crtc *crtc)
 	if (mode->flags & DRM_MODE_FLAG_INTERLACE)
 		vtotal /= 2;
 
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		position = I915_READ_FW(PIPEDSL(pipe)) & DSL_LINEMASK_GEN2;
 	else
 		position = I915_READ_FW(PIPEDSL(pipe)) & DSL_LINEMASK_GEN3;
@@ -998,6 +1014,9 @@ static bool i915_get_crtc_scanoutpos(struct drm_device *dev, unsigned int pipe,
 	int position;
 	int vbl_start, vbl_end, hsync_start, htotal, vtotal;
 	unsigned long irqflags;
+	bool use_scanline_counter = INTEL_GEN(dev_priv) >= 5 ||
+		IS_G4X(dev_priv) || IS_GEN(dev_priv, 2) ||
+		mode->private_flags & I915_MODE_FLAG_USE_SCANLINE_COUNTER;
 
 	if (WARN_ON(!mode->crtc_clock)) {
 		DRM_DEBUG_DRIVER("trying to get scanoutpos for disabled "
@@ -1030,7 +1049,7 @@ static bool i915_get_crtc_scanoutpos(struct drm_device *dev, unsigned int pipe,
 	if (stime)
 		*stime = ktime_get();
 
-	if (IS_GEN2(dev_priv) || IS_G4X(dev_priv) || INTEL_GEN(dev_priv) >= 5) {
+	if (use_scanline_counter) {
 		/* No obvious pixelcount register. Only query vertical
 		 * scanout position from Display scan line register.
 		 */
@@ -1090,7 +1109,7 @@ static bool i915_get_crtc_scanoutpos(struct drm_device *dev, unsigned int pipe,
 	else
 		position += vtotal - vbl_end;
 
-	if (IS_GEN2(dev_priv) || IS_G4X(dev_priv) || INTEL_GEN(dev_priv) >= 5) {
+	if (use_scanline_counter) {
 		*vpos = position;
 		*hpos = 0;
 	} else {
@@ -1152,76 +1171,6 @@ static void ironlake_rps_change_irq_handler(struct drm_i915_private *dev_priv)
 	return;
 }
 
-static void notify_ring(struct intel_engine_cs *engine)
-{
-	const u32 seqno = intel_engine_get_seqno(engine);
-	struct i915_request *rq = NULL;
-	struct task_struct *tsk = NULL;
-	struct intel_wait *wait;
-
-	if (unlikely(!engine->breadcrumbs.irq_armed))
-		return;
-
-	rcu_read_lock();
-
-	spin_lock(&engine->breadcrumbs.irq_lock);
-	wait = engine->breadcrumbs.irq_wait;
-	if (wait) {
-		/*
-		 * We use a callback from the dma-fence to submit
-		 * requests after waiting on our own requests. To
-		 * ensure minimum delay in queuing the next request to
-		 * hardware, signal the fence now rather than wait for
-		 * the signaler to be woken up. We still wake up the
-		 * waiter in order to handle the irq-seqno coherency
-		 * issues (we may receive the interrupt before the
-		 * seqno is written, see __i915_request_irq_complete())
-		 * and to handle coalescing of multiple seqno updates
-		 * and many waiters.
-		 */
-		if (i915_seqno_passed(seqno, wait->seqno)) {
-			struct i915_request *waiter = wait->request;
-
-			if (waiter &&
-			    !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-				      &waiter->fence.flags) &&
-			    intel_wait_check_request(wait, waiter))
-				rq = i915_request_get(waiter);
-
-			tsk = wait->tsk;
-		} else {
-			if (engine->irq_seqno_barrier &&
-			    i915_seqno_passed(seqno, wait->seqno - 1)) {
-				set_bit(ENGINE_IRQ_BREADCRUMB,
-					&engine->irq_posted);
-				tsk = wait->tsk;
-			}
-		}
-
-		engine->breadcrumbs.irq_count++;
-	} else {
-		if (engine->breadcrumbs.irq_armed)
-			__intel_engine_disarm_breadcrumbs(engine);
-	}
-	spin_unlock(&engine->breadcrumbs.irq_lock);
-
-	if (rq) {
-		spin_lock(&rq->lock);
-		dma_fence_signal_locked(&rq->fence);
-		GEM_BUG_ON(!i915_request_completed(rq));
-		spin_unlock(&rq->lock);
-
-		i915_request_put(rq);
-	}
-
-	if (tsk && tsk->state & TASK_NORMAL)
-		wake_up_process(tsk);
-
-	rcu_read_unlock();
-
-	trace_intel_engine_notify(engine, wait);
-}
-
 static void vlv_c0_read(struct drm_i915_private *dev_priv,
 			struct intel_rps_ei *ei)
 {
@@ -1376,8 +1325,8 @@ static void ivybridge_parity_work(struct work_struct *work)
 		container_of(work, typeof(*dev_priv), l3_parity.error_work);
 	u32 error_status, row, bank, subbank;
 	char *parity_event[6];
-	uint32_t misccpctl;
-	uint8_t slice = 0;
+	u32 misccpctl;
+	u8 slice = 0;
 
 	/* We must turn off DOP level clock gating to access the L3 registers.
 	 * In order to prevent a get/put style interface, acquire struct mutex
@@ -1466,20 +1415,20 @@ static void ilk_gt_irq_handler(struct drm_i915_private *dev_priv,
 			       u32 gt_iir)
 {
 	if (gt_iir & GT_RENDER_USER_INTERRUPT)
-		notify_ring(dev_priv->engine[RCS]);
+		intel_engine_breadcrumbs_irq(dev_priv->engine[RCS]);
 	if (gt_iir & ILK_BSD_USER_INTERRUPT)
-		notify_ring(dev_priv->engine[VCS]);
+		intel_engine_breadcrumbs_irq(dev_priv->engine[VCS]);
 }
 
 static void snb_gt_irq_handler(struct drm_i915_private *dev_priv,
 			       u32 gt_iir)
 {
 	if (gt_iir & GT_RENDER_USER_INTERRUPT)
-		notify_ring(dev_priv->engine[RCS]);
+		intel_engine_breadcrumbs_irq(dev_priv->engine[RCS]);
 	if (gt_iir & GT_BSD_USER_INTERRUPT)
-		notify_ring(dev_priv->engine[VCS]);
+		intel_engine_breadcrumbs_irq(dev_priv->engine[VCS]);
 	if (gt_iir & GT_BLT_USER_INTERRUPT)
-		notify_ring(dev_priv->engine[BCS]);
+		intel_engine_breadcrumbs_irq(dev_priv->engine[BCS]);
 
 	if (gt_iir & (GT_BLT_CS_ERROR_INTERRUPT |
 		      GT_BSD_CS_ERROR_INTERRUPT |
@@ -1499,7 +1448,7 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir)
 		tasklet = true;
 
 	if (iir & GT_RENDER_USER_INTERRUPT) {
-		notify_ring(engine);
+		intel_engine_breadcrumbs_irq(engine);
 		tasklet |= USES_GUC_SUBMISSION(engine->i915);
 	}
 
@@ -1738,13 +1687,13 @@ static void dp_aux_irq_handler(struct drm_i915_private *dev_priv)
 #if defined(CONFIG_DEBUG_FS)
 static void display_pipe_crc_irq_handler(struct drm_i915_private *dev_priv,
 					 enum pipe pipe,
-					 uint32_t crc0, uint32_t crc1,
-					 uint32_t crc2, uint32_t crc3,
-					 uint32_t crc4)
+					 u32 crc0, u32 crc1,
+					 u32 crc2, u32 crc3,
+					 u32 crc4)
 {
 	struct intel_pipe_crc *pipe_crc = &dev_priv->pipe_crc[pipe];
 	struct intel_crtc *crtc = intel_get_crtc_for_pipe(dev_priv, pipe);
-	uint32_t crcs[5];
+	u32 crcs[5];
 
 	spin_lock(&pipe_crc->lock);
 	/*
@@ -1776,9 +1725,9 @@ static void display_pipe_crc_irq_handler(struct drm_i915_private *dev_priv,
 static inline void
 display_pipe_crc_irq_handler(struct drm_i915_private *dev_priv,
 			     enum pipe pipe,
-			     uint32_t crc0, uint32_t crc1,
-			     uint32_t crc2, uint32_t crc3,
-			     uint32_t crc4) {}
+			     u32 crc0, u32 crc1,
+			     u32 crc2, u32 crc3,
+			     u32 crc4) {}
 #endif
 
 
@@ -1804,7 +1753,7 @@ static void ivb_pipe_crc_irq_handler(struct drm_i915_private *dev_priv,
 static void i9xx_pipe_crc_irq_handler(struct drm_i915_private *dev_priv,
 				      enum pipe pipe)
 {
-	uint32_t res1, res2;
+	u32 res1, res2;
 
 	if (INTEL_GEN(dev_priv) >= 3)
 		res1 = I915_READ(PIPE_CRC_RES_RES1_I915(pipe));
@@ -1845,7 +1794,7 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir)
 
 	if (HAS_VEBOX(dev_priv)) {
 		if (pm_iir & PM_VEBOX_USER_INTERRUPT)
-			notify_ring(dev_priv->engine[VECS]);
+			intel_engine_breadcrumbs_irq(dev_priv->engine[VECS]);
 
 		if (pm_iir & PM_VEBOX_CS_ERROR_INTERRUPT)
 			DRM_DEBUG("Command parser error, pm_iir 0x%08x\n", pm_iir);
@@ -2547,7 +2496,7 @@ static void ilk_display_irq_handler(struct drm_i915_private *dev_priv,
 		I915_WRITE(SDEIIR, pch_iir);
 	}
 
-	if (IS_GEN5(dev_priv) && de_iir & DE_PCU_EVENT)
+	if (IS_GEN(dev_priv, 5) && de_iir & DE_PCU_EVENT)
 		ironlake_rps_change_irq_handler(dev_priv);
 }
 
@@ -2938,46 +2887,6 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
 	return IRQ_HANDLED;
 }
 
-struct wedge_me {
-	struct delayed_work work;
-	struct drm_i915_private *i915;
-	const char *name;
-};
-
-static void wedge_me(struct work_struct *work)
-{
-	struct wedge_me *w = container_of(work, typeof(*w), work.work);
-
-	dev_err(w->i915->drm.dev,
-		"%s timed out, cancelling all in-flight rendering.\n",
-		w->name);
-	i915_gem_set_wedged(w->i915);
-}
-
-static void __init_wedge(struct wedge_me *w,
-			 struct drm_i915_private *i915,
-			 long timeout,
-			 const char *name)
-{
-	w->i915 = i915;
-	w->name = name;
-
-	INIT_DELAYED_WORK_ONSTACK(&w->work, wedge_me);
-	schedule_delayed_work(&w->work, timeout);
-}
-
-static void __fini_wedge(struct wedge_me *w)
-{
-	cancel_delayed_work_sync(&w->work);
-	destroy_delayed_work_on_stack(&w->work);
-	w->i915 = NULL;
-}
-
-#define i915_wedge_on_timeout(W, DEV, TIMEOUT)				\
-	for (__init_wedge((W), (DEV), (TIMEOUT), __func__);		\
-	     (W)->i915;							\
-	     __fini_wedge((W)))
-
 static u32
 gen11_gt_engine_identity(struct drm_i915_private * const i915,
 			 const unsigned int bank, const unsigned int bit)
@@ -3188,203 +3097,6 @@ static irqreturn_t gen11_irq_handler(int irq, void *arg)
 	return IRQ_HANDLED;
 }
 
-static void i915_reset_device(struct drm_i915_private *dev_priv,
-			      u32 engine_mask,
-			      const char *reason)
-{
-	struct i915_gpu_error *error = &dev_priv->gpu_error;
-	struct kobject *kobj = &dev_priv->drm.primary->kdev->kobj;
-	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
-	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
-	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
-	struct wedge_me w;
-
-	kobject_uevent_env(kobj, KOBJ_CHANGE, error_event);
-
-	DRM_DEBUG_DRIVER("resetting chip\n");
-	kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
-
-	/* Use a watchdog to ensure that our reset completes */
-	i915_wedge_on_timeout(&w, dev_priv, 5*HZ) {
-		intel_prepare_reset(dev_priv);
-
-		error->reason = reason;
-		error->stalled_mask = engine_mask;
-
-		/* Signal that locked waiters should reset the GPU */
-		smp_mb__before_atomic();
-		set_bit(I915_RESET_HANDOFF, &error->flags);
-		wake_up_all(&error->wait_queue);
-
-		/* Wait for anyone holding the lock to wakeup, without
-		 * blocking indefinitely on struct_mutex.
-		 */
-		do {
-			if (mutex_trylock(&dev_priv->drm.struct_mutex)) {
-				i915_reset(dev_priv, engine_mask, reason);
-				mutex_unlock(&dev_priv->drm.struct_mutex);
-			}
-		} while (wait_on_bit_timeout(&error->flags,
-					     I915_RESET_HANDOFF,
-					     TASK_UNINTERRUPTIBLE,
-					     1));
-
-		error->stalled_mask = 0;
-		error->reason = NULL;
-
-		intel_finish_reset(dev_priv);
-	}
-
-	if (!test_bit(I915_WEDGED, &error->flags))
-		kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
-}
-
-void i915_clear_error_registers(struct drm_i915_private *dev_priv)
-{
-	u32 eir;
-
-	if (!IS_GEN2(dev_priv))
-		I915_WRITE(PGTBL_ER, I915_READ(PGTBL_ER));
-
-	if (INTEL_GEN(dev_priv) < 4)
-		I915_WRITE(IPEIR, I915_READ(IPEIR));
-	else
-		I915_WRITE(IPEIR_I965, I915_READ(IPEIR_I965));
-
-	I915_WRITE(EIR, I915_READ(EIR));
-	eir = I915_READ(EIR);
-	if (eir) {
-		/*
-		 * some errors might have become stuck,
-		 * mask them.
-		 */
-		DRM_DEBUG_DRIVER("EIR stuck: 0x%08x, masking\n", eir);
-		I915_WRITE(EMR, I915_READ(EMR) | eir);
-		I915_WRITE(IIR, I915_MASTER_ERROR_INTERRUPT);
-	}
-
-	if (INTEL_GEN(dev_priv) >= 8) {
-		I915_WRITE(GEN8_RING_FAULT_REG,
-			   I915_READ(GEN8_RING_FAULT_REG) & ~RING_FAULT_VALID);
-		POSTING_READ(GEN8_RING_FAULT_REG);
-	} else if (INTEL_GEN(dev_priv) >= 6) {
-		struct intel_engine_cs *engine;
-		enum intel_engine_id id;
-
-		for_each_engine(engine, dev_priv, id) {
-			I915_WRITE(RING_FAULT_REG(engine),
-				   I915_READ(RING_FAULT_REG(engine)) &
-				   ~RING_FAULT_VALID);
-		}
-		POSTING_READ(RING_FAULT_REG(dev_priv->engine[RCS]));
-	}
-}
-
-/**
- * i915_handle_error - handle a gpu error
- * @dev_priv: i915 device private
- * @engine_mask: mask representing engines that are hung
- * @flags: control flags
- * @fmt: Error message format string
- *
- * Do some basic checking of register state at error time and
- * dump it to the syslog.  Also call i915_capture_error_state() to make
- * sure we get a record and make it available in debugfs.  Fire a uevent
- * so userspace knows something bad happened (should trigger collection
- * of a ring dump etc.).
- */
-void i915_handle_error(struct drm_i915_private *dev_priv,
-		       u32 engine_mask,
-		       unsigned long flags,
-		       const char *fmt, ...)
-{
-	struct intel_engine_cs *engine;
-	unsigned int tmp;
-	char error_msg[80];
-	char *msg = NULL;
-
-	if (fmt) {
-		va_list args;
-
-		va_start(args, fmt);
-		vscnprintf(error_msg, sizeof(error_msg), fmt, args);
-		va_end(args);
-
-		msg = error_msg;
-	}
-
-	/*
-	 * In most cases it's guaranteed that we get here with an RPM
-	 * reference held, for example because there is a pending GPU
-	 * request that won't finish until the reset is done. This
-	 * isn't the case at least when we get here by doing a
-	 * simulated reset via debugfs, so get an RPM reference.
-	 */
-	intel_runtime_pm_get(dev_priv);
-
-	engine_mask &= INTEL_INFO(dev_priv)->ring_mask;
-
-	if (flags & I915_ERROR_CAPTURE) {
-		i915_capture_error_state(dev_priv, engine_mask, msg);
-		i915_clear_error_registers(dev_priv);
-	}
-
-	/*
-	 * Try engine reset when available. We fall back to full reset if
-	 * single reset fails.
-	 */
-	if (intel_has_reset_engine(dev_priv) &&
-	    !i915_terminally_wedged(&dev_priv->gpu_error)) {
-		for_each_engine_masked(engine, dev_priv, engine_mask, tmp) {
-			BUILD_BUG_ON(I915_RESET_MODESET >= I915_RESET_ENGINE);
-			if (test_and_set_bit(I915_RESET_ENGINE + engine->id,
-					     &dev_priv->gpu_error.flags))
-				continue;
-
-			if (i915_reset_engine(engine, msg) == 0)
-				engine_mask &= ~intel_engine_flag(engine);
-
-			clear_bit(I915_RESET_ENGINE + engine->id,
-				  &dev_priv->gpu_error.flags);
-			wake_up_bit(&dev_priv->gpu_error.flags,
-				    I915_RESET_ENGINE + engine->id);
-		}
-	}
-
-	if (!engine_mask)
-		goto out;
-
-	/* Full reset needs the mutex, stop any other user trying to do so. */
-	if (test_and_set_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags)) {
-		wait_event(dev_priv->gpu_error.reset_queue,
-			   !test_bit(I915_RESET_BACKOFF,
-				     &dev_priv->gpu_error.flags));
-		goto out;
-	}
-
-	/* Prevent any other reset-engine attempt. */
-	for_each_engine(engine, dev_priv, tmp) {
-		while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
-					&dev_priv->gpu_error.flags))
-			wait_on_bit(&dev_priv->gpu_error.flags,
-				    I915_RESET_ENGINE + engine->id,
-				    TASK_UNINTERRUPTIBLE);
-	}
-
-	i915_reset_device(dev_priv, engine_mask, msg);
-
-	for_each_engine(engine, dev_priv, tmp) {
-		clear_bit(I915_RESET_ENGINE + engine->id,
-			  &dev_priv->gpu_error.flags);
-	}
-
-	clear_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags);
-	wake_up_all(&dev_priv->gpu_error.reset_queue);
-
-out:
-	intel_runtime_pm_put(dev_priv);
-}
-
 /* Called from drm generic code, passed 'crtc' which
  * we use as a pipe index
  */
@@ -3417,7 +3129,7 @@ static int ironlake_enable_vblank(struct drm_device *dev, unsigned int pipe)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	unsigned long irqflags;
-	uint32_t bit = INTEL_GEN(dev_priv) >= 7 ?
+	u32 bit = INTEL_GEN(dev_priv) >= 7 ?
 		DE_PIPE_VBLANK_IVB(pipe) : DE_PIPE_VBLANK(pipe);
 
 	spin_lock_irqsave(&dev_priv->irq_lock, irqflags);
@@ -3479,7 +3191,7 @@ static void ironlake_disable_vblank(struct drm_device *dev, unsigned int pipe)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	unsigned long irqflags;
-	uint32_t bit = INTEL_GEN(dev_priv) >= 7 ?
+	u32 bit = INTEL_GEN(dev_priv) >= 7 ?
 		DE_PIPE_VBLANK_IVB(pipe) : DE_PIPE_VBLANK(pipe);
 
 	spin_lock_irqsave(&dev_priv->irq_lock, irqflags);
@@ -3586,11 +3298,8 @@ static void ironlake_irq_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
 
-	if (IS_GEN5(dev_priv))
-		I915_WRITE(HWSTAM, 0xffffffff);
-
 	GEN3_IRQ_RESET(DE);
-	if (IS_GEN7(dev_priv))
+	if (IS_GEN(dev_priv, 7))
 		I915_WRITE(GEN7_ERR_INT, 0xffffffff);
 
 	if (IS_HASWELL(dev_priv)) {
@@ -3700,7 +3409,7 @@ static void gen11_irq_reset(struct drm_device *dev)
 void gen8_irq_power_well_post_enable(struct drm_i915_private *dev_priv,
 				     u8 pipe_mask)
 {
-	uint32_t extra_ier = GEN8_PIPE_VBLANK | GEN8_PIPE_FIFO_UNDERRUN;
+	u32 extra_ier = GEN8_PIPE_VBLANK | GEN8_PIPE_FIFO_UNDERRUN;
 	enum pipe pipe;
 
 	spin_lock_irq(&dev_priv->irq_lock);
@@ -4045,7 +3754,7 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev)
 	}
 
 	gt_irqs |= GT_RENDER_USER_INTERRUPT;
-	if (IS_GEN5(dev_priv)) {
+	if (IS_GEN(dev_priv, 5)) {
 		gt_irqs |= ILK_BSD_USER_INTERRUPT;
 	} else {
 		gt_irqs |= GT_BLT_USER_INTERRUPT | GT_BSD_USER_INTERRUPT;
@@ -4169,7 +3878,7 @@ static int valleyview_irq_postinstall(struct drm_device *dev)
 static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
 {
 	/* These are interrupts we'll toggle with the ring mask register */
-	uint32_t gt_interrupts[] = {
+	u32 gt_interrupts[] = {
 		GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
 			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT |
@@ -4183,9 +3892,6 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
 			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT
 		};
 
-	if (HAS_L3_DPF(dev_priv))
-		gt_interrupts[0] |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
-
 	dev_priv->pm_ier = 0x0;
 	dev_priv->pm_imr = ~dev_priv->pm_ier;
 	GEN8_IRQ_INIT_NDX(GT, 0, ~gt_interrupts[0], gt_interrupts[0]);
@@ -4200,8 +3906,8 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
 
 static void gen8_de_irq_postinstall(struct drm_i915_private *dev_priv)
 {
-	uint32_t de_pipe_masked = GEN8_PIPE_CDCLK_CRC_DONE;
-	uint32_t de_pipe_enables;
+	u32 de_pipe_masked = GEN8_PIPE_CDCLK_CRC_DONE;
+	u32 de_pipe_enables;
 	u32 de_port_masked = GEN8_AUX_CHANNEL_A;
 	u32 de_port_enables;
 	u32 de_misc_masked = GEN8_DE_EDP_PSR;
@@ -4341,6 +4047,7 @@ static int gen11_irq_postinstall(struct drm_device *dev)
 	I915_WRITE(GEN11_DISPLAY_INT_CTL, GEN11_DISPLAY_IRQ_ENABLE);
 
 	gen11_master_intr_enable(dev_priv->regs);
+	POSTING_READ(GEN11_GFX_MSTR_IRQ);
 
 	return 0;
 }
@@ -4368,8 +4075,6 @@ static void i8xx_irq_reset(struct drm_device *dev)
 
 	i9xx_pipestat_irq_reset(dev_priv);
 
-	I915_WRITE16(HWSTAM, 0xffff);
-
 	GEN2_IRQ_RESET();
 }
 
@@ -4513,7 +4218,7 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg)
 		I915_WRITE16(IIR, iir);
 
 		if (iir & I915_USER_INTERRUPT)
-			notify_ring(dev_priv->engine[RCS]);
+			intel_engine_breadcrumbs_irq(dev_priv->engine[RCS]);
 
 		if (iir & I915_MASTER_ERROR_INTERRUPT)
 			i8xx_error_irq_handler(dev_priv, eir, eir_stuck);
@@ -4537,8 +4242,6 @@ static void i915_irq_reset(struct drm_device *dev)
 
 	i9xx_pipestat_irq_reset(dev_priv);
 
-	I915_WRITE(HWSTAM, 0xffffffff);
-
 	GEN3_IRQ_RESET();
 }
 
@@ -4623,7 +4326,7 @@ static irqreturn_t i915_irq_handler(int irq, void *arg)
 		I915_WRITE(IIR, iir);
 
 		if (iir & I915_USER_INTERRUPT)
-			notify_ring(dev_priv->engine[RCS]);
+			intel_engine_breadcrumbs_irq(dev_priv->engine[RCS]);
 
 		if (iir & I915_MASTER_ERROR_INTERRUPT)
 			i9xx_error_irq_handler(dev_priv, eir, eir_stuck);
@@ -4648,8 +4351,6 @@ static void i965_irq_reset(struct drm_device *dev)
 
 	i9xx_pipestat_irq_reset(dev_priv);
 
-	I915_WRITE(HWSTAM, 0xffffffff);
-
 	GEN3_IRQ_RESET();
 }
 
@@ -4770,10 +4471,10 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 		I915_WRITE(IIR, iir);
 
 		if (iir & I915_USER_INTERRUPT)
-			notify_ring(dev_priv->engine[RCS]);
+			intel_engine_breadcrumbs_irq(dev_priv->engine[RCS]);
 
 		if (iir & I915_BSD_USER_INTERRUPT)
-			notify_ring(dev_priv->engine[VCS]);
+			intel_engine_breadcrumbs_irq(dev_priv->engine[VCS]);
 
 		if (iir & I915_MASTER_ERROR_INTERRUPT)
 			i9xx_error_irq_handler(dev_priv, eir, eir_stuck);
@@ -4836,23 +4537,17 @@ void intel_irq_init(struct drm_i915_private *dev_priv)
 	if (INTEL_GEN(dev_priv) >= 8)
 		rps->pm_intrmsk_mbz |= GEN8_PMINTR_DISABLE_REDIRECT_TO_GUC;
 
-	if (IS_GEN2(dev_priv)) {
-		/* Gen2 doesn't have a hardware frame counter */
-		dev->max_vblank_count = 0;
-	} else if (IS_G4X(dev_priv) || INTEL_GEN(dev_priv) >= 5) {
-		dev->max_vblank_count = 0xffffffff; /* full 32 bit counter */
+	if (INTEL_GEN(dev_priv) >= 5 || IS_G4X(dev_priv))
 		dev->driver->get_vblank_counter = g4x_get_vblank_counter;
-	} else {
+	else if (INTEL_GEN(dev_priv) >= 3)
 		dev->driver->get_vblank_counter = i915_get_vblank_counter;
-		dev->max_vblank_count = 0xffffff; /* only 24 bits of frame count */
-	}
 
 	/*
 	 * Opt out of the vblank disable timer on everything except gen2.
 	 * Gen2 doesn't have a hardware frame counter and so depends on
 	 * vblank interrupts to produce sane vblank seuquence numbers.
 	 */
-	if (!IS_GEN2(dev_priv))
+	if (!IS_GEN(dev_priv, 2))
 		dev->vblank_disable_immediate = true;
 
 	/* Most platforms treat the display irq block as an always-on
@@ -4924,14 +4619,14 @@ void intel_irq_init(struct drm_i915_private *dev_priv)
 		dev->driver->disable_vblank = ironlake_disable_vblank;
 		dev_priv->display.hpd_irq_setup = ilk_hpd_irq_setup;
 	} else {
-		if (IS_GEN2(dev_priv)) {
+		if (IS_GEN(dev_priv, 2)) {
 			dev->driver->irq_preinstall = i8xx_irq_reset;
 			dev->driver->irq_postinstall = i8xx_irq_postinstall;
 			dev->driver->irq_handler = i8xx_irq_handler;
 			dev->driver->irq_uninstall = i8xx_irq_reset;
 			dev->driver->enable_vblank = i8xx_enable_vblank;
 			dev->driver->disable_vblank = i8xx_disable_vblank;
-		} else if (IS_GEN3(dev_priv)) {
+		} else if (IS_GEN(dev_priv, 3)) {
 			dev->driver->irq_preinstall = i915_irq_reset;
 			dev->driver->irq_postinstall = i915_irq_postinstall;
 			dev->driver->irq_uninstall = i915_irq_reset;
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 2e0356561839..b5be0abbba35 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -77,7 +77,7 @@ i915_param_named(error_capture, bool, 0600,
 	"triaging and debugging hangs.");
 #endif
 
-i915_param_named_unsafe(enable_hangcheck, bool, 0644,
+i915_param_named_unsafe(enable_hangcheck, bool, 0600,
 	"Periodically check GPU activity for detecting hangs. "
 	"WARNING: Disabling this can cause system wide hangs. "
 	"(default: true)");
@@ -97,8 +97,10 @@ i915_param_named_unsafe(disable_power_well, int, 0400,
 
 i915_param_named_unsafe(enable_ips, int, 0600, "Enable IPS (default: true)");
 
-i915_param_named(fastboot, bool, 0600,
-	"Try to skip unnecessary mode sets at boot time (default: false)");
+i915_param_named(fastboot, int, 0600,
+	"Try to skip unnecessary mode sets at boot time "
+	"(0=disabled, 1=enabled) "
+	"Default: -1 (use per-chip default)");
 
 i915_param_named_unsafe(prefault_disable, bool, 0600,
 	"Disable page prefaulting for pread/pwrite/reloc (default:false). "
@@ -203,3 +205,33 @@ void i915_params_dump(const struct i915_params *params, struct drm_printer *p)
 	I915_PARAMS_FOR_EACH(PRINT);
 #undef PRINT
 }
+
+static __always_inline void dup_param(const char *type, void *x)
+{
+	if (!__builtin_strcmp(type, "char *"))
+		*(void **)x = kstrdup(*(void **)x, GFP_ATOMIC);
+}
+
+void i915_params_copy(struct i915_params *dest, const struct i915_params *src)
+{
+	*dest = *src;
+#define DUP(T, x, ...) dup_param(#T, &dest->x);
+	I915_PARAMS_FOR_EACH(DUP);
+#undef DUP
+}
+
+static __always_inline void free_param(const char *type, void *x)
+{
+	if (!__builtin_strcmp(type, "char *")) {
+		kfree(*(void **)x);
+		*(void **)x = NULL;
+	}
+}
+
+/* free the allocated members, *not* the passed in params itself */
+void i915_params_free(struct i915_params *params)
+{
+#define FREE(T, x, ...) free_param(#T, &params->x);
+	I915_PARAMS_FOR_EACH(FREE);
+#undef FREE
+}
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 7e56c516c815..3f14e9881a0d 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -33,6 +33,15 @@ struct drm_printer;
 #define ENABLE_GUC_SUBMISSION		BIT(0)
 #define ENABLE_GUC_LOAD_HUC		BIT(1)
 
+/*
+ * Invoke param, a function-like macro, for each i915 param, with arguments:
+ *
+ * param(type, name, value)
+ *
+ * type: parameter type, one of {bool, int, unsigned int, char *}
+ * name: name of the parameter
+ * value: initial/default value of the parameter
+ */
 #define I915_PARAMS_FOR_EACH(param) \
 	param(char *, vbt_firmware, NULL) \
 	param(int, modeset, -1) \
@@ -54,10 +63,10 @@ struct drm_printer;
 	param(int, edp_vswing, 0) \
 	param(int, reset, 2) \
 	param(unsigned int, inject_load_failure, 0) \
+	param(int, fastboot, -1) \
 	/* leave bools at the end to not create holes */ \
 	param(bool, alpha_support, IS_ENABLED(CONFIG_DRM_I915_ALPHA_SUPPORT)) \
 	param(bool, enable_hangcheck, true) \
-	param(bool, fastboot, false) \
 	param(bool, prefault_disable, false) \
 	param(bool, load_detect_test, false) \
 	param(bool, force_reset_modeset_test, false) \
@@ -78,6 +87,8 @@ struct i915_params {
 extern struct i915_params i915_modparams __read_mostly;
 
 void i915_params_dump(const struct i915_params *params, struct drm_printer *p);
+void i915_params_copy(struct i915_params *dest, const struct i915_params *src);
+void i915_params_free(struct i915_params *params);
 
 #endif
 
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 6350db5503cd..66f82f3f050f 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -26,6 +26,9 @@
 #include <linux/vgaarb.h>
 #include <linux/vga_switcheroo.h>
 
+#include <drm/drm_drv.h>
+
+#include "i915_active.h"
 #include "i915_drv.h"
 #include "i915_selftest.h"
 
@@ -67,9 +70,15 @@
 #define BDW_COLORS \
 	.color = { .degamma_lut_size = 512, .gamma_lut_size = 512 }
 #define CHV_COLORS \
-	.color = { .degamma_lut_size = 65, .gamma_lut_size = 257 }
+	.color = { .degamma_lut_size = 65, .gamma_lut_size = 257, \
+		   .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
+		   .gamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
+	}
 #define GLK_COLORS \
-	.color = { .degamma_lut_size = 0, .gamma_lut_size = 1024 }
+	.color = { .degamma_lut_size = 0, .gamma_lut_size = 1024, \
+		   .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING | \
+					DRM_COLOR_LUT_EQUAL_CHANNELS, \
+	}
 
 /* Keep in gen based order, and chronological order within a gen */
 
@@ -81,7 +90,8 @@
 	.num_pipes = 1, \
 	.display.has_overlay = 1, \
 	.display.overlay_needs_physical = 1, \
-	.display.has_gmch_display = 1, \
+	.display.has_gmch = 1, \
+	.gpu_reset_clobbers_display = true, \
 	.hws_needs_physical = 1, \
 	.unfenced_needs_alignment = 1, \
 	.ring_mask = RENDER_RING, \
@@ -121,7 +131,8 @@ static const struct intel_device_info intel_i865g_info = {
 #define GEN3_FEATURES \
 	GEN(3), \
 	.num_pipes = 2, \
-	.display.has_gmch_display = 1, \
+	.display.has_gmch = 1, \
+	.gpu_reset_clobbers_display = true, \
 	.ring_mask = RENDER_RING, \
 	.has_snoop = true, \
 	.has_coherent_ggtt = true, \
@@ -197,7 +208,8 @@ static const struct intel_device_info intel_pineview_info = {
 	GEN(4), \
 	.num_pipes = 2, \
 	.display.has_hotplug = 1, \
-	.display.has_gmch_display = 1, \
+	.display.has_gmch = 1, \
+	.gpu_reset_clobbers_display = true, \
 	.ring_mask = RENDER_RING, \
 	.has_snoop = true, \
 	.has_coherent_ggtt = true, \
@@ -228,6 +240,7 @@ static const struct intel_device_info intel_g45_info = {
 	GEN4_FEATURES,
 	PLATFORM(INTEL_G45),
 	.ring_mask = RENDER_RING | BSD_RING,
+	.gpu_reset_clobbers_display = false,
 };
 
 static const struct intel_device_info intel_gm45_info = {
@@ -237,6 +250,7 @@ static const struct intel_device_info intel_gm45_info = {
 	.display.has_fbc = 1,
 	.display.supports_tv = 1,
 	.ring_mask = RENDER_RING | BSD_RING,
+	.gpu_reset_clobbers_display = false,
 };
 
 #define GEN5_FEATURES \
@@ -370,7 +384,7 @@ static const struct intel_device_info intel_valleyview_info = {
 	.num_pipes = 2,
 	.has_runtime_pm = 1,
 	.has_rc6 = 1,
-	.display.has_gmch_display = 1,
+	.display.has_gmch = 1,
 	.display.has_hotplug = 1,
 	.ppgtt = INTEL_PPGTT_FULL,
 	.has_snoop = true,
@@ -462,7 +476,7 @@ static const struct intel_device_info intel_cherryview_info = {
 	.has_runtime_pm = 1,
 	.has_rc6 = 1,
 	.has_logical_ring_contexts = 1,
-	.display.has_gmch_display = 1,
+	.display.has_gmch = 1,
 	.ppgtt = INTEL_PPGTT_FULL,
 	.has_reset_engine = 1,
 	.has_snoop = true,
@@ -532,7 +546,6 @@ static const struct intel_device_info intel_skylake_gt4_info = {
 	.display.has_fbc = 1, \
 	.display.has_psr = 1, \
 	.has_runtime_pm = 1, \
-	.has_pooled_eu = 0, \
 	.display.has_csr = 1, \
 	.has_rc6 = 1, \
 	.display.has_dp_mst = 1, \
@@ -701,6 +714,7 @@ static const struct pci_device_id pciidlist[] = {
 	INTEL_AML_KBL_GT2_IDS(&intel_kabylake_gt2_info),
 	INTEL_CFL_S_GT1_IDS(&intel_coffeelake_gt1_info),
 	INTEL_CFL_S_GT2_IDS(&intel_coffeelake_gt2_info),
+	INTEL_CFL_H_GT1_IDS(&intel_coffeelake_gt1_info),
 	INTEL_CFL_H_GT2_IDS(&intel_coffeelake_gt2_info),
 	INTEL_CFL_U_GT2_IDS(&intel_coffeelake_gt2_info),
 	INTEL_CFL_U_GT3_IDS(&intel_coffeelake_gt3_info),
@@ -787,6 +801,8 @@ static int __init i915_init(void)
 	bool use_kms = true;
 	int err;
 
+	i915_global_active_init();
+
 	err = i915_mock_selftests();
 	if (err)
 		return err > 0 ? 0 : err;
@@ -818,6 +834,7 @@ static void __exit i915_exit(void)
 		return;
 
 	pci_unregister_driver(&i915_pci_driver);
+	i915_global_active_exit();
 }
 
 module_init(i915_init);
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 2b2eb57ca71f..9ebf99f3d8d3 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1365,7 +1365,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 	free_oa_buffer(dev_priv);
 
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, stream->wakeref);
 
 	if (stream->ctx)
 		oa_put_render_ctx_id(stream);
@@ -1677,6 +1677,11 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
 
 		CTX_REG(reg_state, state_offset, flex_regs[i], value);
 	}
+
+	CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
+		gen8_make_rpcs(dev_priv,
+			       &to_intel_context(ctx,
+						 dev_priv->engine[RCS])->sseu));
 }
 
 /*
@@ -1796,7 +1801,7 @@ static int gen8_enable_metric_set(struct i915_perf_stream *stream)
 	 * be read back from automatically triggered reports, as part of the
 	 * RPT_ID field.
 	 */
-	if (IS_GEN(dev_priv, 9, 11)) {
+	if (IS_GEN_RANGE(dev_priv, 9, 11)) {
 		I915_WRITE(GEN8_OA_DEBUG,
 			   _MASKED_BIT_ENABLE(GEN9_OA_DEBUG_DISABLE_CLK_RATIO_REPORTS |
 					      GEN9_OA_DEBUG_INCLUDE_CLK_RATIO));
@@ -2087,7 +2092,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	 *   In our case we are expecting that taking pm + FORCEWAKE
 	 *   references will effectively disable RC6.
 	 */
-	intel_runtime_pm_get(dev_priv);
+	stream->wakeref = intel_runtime_pm_get(dev_priv);
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
 	ret = alloc_oa_buffer(dev_priv);
@@ -2098,21 +2103,21 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	if (ret)
 		goto err_lock;
 
+	stream->ops = &i915_oa_stream_ops;
+	dev_priv->perf.oa.exclusive_stream = stream;
+
 	ret = dev_priv->perf.oa.ops.enable_metric_set(stream);
 	if (ret) {
 		DRM_DEBUG("Unable to enable metric set\n");
 		goto err_enable;
 	}
 
-	stream->ops = &i915_oa_stream_ops;
-
-	dev_priv->perf.oa.exclusive_stream = stream;
-
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	return 0;
 
 err_enable:
+	dev_priv->perf.oa.exclusive_stream = NULL;
 	dev_priv->perf.oa.ops.disable_metric_set(dev_priv);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
@@ -2123,7 +2128,7 @@ err_oa_buf_alloc:
 	put_oa_config(dev_priv, stream->oa_config);
 
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, stream->wakeref);
 
 err_config:
 	if (stream->ctx)
@@ -2646,7 +2651,7 @@ err:
 static u64 oa_exponent_to_ns(struct drm_i915_private *dev_priv, int exponent)
 {
 	return div64_u64(1000000000ULL * (2ULL << exponent),
-			 1000ULL * INTEL_INFO(dev_priv)->cs_timestamp_frequency_khz);
+			 1000ULL * RUNTIME_INFO(dev_priv)->cs_timestamp_frequency_khz);
 }
 
 /**
@@ -3021,7 +3026,7 @@ static bool chv_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
 		(addr >= 0x182300 && addr <= 0x1823A4);
 }
 
-static uint32_t mask_reg_value(u32 reg, u32 val)
+static u32 mask_reg_value(u32 reg, u32 val)
 {
 	/* HALF_SLICE_CHICKEN2 is programmed with a the
 	 * WaDisableSTUnitPowerOptimization workaround. Make sure the value
@@ -3415,7 +3420,7 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 		dev_priv->perf.oa.ops.read = gen8_oa_read;
 		dev_priv->perf.oa.ops.oa_hw_tail_read = gen8_oa_hw_tail_read;
 
-		if (IS_GEN8(dev_priv) || IS_GEN9(dev_priv)) {
+		if (IS_GEN_RANGE(dev_priv, 8, 9)) {
 			dev_priv->perf.oa.ops.is_valid_b_counter_reg =
 				gen7_is_valid_b_counter_addr;
 			dev_priv->perf.oa.ops.is_valid_mux_reg =
@@ -3431,7 +3436,7 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 			dev_priv->perf.oa.ops.enable_metric_set = gen8_enable_metric_set;
 			dev_priv->perf.oa.ops.disable_metric_set = gen8_disable_metric_set;
 
-			if (IS_GEN8(dev_priv)) {
+			if (IS_GEN(dev_priv, 8)) {
 				dev_priv->perf.oa.ctx_oactxctrl_offset = 0x120;
 				dev_priv->perf.oa.ctx_flexeu0_offset = 0x2ce;
 
@@ -3442,7 +3447,7 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 
 				dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<16);
 			}
-		} else if (IS_GEN(dev_priv, 10, 11)) {
+		} else if (IS_GEN_RANGE(dev_priv, 10, 11)) {
 			dev_priv->perf.oa.ops.is_valid_b_counter_reg =
 				gen7_is_valid_b_counter_addr;
 			dev_priv->perf.oa.ops.is_valid_mux_reg =
@@ -3471,7 +3476,7 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 		spin_lock_init(&dev_priv->perf.oa.oa_buffer.ptr_lock);
 
 		oa_sample_rate_hard_limit = 1000 *
-			(INTEL_INFO(dev_priv)->cs_timestamp_frequency_khz / 2);
+			(RUNTIME_INFO(dev_priv)->cs_timestamp_frequency_khz / 2);
 		dev_priv->perf.sysctl_header = register_sysctl_table(dev_root);
 
 		mutex_init(&dev_priv->perf.metrics_lock);
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index cf7c66bb3ed9..b745c49a5af6 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -168,6 +168,7 @@ engines_sample(struct drm_i915_private *dev_priv, unsigned int period_ns)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
 	bool fw = false;
 
 	if ((dev_priv->pmu.enable & ENGINE_SAMPLE_MASK) == 0)
@@ -176,7 +177,8 @@ engines_sample(struct drm_i915_private *dev_priv, unsigned int period_ns)
 	if (!dev_priv->gt.awake)
 		return;
 
-	if (!intel_runtime_pm_get_if_in_use(dev_priv))
+	wakeref = intel_runtime_pm_get_if_in_use(dev_priv);
+	if (!wakeref)
 		return;
 
 	for_each_engine(engine, dev_priv, id) {
@@ -211,7 +213,7 @@ engines_sample(struct drm_i915_private *dev_priv, unsigned int period_ns)
 	if (fw)
 		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 }
 
 static void
@@ -228,11 +230,12 @@ frequency_sample(struct drm_i915_private *dev_priv, unsigned int period_ns)
 		u32 val;
 
 		val = dev_priv->gt_pm.rps.cur_freq;
-		if (dev_priv->gt.awake &&
-		    intel_runtime_pm_get_if_in_use(dev_priv)) {
-			val = intel_get_cagf(dev_priv,
-					     I915_READ_NOTRACE(GEN6_RPSTAT1));
-			intel_runtime_pm_put(dev_priv);
+		if (dev_priv->gt.awake) {
+			intel_wakeref_t wakeref;
+
+			with_intel_runtime_pm_if_in_use(dev_priv, wakeref)
+				val = intel_get_cagf(dev_priv,
+						     I915_READ_NOTRACE(GEN6_RPSTAT1));
 		}
 
 		add_sample_mult(&dev_priv->pmu.sample[__I915_SAMPLE_FREQ_ACT],
@@ -444,12 +447,14 @@ static u64 __get_rc6(struct drm_i915_private *i915)
 static u64 get_rc6(struct drm_i915_private *i915)
 {
 #if IS_ENABLED(CONFIG_PM)
+	intel_wakeref_t wakeref;
 	unsigned long flags;
 	u64 val;
 
-	if (intel_runtime_pm_get_if_in_use(i915)) {
+	wakeref = intel_runtime_pm_get_if_in_use(i915);
+	if (wakeref) {
 		val = __get_rc6(i915);
-		intel_runtime_pm_put(i915);
+		intel_runtime_pm_put(i915, wakeref);
 
 		/*
 		 * If we are coming back from being runtime suspended we must
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index fe56465cdfd6..cbcb957b7141 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -13,7 +13,7 @@
 static int query_topology_info(struct drm_i915_private *dev_priv,
 			       struct drm_i915_query_item *query_item)
 {
-	const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu;
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	struct drm_i915_query_topology_info topo;
 	u32 slice_length, subslice_length, eu_length, total_length;
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 067054cf4a86..638a586469f9 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -117,14 +117,14 @@
  */
 
 typedef struct {
-	uint32_t reg;
+	u32 reg;
 } i915_reg_t;
 
 #define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
 
 #define INVALID_MMIO_REG _MMIO(0)
 
-static inline uint32_t i915_mmio_reg_offset(i915_reg_t reg)
+static inline u32 i915_mmio_reg_offset(i915_reg_t reg)
 {
 	return reg.reg;
 }
@@ -139,6 +139,12 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 	return !i915_mmio_reg_equal(reg, INVALID_MMIO_REG);
 }
 
+#define VLV_DISPLAY_BASE		0x180000
+#define VLV_MIPI_BASE			VLV_DISPLAY_BASE
+#define BXT_MIPI_BASE			0x60000
+
+#define DISPLAY_MMIO_BASE(dev_priv)	(INTEL_INFO(dev_priv)->display_mmio_offset)
+
 /*
  * Given the first two numbers __a and __b of arbitrarily many evenly spaced
  * numbers, pick the 0-based __index'th value.
@@ -179,15 +185,15 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
  * Device info offset array based helpers for groups of registers with unevenly
  * spaced base offsets.
  */
-#define _MMIO_PIPE2(pipe, reg)		_MMIO(dev_priv->info.pipe_offsets[pipe] - \
-					      dev_priv->info.pipe_offsets[PIPE_A] + (reg) + \
-					      dev_priv->info.display_mmio_offset)
-#define _MMIO_TRANS2(pipe, reg)		_MMIO(dev_priv->info.trans_offsets[(pipe)] - \
-					      dev_priv->info.trans_offsets[TRANSCODER_A] + (reg) + \
-					      dev_priv->info.display_mmio_offset)
-#define _CURSOR2(pipe, reg)		_MMIO(dev_priv->info.cursor_offsets[(pipe)] - \
-					      dev_priv->info.cursor_offsets[PIPE_A] + (reg) + \
-					      dev_priv->info.display_mmio_offset)
+#define _MMIO_PIPE2(pipe, reg)		_MMIO(INTEL_INFO(dev_priv)->pipe_offsets[pipe] - \
+					      INTEL_INFO(dev_priv)->pipe_offsets[PIPE_A] + (reg) + \
+					      DISPLAY_MMIO_BASE(dev_priv))
+#define _MMIO_TRANS2(pipe, reg)		_MMIO(INTEL_INFO(dev_priv)->trans_offsets[(pipe)] - \
+					      INTEL_INFO(dev_priv)->trans_offsets[TRANSCODER_A] + (reg) + \
+					      DISPLAY_MMIO_BASE(dev_priv))
+#define _CURSOR2(pipe, reg)		_MMIO(INTEL_INFO(dev_priv)->cursor_offsets[(pipe)] - \
+					      INTEL_INFO(dev_priv)->cursor_offsets[PIPE_A] + (reg) + \
+					      DISPLAY_MMIO_BASE(dev_priv))
 
 #define __MASKED_FIELD(mask, value) ((mask) << 16 | (value))
 #define _MASKED_FIELD(mask, value) ({					   \
@@ -347,6 +353,24 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define  GEN11_GRDOM_MEDIA4		(1 << 8)
 #define  GEN11_GRDOM_VECS		(1 << 13)
 #define  GEN11_GRDOM_VECS2		(1 << 14)
+#define  GEN11_GRDOM_SFC0		(1 << 17)
+#define  GEN11_GRDOM_SFC1		(1 << 18)
+
+#define  GEN11_VCS_SFC_RESET_BIT(instance)	(GEN11_GRDOM_SFC0 << ((instance) >> 1))
+#define  GEN11_VECS_SFC_RESET_BIT(instance)	(GEN11_GRDOM_SFC0 << (instance))
+
+#define GEN11_VCS_SFC_FORCED_LOCK(engine)	_MMIO((engine)->mmio_base + 0x88C)
+#define   GEN11_VCS_SFC_FORCED_LOCK_BIT		(1 << 0)
+#define GEN11_VCS_SFC_LOCK_STATUS(engine)	_MMIO((engine)->mmio_base + 0x890)
+#define   GEN11_VCS_SFC_USAGE_BIT		(1 << 0)
+#define   GEN11_VCS_SFC_LOCK_ACK_BIT		(1 << 1)
+
+#define GEN11_VECS_SFC_FORCED_LOCK(engine)	_MMIO((engine)->mmio_base + 0x201C)
+#define   GEN11_VECS_SFC_FORCED_LOCK_BIT	(1 << 0)
+#define GEN11_VECS_SFC_LOCK_ACK(engine)		_MMIO((engine)->mmio_base + 0x2018)
+#define   GEN11_VECS_SFC_LOCK_ACK_BIT		(1 << 0)
+#define GEN11_VECS_SFC_USAGE(engine)		_MMIO((engine)->mmio_base + 0x2014)
+#define   GEN11_VECS_SFC_USAGE_BIT		(1 << 0)
 
 #define RING_PP_DIR_BASE(engine)	_MMIO((engine)->mmio_base + 0x228)
 #define RING_PP_DIR_BASE_READ(engine)	_MMIO((engine)->mmio_base + 0x518)
@@ -2596,10 +2620,6 @@ enum i915_power_well_id {
 
 #define   GEN11_GFX_DISABLE_LEGACY_MODE	(1 << 3)
 
-#define VLV_DISPLAY_BASE 0x180000
-#define VLV_MIPI_BASE VLV_DISPLAY_BASE
-#define BXT_MIPI_BASE 0x60000
-
 #define VLV_GU_CTL0	_MMIO(VLV_DISPLAY_BASE + 0x2030)
 #define VLV_GU_CTL1	_MMIO(VLV_DISPLAY_BASE + 0x2034)
 #define SCPD0		_MMIO(0x209c) /* 915+ only */
@@ -2781,6 +2801,9 @@ enum i915_power_well_id {
 #define GEN6_RCS_PWR_FSM _MMIO(0x22ac)
 #define GEN9_RCS_FE_FSM2 _MMIO(0x22a4)
 
+#define GEN10_CACHE_MODE_SS			_MMIO(0xe420)
+#define   FLOAT_BLEND_OPTIMIZATION_ENABLE	(1 << 4)
+
 /* Fuse readout registers for GT */
 #define HSW_PAVP_FUSE1			_MMIO(0x911C)
 #define   HSW_F1_EU_DIS_SHIFT		16
@@ -3156,9 +3179,9 @@ enum i915_power_well_id {
 /*
  * Clock control & power management
  */
-#define _DPLL_A (dev_priv->info.display_mmio_offset + 0x6014)
-#define _DPLL_B (dev_priv->info.display_mmio_offset + 0x6018)
-#define _CHV_DPLL_C (dev_priv->info.display_mmio_offset + 0x6030)
+#define _DPLL_A (DISPLAY_MMIO_BASE(dev_priv) + 0x6014)
+#define _DPLL_B (DISPLAY_MMIO_BASE(dev_priv) + 0x6018)
+#define _CHV_DPLL_C (DISPLAY_MMIO_BASE(dev_priv) + 0x6030)
 #define DPLL(pipe) _MMIO_PIPE3((pipe), _DPLL_A, _DPLL_B, _CHV_DPLL_C)
 
 #define VGA0	_MMIO(0x6000)
@@ -3255,9 +3278,9 @@ enum i915_power_well_id {
 #define   SDVO_MULTIPLIER_SHIFT_HIRES		4
 #define   SDVO_MULTIPLIER_SHIFT_VGA		0
 
-#define _DPLL_A_MD (dev_priv->info.display_mmio_offset + 0x601c)
-#define _DPLL_B_MD (dev_priv->info.display_mmio_offset + 0x6020)
-#define _CHV_DPLL_C_MD (dev_priv->info.display_mmio_offset + 0x603c)
+#define _DPLL_A_MD (DISPLAY_MMIO_BASE(dev_priv) + 0x601c)
+#define _DPLL_B_MD (DISPLAY_MMIO_BASE(dev_priv) + 0x6020)
+#define _CHV_DPLL_C_MD (DISPLAY_MMIO_BASE(dev_priv) + 0x603c)
 #define DPLL_MD(pipe) _MMIO_PIPE3((pipe), _DPLL_A_MD, _DPLL_B_MD, _CHV_DPLL_C_MD)
 
 /*
@@ -3329,7 +3352,7 @@ enum i915_power_well_id {
 #define  DSTATE_PLL_D3_OFF			(1 << 3)
 #define  DSTATE_GFX_CLOCK_GATING		(1 << 1)
 #define  DSTATE_DOT_CLOCK_GATING		(1 << 0)
-#define DSPCLK_GATE_D	_MMIO(dev_priv->info.display_mmio_offset + 0x6200)
+#define DSPCLK_GATE_D	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x6200)
 # define DPUNIT_B_CLOCK_GATE_DISABLE		(1 << 30) /* 965 */
 # define VSUNIT_CLOCK_GATE_DISABLE		(1 << 29) /* 965 */
 # define VRHUNIT_CLOCK_GATE_DISABLE		(1 << 28) /* 965 */
@@ -3469,7 +3492,7 @@ enum i915_power_well_id {
 #define _PALETTE_A		0xa000
 #define _PALETTE_B		0xa800
 #define _CHV_PALETTE_C		0xc000
-#define PALETTE(pipe, i)	_MMIO(dev_priv->info.display_mmio_offset + \
+#define PALETTE(pipe, i)	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + \
 				      _PICK((pipe), _PALETTE_A,		\
 					    _PALETTE_B, _CHV_PALETTE_C) + \
 				      (i) * 4)
@@ -4252,6 +4275,15 @@ enum {
 #define EDP_PSR2_STATUS_STATE_MASK     (0xf << 28)
 #define EDP_PSR2_STATUS_STATE_SHIFT    28
 
+#define _PSR2_SU_STATUS_0		0x6F914
+#define _PSR2_SU_STATUS_1		0x6F918
+#define _PSR2_SU_STATUS_2		0x6F91C
+#define _PSR2_SU_STATUS(index)		_MMIO(_PICK_EVEN((index), _PSR2_SU_STATUS_0, _PSR2_SU_STATUS_1))
+#define PSR2_SU_STATUS(frame)		(_PSR2_SU_STATUS((frame) / 3))
+#define PSR2_SU_STATUS_SHIFT(frame)	(((frame) % 3) * 10)
+#define PSR2_SU_STATUS_MASK(frame)	(0x3ff << PSR2_SU_STATUS_SHIFT(frame))
+#define PSR2_SU_STATUS_FRAMES		8
+
 /* VGA port control */
 #define ADPA			_MMIO(0x61100)
 #define PCH_ADPA                _MMIO(0xe1100)
@@ -4302,7 +4334,7 @@ enum {
 
 
 /* Hotplug control (945+ only) */
-#define PORT_HOTPLUG_EN		_MMIO(dev_priv->info.display_mmio_offset + 0x61110)
+#define PORT_HOTPLUG_EN		_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x61110)
 #define   PORTB_HOTPLUG_INT_EN			(1 << 29)
 #define   PORTC_HOTPLUG_INT_EN			(1 << 28)
 #define   PORTD_HOTPLUG_INT_EN			(1 << 27)
@@ -4332,7 +4364,7 @@ enum {
 #define CRT_HOTPLUG_DETECT_VOLTAGE_325MV	(0 << 2)
 #define CRT_HOTPLUG_DETECT_VOLTAGE_475MV	(1 << 2)
 
-#define PORT_HOTPLUG_STAT	_MMIO(dev_priv->info.display_mmio_offset + 0x61114)
+#define PORT_HOTPLUG_STAT	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x61114)
 /*
  * HDMI/DP bits are g4x+
  *
@@ -4414,7 +4446,7 @@ enum {
 
 #define PORT_DFT_I9XX				_MMIO(0x61150)
 #define   DC_BALANCE_RESET			(1 << 25)
-#define PORT_DFT2_G4X		_MMIO(dev_priv->info.display_mmio_offset + 0x61154)
+#define PORT_DFT2_G4X		_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x61154)
 #define   DC_BALANCE_RESET_VLV			(1 << 31)
 #define   PIPE_SCRAMBLE_RESET_MASK		((1 << 14) | (0x3 << 0))
 #define   PIPE_C_SCRAMBLE_RESET			(1 << 14) /* chv */
@@ -4667,7 +4699,6 @@ enum {
 #define  EDP_FORCE_VDD			(1 << 3)
 #define  EDP_BLC_ENABLE			(1 << 2)
 #define  PANEL_POWER_RESET		(1 << 1)
-#define  PANEL_POWER_OFF		(0 << 0)
 #define  PANEL_POWER_ON			(1 << 0)
 
 #define _PP_ON_DELAYS			0x61208
@@ -4699,7 +4730,7 @@ enum {
 #define  PANEL_POWER_CYCLE_DELAY_SHIFT	0
 
 /* Panel fitting */
-#define PFIT_CONTROL	_MMIO(dev_priv->info.display_mmio_offset + 0x61230)
+#define PFIT_CONTROL	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x61230)
 #define   PFIT_ENABLE		(1 << 31)
 #define   PFIT_PIPE_MASK	(3 << 29)
 #define   PFIT_PIPE_SHIFT	29
@@ -4717,7 +4748,7 @@ enum {
 #define   PFIT_SCALING_PROGRAMMED (1 << 26)
 #define   PFIT_SCALING_PILLAR	(2 << 26)
 #define   PFIT_SCALING_LETTER	(3 << 26)
-#define PFIT_PGM_RATIOS _MMIO(dev_priv->info.display_mmio_offset + 0x61234)
+#define PFIT_PGM_RATIOS _MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x61234)
 /* Pre-965 */
 #define		PFIT_VERT_SCALE_SHIFT		20
 #define		PFIT_VERT_SCALE_MASK		0xfff00000
@@ -4729,25 +4760,25 @@ enum {
 #define		PFIT_HORIZ_SCALE_SHIFT_965	0
 #define		PFIT_HORIZ_SCALE_MASK_965	0x00001fff
 
-#define PFIT_AUTO_RATIOS _MMIO(dev_priv->info.display_mmio_offset + 0x61238)
+#define PFIT_AUTO_RATIOS _MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x61238)
 
-#define _VLV_BLC_PWM_CTL2_A (dev_priv->info.display_mmio_offset + 0x61250)
-#define _VLV_BLC_PWM_CTL2_B (dev_priv->info.display_mmio_offset + 0x61350)
+#define _VLV_BLC_PWM_CTL2_A (DISPLAY_MMIO_BASE(dev_priv) + 0x61250)
+#define _VLV_BLC_PWM_CTL2_B (DISPLAY_MMIO_BASE(dev_priv) + 0x61350)
 #define VLV_BLC_PWM_CTL2(pipe) _MMIO_PIPE(pipe, _VLV_BLC_PWM_CTL2_A, \
 					 _VLV_BLC_PWM_CTL2_B)
 
-#define _VLV_BLC_PWM_CTL_A (dev_priv->info.display_mmio_offset + 0x61254)
-#define _VLV_BLC_PWM_CTL_B (dev_priv->info.display_mmio_offset + 0x61354)
+#define _VLV_BLC_PWM_CTL_A (DISPLAY_MMIO_BASE(dev_priv) + 0x61254)
+#define _VLV_BLC_PWM_CTL_B (DISPLAY_MMIO_BASE(dev_priv) + 0x61354)
 #define VLV_BLC_PWM_CTL(pipe) _MMIO_PIPE(pipe, _VLV_BLC_PWM_CTL_A, \
 					_VLV_BLC_PWM_CTL_B)
 
-#define _VLV_BLC_HIST_CTL_A (dev_priv->info.display_mmio_offset + 0x61260)
-#define _VLV_BLC_HIST_CTL_B (dev_priv->info.display_mmio_offset + 0x61360)
+#define _VLV_BLC_HIST_CTL_A (DISPLAY_MMIO_BASE(dev_priv) + 0x61260)
+#define _VLV_BLC_HIST_CTL_B (DISPLAY_MMIO_BASE(dev_priv) + 0x61360)
 #define VLV_BLC_HIST_CTL(pipe) _MMIO_PIPE(pipe, _VLV_BLC_HIST_CTL_A, \
 					 _VLV_BLC_HIST_CTL_B)
 
 /* Backlight control */
-#define BLC_PWM_CTL2	_MMIO(dev_priv->info.display_mmio_offset + 0x61250) /* 965+ only */
+#define BLC_PWM_CTL2	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x61250) /* 965+ only */
 #define   BLM_PWM_ENABLE		(1 << 31)
 #define   BLM_COMBINATION_MODE		(1 << 30) /* gen4 only */
 #define   BLM_PIPE_SELECT		(1 << 29)
@@ -4770,7 +4801,7 @@ enum {
 #define   BLM_PHASE_IN_COUNT_MASK	(0xff << 8)
 #define   BLM_PHASE_IN_INCR_SHIFT	(0)
 #define   BLM_PHASE_IN_INCR_MASK	(0xff << 0)
-#define BLC_PWM_CTL	_MMIO(dev_priv->info.display_mmio_offset + 0x61254)
+#define BLC_PWM_CTL	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x61254)
 /*
  * This is the most significant 15 bits of the number of backlight cycles in a
  * complete cycle of the modulated backlight control.
@@ -4792,7 +4823,7 @@ enum {
 #define   BACKLIGHT_DUTY_CYCLE_MASK_PNV		(0xfffe)
 #define   BLM_POLARITY_PNV			(1 << 0) /* pnv only */
 
-#define BLC_HIST_CTL	_MMIO(dev_priv->info.display_mmio_offset + 0x61260)
+#define BLC_HIST_CTL	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x61260)
 #define  BLM_HISTOGRAM_ENABLE			(1 << 31)
 
 /* New registers for PCH-split platforms. Safe where new bits show up, the
@@ -4867,6 +4898,7 @@ enum {
 # define TV_OVERSAMPLE_NONE		(2 << 18)
 /* Selects 8x oversampling */
 # define TV_OVERSAMPLE_8X		(3 << 18)
+# define TV_OVERSAMPLE_MASK		(3 << 18)
 /* Selects progressive mode rather than interlaced */
 # define TV_PROGRESSIVE			(1 << 17)
 /* Sets the colorburst to PAL mode.  Required for non-M PAL modes. */
@@ -5416,47 +5448,47 @@ enum {
  * is 20 bytes in each direction, hence the 5 fixed
  * data registers
  */
-#define _DPA_AUX_CH_CTL		(dev_priv->info.display_mmio_offset + 0x64010)
-#define _DPA_AUX_CH_DATA1	(dev_priv->info.display_mmio_offset + 0x64014)
-#define _DPA_AUX_CH_DATA2	(dev_priv->info.display_mmio_offset + 0x64018)
-#define _DPA_AUX_CH_DATA3	(dev_priv->info.display_mmio_offset + 0x6401c)
-#define _DPA_AUX_CH_DATA4	(dev_priv->info.display_mmio_offset + 0x64020)
-#define _DPA_AUX_CH_DATA5	(dev_priv->info.display_mmio_offset + 0x64024)
-
-#define _DPB_AUX_CH_CTL		(dev_priv->info.display_mmio_offset + 0x64110)
-#define _DPB_AUX_CH_DATA1	(dev_priv->info.display_mmio_offset + 0x64114)
-#define _DPB_AUX_CH_DATA2	(dev_priv->info.display_mmio_offset + 0x64118)
-#define _DPB_AUX_CH_DATA3	(dev_priv->info.display_mmio_offset + 0x6411c)
-#define _DPB_AUX_CH_DATA4	(dev_priv->info.display_mmio_offset + 0x64120)
-#define _DPB_AUX_CH_DATA5	(dev_priv->info.display_mmio_offset + 0x64124)
-
-#define _DPC_AUX_CH_CTL		(dev_priv->info.display_mmio_offset + 0x64210)
-#define _DPC_AUX_CH_DATA1	(dev_priv->info.display_mmio_offset + 0x64214)
-#define _DPC_AUX_CH_DATA2	(dev_priv->info.display_mmio_offset + 0x64218)
-#define _DPC_AUX_CH_DATA3	(dev_priv->info.display_mmio_offset + 0x6421c)
-#define _DPC_AUX_CH_DATA4	(dev_priv->info.display_mmio_offset + 0x64220)
-#define _DPC_AUX_CH_DATA5	(dev_priv->info.display_mmio_offset + 0x64224)
-
-#define _DPD_AUX_CH_CTL		(dev_priv->info.display_mmio_offset + 0x64310)
-#define _DPD_AUX_CH_DATA1	(dev_priv->info.display_mmio_offset + 0x64314)
-#define _DPD_AUX_CH_DATA2	(dev_priv->info.display_mmio_offset + 0x64318)
-#define _DPD_AUX_CH_DATA3	(dev_priv->info.display_mmio_offset + 0x6431c)
-#define _DPD_AUX_CH_DATA4	(dev_priv->info.display_mmio_offset + 0x64320)
-#define _DPD_AUX_CH_DATA5	(dev_priv->info.display_mmio_offset + 0x64324)
-
-#define _DPE_AUX_CH_CTL		(dev_priv->info.display_mmio_offset + 0x64410)
-#define _DPE_AUX_CH_DATA1	(dev_priv->info.display_mmio_offset + 0x64414)
-#define _DPE_AUX_CH_DATA2	(dev_priv->info.display_mmio_offset + 0x64418)
-#define _DPE_AUX_CH_DATA3	(dev_priv->info.display_mmio_offset + 0x6441c)
-#define _DPE_AUX_CH_DATA4	(dev_priv->info.display_mmio_offset + 0x64420)
-#define _DPE_AUX_CH_DATA5	(dev_priv->info.display_mmio_offset + 0x64424)
-
-#define _DPF_AUX_CH_CTL		(dev_priv->info.display_mmio_offset + 0x64510)
-#define _DPF_AUX_CH_DATA1	(dev_priv->info.display_mmio_offset + 0x64514)
-#define _DPF_AUX_CH_DATA2	(dev_priv->info.display_mmio_offset + 0x64518)
-#define _DPF_AUX_CH_DATA3	(dev_priv->info.display_mmio_offset + 0x6451c)
-#define _DPF_AUX_CH_DATA4	(dev_priv->info.display_mmio_offset + 0x64520)
-#define _DPF_AUX_CH_DATA5	(dev_priv->info.display_mmio_offset + 0x64524)
+#define _DPA_AUX_CH_CTL		(DISPLAY_MMIO_BASE(dev_priv) + 0x64010)
+#define _DPA_AUX_CH_DATA1	(DISPLAY_MMIO_BASE(dev_priv) + 0x64014)
+#define _DPA_AUX_CH_DATA2	(DISPLAY_MMIO_BASE(dev_priv) + 0x64018)
+#define _DPA_AUX_CH_DATA3	(DISPLAY_MMIO_BASE(dev_priv) + 0x6401c)
+#define _DPA_AUX_CH_DATA4	(DISPLAY_MMIO_BASE(dev_priv) + 0x64020)
+#define _DPA_AUX_CH_DATA5	(DISPLAY_MMIO_BASE(dev_priv) + 0x64024)
+
+#define _DPB_AUX_CH_CTL		(DISPLAY_MMIO_BASE(dev_priv) + 0x64110)
+#define _DPB_AUX_CH_DATA1	(DISPLAY_MMIO_BASE(dev_priv) + 0x64114)
+#define _DPB_AUX_CH_DATA2	(DISPLAY_MMIO_BASE(dev_priv) + 0x64118)
+#define _DPB_AUX_CH_DATA3	(DISPLAY_MMIO_BASE(dev_priv) + 0x6411c)
+#define _DPB_AUX_CH_DATA4	(DISPLAY_MMIO_BASE(dev_priv) + 0x64120)
+#define _DPB_AUX_CH_DATA5	(DISPLAY_MMIO_BASE(dev_priv) + 0x64124)
+
+#define _DPC_AUX_CH_CTL		(DISPLAY_MMIO_BASE(dev_priv) + 0x64210)
+#define _DPC_AUX_CH_DATA1	(DISPLAY_MMIO_BASE(dev_priv) + 0x64214)
+#define _DPC_AUX_CH_DATA2	(DISPLAY_MMIO_BASE(dev_priv) + 0x64218)
+#define _DPC_AUX_CH_DATA3	(DISPLAY_MMIO_BASE(dev_priv) + 0x6421c)
+#define _DPC_AUX_CH_DATA4	(DISPLAY_MMIO_BASE(dev_priv) + 0x64220)
+#define _DPC_AUX_CH_DATA5	(DISPLAY_MMIO_BASE(dev_priv) + 0x64224)
+
+#define _DPD_AUX_CH_CTL		(DISPLAY_MMIO_BASE(dev_priv) + 0x64310)
+#define _DPD_AUX_CH_DATA1	(DISPLAY_MMIO_BASE(dev_priv) + 0x64314)
+#define _DPD_AUX_CH_DATA2	(DISPLAY_MMIO_BASE(dev_priv) + 0x64318)
+#define _DPD_AUX_CH_DATA3	(DISPLAY_MMIO_BASE(dev_priv) + 0x6431c)
+#define _DPD_AUX_CH_DATA4	(DISPLAY_MMIO_BASE(dev_priv) + 0x64320)
+#define _DPD_AUX_CH_DATA5	(DISPLAY_MMIO_BASE(dev_priv) + 0x64324)
+
+#define _DPE_AUX_CH_CTL		(DISPLAY_MMIO_BASE(dev_priv) + 0x64410)
+#define _DPE_AUX_CH_DATA1	(DISPLAY_MMIO_BASE(dev_priv) + 0x64414)
+#define _DPE_AUX_CH_DATA2	(DISPLAY_MMIO_BASE(dev_priv) + 0x64418)
+#define _DPE_AUX_CH_DATA3	(DISPLAY_MMIO_BASE(dev_priv) + 0x6441c)
+#define _DPE_AUX_CH_DATA4	(DISPLAY_MMIO_BASE(dev_priv) + 0x64420)
+#define _DPE_AUX_CH_DATA5	(DISPLAY_MMIO_BASE(dev_priv) + 0x64424)
+
+#define _DPF_AUX_CH_CTL		(DISPLAY_MMIO_BASE(dev_priv) + 0x64510)
+#define _DPF_AUX_CH_DATA1	(DISPLAY_MMIO_BASE(dev_priv) + 0x64514)
+#define _DPF_AUX_CH_DATA2	(DISPLAY_MMIO_BASE(dev_priv) + 0x64518)
+#define _DPF_AUX_CH_DATA3	(DISPLAY_MMIO_BASE(dev_priv) + 0x6451c)
+#define _DPF_AUX_CH_DATA4	(DISPLAY_MMIO_BASE(dev_priv) + 0x64520)
+#define _DPF_AUX_CH_DATA5	(DISPLAY_MMIO_BASE(dev_priv) + 0x64524)
 
 #define DP_AUX_CH_CTL(aux_ch)	_MMIO_PORT(aux_ch, _DPA_AUX_CH_CTL, _DPB_AUX_CH_CTL)
 #define DP_AUX_CH_DATA(aux_ch, i)	_MMIO(_PORT(aux_ch, _DPA_AUX_CH_DATA1, _DPB_AUX_CH_DATA1) + (i) * 4) /* 5 registers */
@@ -5681,6 +5713,12 @@ enum {
 #define   PIPEMISC_DITHER_TYPE_SP	(0 << 2)
 #define PIPEMISC(pipe)			_MMIO_PIPE2(pipe, _PIPE_MISC_A)
 
+/* Skylake+ pipe bottom (background) color */
+#define _SKL_BOTTOM_COLOR_A		0x70034
+#define   SKL_BOTTOM_COLOR_GAMMA_ENABLE	(1 << 31)
+#define   SKL_BOTTOM_COLOR_CSC_ENABLE	(1 << 30)
+#define SKL_BOTTOM_COLOR(pipe)		_MMIO_PIPE2(pipe, _SKL_BOTTOM_COLOR_A)
+
 #define VLV_DPFLIPSTAT				_MMIO(VLV_DISPLAY_BASE + 0x70028)
 #define   PIPEB_LINE_COMPARE_INT_EN		(1 << 29)
 #define   PIPEB_HLINE_INT_EN			(1 << 28)
@@ -5732,7 +5770,7 @@ enum {
 #define   DPINVGTT_STATUS_MASK			0xff
 #define   DPINVGTT_STATUS_MASK_CHV		0xfff
 
-#define DSPARB			_MMIO(dev_priv->info.display_mmio_offset + 0x70030)
+#define DSPARB			_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x70030)
 #define   DSPARB_CSTART_MASK	(0x7f << 7)
 #define   DSPARB_CSTART_SHIFT	7
 #define   DSPARB_BSTART_MASK	(0x7f)
@@ -5767,7 +5805,7 @@ enum {
 #define   DSPARB_SPRITEF_MASK_VLV	(0xff << 8)
 
 /* pnv/gen4/g4x/vlv/chv */
-#define DSPFW1		_MMIO(dev_priv->info.display_mmio_offset + 0x70034)
+#define DSPFW1		_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x70034)
 #define   DSPFW_SR_SHIFT		23
 #define   DSPFW_SR_MASK			(0x1ff << 23)
 #define   DSPFW_CURSORB_SHIFT		16
@@ -5778,7 +5816,7 @@ enum {
 #define   DSPFW_PLANEA_SHIFT		0
 #define   DSPFW_PLANEA_MASK		(0x7f << 0)
 #define   DSPFW_PLANEA_MASK_VLV		(0xff << 0) /* vlv/chv */
-#define DSPFW2		_MMIO(dev_priv->info.display_mmio_offset + 0x70038)
+#define DSPFW2		_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x70038)
 #define   DSPFW_FBC_SR_EN		(1 << 31)	  /* g4x */
 #define   DSPFW_FBC_SR_SHIFT		28
 #define   DSPFW_FBC_SR_MASK		(0x7 << 28) /* g4x */
@@ -5794,7 +5832,7 @@ enum {
 #define   DSPFW_SPRITEA_SHIFT		0
 #define   DSPFW_SPRITEA_MASK		(0x7f << 0) /* g4x */
 #define   DSPFW_SPRITEA_MASK_VLV	(0xff << 0) /* vlv/chv */
-#define DSPFW3		_MMIO(dev_priv->info.display_mmio_offset + 0x7003c)
+#define DSPFW3		_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x7003c)
 #define   DSPFW_HPLL_SR_EN		(1 << 31)
 #define   PINEVIEW_SELF_REFRESH_EN	(1 << 30)
 #define   DSPFW_CURSOR_SR_SHIFT		24
@@ -5962,7 +6000,7 @@ enum {
 #define   PLANE_WM_EN		(1 << 31)
 #define   PLANE_WM_LINES_SHIFT	14
 #define   PLANE_WM_LINES_MASK	0x1f
-#define   PLANE_WM_BLOCKS_MASK	0x3ff
+#define   PLANE_WM_BLOCKS_MASK	0x7ff /* skl+: 10 bits, icl+ 11 bits */
 
 #define _CUR_WM_0(pipe) _PIPE(pipe, _CUR_WM_A_0, _CUR_WM_B_0)
 #define CUR_WM(pipe, level) _MMIO(_CUR_WM_0(pipe) + ((4) * (level)))
@@ -6210,35 +6248,35 @@ enum {
  * [10:1f] all
  * [30:32] all
  */
-#define SWF0(i)	_MMIO(dev_priv->info.display_mmio_offset + 0x70410 + (i) * 4)
-#define SWF1(i)	_MMIO(dev_priv->info.display_mmio_offset + 0x71410 + (i) * 4)
-#define SWF3(i)	_MMIO(dev_priv->info.display_mmio_offset + 0x72414 + (i) * 4)
+#define SWF0(i)	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x70410 + (i) * 4)
+#define SWF1(i)	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x71410 + (i) * 4)
+#define SWF3(i)	_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x72414 + (i) * 4)
 #define SWF_ILK(i)	_MMIO(0x4F000 + (i) * 4)
 
 /* Pipe B */
-#define _PIPEBDSL		(dev_priv->info.display_mmio_offset + 0x71000)
-#define _PIPEBCONF		(dev_priv->info.display_mmio_offset + 0x71008)
-#define _PIPEBSTAT		(dev_priv->info.display_mmio_offset + 0x71024)
+#define _PIPEBDSL		(DISPLAY_MMIO_BASE(dev_priv) + 0x71000)
+#define _PIPEBCONF		(DISPLAY_MMIO_BASE(dev_priv) + 0x71008)
+#define _PIPEBSTAT		(DISPLAY_MMIO_BASE(dev_priv) + 0x71024)
 #define _PIPEBFRAMEHIGH		0x71040
 #define _PIPEBFRAMEPIXEL	0x71044
-#define _PIPEB_FRMCOUNT_G4X	(dev_priv->info.display_mmio_offset + 0x71040)
-#define _PIPEB_FLIPCOUNT_G4X	(dev_priv->info.display_mmio_offset + 0x71044)
+#define _PIPEB_FRMCOUNT_G4X	(DISPLAY_MMIO_BASE(dev_priv) + 0x71040)
+#define _PIPEB_FLIPCOUNT_G4X	(DISPLAY_MMIO_BASE(dev_priv) + 0x71044)
 
 
 /* Display B control */
-#define _DSPBCNTR		(dev_priv->info.display_mmio_offset + 0x71180)
+#define _DSPBCNTR		(DISPLAY_MMIO_BASE(dev_priv) + 0x71180)
 #define   DISPPLANE_ALPHA_TRANS_ENABLE		(1 << 15)
 #define   DISPPLANE_ALPHA_TRANS_DISABLE		0
 #define   DISPPLANE_SPRITE_ABOVE_DISPLAY	0
 #define   DISPPLANE_SPRITE_ABOVE_OVERLAY	(1)
-#define _DSPBADDR		(dev_priv->info.display_mmio_offset + 0x71184)
-#define _DSPBSTRIDE		(dev_priv->info.display_mmio_offset + 0x71188)
-#define _DSPBPOS		(dev_priv->info.display_mmio_offset + 0x7118C)
-#define _DSPBSIZE		(dev_priv->info.display_mmio_offset + 0x71190)
-#define _DSPBSURF		(dev_priv->info.display_mmio_offset + 0x7119C)
-#define _DSPBTILEOFF		(dev_priv->info.display_mmio_offset + 0x711A4)
-#define _DSPBOFFSET		(dev_priv->info.display_mmio_offset + 0x711A4)
-#define _DSPBSURFLIVE		(dev_priv->info.display_mmio_offset + 0x711AC)
+#define _DSPBADDR		(DISPLAY_MMIO_BASE(dev_priv) + 0x71184)
+#define _DSPBSTRIDE		(DISPLAY_MMIO_BASE(dev_priv) + 0x71188)
+#define _DSPBPOS		(DISPLAY_MMIO_BASE(dev_priv) + 0x7118C)
+#define _DSPBSIZE		(DISPLAY_MMIO_BASE(dev_priv) + 0x71190)
+#define _DSPBSURF		(DISPLAY_MMIO_BASE(dev_priv) + 0x7119C)
+#define _DSPBTILEOFF		(DISPLAY_MMIO_BASE(dev_priv) + 0x711A4)
+#define _DSPBOFFSET		(DISPLAY_MMIO_BASE(dev_priv) + 0x711A4)
+#define _DSPBSURFLIVE		(DISPLAY_MMIO_BASE(dev_priv) + 0x711AC)
 
 /* ICL DSI 0 and 1 */
 #define _PIPEDSI0CONF		0x7b008
@@ -6746,8 +6784,7 @@ enum {
 
 #define _PLANE_BUF_CFG_1_B			0x7127c
 #define _PLANE_BUF_CFG_2_B			0x7137c
-#define  SKL_DDB_ENTRY_MASK			0x3FF
-#define  ICL_DDB_ENTRY_MASK			0x7FF
+#define  DDB_ENTRY_MASK				0x7FF /* skl+: 10 bits, icl+ 11 bits */
 #define  DDB_ENTRY_END_SHIFT			16
 #define _PLANE_BUF_CFG_1(pipe)	\
 	_PIPE(pipe, _PLANE_BUF_CFG_1_A, _PLANE_BUF_CFG_1_B)
@@ -7580,6 +7617,7 @@ enum {
 #define _PIPEB_CHICKEN			0x71038
 #define _PIPEC_CHICKEN			0x72038
 #define  PER_PIXEL_ALPHA_BYPASS_EN	(1 << 7)
+#define  PM_FILL_MAINTAIN_DBUF_FULLNESS	(1 << 0)
 #define PIPE_CHICKEN(pipe)		_MMIO_PIPE(pipe, _PIPEA_CHICKEN,\
 						   _PIPEB_CHICKEN)
 
@@ -8790,7 +8828,7 @@ enum {
 #define   GEN9_ENABLE_GPGPU_PREEMPTION	(1 << 2)
 
 /* Audio */
-#define G4X_AUD_VID_DID			_MMIO(dev_priv->info.display_mmio_offset + 0x62020)
+#define G4X_AUD_VID_DID			_MMIO(DISPLAY_MMIO_BASE(dev_priv) + 0x62020)
 #define   INTEL_AUDIO_DEVCL		0x808629FB
 #define   INTEL_AUDIO_DEVBLC		0x80862801
 #define   INTEL_AUDIO_DEVCTG		0x80862802
@@ -9525,7 +9563,7 @@ enum skl_power_gate {
 #define _MG_PLL3_ENABLE		0x46038
 #define _MG_PLL4_ENABLE		0x4603C
 /* Bits are the same as DPLL0_ENABLE */
-#define MG_PLL_ENABLE(port)	_MMIO_PORT((port) - PORT_C, _MG_PLL1_ENABLE, \
+#define MG_PLL_ENABLE(tc_port)	_MMIO_PORT((tc_port), _MG_PLL1_ENABLE, \
 					   _MG_PLL2_ENABLE)
 
 #define _MG_REFCLKIN_CTL_PORT1				0x16892C
@@ -9534,9 +9572,9 @@ enum skl_power_gate {
 #define _MG_REFCLKIN_CTL_PORT4				0x16B92C
 #define   MG_REFCLKIN_CTL_OD_2_MUX(x)			((x) << 8)
 #define   MG_REFCLKIN_CTL_OD_2_MUX_MASK			(0x7 << 8)
-#define MG_REFCLKIN_CTL(port) _MMIO_PORT((port) - PORT_C, \
-					 _MG_REFCLKIN_CTL_PORT1, \
-					 _MG_REFCLKIN_CTL_PORT2)
+#define MG_REFCLKIN_CTL(tc_port) _MMIO_PORT((tc_port), \
+					    _MG_REFCLKIN_CTL_PORT1, \
+					    _MG_REFCLKIN_CTL_PORT2)
 
 #define _MG_CLKTOP2_CORECLKCTL1_PORT1			0x1688D8
 #define _MG_CLKTOP2_CORECLKCTL1_PORT2			0x1698D8
@@ -9546,9 +9584,9 @@ enum skl_power_gate {
 #define   MG_CLKTOP2_CORECLKCTL1_B_DIVRATIO_MASK	(0xff << 16)
 #define   MG_CLKTOP2_CORECLKCTL1_A_DIVRATIO(x)		((x) << 8)
 #define   MG_CLKTOP2_CORECLKCTL1_A_DIVRATIO_MASK	(0xff << 8)
-#define MG_CLKTOP2_CORECLKCTL1(port) _MMIO_PORT((port) - PORT_C, \
-						_MG_CLKTOP2_CORECLKCTL1_PORT1, \
-						_MG_CLKTOP2_CORECLKCTL1_PORT2)
+#define MG_CLKTOP2_CORECLKCTL1(tc_port) _MMIO_PORT((tc_port), \
+						   _MG_CLKTOP2_CORECLKCTL1_PORT1, \
+						   _MG_CLKTOP2_CORECLKCTL1_PORT2)
 
 #define _MG_CLKTOP2_HSCLKCTL_PORT1			0x1688D4
 #define _MG_CLKTOP2_HSCLKCTL_PORT2			0x1698D4
@@ -9566,9 +9604,9 @@ enum skl_power_gate {
 #define   MG_CLKTOP2_HSCLKCTL_DSDIV_RATIO(x)		((x) << 8)
 #define   MG_CLKTOP2_HSCLKCTL_DSDIV_RATIO_SHIFT		8
 #define   MG_CLKTOP2_HSCLKCTL_DSDIV_RATIO_MASK		(0xf << 8)
-#define MG_CLKTOP2_HSCLKCTL(port) _MMIO_PORT((port) - PORT_C, \
-					     _MG_CLKTOP2_HSCLKCTL_PORT1, \
-					     _MG_CLKTOP2_HSCLKCTL_PORT2)
+#define MG_CLKTOP2_HSCLKCTL(tc_port) _MMIO_PORT((tc_port), \
+						_MG_CLKTOP2_HSCLKCTL_PORT1, \
+						_MG_CLKTOP2_HSCLKCTL_PORT2)
 
 #define _MG_PLL_DIV0_PORT1				0x168A00
 #define _MG_PLL_DIV0_PORT2				0x169A00
@@ -9580,8 +9618,8 @@ enum skl_power_gate {
 #define   MG_PLL_DIV0_FBDIV_FRAC(x)			((x) << 8)
 #define   MG_PLL_DIV0_FBDIV_INT_MASK			(0xff << 0)
 #define   MG_PLL_DIV0_FBDIV_INT(x)			((x) << 0)
-#define MG_PLL_DIV0(port) _MMIO_PORT((port) - PORT_C, _MG_PLL_DIV0_PORT1, \
-				     _MG_PLL_DIV0_PORT2)
+#define MG_PLL_DIV0(tc_port) _MMIO_PORT((tc_port), _MG_PLL_DIV0_PORT1, \
+					_MG_PLL_DIV0_PORT2)
 
 #define _MG_PLL_DIV1_PORT1				0x168A04
 #define _MG_PLL_DIV1_PORT2				0x169A04
@@ -9595,8 +9633,8 @@ enum skl_power_gate {
 #define   MG_PLL_DIV1_NDIVRATIO(x)			((x) << 4)
 #define   MG_PLL_DIV1_FBPREDIV_MASK			(0xf << 0)
 #define   MG_PLL_DIV1_FBPREDIV(x)			((x) << 0)
-#define MG_PLL_DIV1(port) _MMIO_PORT((port) - PORT_C, _MG_PLL_DIV1_PORT1, \
-				     _MG_PLL_DIV1_PORT2)
+#define MG_PLL_DIV1(tc_port) _MMIO_PORT((tc_port), _MG_PLL_DIV1_PORT1, \
+					_MG_PLL_DIV1_PORT2)
 
 #define _MG_PLL_LF_PORT1				0x168A08
 #define _MG_PLL_LF_PORT2				0x169A08
@@ -9608,8 +9646,8 @@ enum skl_power_gate {
 #define   MG_PLL_LF_GAINCTRL(x)				((x) << 16)
 #define   MG_PLL_LF_INT_COEFF(x)			((x) << 8)
 #define   MG_PLL_LF_PROP_COEFF(x)			((x) << 0)
-#define MG_PLL_LF(port) _MMIO_PORT((port) - PORT_C, _MG_PLL_LF_PORT1, \
-				   _MG_PLL_LF_PORT2)
+#define MG_PLL_LF(tc_port) _MMIO_PORT((tc_port), _MG_PLL_LF_PORT1, \
+				      _MG_PLL_LF_PORT2)
 
 #define _MG_PLL_FRAC_LOCK_PORT1				0x168A0C
 #define _MG_PLL_FRAC_LOCK_PORT2				0x169A0C
@@ -9621,9 +9659,9 @@ enum skl_power_gate {
 #define   MG_PLL_FRAC_LOCK_DCODITHEREN			(1 << 10)
 #define   MG_PLL_FRAC_LOCK_FEEDFWRDCAL_EN		(1 << 8)
 #define   MG_PLL_FRAC_LOCK_FEEDFWRDGAIN(x)		((x) << 0)
-#define MG_PLL_FRAC_LOCK(port) _MMIO_PORT((port) - PORT_C, \
-					  _MG_PLL_FRAC_LOCK_PORT1, \
-					  _MG_PLL_FRAC_LOCK_PORT2)
+#define MG_PLL_FRAC_LOCK(tc_port) _MMIO_PORT((tc_port), \
+					     _MG_PLL_FRAC_LOCK_PORT1, \
+					     _MG_PLL_FRAC_LOCK_PORT2)
 
 #define _MG_PLL_SSC_PORT1				0x168A10
 #define _MG_PLL_SSC_PORT2				0x169A10
@@ -9635,8 +9673,8 @@ enum skl_power_gate {
 #define   MG_PLL_SSC_STEPNUM(x)				((x) << 10)
 #define   MG_PLL_SSC_FLLEN				(1 << 9)
 #define   MG_PLL_SSC_STEPSIZE(x)			((x) << 0)
-#define MG_PLL_SSC(port) _MMIO_PORT((port) - PORT_C, _MG_PLL_SSC_PORT1, \
-				    _MG_PLL_SSC_PORT2)
+#define MG_PLL_SSC(tc_port) _MMIO_PORT((tc_port), _MG_PLL_SSC_PORT1, \
+				       _MG_PLL_SSC_PORT2)
 
 #define _MG_PLL_BIAS_PORT1				0x168A14
 #define _MG_PLL_BIAS_PORT2				0x169A14
@@ -9655,8 +9693,8 @@ enum skl_power_gate {
 #define   MG_PLL_BIAS_VREF_RDAC_MASK			(0x7 << 5)
 #define   MG_PLL_BIAS_IREFTRIM(x)			((x) << 0)
 #define   MG_PLL_BIAS_IREFTRIM_MASK			(0x1f << 0)
-#define MG_PLL_BIAS(port) _MMIO_PORT((port) - PORT_C, _MG_PLL_BIAS_PORT1, \
-				     _MG_PLL_BIAS_PORT2)
+#define MG_PLL_BIAS(tc_port) _MMIO_PORT((tc_port), _MG_PLL_BIAS_PORT1, \
+					_MG_PLL_BIAS_PORT2)
 
 #define _MG_PLL_TDC_COLDST_BIAS_PORT1			0x168A18
 #define _MG_PLL_TDC_COLDST_BIAS_PORT2			0x169A18
@@ -9667,9 +9705,9 @@ enum skl_power_gate {
 #define   MG_PLL_TDC_COLDST_COLDSTART			(1 << 16)
 #define   MG_PLL_TDC_TDCOVCCORR_EN			(1 << 2)
 #define   MG_PLL_TDC_TDCSEL(x)				((x) << 0)
-#define MG_PLL_TDC_COLDST_BIAS(port) _MMIO_PORT((port) - PORT_C, \
-						_MG_PLL_TDC_COLDST_BIAS_PORT1, \
-						_MG_PLL_TDC_COLDST_BIAS_PORT2)
+#define MG_PLL_TDC_COLDST_BIAS(tc_port) _MMIO_PORT((tc_port), \
+						   _MG_PLL_TDC_COLDST_BIAS_PORT1, \
+						   _MG_PLL_TDC_COLDST_BIAS_PORT2)
 
 #define _CNL_DPLL0_CFGCR0		0x6C000
 #define _CNL_DPLL1_CFGCR0		0x6C080
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index ca95ab2f4cfa..c2a5c48c7541 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -29,6 +29,8 @@
 #include <linux/sched/signal.h>
 
 #include "i915_drv.h"
+#include "i915_active.h"
+#include "i915_reset.h"
 
 static const char *i915_fence_get_driver_name(struct dma_fence *fence)
 {
@@ -59,7 +61,7 @@ static bool i915_fence_signaled(struct dma_fence *fence)
 
 static bool i915_fence_enable_signaling(struct dma_fence *fence)
 {
-	return intel_engine_enable_signaling(to_request(fence), true);
+	return i915_request_enable_breadcrumb(to_request(fence));
 }
 
 static signed long i915_fence_wait(struct dma_fence *fence,
@@ -111,99 +113,10 @@ i915_request_remove_from_client(struct i915_request *request)
 	spin_unlock(&file_priv->mm.lock);
 }
 
-static int reset_all_global_seqno(struct drm_i915_private *i915, u32 seqno)
+static void reserve_gt(struct drm_i915_private *i915)
 {
-	struct intel_engine_cs *engine;
-	struct i915_timeline *timeline;
-	enum intel_engine_id id;
-	int ret;
-
-	/* Carefully retire all requests without writing to the rings */
-	ret = i915_gem_wait_for_idle(i915,
-				     I915_WAIT_INTERRUPTIBLE |
-				     I915_WAIT_LOCKED,
-				     MAX_SCHEDULE_TIMEOUT);
-	if (ret)
-		return ret;
-
-	GEM_BUG_ON(i915->gt.active_requests);
-
-	/* If the seqno wraps around, we need to clear the breadcrumb rbtree */
-	for_each_engine(engine, i915, id) {
-		GEM_TRACE("%s seqno %d (current %d) -> %d\n",
-			  engine->name,
-			  engine->timeline.seqno,
-			  intel_engine_get_seqno(engine),
-			  seqno);
-
-		if (seqno == engine->timeline.seqno)
-			continue;
-
-		kthread_park(engine->breadcrumbs.signaler);
-
-		if (!i915_seqno_passed(seqno, engine->timeline.seqno)) {
-			/* Flush any waiters before we reuse the seqno */
-			intel_engine_disarm_breadcrumbs(engine);
-			intel_engine_init_hangcheck(engine);
-			GEM_BUG_ON(!list_empty(&engine->breadcrumbs.signals));
-		}
-
-		/* Check we are idle before we fiddle with hw state! */
-		GEM_BUG_ON(!intel_engine_is_idle(engine));
-		GEM_BUG_ON(i915_gem_active_isset(&engine->timeline.last_request));
-
-		/* Finally reset hw state */
-		intel_engine_init_global_seqno(engine, seqno);
-		engine->timeline.seqno = seqno;
-
-		kthread_unpark(engine->breadcrumbs.signaler);
-	}
-
-	list_for_each_entry(timeline, &i915->gt.timelines, link)
-		memset(timeline->global_sync, 0, sizeof(timeline->global_sync));
-
-	i915->gt.request_serial = seqno;
-
-	return 0;
-}
-
-int i915_gem_set_global_seqno(struct drm_device *dev, u32 seqno)
-{
-	struct drm_i915_private *i915 = to_i915(dev);
-
-	lockdep_assert_held(&i915->drm.struct_mutex);
-
-	if (seqno == 0)
-		return -EINVAL;
-
-	/* HWS page needs to be set less than what we will inject to ring */
-	return reset_all_global_seqno(i915, seqno - 1);
-}
-
-static int reserve_gt(struct drm_i915_private *i915)
-{
-	int ret;
-
-	/*
-	 * Reservation is fine until we may need to wrap around
-	 *
-	 * By incrementing the serial for every request, we know that no
-	 * individual engine may exceed that serial (as each is reset to 0
-	 * on any wrap). This protects even the most pessimistic of migrations
-	 * of every request from all engines onto just one.
-	 */
-	while (unlikely(++i915->gt.request_serial == 0)) {
-		ret = reset_all_global_seqno(i915, 0);
-		if (ret) {
-			i915->gt.request_serial--;
-			return ret;
-		}
-	}
-
 	if (!i915->gt.active_requests++)
 		i915_gem_unpark(i915);
-
-	return 0;
 }
 
 static void unreserve_gt(struct drm_i915_private *i915)
@@ -213,12 +126,6 @@ static void unreserve_gt(struct drm_i915_private *i915)
 		i915_gem_park(i915);
 }
 
-void i915_gem_retire_noop(struct i915_gem_active *active,
-			  struct i915_request *request)
-{
-	/* Space left intentionally blank */
-}
-
 static void advance_ring(struct i915_request *request)
 {
 	struct intel_ring *ring = request->ring;
@@ -270,10 +177,11 @@ static void free_capture_list(struct i915_request *request)
 static void __retire_engine_request(struct intel_engine_cs *engine,
 				    struct i915_request *rq)
 {
-	GEM_TRACE("%s(%s) fence %llx:%d, global=%d, current %d\n",
+	GEM_TRACE("%s(%s) fence %llx:%lld, global=%d, current %d:%d\n",
 		  __func__, engine->name,
 		  rq->fence.context, rq->fence.seqno,
 		  rq->global_seqno,
+		  hwsp_seqno(rq),
 		  intel_engine_get_seqno(engine));
 
 	GEM_BUG_ON(!i915_request_completed(rq));
@@ -286,10 +194,11 @@ static void __retire_engine_request(struct intel_engine_cs *engine,
 	spin_unlock(&engine->timeline.lock);
 
 	spin_lock(&rq->lock);
-	if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags))
+	i915_request_mark_complete(rq);
+	if (!i915_request_signaled(rq))
 		dma_fence_signal_locked(&rq->fence);
 	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags))
-		intel_engine_cancel_signaling(rq);
+		i915_request_cancel_breadcrumb(rq);
 	if (rq->waitboost) {
 		GEM_BUG_ON(!atomic_read(&rq->i915->gt_pm.rps.num_waiters));
 		atomic_dec(&rq->i915->gt_pm.rps.num_waiters);
@@ -330,12 +239,13 @@ static void __retire_engine_upto(struct intel_engine_cs *engine,
 
 static void i915_request_retire(struct i915_request *request)
 {
-	struct i915_gem_active *active, *next;
+	struct i915_active_request *active, *next;
 
-	GEM_TRACE("%s fence %llx:%d, global=%d, current %d\n",
+	GEM_TRACE("%s fence %llx:%lld, global=%d, current %d:%d\n",
 		  request->engine->name,
 		  request->fence.context, request->fence.seqno,
 		  request->global_seqno,
+		  hwsp_seqno(request),
 		  intel_engine_get_seqno(request->engine));
 
 	lockdep_assert_held(&request->i915->drm.struct_mutex);
@@ -363,10 +273,10 @@ static void i915_request_retire(struct i915_request *request)
 		 * we may spend an inordinate amount of time simply handling
 		 * the retirement of requests and processing their callbacks.
 		 * Of which, this loop itself is particularly hot due to the
-		 * cache misses when jumping around the list of i915_gem_active.
-		 * So we try to keep this loop as streamlined as possible and
-		 * also prefetch the next i915_gem_active to try and hide
-		 * the likely cache miss.
+		 * cache misses when jumping around the list of
+		 * i915_active_request.  So we try to keep this loop as
+		 * streamlined as possible and also prefetch the next
+		 * i915_active_request to try and hide the likely cache miss.
 		 */
 		prefetchw(next);
 
@@ -395,10 +305,11 @@ void i915_request_retire_upto(struct i915_request *rq)
 	struct intel_ring *ring = rq->ring;
 	struct i915_request *tmp;
 
-	GEM_TRACE("%s fence %llx:%d, global=%d, current %d\n",
+	GEM_TRACE("%s fence %llx:%lld, global=%d, current %d:%d\n",
 		  rq->engine->name,
 		  rq->fence.context, rq->fence.seqno,
 		  rq->global_seqno,
+		  hwsp_seqno(rq),
 		  intel_engine_get_seqno(rq->engine));
 
 	lockdep_assert_held(&rq->i915->drm.struct_mutex);
@@ -417,7 +328,7 @@ void i915_request_retire_upto(struct i915_request *rq)
 
 static u32 timeline_get_seqno(struct i915_timeline *tl)
 {
-	return ++tl->seqno;
+	return tl->seqno += 1 + tl->has_initial_breadcrumb;
 }
 
 static void move_to_timeline(struct i915_request *request,
@@ -431,15 +342,23 @@ static void move_to_timeline(struct i915_request *request,
 	spin_unlock(&request->timeline->lock);
 }
 
+static u32 next_global_seqno(struct i915_timeline *tl)
+{
+	if (!++tl->seqno)
+		++tl->seqno;
+	return tl->seqno;
+}
+
 void __i915_request_submit(struct i915_request *request)
 {
 	struct intel_engine_cs *engine = request->engine;
 	u32 seqno;
 
-	GEM_TRACE("%s fence %llx:%d -> global=%d, current %d\n",
+	GEM_TRACE("%s fence %llx:%lld -> global=%d, current %d:%d\n",
 		  engine->name,
 		  request->fence.context, request->fence.seqno,
 		  engine->timeline.seqno + 1,
+		  hwsp_seqno(request),
 		  intel_engine_get_seqno(engine));
 
 	GEM_BUG_ON(!irqs_disabled());
@@ -447,26 +366,27 @@ void __i915_request_submit(struct i915_request *request)
 
 	GEM_BUG_ON(request->global_seqno);
 
-	seqno = timeline_get_seqno(&engine->timeline);
+	seqno = next_global_seqno(&engine->timeline);
 	GEM_BUG_ON(!seqno);
 	GEM_BUG_ON(intel_engine_signaled(engine, seqno));
 
 	/* We may be recursing from the signal callback of another i915 fence */
 	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
+	GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags));
+	set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags);
 	request->global_seqno = seqno;
-	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags))
-		intel_engine_enable_signaling(request, false);
+	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
+	    !i915_request_enable_breadcrumb(request))
+		intel_engine_queue_breadcrumbs(engine);
 	spin_unlock(&request->lock);
 
-	engine->emit_breadcrumb(request,
-				request->ring->vaddr + request->postfix);
+	engine->emit_fini_breadcrumb(request,
+				     request->ring->vaddr + request->postfix);
 
 	/* Transfer from per-context onto the global per-engine timeline */
 	move_to_timeline(request, &engine->timeline);
 
 	trace_i915_request_execute(request);
-
-	wake_up_all(&request->execute);
 }
 
 void i915_request_submit(struct i915_request *request)
@@ -486,10 +406,11 @@ void __i915_request_unsubmit(struct i915_request *request)
 {
 	struct intel_engine_cs *engine = request->engine;
 
-	GEM_TRACE("%s fence %llx:%d <- global=%d, current %d\n",
+	GEM_TRACE("%s fence %llx:%lld <- global=%d, current %d:%d\n",
 		  engine->name,
 		  request->fence.context, request->fence.seqno,
 		  request->global_seqno,
+		  hwsp_seqno(request),
 		  intel_engine_get_seqno(engine));
 
 	GEM_BUG_ON(!irqs_disabled());
@@ -508,7 +429,9 @@ void __i915_request_unsubmit(struct i915_request *request)
 	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
 	request->global_seqno = 0;
 	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags))
-		intel_engine_cancel_signaling(request);
+		i915_request_cancel_breadcrumb(request);
+	GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags));
+	clear_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags);
 	spin_unlock(&request->lock);
 
 	/* Transfer back from the global per-engine timeline to per-context */
@@ -566,6 +489,43 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 	return NOTIFY_DONE;
 }
 
+static void ring_retire_requests(struct intel_ring *ring)
+{
+	struct i915_request *rq, *rn;
+
+	list_for_each_entry_safe(rq, rn, &ring->request_list, ring_link) {
+		if (!i915_request_completed(rq))
+			break;
+
+		i915_request_retire(rq);
+	}
+}
+
+static noinline struct i915_request *
+i915_request_alloc_slow(struct intel_context *ce)
+{
+	struct intel_ring *ring = ce->ring;
+	struct i915_request *rq;
+
+	if (list_empty(&ring->request_list))
+		goto out;
+
+	/* Ratelimit ourselves to prevent oom from malicious clients */
+	rq = list_last_entry(&ring->request_list, typeof(*rq), ring_link);
+	cond_synchronize_rcu(rq->rcustate);
+
+	/* Retire our old requests in the hope that we free some */
+	ring_retire_requests(ring);
+
+out:
+	return kmem_cache_alloc(ce->gem_context->i915->requests, GFP_KERNEL);
+}
+
+static int add_timeline_barrier(struct i915_request *rq)
+{
+	return i915_request_await_active_request(rq, &rq->timeline->barrier);
+}
+
 /**
  * i915_request_alloc - allocate a request structure
  *
@@ -608,13 +568,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	if (IS_ERR(ce))
 		return ERR_CAST(ce);
 
-	ret = reserve_gt(i915);
-	if (ret)
-		goto err_unpin;
-
-	ret = intel_ring_wait_for_space(ce->ring, MIN_SPACE_FOR_ADD_REQUEST);
-	if (ret)
-		goto err_unreserve;
+	reserve_gt(i915);
 
 	/* Move our oldest request to the slab-cache (if not in use!) */
 	rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link);
@@ -628,7 +582,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * We use RCU to look up requests in flight. The lookups may
 	 * race with the request being allocated from the slab freelist.
 	 * That is the request we are writing to here, may be in the process
-	 * of being read by __i915_gem_active_get_rcu(). As such,
+	 * of being read by __i915_active_request_get_rcu(). As such,
 	 * we have to be very careful when overwriting the contents. During
 	 * the RCU lookup, we change chase the request->engine pointer,
 	 * read the request->global_seqno and increment the reference count.
@@ -654,15 +608,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	rq = kmem_cache_alloc(i915->requests,
 			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
 	if (unlikely(!rq)) {
-		i915_retire_requests(i915);
-
-		/* Ratelimit ourselves to prevent oom from malicious clients */
-		rq = i915_gem_active_raw(&ce->ring->timeline->last_request,
-					 &i915->drm.struct_mutex);
-		if (rq)
-			cond_synchronize_rcu(rq->rcustate);
-
-		rq = kmem_cache_alloc(i915->requests, GFP_KERNEL);
+		rq = i915_request_alloc_slow(ce);
 		if (!rq) {
 			ret = -ENOMEM;
 			goto err_unreserve;
@@ -679,6 +625,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	rq->ring = ce->ring;
 	rq->timeline = ce->ring->timeline;
 	GEM_BUG_ON(rq->timeline == &engine->timeline);
+	rq->hwsp_seqno = rq->timeline->hwsp_seqno;
 
 	spin_lock_init(&rq->lock);
 	dma_fence_init(&rq->fence,
@@ -689,13 +636,11 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 
 	/* We bump the ref for the fence chain */
 	i915_sw_fence_init(&i915_request_get(rq)->submit, submit_notify);
-	init_waitqueue_head(&rq->execute);
 
 	i915_sched_node_init(&rq->sched);
 
 	/* No zalloc, must clear what we need by hand */
 	rq->global_seqno = 0;
-	rq->signaling.wait.seqno = 0;
 	rq->file_priv = NULL;
 	rq->batch = NULL;
 	rq->capture_list = NULL;
@@ -707,9 +652,13 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 * i915_request_add() call can't fail. Note that the reserve may need
 	 * to be redone if the request is not actually submitted straight
 	 * away, e.g. because a GPU scheduler has deferred it.
+	 *
+	 * Note that due to how we add reserved_space to intel_ring_begin()
+	 * we need to double our request to ensure that if we need to wrap
+	 * around inside i915_request_add() there is sufficient space at
+	 * the beginning of the ring as well.
 	 */
-	rq->reserved_space = MIN_SPACE_FOR_ADD_REQUEST;
-	GEM_BUG_ON(rq->reserved_space < engine->emit_breadcrumb_sz);
+	rq->reserved_space = 2 * engine->emit_fini_breadcrumb_dw * sizeof(u32);
 
 	/*
 	 * Record the position of the start of the request so that
@@ -719,8 +668,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 */
 	rq->head = rq->ring->emit;
 
-	/* Unconditionally invalidate GPU caches and TLBs. */
-	ret = engine->emit_flush(rq, EMIT_INVALIDATE);
+	ret = add_timeline_barrier(rq);
 	if (ret)
 		goto err_unwind;
 
@@ -748,7 +696,6 @@ err_unwind:
 	kmem_cache_free(i915->requests, rq);
 err_unreserve:
 	unreserve_gt(i915);
-err_unpin:
 	intel_context_unpin(ce);
 	return ERR_PTR(ret);
 }
@@ -776,34 +723,12 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
 		ret = i915_sw_fence_await_sw_fence_gfp(&to->submit,
 						       &from->submit,
 						       I915_FENCE_GFP);
-		return ret < 0 ? ret : 0;
-	}
-
-	if (to->engine->semaphore.sync_to) {
-		u32 seqno;
-
-		GEM_BUG_ON(!from->engine->semaphore.signal);
-
-		seqno = i915_request_global_seqno(from);
-		if (!seqno)
-			goto await_dma_fence;
-
-		if (seqno <= to->timeline->global_sync[from->engine->id])
-			return 0;
-
-		trace_i915_gem_ring_sync_to(to, from);
-		ret = to->engine->semaphore.sync_to(to, from);
-		if (ret)
-			return ret;
-
-		to->timeline->global_sync[from->engine->id] = seqno;
-		return 0;
+	} else {
+		ret = i915_sw_fence_await_dma_fence(&to->submit,
+						    &from->fence, 0,
+						    I915_FENCE_GFP);
 	}
 
-await_dma_fence:
-	ret = i915_sw_fence_await_dma_fence(&to->submit,
-					    &from->fence, 0,
-					    I915_FENCE_GFP);
 	return ret < 0 ? ret : 0;
 }
 
@@ -961,7 +886,7 @@ void i915_request_add(struct i915_request *request)
 	struct i915_request *prev;
 	u32 *cs;
 
-	GEM_TRACE("%s fence %llx:%d\n",
+	GEM_TRACE("%s fence %llx:%lld\n",
 		  engine->name, request->fence.context, request->fence.seqno);
 
 	lockdep_assert_held(&request->i915->drm.struct_mutex);
@@ -979,8 +904,8 @@ void i915_request_add(struct i915_request *request)
 	 * should already have been reserved in the ring buffer. Let the ring
 	 * know that it is time to use that space up.
 	 */
+	GEM_BUG_ON(request->reserved_space > request->ring->space);
 	request->reserved_space = 0;
-	engine->emit_flush(request, EMIT_FLUSH);
 
 	/*
 	 * Record the position of the start of the breadcrumb so that
@@ -988,7 +913,7 @@ void i915_request_add(struct i915_request *request)
 	 * GPU processing the request, we never over-estimate the
 	 * position of the ring's HEAD.
 	 */
-	cs = intel_ring_begin(request, engine->emit_breadcrumb_sz);
+	cs = intel_ring_begin(request, engine->emit_fini_breadcrumb_dw);
 	GEM_BUG_ON(IS_ERR(cs));
 	request->postfix = intel_ring_offset(request, cs);
 
@@ -999,8 +924,8 @@ void i915_request_add(struct i915_request *request)
 	 * see a more recent value in the hws than we are tracking.
 	 */
 
-	prev = i915_gem_active_raw(&timeline->last_request,
-				   &request->i915->drm.struct_mutex);
+	prev = i915_active_request_raw(&timeline->last_request,
+				       &request->i915->drm.struct_mutex);
 	if (prev && !i915_request_completed(prev)) {
 		i915_sw_fence_await_sw_fence(&request->submit, &prev->submit,
 					     &request->submitq);
@@ -1016,7 +941,7 @@ void i915_request_add(struct i915_request *request)
 	spin_unlock_irq(&timeline->lock);
 
 	GEM_BUG_ON(timeline->seqno != request->fence.seqno);
-	i915_gem_active_set(&timeline->last_request, request);
+	__i915_active_request_set(&timeline->last_request, request);
 
 	list_add_tail(&request->ring_link, &ring->request_list);
 	if (list_is_first(&request->ring_link, &ring->request_list)) {
@@ -1047,7 +972,7 @@ void i915_request_add(struct i915_request *request)
 		 * Allow interactive/synchronous clients to jump ahead of
 		 * the bulk clients. (FQ_CODEL)
 		 */
-		if (!prev || i915_request_completed(prev))
+		if (list_empty(&request->sched.signalers_list))
 			attr.priority |= I915_PRIORITY_NEWCLIENT;
 
 		engine->schedule(request, &attr);
@@ -1110,13 +1035,10 @@ static bool busywait_stop(unsigned long timeout, unsigned int cpu)
 	return this_cpu != cpu;
 }
 
-static bool __i915_spin_request(const struct i915_request *rq,
-				u32 seqno, int state, unsigned long timeout_us)
+static bool __i915_spin_request(const struct i915_request * const rq,
+				int state, unsigned long timeout_us)
 {
-	struct intel_engine_cs *engine = rq->engine;
-	unsigned int irq, cpu;
-
-	GEM_BUG_ON(!seqno);
+	unsigned int cpu;
 
 	/*
 	 * Only wait for the request if we know it is likely to complete.
@@ -1124,12 +1046,12 @@ static bool __i915_spin_request(const struct i915_request *rq,
 	 * We don't track the timestamps around requests, nor the average
 	 * request length, so we do not have a good indicator that this
 	 * request will complete within the timeout. What we do know is the
-	 * order in which requests are executed by the engine and so we can
-	 * tell if the request has started. If the request hasn't started yet,
-	 * it is a fair assumption that it will not complete within our
-	 * relatively short timeout.
+	 * order in which requests are executed by the context and so we can
+	 * tell if the request has been started. If the request is not even
+	 * running yet, it is a fair assumption that it will not complete
+	 * within our relatively short timeout.
 	 */
-	if (!intel_engine_has_started(engine, seqno))
+	if (!i915_request_is_running(rq))
 		return false;
 
 	/*
@@ -1143,20 +1065,10 @@ static bool __i915_spin_request(const struct i915_request *rq,
 	 * takes to sleep on a request, on the order of a microsecond.
 	 */
 
-	irq = READ_ONCE(engine->breadcrumbs.irq_count);
 	timeout_us += local_clock_us(&cpu);
 	do {
-		if (intel_engine_has_completed(engine, seqno))
-			return seqno == i915_request_global_seqno(rq);
-
-		/*
-		 * Seqno are meant to be ordered *before* the interrupt. If
-		 * we see an interrupt without a corresponding seqno advance,
-		 * assume we won't see one in the near future but require
-		 * the engine->seqno_barrier() to fixup coherency.
-		 */
-		if (READ_ONCE(engine->breadcrumbs.irq_count) != irq)
-			break;
+		if (i915_request_completed(rq))
+			return true;
 
 		if (signal_pending_state(state, current))
 			break;
@@ -1170,16 +1082,16 @@ static bool __i915_spin_request(const struct i915_request *rq,
 	return false;
 }
 
-static bool __i915_wait_request_check_and_reset(struct i915_request *request)
-{
-	struct i915_gpu_error *error = &request->i915->gpu_error;
+struct request_wait {
+	struct dma_fence_cb cb;
+	struct task_struct *tsk;
+};
 
-	if (likely(!i915_reset_handoff(error)))
-		return false;
+static void request_wait_wake(struct dma_fence *fence, struct dma_fence_cb *cb)
+{
+	struct request_wait *wait = container_of(cb, typeof(*wait), cb);
 
-	__set_current_state(TASK_RUNNING);
-	i915_reset(request->i915, error->stalled_mask, error->reason);
-	return true;
+	wake_up_process(wait->tsk);
 }
 
 /**
@@ -1207,17 +1119,9 @@ long i915_request_wait(struct i915_request *rq,
 {
 	const int state = flags & I915_WAIT_INTERRUPTIBLE ?
 		TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
-	wait_queue_head_t *errq = &rq->i915->gpu_error.wait_queue;
-	DEFINE_WAIT_FUNC(reset, default_wake_function);
-	DEFINE_WAIT_FUNC(exec, default_wake_function);
-	struct intel_wait wait;
+	struct request_wait wait;
 
 	might_sleep();
-#if IS_ENABLED(CONFIG_LOCKDEP)
-	GEM_BUG_ON(debug_locks &&
-		   !!lockdep_is_held(&rq->i915->drm.struct_mutex) !=
-		   !!(flags & I915_WAIT_LOCKED));
-#endif
 	GEM_BUG_ON(timeout < 0);
 
 	if (i915_request_completed(rq))
@@ -1228,57 +1132,23 @@ long i915_request_wait(struct i915_request *rq,
 
 	trace_i915_request_wait_begin(rq, flags);
 
-	add_wait_queue(&rq->execute, &exec);
-	if (flags & I915_WAIT_LOCKED)
-		add_wait_queue(errq, &reset);
+	/* Optimistic short spin before touching IRQs */
+	if (__i915_spin_request(rq, state, 5))
+		goto out;
 
-	intel_wait_init(&wait);
 	if (flags & I915_WAIT_PRIORITY)
 		i915_schedule_bump_priority(rq, I915_PRIORITY_WAIT);
 
-restart:
-	do {
-		set_current_state(state);
-		if (intel_wait_update_request(&wait, rq))
-			break;
-
-		if (flags & I915_WAIT_LOCKED &&
-		    __i915_wait_request_check_and_reset(rq))
-			continue;
-
-		if (signal_pending_state(state, current)) {
-			timeout = -ERESTARTSYS;
-			goto complete;
-		}
-
-		if (!timeout) {
-			timeout = -ETIME;
-			goto complete;
-		}
-
-		timeout = io_schedule_timeout(timeout);
-	} while (1);
-
-	GEM_BUG_ON(!intel_wait_has_seqno(&wait));
-	GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit));
+	wait.tsk = current;
+	if (dma_fence_add_callback(&rq->fence, &wait.cb, request_wait_wake))
+		goto out;
 
-	/* Optimistic short spin before touching IRQs */
-	if (__i915_spin_request(rq, wait.seqno, state, 5))
-		goto complete;
-
-	set_current_state(state);
-	if (intel_engine_add_wait(rq->engine, &wait))
-		/*
-		 * In order to check that we haven't missed the interrupt
-		 * as we enabled it, we need to kick ourselves to do a
-		 * coherent check on the seqno before we sleep.
-		 */
-		goto wakeup;
+	for (;;) {
+		set_current_state(state);
 
-	if (flags & I915_WAIT_LOCKED)
-		__i915_wait_request_check_and_reset(rq);
+		if (i915_request_completed(rq))
+			break;
 
-	for (;;) {
 		if (signal_pending_state(state, current)) {
 			timeout = -ERESTARTSYS;
 			break;
@@ -1290,70 +1160,14 @@ restart:
 		}
 
 		timeout = io_schedule_timeout(timeout);
-
-		if (intel_wait_complete(&wait) &&
-		    intel_wait_check_request(&wait, rq))
-			break;
-
-		set_current_state(state);
-
-wakeup:
-		/*
-		 * Carefully check if the request is complete, giving time
-		 * for the seqno to be visible following the interrupt.
-		 * We also have to check in case we are kicked by the GPU
-		 * reset in order to drop the struct_mutex.
-		 */
-		if (__i915_request_irq_complete(rq))
-			break;
-
-		/*
-		 * If the GPU is hung, and we hold the lock, reset the GPU
-		 * and then check for completion. On a full reset, the engine's
-		 * HW seqno will be advanced passed us and we are complete.
-		 * If we do a partial reset, we have to wait for the GPU to
-		 * resume and update the breadcrumb.
-		 *
-		 * If we don't hold the mutex, we can just wait for the worker
-		 * to come along and update the breadcrumb (either directly
-		 * itself, or indirectly by recovering the GPU).
-		 */
-		if (flags & I915_WAIT_LOCKED &&
-		    __i915_wait_request_check_and_reset(rq))
-			continue;
-
-		/* Only spin if we know the GPU is processing this request */
-		if (__i915_spin_request(rq, wait.seqno, state, 2))
-			break;
-
-		if (!intel_wait_check_request(&wait, rq)) {
-			intel_engine_remove_wait(rq->engine, &wait);
-			goto restart;
-		}
 	}
-
-	intel_engine_remove_wait(rq->engine, &wait);
-complete:
 	__set_current_state(TASK_RUNNING);
-	if (flags & I915_WAIT_LOCKED)
-		remove_wait_queue(errq, &reset);
-	remove_wait_queue(&rq->execute, &exec);
-	trace_i915_request_wait_end(rq);
-
-	return timeout;
-}
 
-static void ring_retire_requests(struct intel_ring *ring)
-{
-	struct i915_request *request, *next;
+	dma_fence_remove_callback(&rq->fence, &wait.cb);
 
-	list_for_each_entry_safe(request, next,
-				 &ring->request_list, ring_link) {
-		if (!i915_request_completed(request))
-			break;
-
-		i915_request_retire(request);
-	}
+out:
+	trace_i915_request_wait_end(rq);
+	return timeout;
 }
 
 void i915_retire_requests(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 90e9d170a0cd..40f3e8dcbdd5 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -30,7 +30,6 @@
 #include "i915_gem.h"
 #include "i915_scheduler.h"
 #include "i915_sw_fence.h"
-#include "i915_scheduler.h"
 
 #include <uapi/drm/i915_drm.h>
 
@@ -39,23 +38,34 @@ struct drm_i915_gem_object;
 struct i915_request;
 struct i915_timeline;
 
-struct intel_wait {
-	struct rb_node node;
-	struct task_struct *tsk;
-	struct i915_request *request;
-	u32 seqno;
-};
-
-struct intel_signal_node {
-	struct intel_wait wait;
-	struct list_head link;
-};
-
 struct i915_capture_list {
 	struct i915_capture_list *next;
 	struct i915_vma *vma;
 };
 
+enum {
+	/*
+	 * I915_FENCE_FLAG_ACTIVE - this request is currently submitted to HW.
+	 *
+	 * Set by __i915_request_submit() on handing over to HW, and cleared
+	 * by __i915_request_unsubmit() if we preempt this request.
+	 *
+	 * Finally cleared for consistency on retiring the request, when
+	 * we know the HW is no longer running this request.
+	 *
+	 * See i915_request_is_active()
+	 */
+	I915_FENCE_FLAG_ACTIVE = DMA_FENCE_FLAG_USER_BITS,
+
+	/*
+	 * I915_FENCE_FLAG_SIGNAL - this request is currently on signal_list
+	 *
+	 * Internal bookkeeping used by the breadcrumb code to track when
+	 * a request is on the various signal_list.
+	 */
+	I915_FENCE_FLAG_SIGNAL,
+};
+
 /**
  * Request queue structure.
  *
@@ -98,7 +108,7 @@ struct i915_request {
 	struct intel_context *hw_context;
 	struct intel_ring *ring;
 	struct i915_timeline *timeline;
-	struct intel_signal_node signaling;
+	struct list_head signal_link;
 
 	/*
 	 * The rcu epoch of when this request was allocated. Used to judiciously
@@ -117,7 +127,6 @@ struct i915_request {
 	 */
 	struct i915_sw_fence submit;
 	wait_queue_entry_t submitq;
-	wait_queue_head_t execute;
 
 	/*
 	 * A list of everyone we wait upon, and everyone who waits upon us.
@@ -131,6 +140,13 @@ struct i915_request {
 	struct i915_sched_node sched;
 	struct i915_dependency dep;
 
+	/*
+	 * A convenience pointer to the current breadcrumb value stored in
+	 * the HW status page (or our timeline's local equivalent). The full
+	 * path would be rq->hw_context->ring->timeline->hwsp_seqno.
+	 */
+	const u32 *hwsp_seqno;
+
 	/**
 	 * GEM sequence number associated with this request on the
 	 * global execution timeline. It is zero when the request is not
@@ -249,7 +265,7 @@ i915_request_put(struct i915_request *rq)
  * that it has passed the global seqno and the global seqno is unchanged
  * after the read, it is indeed complete).
  */
-static u32
+static inline u32
 i915_request_global_seqno(const struct i915_request *request)
 {
 	return READ_ONCE(request->global_seqno);
@@ -271,6 +287,10 @@ void i915_request_skip(struct i915_request *request, int error);
 void __i915_request_unsubmit(struct i915_request *request);
 void i915_request_unsubmit(struct i915_request *request);
 
+/* Note: part of the intel_breadcrumbs family */
+bool i915_request_enable_breadcrumb(struct i915_request *request);
+void i915_request_cancel_breadcrumb(struct i915_request *request);
+
 long i915_request_wait(struct i915_request *rq,
 		       unsigned int flags,
 		       long timeout)
@@ -281,441 +301,106 @@ long i915_request_wait(struct i915_request *rq,
 #define I915_WAIT_ALL		BIT(3) /* used by i915_gem_object_wait() */
 #define I915_WAIT_FOR_IDLE_BOOST BIT(4)
 
-static inline bool intel_engine_has_started(struct intel_engine_cs *engine,
-					    u32 seqno);
-static inline bool intel_engine_has_completed(struct intel_engine_cs *engine,
-					      u32 seqno);
-
-/**
- * Returns true if seq1 is later than seq2.
- */
-static inline bool i915_seqno_passed(u32 seq1, u32 seq2)
-{
-	return (s32)(seq1 - seq2) >= 0;
-}
-
-/**
- * i915_request_started - check if the request has begun being executed
- * @rq: the request
- *
- * Returns true if the request has been submitted to hardware, and the hardware
- * has advanced passed the end of the previous request and so should be either
- * currently processing the request (though it may be preempted and so
- * not necessarily the next request to complete) or have completed the request.
- */
-static inline bool i915_request_started(const struct i915_request *rq)
-{
-	u32 seqno;
-
-	seqno = i915_request_global_seqno(rq);
-	if (!seqno) /* not yet submitted to HW */
-		return false;
-
-	return intel_engine_has_started(rq->engine, seqno);
-}
-
-static inline bool
-__i915_request_completed(const struct i915_request *rq, u32 seqno)
+static inline bool i915_request_signaled(const struct i915_request *rq)
 {
-	GEM_BUG_ON(!seqno);
-	return intel_engine_has_completed(rq->engine, seqno) &&
-		seqno == i915_request_global_seqno(rq);
+	/* The request may live longer than its HWSP, so check flags first! */
+	return test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags);
 }
 
-static inline bool i915_request_completed(const struct i915_request *rq)
+static inline bool i915_request_is_active(const struct i915_request *rq)
 {
-	u32 seqno;
-
-	seqno = i915_request_global_seqno(rq);
-	if (!seqno)
-		return false;
-
-	return __i915_request_completed(rq, seqno);
+	return test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
 }
 
-void i915_retire_requests(struct drm_i915_private *i915);
-
-/*
- * We treat requests as fences. This is not be to confused with our
- * "fence registers" but pipeline synchronisation objects ala GL_ARB_sync.
- * We use the fences to synchronize access from the CPU with activity on the
- * GPU, for example, we should not rewrite an object's PTE whilst the GPU
- * is reading them. We also track fences at a higher level to provide
- * implicit synchronisation around GEM objects, e.g. set-domain will wait
- * for outstanding GPU rendering before marking the object ready for CPU
- * access, or a pageflip will wait until the GPU is complete before showing
- * the frame on the scanout.
- *
- * In order to use a fence, the object must track the fence it needs to
- * serialise with. For example, GEM objects want to track both read and
- * write access so that we can perform concurrent read operations between
- * the CPU and GPU engines, as well as waiting for all rendering to
- * complete, or waiting for the last GPU user of a "fence register". The
- * object then embeds a #i915_gem_active to track the most recent (in
- * retirement order) request relevant for the desired mode of access.
- * The #i915_gem_active is updated with i915_gem_active_set() to track the
- * most recent fence request, typically this is done as part of
- * i915_vma_move_to_active().
- *
- * When the #i915_gem_active completes (is retired), it will
- * signal its completion to the owner through a callback as well as mark
- * itself as idle (i915_gem_active.request == NULL). The owner
- * can then perform any action, such as delayed freeing of an active
- * resource including itself.
- */
-struct i915_gem_active;
-
-typedef void (*i915_gem_retire_fn)(struct i915_gem_active *,
-				   struct i915_request *);
-
-struct i915_gem_active {
-	struct i915_request __rcu *request;
-	struct list_head link;
-	i915_gem_retire_fn retire;
-};
-
-void i915_gem_retire_noop(struct i915_gem_active *,
-			  struct i915_request *request);
-
 /**
- * init_request_active - prepares the activity tracker for use
- * @active - the active tracker
- * @func - a callback when then the tracker is retired (becomes idle),
- *         can be NULL
- *
- * init_request_active() prepares the embedded @active struct for use as
- * an activity tracker, that is for tracking the last known active request
- * associated with it. When the last request becomes idle, when it is retired
- * after completion, the optional callback @func is invoked.
- */
-static inline void
-init_request_active(struct i915_gem_active *active,
-		    i915_gem_retire_fn retire)
-{
-	RCU_INIT_POINTER(active->request, NULL);
-	INIT_LIST_HEAD(&active->link);
-	active->retire = retire ?: i915_gem_retire_noop;
-}
-
-/**
- * i915_gem_active_set - updates the tracker to watch the current request
- * @active - the active tracker
- * @request - the request to watch
- *
- * i915_gem_active_set() watches the given @request for completion. Whilst
- * that @request is busy, the @active reports busy. When that @request is
- * retired, the @active tracker is updated to report idle.
- */
-static inline void
-i915_gem_active_set(struct i915_gem_active *active,
-		    struct i915_request *request)
-{
-	list_move(&active->link, &request->active_list);
-	rcu_assign_pointer(active->request, request);
-}
-
-/**
- * i915_gem_active_set_retire_fn - updates the retirement callback
- * @active - the active tracker
- * @fn - the routine called when the request is retired
- * @mutex - struct_mutex used to guard retirements
- *
- * i915_gem_active_set_retire_fn() updates the function pointer that
- * is called when the final request associated with the @active tracker
- * is retired.
+ * Returns true if seq1 is later than seq2.
  */
-static inline void
-i915_gem_active_set_retire_fn(struct i915_gem_active *active,
-			      i915_gem_retire_fn fn,
-			      struct mutex *mutex)
+static inline bool i915_seqno_passed(u32 seq1, u32 seq2)
 {
-	lockdep_assert_held(mutex);
-	active->retire = fn ?: i915_gem_retire_noop;
+	return (s32)(seq1 - seq2) >= 0;
 }
 
-static inline struct i915_request *
-__i915_gem_active_peek(const struct i915_gem_active *active)
+static inline u32 __hwsp_seqno(const struct i915_request *rq)
 {
-	/*
-	 * Inside the error capture (running with the driver in an unknown
-	 * state), we want to bend the rules slightly (a lot).
-	 *
-	 * Work is in progress to make it safer, in the meantime this keeps
-	 * the known issue from spamming the logs.
-	 */
-	return rcu_dereference_protected(active->request, 1);
+	return READ_ONCE(*rq->hwsp_seqno);
 }
 
 /**
- * i915_gem_active_raw - return the active request
- * @active - the active tracker
+ * hwsp_seqno - the current breadcrumb value in the HW status page
+ * @rq: the request, to chase the relevant HW status page
  *
- * i915_gem_active_raw() returns the current request being tracked, or NULL.
- * It does not obtain a reference on the request for the caller, so the caller
- * must hold struct_mutex.
- */
-static inline struct i915_request *
-i915_gem_active_raw(const struct i915_gem_active *active, struct mutex *mutex)
-{
-	return rcu_dereference_protected(active->request,
-					 lockdep_is_held(mutex));
-}
-
-/**
- * i915_gem_active_peek - report the active request being monitored
- * @active - the active tracker
+ * The emphasis in naming here is that hwsp_seqno() is not a property of the
+ * request, but an indication of the current HW state (associated with this
+ * request). Its value will change as the GPU executes more requests.
  *
- * i915_gem_active_peek() returns the current request being tracked if
- * still active, or NULL. It does not obtain a reference on the request
- * for the caller, so the caller must hold struct_mutex.
+ * Returns the current breadcrumb value in the associated HW status page (or
+ * the local timeline's equivalent) for this request. The request itself
+ * has the associated breadcrumb value of rq->fence.seqno, when the HW
+ * status page has that breadcrumb or later, this request is complete.
  */
-static inline struct i915_request *
-i915_gem_active_peek(const struct i915_gem_active *active, struct mutex *mutex)
+static inline u32 hwsp_seqno(const struct i915_request *rq)
 {
-	struct i915_request *request;
+	u32 seqno;
 
-	request = i915_gem_active_raw(active, mutex);
-	if (!request || i915_request_completed(request))
-		return NULL;
+	rcu_read_lock(); /* the HWSP may be freed at runtime */
+	seqno = __hwsp_seqno(rq);
+	rcu_read_unlock();
 
-	return request;
+	return seqno;
 }
 
-/**
- * i915_gem_active_get - return a reference to the active request
- * @active - the active tracker
- *
- * i915_gem_active_get() returns a reference to the active request, or NULL
- * if the active tracker is idle. The caller must hold struct_mutex.
- */
-static inline struct i915_request *
-i915_gem_active_get(const struct i915_gem_active *active, struct mutex *mutex)
+static inline bool __i915_request_has_started(const struct i915_request *rq)
 {
-	return i915_request_get(i915_gem_active_peek(active, mutex));
+	return i915_seqno_passed(hwsp_seqno(rq), rq->fence.seqno - 1);
 }
 
 /**
- * __i915_gem_active_get_rcu - return a reference to the active request
- * @active - the active tracker
- *
- * __i915_gem_active_get() returns a reference to the active request, or NULL
- * if the active tracker is idle. The caller must hold the RCU read lock, but
- * the returned pointer is safe to use outside of RCU.
- */
-static inline struct i915_request *
-__i915_gem_active_get_rcu(const struct i915_gem_active *active)
-{
-	/*
-	 * Performing a lockless retrieval of the active request is super
-	 * tricky. SLAB_TYPESAFE_BY_RCU merely guarantees that the backing
-	 * slab of request objects will not be freed whilst we hold the
-	 * RCU read lock. It does not guarantee that the request itself
-	 * will not be freed and then *reused*. Viz,
-	 *
-	 * Thread A			Thread B
-	 *
-	 * rq = active.request
-	 *				retire(rq) -> free(rq);
-	 *				(rq is now first on the slab freelist)
-	 *				active.request = NULL
-	 *
-	 *				rq = new submission on a new object
-	 * ref(rq)
-	 *
-	 * To prevent the request from being reused whilst the caller
-	 * uses it, we take a reference like normal. Whilst acquiring
-	 * the reference we check that it is not in a destroyed state
-	 * (refcnt == 0). That prevents the request being reallocated
-	 * whilst the caller holds on to it. To check that the request
-	 * was not reallocated as we acquired the reference we have to
-	 * check that our request remains the active request across
-	 * the lookup, in the same manner as a seqlock. The visibility
-	 * of the pointer versus the reference counting is controlled
-	 * by using RCU barriers (rcu_dereference and rcu_assign_pointer).
-	 *
-	 * In the middle of all that, we inspect whether the request is
-	 * complete. Retiring is lazy so the request may be completed long
-	 * before the active tracker is updated. Querying whether the
-	 * request is complete is far cheaper (as it involves no locked
-	 * instructions setting cachelines to exclusive) than acquiring
-	 * the reference, so we do it first. The RCU read lock ensures the
-	 * pointer dereference is valid, but does not ensure that the
-	 * seqno nor HWS is the right one! However, if the request was
-	 * reallocated, that means the active tracker's request was complete.
-	 * If the new request is also complete, then both are and we can
-	 * just report the active tracker is idle. If the new request is
-	 * incomplete, then we acquire a reference on it and check that
-	 * it remained the active request.
-	 *
-	 * It is then imperative that we do not zero the request on
-	 * reallocation, so that we can chase the dangling pointers!
-	 * See i915_request_alloc().
-	 */
-	do {
-		struct i915_request *request;
-
-		request = rcu_dereference(active->request);
-		if (!request || i915_request_completed(request))
-			return NULL;
-
-		/*
-		 * An especially silly compiler could decide to recompute the
-		 * result of i915_request_completed, more specifically
-		 * re-emit the load for request->fence.seqno. A race would catch
-		 * a later seqno value, which could flip the result from true to
-		 * false. Which means part of the instructions below might not
-		 * be executed, while later on instructions are executed. Due to
-		 * barriers within the refcounting the inconsistency can't reach
-		 * past the call to i915_request_get_rcu, but not executing
-		 * that while still executing i915_request_put() creates
-		 * havoc enough.  Prevent this with a compiler barrier.
-		 */
-		barrier();
-
-		request = i915_request_get_rcu(request);
-
-		/*
-		 * What stops the following rcu_access_pointer() from occurring
-		 * before the above i915_request_get_rcu()? If we were
-		 * to read the value before pausing to get the reference to
-		 * the request, we may not notice a change in the active
-		 * tracker.
-		 *
-		 * The rcu_access_pointer() is a mere compiler barrier, which
-		 * means both the CPU and compiler are free to perform the
-		 * memory read without constraint. The compiler only has to
-		 * ensure that any operations after the rcu_access_pointer()
-		 * occur afterwards in program order. This means the read may
-		 * be performed earlier by an out-of-order CPU, or adventurous
-		 * compiler.
-		 *
-		 * The atomic operation at the heart of
-		 * i915_request_get_rcu(), see dma_fence_get_rcu(), is
-		 * atomic_inc_not_zero() which is only a full memory barrier
-		 * when successful. That is, if i915_request_get_rcu()
-		 * returns the request (and so with the reference counted
-		 * incremented) then the following read for rcu_access_pointer()
-		 * must occur after the atomic operation and so confirm
-		 * that this request is the one currently being tracked.
-		 *
-		 * The corresponding write barrier is part of
-		 * rcu_assign_pointer().
-		 */
-		if (!request || request == rcu_access_pointer(active->request))
-			return rcu_pointer_handoff(request);
-
-		i915_request_put(request);
-	} while (1);
-}
-
-/**
- * i915_gem_active_get_unlocked - return a reference to the active request
- * @active - the active tracker
- *
- * i915_gem_active_get_unlocked() returns a reference to the active request,
- * or NULL if the active tracker is idle. The reference is obtained under RCU,
- * so no locking is required by the caller.
+ * i915_request_started - check if the request has begun being executed
+ * @rq: the request
  *
- * The reference should be freed with i915_request_put().
+ * Returns true if the request has been submitted to hardware, and the hardware
+ * has advanced passed the end of the previous request and so should be either
+ * currently processing the request (though it may be preempted and so
+ * not necessarily the next request to complete) or have completed the request.
  */
-static inline struct i915_request *
-i915_gem_active_get_unlocked(const struct i915_gem_active *active)
+static inline bool i915_request_started(const struct i915_request *rq)
 {
-	struct i915_request *request;
+	if (i915_request_signaled(rq))
+		return true;
 
-	rcu_read_lock();
-	request = __i915_gem_active_get_rcu(active);
-	rcu_read_unlock();
-
-	return request;
+	/* Remember: started but may have since been preempted! */
+	return __i915_request_has_started(rq);
 }
 
 /**
- * i915_gem_active_isset - report whether the active tracker is assigned
- * @active - the active tracker
+ * i915_request_is_running - check if the request may actually be executing
+ * @rq: the request
  *
- * i915_gem_active_isset() returns true if the active tracker is currently
- * assigned to a request. Due to the lazy retiring, that request may be idle
- * and this may report stale information.
+ * Returns true if the request is currently submitted to hardware, has passed
+ * its start point (i.e. the context is setup and not busywaiting). Note that
+ * it may no longer be running by the time the function returns!
  */
-static inline bool
-i915_gem_active_isset(const struct i915_gem_active *active)
+static inline bool i915_request_is_running(const struct i915_request *rq)
 {
-	return rcu_access_pointer(active->request);
+	if (!i915_request_is_active(rq))
+		return false;
+
+	return __i915_request_has_started(rq);
 }
 
-/**
- * i915_gem_active_wait - waits until the request is completed
- * @active - the active request on which to wait
- * @flags - how to wait
- * @timeout - how long to wait at most
- * @rps - userspace client to charge for a waitboost
- *
- * i915_gem_active_wait() waits until the request is completed before
- * returning, without requiring any locks to be held. Note that it does not
- * retire any requests before returning.
- *
- * This function relies on RCU in order to acquire the reference to the active
- * request without holding any locks. See __i915_gem_active_get_rcu() for the
- * glory details on how that is managed. Once the reference is acquired, we
- * can then wait upon the request, and afterwards release our reference,
- * free of any locking.
- *
- * This function wraps i915_request_wait(), see it for the full details on
- * the arguments.
- *
- * Returns 0 if successful, or a negative error code.
- */
-static inline int
-i915_gem_active_wait(const struct i915_gem_active *active, unsigned int flags)
+static inline bool i915_request_completed(const struct i915_request *rq)
 {
-	struct i915_request *request;
-	long ret = 0;
-
-	request = i915_gem_active_get_unlocked(active);
-	if (request) {
-		ret = i915_request_wait(request, flags, MAX_SCHEDULE_TIMEOUT);
-		i915_request_put(request);
-	}
+	if (i915_request_signaled(rq))
+		return true;
 
-	return ret < 0 ? ret : 0;
+	return i915_seqno_passed(hwsp_seqno(rq), rq->fence.seqno);
 }
 
-/**
- * i915_gem_active_retire - waits until the request is retired
- * @active - the active request on which to wait
- *
- * i915_gem_active_retire() waits until the request is completed,
- * and then ensures that at least the retirement handler for this
- * @active tracker is called before returning. If the @active
- * tracker is idle, the function returns immediately.
- */
-static inline int __must_check
-i915_gem_active_retire(struct i915_gem_active *active,
-		       struct mutex *mutex)
+static inline void i915_request_mark_complete(struct i915_request *rq)
 {
-	struct i915_request *request;
-	long ret;
-
-	request = i915_gem_active_raw(active, mutex);
-	if (!request)
-		return 0;
-
-	ret = i915_request_wait(request,
-				I915_WAIT_INTERRUPTIBLE | I915_WAIT_LOCKED,
-				MAX_SCHEDULE_TIMEOUT);
-	if (ret < 0)
-		return ret;
-
-	list_del_init(&active->link);
-	RCU_INIT_POINTER(active->request, NULL);
-
-	active->retire(active, request);
-
-	return 0;
+	rq->hwsp_seqno = (u32 *)&rq->fence.seqno; /* decouple from HWSP */
 }
 
-#define for_each_active(mask, idx) \
-	for (; mask ? idx = ffs(mask) - 1, 1 : 0; mask &= ~BIT(idx))
+void i915_retire_requests(struct drm_i915_private *i915);
 
 #endif /* I915_REQUEST_H */
diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
new file mode 100644
index 000000000000..0e0ddf2e6815
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -0,0 +1,1349 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2008-2018 Intel Corporation
+ */
+
+#include <linux/sched/mm.h>
+#include <linux/stop_machine.h>
+
+#include "i915_drv.h"
+#include "i915_gpu_error.h"
+#include "i915_reset.h"
+
+#include "intel_guc.h"
+
+#define RESET_MAX_RETRIES 3
+
+/* XXX How to handle concurrent GGTT updates using tiling registers? */
+#define RESET_UNDER_STOP_MACHINE 0
+
+static void engine_skip_context(struct i915_request *rq)
+{
+	struct intel_engine_cs *engine = rq->engine;
+	struct i915_gem_context *hung_ctx = rq->gem_context;
+	struct i915_timeline *timeline = rq->timeline;
+
+	lockdep_assert_held(&engine->timeline.lock);
+	GEM_BUG_ON(timeline == &engine->timeline);
+
+	spin_lock(&timeline->lock);
+
+	if (i915_request_is_active(rq)) {
+		list_for_each_entry_continue(rq,
+					     &engine->timeline.requests, link)
+			if (rq->gem_context == hung_ctx)
+				i915_request_skip(rq, -EIO);
+	}
+
+	list_for_each_entry(rq, &timeline->requests, link)
+		i915_request_skip(rq, -EIO);
+
+	spin_unlock(&timeline->lock);
+}
+
+static void client_mark_guilty(struct drm_i915_file_private *file_priv,
+			       const struct i915_gem_context *ctx)
+{
+	unsigned int score;
+	unsigned long prev_hang;
+
+	if (i915_gem_context_is_banned(ctx))
+		score = I915_CLIENT_SCORE_CONTEXT_BAN;
+	else
+		score = 0;
+
+	prev_hang = xchg(&file_priv->hang_timestamp, jiffies);
+	if (time_before(jiffies, prev_hang + I915_CLIENT_FAST_HANG_JIFFIES))
+		score += I915_CLIENT_SCORE_HANG_FAST;
+
+	if (score) {
+		atomic_add(score, &file_priv->ban_score);
+
+		DRM_DEBUG_DRIVER("client %s: gained %u ban score, now %u\n",
+				 ctx->name, score,
+				 atomic_read(&file_priv->ban_score));
+	}
+}
+
+static bool context_mark_guilty(struct i915_gem_context *ctx)
+{
+	unsigned int score;
+	bool banned, bannable;
+
+	atomic_inc(&ctx->guilty_count);
+
+	bannable = i915_gem_context_is_bannable(ctx);
+	score = atomic_add_return(CONTEXT_SCORE_GUILTY, &ctx->ban_score);
+	banned = score >= CONTEXT_SCORE_BAN_THRESHOLD;
+
+	/* Cool contexts don't accumulate client ban score */
+	if (!bannable)
+		return false;
+
+	if (banned) {
+		DRM_DEBUG_DRIVER("context %s: guilty %d, score %u, banned\n",
+				 ctx->name, atomic_read(&ctx->guilty_count),
+				 score);
+		i915_gem_context_set_banned(ctx);
+	}
+
+	if (!IS_ERR_OR_NULL(ctx->file_priv))
+		client_mark_guilty(ctx->file_priv, ctx);
+
+	return banned;
+}
+
+static void context_mark_innocent(struct i915_gem_context *ctx)
+{
+	atomic_inc(&ctx->active_count);
+}
+
+void i915_reset_request(struct i915_request *rq, bool guilty)
+{
+	lockdep_assert_held(&rq->engine->timeline.lock);
+	GEM_BUG_ON(i915_request_completed(rq));
+
+	if (guilty) {
+		i915_request_skip(rq, -EIO);
+		if (context_mark_guilty(rq->gem_context))
+			engine_skip_context(rq);
+	} else {
+		dma_fence_set_error(&rq->fence, -EAGAIN);
+		context_mark_innocent(rq->gem_context);
+	}
+}
+
+static void gen3_stop_engine(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	const u32 base = engine->mmio_base;
+
+	if (intel_engine_stop_cs(engine))
+		DRM_DEBUG_DRIVER("%s: timed out on STOP_RING\n", engine->name);
+
+	I915_WRITE_FW(RING_HEAD(base), I915_READ_FW(RING_TAIL(base)));
+	POSTING_READ_FW(RING_HEAD(base)); /* paranoia */
+
+	I915_WRITE_FW(RING_HEAD(base), 0);
+	I915_WRITE_FW(RING_TAIL(base), 0);
+	POSTING_READ_FW(RING_TAIL(base));
+
+	/* The ring must be empty before it is disabled */
+	I915_WRITE_FW(RING_CTL(base), 0);
+
+	/* Check acts as a post */
+	if (I915_READ_FW(RING_HEAD(base)) != 0)
+		DRM_DEBUG_DRIVER("%s: ring head not parked\n",
+				 engine->name);
+}
+
+static void i915_stop_engines(struct drm_i915_private *i915,
+			      unsigned int engine_mask)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	if (INTEL_GEN(i915) < 3)
+		return;
+
+	for_each_engine_masked(engine, i915, engine_mask, id)
+		gen3_stop_engine(engine);
+}
+
+static bool i915_in_reset(struct pci_dev *pdev)
+{
+	u8 gdrst;
+
+	pci_read_config_byte(pdev, I915_GDRST, &gdrst);
+	return gdrst & GRDOM_RESET_STATUS;
+}
+
+static int i915_do_reset(struct drm_i915_private *i915,
+			 unsigned int engine_mask,
+			 unsigned int retry)
+{
+	struct pci_dev *pdev = i915->drm.pdev;
+	int err;
+
+	/* Assert reset for at least 20 usec, and wait for acknowledgement. */
+	pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
+	udelay(50);
+	err = wait_for_atomic(i915_in_reset(pdev), 50);
+
+	/* Clear the reset request. */
+	pci_write_config_byte(pdev, I915_GDRST, 0);
+	udelay(50);
+	if (!err)
+		err = wait_for_atomic(!i915_in_reset(pdev), 50);
+
+	return err;
+}
+
+static bool g4x_reset_complete(struct pci_dev *pdev)
+{
+	u8 gdrst;
+
+	pci_read_config_byte(pdev, I915_GDRST, &gdrst);
+	return (gdrst & GRDOM_RESET_ENABLE) == 0;
+}
+
+static int g33_do_reset(struct drm_i915_private *i915,
+			unsigned int engine_mask,
+			unsigned int retry)
+{
+	struct pci_dev *pdev = i915->drm.pdev;
+
+	pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
+	return wait_for_atomic(g4x_reset_complete(pdev), 50);
+}
+
+static int g4x_do_reset(struct drm_i915_private *dev_priv,
+			unsigned int engine_mask,
+			unsigned int retry)
+{
+	struct pci_dev *pdev = dev_priv->drm.pdev;
+	int ret;
+
+	/* WaVcpClkGateDisableForMediaReset:ctg,elk */
+	I915_WRITE_FW(VDECCLK_GATE_D,
+		      I915_READ(VDECCLK_GATE_D) | VCP_UNIT_CLOCK_GATE_DISABLE);
+	POSTING_READ_FW(VDECCLK_GATE_D);
+
+	pci_write_config_byte(pdev, I915_GDRST,
+			      GRDOM_MEDIA | GRDOM_RESET_ENABLE);
+	ret =  wait_for_atomic(g4x_reset_complete(pdev), 50);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Wait for media reset failed\n");
+		goto out;
+	}
+
+	pci_write_config_byte(pdev, I915_GDRST,
+			      GRDOM_RENDER | GRDOM_RESET_ENABLE);
+	ret =  wait_for_atomic(g4x_reset_complete(pdev), 50);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Wait for render reset failed\n");
+		goto out;
+	}
+
+out:
+	pci_write_config_byte(pdev, I915_GDRST, 0);
+
+	I915_WRITE_FW(VDECCLK_GATE_D,
+		      I915_READ(VDECCLK_GATE_D) & ~VCP_UNIT_CLOCK_GATE_DISABLE);
+	POSTING_READ_FW(VDECCLK_GATE_D);
+
+	return ret;
+}
+
+static int ironlake_do_reset(struct drm_i915_private *dev_priv,
+			     unsigned int engine_mask,
+			     unsigned int retry)
+{
+	int ret;
+
+	I915_WRITE_FW(ILK_GDSR, ILK_GRDOM_RENDER | ILK_GRDOM_RESET_ENABLE);
+	ret = __intel_wait_for_register_fw(dev_priv, ILK_GDSR,
+					   ILK_GRDOM_RESET_ENABLE, 0,
+					   5000, 0,
+					   NULL);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Wait for render reset failed\n");
+		goto out;
+	}
+
+	I915_WRITE_FW(ILK_GDSR, ILK_GRDOM_MEDIA | ILK_GRDOM_RESET_ENABLE);
+	ret = __intel_wait_for_register_fw(dev_priv, ILK_GDSR,
+					   ILK_GRDOM_RESET_ENABLE, 0,
+					   5000, 0,
+					   NULL);
+	if (ret) {
+		DRM_DEBUG_DRIVER("Wait for media reset failed\n");
+		goto out;
+	}
+
+out:
+	I915_WRITE_FW(ILK_GDSR, 0);
+	POSTING_READ_FW(ILK_GDSR);
+	return ret;
+}
+
+/* Reset the hardware domains (GENX_GRDOM_*) specified by mask */
+static int gen6_hw_domain_reset(struct drm_i915_private *dev_priv,
+				u32 hw_domain_mask)
+{
+	int err;
+
+	/*
+	 * GEN6_GDRST is not in the gt power well, no need to check
+	 * for fifo space for the write or forcewake the chip for
+	 * the read
+	 */
+	I915_WRITE_FW(GEN6_GDRST, hw_domain_mask);
+
+	/* Wait for the device to ack the reset requests */
+	err = __intel_wait_for_register_fw(dev_priv,
+					   GEN6_GDRST, hw_domain_mask, 0,
+					   500, 0,
+					   NULL);
+	if (err)
+		DRM_DEBUG_DRIVER("Wait for 0x%08x engines reset failed\n",
+				 hw_domain_mask);
+
+	return err;
+}
+
+static int gen6_reset_engines(struct drm_i915_private *i915,
+			      unsigned int engine_mask,
+			      unsigned int retry)
+{
+	struct intel_engine_cs *engine;
+	const u32 hw_engine_mask[I915_NUM_ENGINES] = {
+		[RCS] = GEN6_GRDOM_RENDER,
+		[BCS] = GEN6_GRDOM_BLT,
+		[VCS] = GEN6_GRDOM_MEDIA,
+		[VCS2] = GEN8_GRDOM_MEDIA2,
+		[VECS] = GEN6_GRDOM_VECS,
+	};
+	u32 hw_mask;
+
+	if (engine_mask == ALL_ENGINES) {
+		hw_mask = GEN6_GRDOM_FULL;
+	} else {
+		unsigned int tmp;
+
+		hw_mask = 0;
+		for_each_engine_masked(engine, i915, engine_mask, tmp)
+			hw_mask |= hw_engine_mask[engine->id];
+	}
+
+	return gen6_hw_domain_reset(i915, hw_mask);
+}
+
+static u32 gen11_lock_sfc(struct drm_i915_private *dev_priv,
+			  struct intel_engine_cs *engine)
+{
+	u8 vdbox_sfc_access = RUNTIME_INFO(dev_priv)->vdbox_sfc_access;
+	i915_reg_t sfc_forced_lock, sfc_forced_lock_ack;
+	u32 sfc_forced_lock_bit, sfc_forced_lock_ack_bit;
+	i915_reg_t sfc_usage;
+	u32 sfc_usage_bit;
+	u32 sfc_reset_bit;
+
+	switch (engine->class) {
+	case VIDEO_DECODE_CLASS:
+		if ((BIT(engine->instance) & vdbox_sfc_access) == 0)
+			return 0;
+
+		sfc_forced_lock = GEN11_VCS_SFC_FORCED_LOCK(engine);
+		sfc_forced_lock_bit = GEN11_VCS_SFC_FORCED_LOCK_BIT;
+
+		sfc_forced_lock_ack = GEN11_VCS_SFC_LOCK_STATUS(engine);
+		sfc_forced_lock_ack_bit  = GEN11_VCS_SFC_LOCK_ACK_BIT;
+
+		sfc_usage = GEN11_VCS_SFC_LOCK_STATUS(engine);
+		sfc_usage_bit = GEN11_VCS_SFC_USAGE_BIT;
+		sfc_reset_bit = GEN11_VCS_SFC_RESET_BIT(engine->instance);
+		break;
+
+	case VIDEO_ENHANCEMENT_CLASS:
+		sfc_forced_lock = GEN11_VECS_SFC_FORCED_LOCK(engine);
+		sfc_forced_lock_bit = GEN11_VECS_SFC_FORCED_LOCK_BIT;
+
+		sfc_forced_lock_ack = GEN11_VECS_SFC_LOCK_ACK(engine);
+		sfc_forced_lock_ack_bit  = GEN11_VECS_SFC_LOCK_ACK_BIT;
+
+		sfc_usage = GEN11_VECS_SFC_USAGE(engine);
+		sfc_usage_bit = GEN11_VECS_SFC_USAGE_BIT;
+		sfc_reset_bit = GEN11_VECS_SFC_RESET_BIT(engine->instance);
+		break;
+
+	default:
+		return 0;
+	}
+
+	/*
+	 * Tell the engine that a software reset is going to happen. The engine
+	 * will then try to force lock the SFC (if currently locked, it will
+	 * remain so until we tell the engine it is safe to unlock; if currently
+	 * unlocked, it will ignore this and all new lock requests). If SFC
+	 * ends up being locked to the engine we want to reset, we have to reset
+	 * it as well (we will unlock it once the reset sequence is completed).
+	 */
+	I915_WRITE_FW(sfc_forced_lock,
+		      I915_READ_FW(sfc_forced_lock) | sfc_forced_lock_bit);
+
+	if (__intel_wait_for_register_fw(dev_priv,
+					 sfc_forced_lock_ack,
+					 sfc_forced_lock_ack_bit,
+					 sfc_forced_lock_ack_bit,
+					 1000, 0, NULL)) {
+		DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
+		return 0;
+	}
+
+	if (I915_READ_FW(sfc_usage) & sfc_usage_bit)
+		return sfc_reset_bit;
+
+	return 0;
+}
+
+static void gen11_unlock_sfc(struct drm_i915_private *dev_priv,
+			     struct intel_engine_cs *engine)
+{
+	u8 vdbox_sfc_access = RUNTIME_INFO(dev_priv)->vdbox_sfc_access;
+	i915_reg_t sfc_forced_lock;
+	u32 sfc_forced_lock_bit;
+
+	switch (engine->class) {
+	case VIDEO_DECODE_CLASS:
+		if ((BIT(engine->instance) & vdbox_sfc_access) == 0)
+			return;
+
+		sfc_forced_lock = GEN11_VCS_SFC_FORCED_LOCK(engine);
+		sfc_forced_lock_bit = GEN11_VCS_SFC_FORCED_LOCK_BIT;
+		break;
+
+	case VIDEO_ENHANCEMENT_CLASS:
+		sfc_forced_lock = GEN11_VECS_SFC_FORCED_LOCK(engine);
+		sfc_forced_lock_bit = GEN11_VECS_SFC_FORCED_LOCK_BIT;
+		break;
+
+	default:
+		return;
+	}
+
+	I915_WRITE_FW(sfc_forced_lock,
+		      I915_READ_FW(sfc_forced_lock) & ~sfc_forced_lock_bit);
+}
+
+static int gen11_reset_engines(struct drm_i915_private *i915,
+			       unsigned int engine_mask,
+			       unsigned int retry)
+{
+	const u32 hw_engine_mask[I915_NUM_ENGINES] = {
+		[RCS] = GEN11_GRDOM_RENDER,
+		[BCS] = GEN11_GRDOM_BLT,
+		[VCS] = GEN11_GRDOM_MEDIA,
+		[VCS2] = GEN11_GRDOM_MEDIA2,
+		[VCS3] = GEN11_GRDOM_MEDIA3,
+		[VCS4] = GEN11_GRDOM_MEDIA4,
+		[VECS] = GEN11_GRDOM_VECS,
+		[VECS2] = GEN11_GRDOM_VECS2,
+	};
+	struct intel_engine_cs *engine;
+	unsigned int tmp;
+	u32 hw_mask;
+	int ret;
+
+	BUILD_BUG_ON(VECS2 + 1 != I915_NUM_ENGINES);
+
+	if (engine_mask == ALL_ENGINES) {
+		hw_mask = GEN11_GRDOM_FULL;
+	} else {
+		hw_mask = 0;
+		for_each_engine_masked(engine, i915, engine_mask, tmp) {
+			hw_mask |= hw_engine_mask[engine->id];
+			hw_mask |= gen11_lock_sfc(i915, engine);
+		}
+	}
+
+	ret = gen6_hw_domain_reset(i915, hw_mask);
+
+	if (engine_mask != ALL_ENGINES)
+		for_each_engine_masked(engine, i915, engine_mask, tmp)
+			gen11_unlock_sfc(i915, engine);
+
+	return ret;
+}
+
+static int gen8_engine_reset_prepare(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	int ret;
+
+	I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base),
+		      _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET));
+
+	ret = __intel_wait_for_register_fw(dev_priv,
+					   RING_RESET_CTL(engine->mmio_base),
+					   RESET_CTL_READY_TO_RESET,
+					   RESET_CTL_READY_TO_RESET,
+					   700, 0,
+					   NULL);
+	if (ret)
+		DRM_ERROR("%s: reset request timeout\n", engine->name);
+
+	return ret;
+}
+
+static void gen8_engine_reset_cancel(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+
+	I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base),
+		      _MASKED_BIT_DISABLE(RESET_CTL_REQUEST_RESET));
+}
+
+static int gen8_reset_engines(struct drm_i915_private *i915,
+			      unsigned int engine_mask,
+			      unsigned int retry)
+{
+	struct intel_engine_cs *engine;
+	const bool reset_non_ready = retry >= 1;
+	unsigned int tmp;
+	int ret;
+
+	for_each_engine_masked(engine, i915, engine_mask, tmp) {
+		ret = gen8_engine_reset_prepare(engine);
+		if (ret && !reset_non_ready)
+			goto skip_reset;
+
+		/*
+		 * If this is not the first failed attempt to prepare,
+		 * we decide to proceed anyway.
+		 *
+		 * By doing so we risk context corruption and with
+		 * some gens (kbl), possible system hang if reset
+		 * happens during active bb execution.
+		 *
+		 * We rather take context corruption instead of
+		 * failed reset with a wedged driver/gpu. And
+		 * active bb execution case should be covered by
+		 * i915_stop_engines we have before the reset.
+		 */
+	}
+
+	if (INTEL_GEN(i915) >= 11)
+		ret = gen11_reset_engines(i915, engine_mask, retry);
+	else
+		ret = gen6_reset_engines(i915, engine_mask, retry);
+
+skip_reset:
+	for_each_engine_masked(engine, i915, engine_mask, tmp)
+		gen8_engine_reset_cancel(engine);
+
+	return ret;
+}
+
+typedef int (*reset_func)(struct drm_i915_private *,
+			  unsigned int engine_mask,
+			  unsigned int retry);
+
+static reset_func intel_get_gpu_reset(struct drm_i915_private *i915)
+{
+	if (!i915_modparams.reset)
+		return NULL;
+
+	if (INTEL_GEN(i915) >= 8)
+		return gen8_reset_engines;
+	else if (INTEL_GEN(i915) >= 6)
+		return gen6_reset_engines;
+	else if (INTEL_GEN(i915) >= 5)
+		return ironlake_do_reset;
+	else if (IS_G4X(i915))
+		return g4x_do_reset;
+	else if (IS_G33(i915) || IS_PINEVIEW(i915))
+		return g33_do_reset;
+	else if (INTEL_GEN(i915) >= 3)
+		return i915_do_reset;
+	else
+		return NULL;
+}
+
+int intel_gpu_reset(struct drm_i915_private *i915, unsigned int engine_mask)
+{
+	const int retries = engine_mask == ALL_ENGINES ? RESET_MAX_RETRIES : 1;
+	reset_func reset;
+	int ret = -ETIMEDOUT;
+	int retry;
+
+	reset = intel_get_gpu_reset(i915);
+	if (!reset)
+		return -ENODEV;
+
+	/*
+	 * If the power well sleeps during the reset, the reset
+	 * request may be dropped and never completes (causing -EIO).
+	 */
+	intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
+	for (retry = 0; ret == -ETIMEDOUT && retry < retries; retry++) {
+		/*
+		 * We stop engines, otherwise we might get failed reset and a
+		 * dead gpu (on elk). Also as modern gpu as kbl can suffer
+		 * from system hang if batchbuffer is progressing when
+		 * the reset is issued, regardless of READY_TO_RESET ack.
+		 * Thus assume it is best to stop engines on all gens
+		 * where we have a gpu reset.
+		 *
+		 * WaKBLVECSSemaphoreWaitPoll:kbl (on ALL_ENGINES)
+		 *
+		 * WaMediaResetMainRingCleanup:ctg,elk (presumably)
+		 *
+		 * FIXME: Wa for more modern gens needs to be validated
+		 */
+		i915_stop_engines(i915, engine_mask);
+
+		GEM_TRACE("engine_mask=%x\n", engine_mask);
+		preempt_disable();
+		ret = reset(i915, engine_mask, retry);
+		preempt_enable();
+	}
+	intel_uncore_forcewake_put(i915, FORCEWAKE_ALL);
+
+	return ret;
+}
+
+bool intel_has_gpu_reset(struct drm_i915_private *i915)
+{
+	if (USES_GUC(i915))
+		return false;
+
+	return intel_get_gpu_reset(i915);
+}
+
+bool intel_has_reset_engine(struct drm_i915_private *i915)
+{
+	return INTEL_INFO(i915)->has_reset_engine && i915_modparams.reset >= 2;
+}
+
+int intel_reset_guc(struct drm_i915_private *i915)
+{
+	u32 guc_domain =
+		INTEL_GEN(i915) >= 11 ? GEN11_GRDOM_GUC : GEN9_GRDOM_GUC;
+	int ret;
+
+	GEM_BUG_ON(!HAS_GUC(i915));
+
+	intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
+	ret = gen6_hw_domain_reset(i915, guc_domain);
+	intel_uncore_forcewake_put(i915, FORCEWAKE_ALL);
+
+	return ret;
+}
+
+/*
+ * Ensure irq handler finishes, and not run again.
+ * Also return the active request so that we only search for it once.
+ */
+static void reset_prepare_engine(struct intel_engine_cs *engine)
+{
+	/*
+	 * During the reset sequence, we must prevent the engine from
+	 * entering RC6. As the context state is undefined until we restart
+	 * the engine, if it does enter RC6 during the reset, the state
+	 * written to the powercontext is undefined and so we may lose
+	 * GPU state upon resume, i.e. fail to restart after a reset.
+	 */
+	intel_uncore_forcewake_get(engine->i915, FORCEWAKE_ALL);
+	engine->reset.prepare(engine);
+}
+
+static void reset_prepare(struct drm_i915_private *i915)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, i915, id)
+		reset_prepare_engine(engine);
+
+	intel_uc_sanitize(i915);
+}
+
+static int gt_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err;
+
+	/*
+	 * Everything depends on having the GTT running, so we need to start
+	 * there.
+	 */
+	err = i915_ggtt_enable_hw(i915);
+	if (err)
+		return err;
+
+	for_each_engine(engine, i915, id)
+		intel_engine_reset(engine, stalled_mask & ENGINE_MASK(id));
+
+	i915_gem_restore_fences(i915);
+
+	return err;
+}
+
+static void reset_finish_engine(struct intel_engine_cs *engine)
+{
+	engine->reset.finish(engine);
+	intel_uncore_forcewake_put(engine->i915, FORCEWAKE_ALL);
+}
+
+struct i915_gpu_restart {
+	struct work_struct work;
+	struct drm_i915_private *i915;
+};
+
+static void restart_work(struct work_struct *work)
+{
+	struct i915_gpu_restart *arg = container_of(work, typeof(*arg), work);
+	struct drm_i915_private *i915 = arg->i915;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+
+	wakeref = intel_runtime_pm_get(i915);
+	mutex_lock(&i915->drm.struct_mutex);
+	WRITE_ONCE(i915->gpu_error.restart, NULL);
+
+	for_each_engine(engine, i915, id) {
+		struct i915_request *rq;
+
+		/*
+		 * Ostensibily, we always want a context loaded for powersaving,
+		 * so if the engine is idle after the reset, send a request
+		 * to load our scratch kernel_context.
+		 */
+		if (!intel_engine_is_idle(engine))
+			continue;
+
+		rq = i915_request_alloc(engine, i915->kernel_context);
+		if (!IS_ERR(rq))
+			i915_request_add(rq);
+	}
+
+	mutex_unlock(&i915->drm.struct_mutex);
+	intel_runtime_pm_put(i915, wakeref);
+
+	kfree(arg);
+}
+
+static void reset_finish(struct drm_i915_private *i915)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, i915, id)
+		reset_finish_engine(engine);
+}
+
+static void reset_restart(struct drm_i915_private *i915)
+{
+	struct i915_gpu_restart *arg;
+
+	/*
+	 * Following the reset, ensure that we always reload context for
+	 * powersaving, and to correct engine->last_retired_context. Since
+	 * this requires us to submit a request, queue a worker to do that
+	 * task for us to evade any locking here.
+	 */
+	if (READ_ONCE(i915->gpu_error.restart))
+		return;
+
+	arg = kmalloc(sizeof(*arg), GFP_KERNEL);
+	if (arg) {
+		arg->i915 = i915;
+		INIT_WORK(&arg->work, restart_work);
+
+		WRITE_ONCE(i915->gpu_error.restart, arg);
+		queue_work(i915->wq, &arg->work);
+	}
+}
+
+static void nop_submit_request(struct i915_request *request)
+{
+	struct intel_engine_cs *engine = request->engine;
+	unsigned long flags;
+
+	GEM_TRACE("%s fence %llx:%lld -> -EIO\n",
+		  engine->name, request->fence.context, request->fence.seqno);
+	dma_fence_set_error(&request->fence, -EIO);
+
+	spin_lock_irqsave(&engine->timeline.lock, flags);
+	__i915_request_submit(request);
+	i915_request_mark_complete(request);
+	intel_engine_write_global_seqno(engine, request->global_seqno);
+	spin_unlock_irqrestore(&engine->timeline.lock, flags);
+
+	intel_engine_queue_breadcrumbs(engine);
+}
+
+void i915_gem_set_wedged(struct drm_i915_private *i915)
+{
+	struct i915_gpu_error *error = &i915->gpu_error;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	mutex_lock(&error->wedge_mutex);
+	if (test_bit(I915_WEDGED, &error->flags)) {
+		mutex_unlock(&error->wedge_mutex);
+		return;
+	}
+
+	if (GEM_SHOW_DEBUG() && !intel_engines_are_idle(i915)) {
+		struct drm_printer p = drm_debug_printer(__func__);
+
+		for_each_engine(engine, i915, id)
+			intel_engine_dump(engine, &p, "%s\n", engine->name);
+	}
+
+	GEM_TRACE("start\n");
+
+	/*
+	 * First, stop submission to hw, but do not yet complete requests by
+	 * rolling the global seqno forward (since this would complete requests
+	 * for which we haven't set the fence error to EIO yet).
+	 */
+	for_each_engine(engine, i915, id)
+		reset_prepare_engine(engine);
+
+	/* Even if the GPU reset fails, it should still stop the engines */
+	if (INTEL_GEN(i915) >= 5)
+		intel_gpu_reset(i915, ALL_ENGINES);
+
+	for_each_engine(engine, i915, id) {
+		engine->submit_request = nop_submit_request;
+		engine->schedule = NULL;
+	}
+	i915->caps.scheduler = 0;
+
+	/*
+	 * Make sure no request can slip through without getting completed by
+	 * either this call here to intel_engine_write_global_seqno, or the one
+	 * in nop_submit_request.
+	 */
+	synchronize_rcu();
+
+	/* Mark all executing requests as skipped */
+	for_each_engine(engine, i915, id)
+		engine->cancel_requests(engine);
+
+	for_each_engine(engine, i915, id) {
+		reset_finish_engine(engine);
+		intel_engine_signal_breadcrumbs(engine);
+	}
+
+	smp_mb__before_atomic();
+	set_bit(I915_WEDGED, &error->flags);
+
+	GEM_TRACE("end\n");
+	mutex_unlock(&error->wedge_mutex);
+
+	wake_up_all(&error->reset_queue);
+}
+
+bool i915_gem_unset_wedged(struct drm_i915_private *i915)
+{
+	struct i915_gpu_error *error = &i915->gpu_error;
+	struct i915_timeline *tl;
+	bool ret = false;
+
+	if (!test_bit(I915_WEDGED, &error->flags))
+		return true;
+
+	if (!i915->gt.scratch) /* Never full initialised, recovery impossible */
+		return false;
+
+	mutex_lock(&error->wedge_mutex);
+
+	GEM_TRACE("start\n");
+
+	/*
+	 * Before unwedging, make sure that all pending operations
+	 * are flushed and errored out - we may have requests waiting upon
+	 * third party fences. We marked all inflight requests as EIO, and
+	 * every execbuf since returned EIO, for consistency we want all
+	 * the currently pending requests to also be marked as EIO, which
+	 * is done inside our nop_submit_request - and so we must wait.
+	 *
+	 * No more can be submitted until we reset the wedged bit.
+	 */
+	mutex_lock(&i915->gt.timelines.mutex);
+	list_for_each_entry(tl, &i915->gt.timelines.active_list, link) {
+		struct i915_request *rq;
+		long timeout;
+
+		rq = i915_active_request_get_unlocked(&tl->last_request);
+		if (!rq)
+			continue;
+
+		/*
+		 * We can't use our normal waiter as we want to
+		 * avoid recursively trying to handle the current
+		 * reset. The basic dma_fence_default_wait() installs
+		 * a callback for dma_fence_signal(), which is
+		 * triggered by our nop handler (indirectly, the
+		 * callback enables the signaler thread which is
+		 * woken by the nop_submit_request() advancing the seqno
+		 * and when the seqno passes the fence, the signaler
+		 * then signals the fence waking us up).
+		 */
+		timeout = dma_fence_default_wait(&rq->fence, true,
+						 MAX_SCHEDULE_TIMEOUT);
+		i915_request_put(rq);
+		if (timeout < 0) {
+			mutex_unlock(&i915->gt.timelines.mutex);
+			goto unlock;
+		}
+	}
+	mutex_unlock(&i915->gt.timelines.mutex);
+
+	intel_engines_sanitize(i915, false);
+
+	/*
+	 * Undo nop_submit_request. We prevent all new i915 requests from
+	 * being queued (by disallowing execbuf whilst wedged) so having
+	 * waited for all active requests above, we know the system is idle
+	 * and do not have to worry about a thread being inside
+	 * engine->submit_request() as we swap over. So unlike installing
+	 * the nop_submit_request on reset, we can do this from normal
+	 * context and do not require stop_machine().
+	 */
+	intel_engines_reset_default_submission(i915);
+
+	GEM_TRACE("end\n");
+
+	smp_mb__before_atomic(); /* complete takeover before enabling execbuf */
+	clear_bit(I915_WEDGED, &i915->gpu_error.flags);
+	ret = true;
+unlock:
+	mutex_unlock(&i915->gpu_error.wedge_mutex);
+
+	return ret;
+}
+
+struct __i915_reset {
+	struct drm_i915_private *i915;
+	unsigned int stalled_mask;
+};
+
+static int __i915_reset__BKL(void *data)
+{
+	struct __i915_reset *arg = data;
+	int err;
+
+	err = intel_gpu_reset(arg->i915, ALL_ENGINES);
+	if (err)
+		return err;
+
+	return gt_reset(arg->i915, arg->stalled_mask);
+}
+
+#if RESET_UNDER_STOP_MACHINE
+/*
+ * XXX An alternative to using stop_machine would be to park only the
+ * processes that have a GGTT mmap. By remote parking the threads (SIGSTOP)
+ * we should be able to prevent their memmory accesses via the lost fence
+ * registers over the course of the reset without the potential recursive
+ * of mutexes between the pagefault handler and reset.
+ *
+ * See igt/gem_mmap_gtt/hang
+ */
+#define __do_reset(fn, arg) stop_machine(fn, arg, NULL)
+#else
+#define __do_reset(fn, arg) fn(arg)
+#endif
+
+static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
+{
+	struct __i915_reset arg = { i915, stalled_mask };
+	int err, i;
+
+	err = __do_reset(__i915_reset__BKL, &arg);
+	for (i = 0; err && i < RESET_MAX_RETRIES; i++) {
+		msleep(100);
+		err = __do_reset(__i915_reset__BKL, &arg);
+	}
+
+	return err;
+}
+
+/**
+ * i915_reset - reset chip after a hang
+ * @i915: #drm_i915_private to reset
+ * @stalled_mask: mask of the stalled engines with the guilty requests
+ * @reason: user error message for why we are resetting
+ *
+ * Reset the chip.  Useful if a hang is detected. Marks the device as wedged
+ * on failure.
+ *
+ * Caller must hold the struct_mutex.
+ *
+ * Procedure is fairly simple:
+ *   - reset the chip using the reset reg
+ *   - re-init context state
+ *   - re-init hardware status page
+ *   - re-init ring buffer
+ *   - re-init interrupt state
+ *   - re-init display
+ */
+void i915_reset(struct drm_i915_private *i915,
+		unsigned int stalled_mask,
+		const char *reason)
+{
+	struct i915_gpu_error *error = &i915->gpu_error;
+	int ret;
+
+	GEM_TRACE("flags=%lx\n", error->flags);
+
+	might_sleep();
+	assert_rpm_wakelock_held(i915);
+	GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
+
+	/* Clear any previous failed attempts at recovery. Time to try again. */
+	if (!i915_gem_unset_wedged(i915))
+		return;
+
+	if (reason)
+		dev_notice(i915->drm.dev, "Resetting chip for %s\n", reason);
+	error->reset_count++;
+
+	reset_prepare(i915);
+
+	if (!intel_has_gpu_reset(i915)) {
+		if (i915_modparams.reset)
+			dev_err(i915->drm.dev, "GPU reset not supported\n");
+		else
+			DRM_DEBUG_DRIVER("GPU reset disabled\n");
+		goto error;
+	}
+
+	if (do_reset(i915, stalled_mask)) {
+		dev_err(i915->drm.dev, "Failed to reset chip\n");
+		goto taint;
+	}
+
+	intel_overlay_reset(i915);
+
+	/*
+	 * Next we need to restore the context, but we don't use those
+	 * yet either...
+	 *
+	 * Ring buffer needs to be re-initialized in the KMS case, or if X
+	 * was running at the time of the reset (i.e. we weren't VT
+	 * switched away).
+	 */
+	ret = i915_gem_init_hw(i915);
+	if (ret) {
+		DRM_ERROR("Failed to initialise HW following reset (%d)\n",
+			  ret);
+		goto error;
+	}
+
+	i915_queue_hangcheck(i915);
+
+finish:
+	reset_finish(i915);
+	if (!i915_terminally_wedged(error))
+		reset_restart(i915);
+	return;
+
+taint:
+	/*
+	 * History tells us that if we cannot reset the GPU now, we
+	 * never will. This then impacts everything that is run
+	 * subsequently. On failing the reset, we mark the driver
+	 * as wedged, preventing further execution on the GPU.
+	 * We also want to go one step further and add a taint to the
+	 * kernel so that any subsequent faults can be traced back to
+	 * this failure. This is important for CI, where if the
+	 * GPU/driver fails we would like to reboot and restart testing
+	 * rather than continue on into oblivion. For everyone else,
+	 * the system should still plod along, but they have been warned!
+	 */
+	add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
+error:
+	i915_gem_set_wedged(i915);
+	goto finish;
+}
+
+static inline int intel_gt_reset_engine(struct drm_i915_private *i915,
+					struct intel_engine_cs *engine)
+{
+	return intel_gpu_reset(i915, intel_engine_flag(engine));
+}
+
+/**
+ * i915_reset_engine - reset GPU engine to recover from a hang
+ * @engine: engine to reset
+ * @msg: reason for GPU reset; or NULL for no dev_notice()
+ *
+ * Reset a specific GPU engine. Useful if a hang is detected.
+ * Returns zero on successful reset or otherwise an error code.
+ *
+ * Procedure is:
+ *  - identifies the request that caused the hang and it is dropped
+ *  - reset engine (which will force the engine to idle)
+ *  - re-init/configure engine
+ */
+int i915_reset_engine(struct intel_engine_cs *engine, const char *msg)
+{
+	struct i915_gpu_error *error = &engine->i915->gpu_error;
+	int ret;
+
+	GEM_TRACE("%s flags=%lx\n", engine->name, error->flags);
+	GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
+
+	reset_prepare_engine(engine);
+
+	if (msg)
+		dev_notice(engine->i915->drm.dev,
+			   "Resetting %s for %s\n", engine->name, msg);
+	error->reset_engine_count[engine->id]++;
+
+	if (!engine->i915->guc.execbuf_client)
+		ret = intel_gt_reset_engine(engine->i915, engine);
+	else
+		ret = intel_guc_reset_engine(&engine->i915->guc, engine);
+	if (ret) {
+		/* If we fail here, we expect to fallback to a global reset */
+		DRM_DEBUG_DRIVER("%sFailed to reset %s, ret=%d\n",
+				 engine->i915->guc.execbuf_client ? "GuC " : "",
+				 engine->name, ret);
+		goto out;
+	}
+
+	/*
+	 * The request that caused the hang is stuck on elsp, we know the
+	 * active request and can drop it, adjust head to skip the offending
+	 * request to resume executing remaining requests in the queue.
+	 */
+	intel_engine_reset(engine, true);
+
+	/*
+	 * The engine and its registers (and workarounds in case of render)
+	 * have been reset to their default values. Follow the init_ring
+	 * process to program RING_MODE, HWSP and re-enable submission.
+	 */
+	ret = engine->init_hw(engine);
+	if (ret)
+		goto out;
+
+out:
+	intel_engine_cancel_stop_cs(engine);
+	reset_finish_engine(engine);
+	return ret;
+}
+
+static void i915_reset_device(struct drm_i915_private *i915,
+			      u32 engine_mask,
+			      const char *reason)
+{
+	struct i915_gpu_error *error = &i915->gpu_error;
+	struct kobject *kobj = &i915->drm.primary->kdev->kobj;
+	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
+	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
+	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
+	struct i915_wedge_me w;
+
+	kobject_uevent_env(kobj, KOBJ_CHANGE, error_event);
+
+	DRM_DEBUG_DRIVER("resetting chip\n");
+	kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
+
+	/* Use a watchdog to ensure that our reset completes */
+	i915_wedge_on_timeout(&w, i915, 5 * HZ) {
+		intel_prepare_reset(i915);
+
+		i915_reset(i915, engine_mask, reason);
+
+		intel_finish_reset(i915);
+	}
+
+	if (!test_bit(I915_WEDGED, &error->flags))
+		kobject_uevent_env(kobj, KOBJ_CHANGE, reset_done_event);
+}
+
+void i915_clear_error_registers(struct drm_i915_private *dev_priv)
+{
+	u32 eir;
+
+	if (!IS_GEN(dev_priv, 2))
+		I915_WRITE(PGTBL_ER, I915_READ(PGTBL_ER));
+
+	if (INTEL_GEN(dev_priv) < 4)
+		I915_WRITE(IPEIR, I915_READ(IPEIR));
+	else
+		I915_WRITE(IPEIR_I965, I915_READ(IPEIR_I965));
+
+	I915_WRITE(EIR, I915_READ(EIR));
+	eir = I915_READ(EIR);
+	if (eir) {
+		/*
+		 * some errors might have become stuck,
+		 * mask them.
+		 */
+		DRM_DEBUG_DRIVER("EIR stuck: 0x%08x, masking\n", eir);
+		I915_WRITE(EMR, I915_READ(EMR) | eir);
+		I915_WRITE(IIR, I915_MASTER_ERROR_INTERRUPT);
+	}
+
+	if (INTEL_GEN(dev_priv) >= 8) {
+		I915_WRITE(GEN8_RING_FAULT_REG,
+			   I915_READ(GEN8_RING_FAULT_REG) & ~RING_FAULT_VALID);
+		POSTING_READ(GEN8_RING_FAULT_REG);
+	} else if (INTEL_GEN(dev_priv) >= 6) {
+		struct intel_engine_cs *engine;
+		enum intel_engine_id id;
+
+		for_each_engine(engine, dev_priv, id) {
+			I915_WRITE(RING_FAULT_REG(engine),
+				   I915_READ(RING_FAULT_REG(engine)) &
+				   ~RING_FAULT_VALID);
+		}
+		POSTING_READ(RING_FAULT_REG(dev_priv->engine[RCS]));
+	}
+}
+
+/**
+ * i915_handle_error - handle a gpu error
+ * @i915: i915 device private
+ * @engine_mask: mask representing engines that are hung
+ * @flags: control flags
+ * @fmt: Error message format string
+ *
+ * Do some basic checking of register state at error time and
+ * dump it to the syslog.  Also call i915_capture_error_state() to make
+ * sure we get a record and make it available in debugfs.  Fire a uevent
+ * so userspace knows something bad happened (should trigger collection
+ * of a ring dump etc.).
+ */
+void i915_handle_error(struct drm_i915_private *i915,
+		       u32 engine_mask,
+		       unsigned long flags,
+		       const char *fmt, ...)
+{
+	struct intel_engine_cs *engine;
+	intel_wakeref_t wakeref;
+	unsigned int tmp;
+	char error_msg[80];
+	char *msg = NULL;
+
+	if (fmt) {
+		va_list args;
+
+		va_start(args, fmt);
+		vscnprintf(error_msg, sizeof(error_msg), fmt, args);
+		va_end(args);
+
+		msg = error_msg;
+	}
+
+	/*
+	 * In most cases it's guaranteed that we get here with an RPM
+	 * reference held, for example because there is a pending GPU
+	 * request that won't finish until the reset is done. This
+	 * isn't the case at least when we get here by doing a
+	 * simulated reset via debugfs, so get an RPM reference.
+	 */
+	wakeref = intel_runtime_pm_get(i915);
+
+	engine_mask &= INTEL_INFO(i915)->ring_mask;
+
+	if (flags & I915_ERROR_CAPTURE) {
+		i915_capture_error_state(i915, engine_mask, msg);
+		i915_clear_error_registers(i915);
+	}
+
+	/*
+	 * Try engine reset when available. We fall back to full reset if
+	 * single reset fails.
+	 */
+	if (intel_has_reset_engine(i915) &&
+	    !i915_terminally_wedged(&i915->gpu_error)) {
+		for_each_engine_masked(engine, i915, engine_mask, tmp) {
+			BUILD_BUG_ON(I915_RESET_MODESET >= I915_RESET_ENGINE);
+			if (test_and_set_bit(I915_RESET_ENGINE + engine->id,
+					     &i915->gpu_error.flags))
+				continue;
+
+			if (i915_reset_engine(engine, msg) == 0)
+				engine_mask &= ~intel_engine_flag(engine);
+
+			clear_bit(I915_RESET_ENGINE + engine->id,
+				  &i915->gpu_error.flags);
+			wake_up_bit(&i915->gpu_error.flags,
+				    I915_RESET_ENGINE + engine->id);
+		}
+	}
+
+	if (!engine_mask)
+		goto out;
+
+	/* Full reset needs the mutex, stop any other user trying to do so. */
+	if (test_and_set_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags)) {
+		wait_event(i915->gpu_error.reset_queue,
+			   !test_bit(I915_RESET_BACKOFF,
+				     &i915->gpu_error.flags));
+		goto out;
+	}
+
+	/* Prevent any other reset-engine attempt. */
+	for_each_engine(engine, i915, tmp) {
+		while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
+					&i915->gpu_error.flags))
+			wait_on_bit(&i915->gpu_error.flags,
+				    I915_RESET_ENGINE + engine->id,
+				    TASK_UNINTERRUPTIBLE);
+	}
+
+	i915_reset_device(i915, engine_mask, msg);
+
+	for_each_engine(engine, i915, tmp) {
+		clear_bit(I915_RESET_ENGINE + engine->id,
+			  &i915->gpu_error.flags);
+	}
+
+	clear_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags);
+	wake_up_all(&i915->gpu_error.reset_queue);
+
+out:
+	intel_runtime_pm_put(i915, wakeref);
+}
+
+bool i915_reset_flush(struct drm_i915_private *i915)
+{
+	int err;
+
+	cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
+
+	flush_workqueue(i915->wq);
+	GEM_BUG_ON(READ_ONCE(i915->gpu_error.restart));
+
+	mutex_lock(&i915->drm.struct_mutex);
+	err = i915_gem_wait_for_idle(i915,
+				     I915_WAIT_LOCKED |
+				     I915_WAIT_FOR_IDLE_BOOST,
+				     MAX_SCHEDULE_TIMEOUT);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	return !err;
+}
+
+static void i915_wedge_me(struct work_struct *work)
+{
+	struct i915_wedge_me *w = container_of(work, typeof(*w), work.work);
+
+	dev_err(w->i915->drm.dev,
+		"%s timed out, cancelling all in-flight rendering.\n",
+		w->name);
+	i915_gem_set_wedged(w->i915);
+}
+
+void __i915_init_wedge(struct i915_wedge_me *w,
+		       struct drm_i915_private *i915,
+		       long timeout,
+		       const char *name)
+{
+	w->i915 = i915;
+	w->name = name;
+
+	INIT_DELAYED_WORK_ONSTACK(&w->work, i915_wedge_me);
+	schedule_delayed_work(&w->work, timeout);
+}
+
+void __i915_fini_wedge(struct i915_wedge_me *w)
+{
+	cancel_delayed_work_sync(&w->work);
+	destroy_delayed_work_on_stack(&w->work);
+	w->i915 = NULL;
+}
diff --git a/drivers/gpu/drm/i915/i915_reset.h b/drivers/gpu/drm/i915/i915_reset.h
new file mode 100644
index 000000000000..f2d347f319df
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_reset.h
@@ -0,0 +1,59 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2008-2018 Intel Corporation
+ */
+
+#ifndef I915_RESET_H
+#define I915_RESET_H
+
+#include <linux/compiler.h>
+#include <linux/types.h>
+
+struct drm_i915_private;
+struct intel_engine_cs;
+struct intel_guc;
+
+__printf(4, 5)
+void i915_handle_error(struct drm_i915_private *i915,
+		       u32 engine_mask,
+		       unsigned long flags,
+		       const char *fmt, ...);
+#define I915_ERROR_CAPTURE BIT(0)
+
+void i915_clear_error_registers(struct drm_i915_private *i915);
+
+void i915_reset(struct drm_i915_private *i915,
+		unsigned int stalled_mask,
+		const char *reason);
+int i915_reset_engine(struct intel_engine_cs *engine,
+		      const char *reason);
+
+void i915_reset_request(struct i915_request *rq, bool guilty);
+bool i915_reset_flush(struct drm_i915_private *i915);
+
+bool intel_has_gpu_reset(struct drm_i915_private *i915);
+bool intel_has_reset_engine(struct drm_i915_private *i915);
+
+int intel_gpu_reset(struct drm_i915_private *i915, u32 engine_mask);
+
+int intel_reset_guc(struct drm_i915_private *i915);
+
+struct i915_wedge_me {
+	struct delayed_work work;
+	struct drm_i915_private *i915;
+	const char *name;
+};
+
+void __i915_init_wedge(struct i915_wedge_me *w,
+		       struct drm_i915_private *i915,
+		       long timeout,
+		       const char *name);
+void __i915_fini_wedge(struct i915_wedge_me *w);
+
+#define i915_wedge_on_timeout(W, DEV, TIMEOUT)				\
+	for (__i915_init_wedge((W), (DEV), (TIMEOUT), __func__);	\
+	     (W)->i915;							\
+	     __i915_fini_wedge((W)))
+
+#endif /* I915_RESET_H */
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index 340faea6c08a..d01683167c77 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -127,8 +127,7 @@ static inline struct i915_priolist *to_priolist(struct rb_node *rb)
 	return rb_entry(rb, struct i915_priolist, node);
 }
 
-static void assert_priolists(struct intel_engine_execlists * const execlists,
-			     long queue_priority)
+static void assert_priolists(struct intel_engine_execlists * const execlists)
 {
 	struct rb_node *rb;
 	long last_prio, i;
@@ -139,7 +138,7 @@ static void assert_priolists(struct intel_engine_execlists * const execlists,
 	GEM_BUG_ON(rb_first_cached(&execlists->queue) !=
 		   rb_first(&execlists->queue.rb_root));
 
-	last_prio = (queue_priority >> I915_USER_PRIORITY_SHIFT) + 1;
+	last_prio = (INT_MAX >> I915_USER_PRIORITY_SHIFT) + 1;
 	for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
 		const struct i915_priolist *p = to_priolist(rb);
 
@@ -166,7 +165,7 @@ i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio)
 	int idx, i;
 
 	lockdep_assert_held(&engine->timeline.lock);
-	assert_priolists(execlists, INT_MAX);
+	assert_priolists(execlists);
 
 	/* buckets sorted from highest [in slot 0] to lowest priority */
 	idx = I915_PRIORITY_COUNT - (prio & I915_PRIORITY_MASK) - 1;
@@ -239,6 +238,18 @@ sched_lock_engine(struct i915_sched_node *node, struct intel_engine_cs *locked)
 	return engine;
 }
 
+static bool inflight(const struct i915_request *rq,
+		     const struct intel_engine_cs *engine)
+{
+	const struct i915_request *active;
+
+	if (!i915_request_is_active(rq))
+		return false;
+
+	active = port_request(engine->execlists.port);
+	return active->hw_context == rq->hw_context;
+}
+
 static void __i915_schedule(struct i915_request *rq,
 			    const struct i915_sched_attr *attr)
 {
@@ -328,6 +339,7 @@ static void __i915_schedule(struct i915_request *rq,
 		INIT_LIST_HEAD(&dep->dfs_link);
 
 		engine = sched_lock_engine(node, engine);
+		lockdep_assert_held(&engine->timeline.lock);
 
 		/* Recheck after acquiring the engine->timeline.lock */
 		if (prio <= node->attr.priority || node_signaled(node))
@@ -353,20 +365,19 @@ static void __i915_schedule(struct i915_request *rq,
 				continue;
 		}
 
-		if (prio <= engine->execlists.queue_priority)
+		if (prio <= engine->execlists.queue_priority_hint)
 			continue;
 
+		engine->execlists.queue_priority_hint = prio;
+
 		/*
 		 * If we are already the currently executing context, don't
 		 * bother evaluating if we should preempt ourselves.
 		 */
-		if (node_to_request(node)->global_seqno &&
-		    i915_seqno_passed(port_request(engine->execlists.port)->global_seqno,
-				      node_to_request(node)->global_seqno))
+		if (inflight(node_to_request(node), engine))
 			continue;
 
 		/* Defer (tasklet) submission until after all of our updates. */
-		engine->execlists.queue_priority = prio;
 		tasklet_hi_schedule(&engine->execlists.tasklet);
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_selftest.h b/drivers/gpu/drm/i915/i915_selftest.h
index a73472dd12fd..207e21b478f2 100644
--- a/drivers/gpu/drm/i915/i915_selftest.h
+++ b/drivers/gpu/drm/i915/i915_selftest.h
@@ -31,6 +31,7 @@ struct i915_selftest {
 	unsigned long timeout_jiffies;
 	unsigned int timeout_ms;
 	unsigned int random_seed;
+	char *filter;
 	int mock;
 	int live;
 };
diff --git a/drivers/gpu/drm/i915/i915_suspend.c b/drivers/gpu/drm/i915/i915_suspend.c
index 8f3aa4dc0c98..d2f2a9c2fabd 100644
--- a/drivers/gpu/drm/i915/i915_suspend.c
+++ b/drivers/gpu/drm/i915/i915_suspend.c
@@ -24,7 +24,6 @@
  * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "intel_drv.h"
 #include "i915_reg.h"
@@ -65,7 +64,7 @@ int i915_save_state(struct drm_i915_private *dev_priv)
 
 	i915_save_display(dev_priv);
 
-	if (IS_GEN4(dev_priv))
+	if (IS_GEN(dev_priv, 4))
 		pci_read_config_word(pdev, GCDGMBUS,
 				     &dev_priv->regfile.saveGCDGMBUS);
 
@@ -77,17 +76,17 @@ int i915_save_state(struct drm_i915_private *dev_priv)
 	dev_priv->regfile.saveMI_ARB_STATE = I915_READ(MI_ARB_STATE);
 
 	/* Scratch space */
-	if (IS_GEN2(dev_priv) && IS_MOBILE(dev_priv)) {
+	if (IS_GEN(dev_priv, 2) && IS_MOBILE(dev_priv)) {
 		for (i = 0; i < 7; i++) {
 			dev_priv->regfile.saveSWF0[i] = I915_READ(SWF0(i));
 			dev_priv->regfile.saveSWF1[i] = I915_READ(SWF1(i));
 		}
 		for (i = 0; i < 3; i++)
 			dev_priv->regfile.saveSWF3[i] = I915_READ(SWF3(i));
-	} else if (IS_GEN2(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 2)) {
 		for (i = 0; i < 7; i++)
 			dev_priv->regfile.saveSWF1[i] = I915_READ(SWF1(i));
-	} else if (HAS_GMCH_DISPLAY(dev_priv)) {
+	} else if (HAS_GMCH(dev_priv)) {
 		for (i = 0; i < 16; i++) {
 			dev_priv->regfile.saveSWF0[i] = I915_READ(SWF0(i));
 			dev_priv->regfile.saveSWF1[i] = I915_READ(SWF1(i));
@@ -108,7 +107,7 @@ int i915_restore_state(struct drm_i915_private *dev_priv)
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
 
-	if (IS_GEN4(dev_priv))
+	if (IS_GEN(dev_priv, 4))
 		pci_write_config_word(pdev, GCDGMBUS,
 				      dev_priv->regfile.saveGCDGMBUS);
 	i915_restore_display(dev_priv);
@@ -122,17 +121,17 @@ int i915_restore_state(struct drm_i915_private *dev_priv)
 	I915_WRITE(MI_ARB_STATE, dev_priv->regfile.saveMI_ARB_STATE | 0xffff0000);
 
 	/* Scratch space */
-	if (IS_GEN2(dev_priv) && IS_MOBILE(dev_priv)) {
+	if (IS_GEN(dev_priv, 2) && IS_MOBILE(dev_priv)) {
 		for (i = 0; i < 7; i++) {
 			I915_WRITE(SWF0(i), dev_priv->regfile.saveSWF0[i]);
 			I915_WRITE(SWF1(i), dev_priv->regfile.saveSWF1[i]);
 		}
 		for (i = 0; i < 3; i++)
 			I915_WRITE(SWF3(i), dev_priv->regfile.saveSWF3[i]);
-	} else if (IS_GEN2(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 2)) {
 		for (i = 0; i < 7; i++)
 			I915_WRITE(SWF1(i), dev_priv->regfile.saveSWF1[i]);
-	} else if (HAS_GMCH_DISPLAY(dev_priv)) {
+	} else if (HAS_GMCH(dev_priv)) {
 		for (i = 0; i < 16; i++) {
 			I915_WRITE(SWF0(i), dev_priv->regfile.saveSWF0[i]);
 			I915_WRITE(SWF1(i), dev_priv->regfile.saveSWF1[i]);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index fc2eeab823b7..7c58b049ecb5 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -390,7 +390,7 @@ static void timer_i915_sw_fence_wake(struct timer_list *t)
 	if (!fence)
 		return;
 
-	pr_notice("Asynchronous wait on fence %s:%s:%x timed out (hint:%pS)\n",
+	pr_notice("Asynchronous wait on fence %s:%s:%llx timed out (hint:%pS)\n",
 		  cb->dma->ops->get_driver_name(cb->dma),
 		  cb->dma->ops->get_timeline_name(cb->dma),
 		  cb->dma->seqno,
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index c0cfe7ae2ba5..41313005af42 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -42,11 +42,11 @@ static inline struct drm_i915_private *kdev_minor_to_i915(struct device *kdev)
 static u32 calc_residency(struct drm_i915_private *dev_priv,
 			  i915_reg_t reg)
 {
-	u64 res;
+	intel_wakeref_t wakeref;
+	u64 res = 0;
 
-	intel_runtime_pm_get(dev_priv);
-	res = intel_rc6_residency_us(dev_priv, reg);
-	intel_runtime_pm_put(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref)
+		res = intel_rc6_residency_us(dev_priv, reg);
 
 	return DIV_ROUND_CLOSEST_ULL(res, 1000);
 }
@@ -258,9 +258,10 @@ static ssize_t gt_act_freq_mhz_show(struct device *kdev,
 				    struct device_attribute *attr, char *buf)
 {
 	struct drm_i915_private *dev_priv = kdev_minor_to_i915(kdev);
+	intel_wakeref_t wakeref;
 	int ret;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	mutex_lock(&dev_priv->pcu_lock);
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
@@ -274,7 +275,7 @@ static ssize_t gt_act_freq_mhz_show(struct device *kdev,
 	}
 	mutex_unlock(&dev_priv->pcu_lock);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return snprintf(buf, PAGE_SIZE, "%d\n", ret);
 }
@@ -354,6 +355,7 @@ static ssize_t gt_max_freq_mhz_store(struct device *kdev,
 {
 	struct drm_i915_private *dev_priv = kdev_minor_to_i915(kdev);
 	struct intel_rps *rps = &dev_priv->gt_pm.rps;
+	intel_wakeref_t wakeref;
 	u32 val;
 	ssize_t ret;
 
@@ -361,7 +363,7 @@ static ssize_t gt_max_freq_mhz_store(struct device *kdev,
 	if (ret)
 		return ret;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	mutex_lock(&dev_priv->pcu_lock);
 
@@ -371,7 +373,7 @@ static ssize_t gt_max_freq_mhz_store(struct device *kdev,
 	    val > rps->max_freq ||
 	    val < rps->min_freq_softlimit) {
 		mutex_unlock(&dev_priv->pcu_lock);
-		intel_runtime_pm_put(dev_priv);
+		intel_runtime_pm_put(dev_priv, wakeref);
 		return -EINVAL;
 	}
 
@@ -392,7 +394,7 @@ static ssize_t gt_max_freq_mhz_store(struct device *kdev,
 
 	mutex_unlock(&dev_priv->pcu_lock);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return ret ?: count;
 }
@@ -412,6 +414,7 @@ static ssize_t gt_min_freq_mhz_store(struct device *kdev,
 {
 	struct drm_i915_private *dev_priv = kdev_minor_to_i915(kdev);
 	struct intel_rps *rps = &dev_priv->gt_pm.rps;
+	intel_wakeref_t wakeref;
 	u32 val;
 	ssize_t ret;
 
@@ -419,7 +422,7 @@ static ssize_t gt_min_freq_mhz_store(struct device *kdev,
 	if (ret)
 		return ret;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	mutex_lock(&dev_priv->pcu_lock);
 
@@ -429,7 +432,7 @@ static ssize_t gt_min_freq_mhz_store(struct device *kdev,
 	    val > rps->max_freq ||
 	    val > rps->max_freq_softlimit) {
 		mutex_unlock(&dev_priv->pcu_lock);
-		intel_runtime_pm_put(dev_priv);
+		intel_runtime_pm_put(dev_priv, wakeref);
 		return -EINVAL;
 	}
 
@@ -446,7 +449,7 @@ static ssize_t gt_min_freq_mhz_store(struct device *kdev,
 
 	mutex_unlock(&dev_priv->pcu_lock);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return ret ?: count;
 }
diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c
index 4667cc08c416..b2202d2e58a2 100644
--- a/drivers/gpu/drm/i915/i915_timeline.c
+++ b/drivers/gpu/drm/i915/i915_timeline.c
@@ -9,34 +9,199 @@
 #include "i915_timeline.h"
 #include "i915_syncmap.h"
 
-void i915_timeline_init(struct drm_i915_private *i915,
-			struct i915_timeline *timeline,
-			const char *name)
+struct i915_timeline_hwsp {
+	struct i915_vma *vma;
+	struct list_head free_link;
+	u64 free_bitmap;
+};
+
+static inline struct i915_timeline_hwsp *
+i915_timeline_hwsp(const struct i915_timeline *tl)
+{
+	return tl->hwsp_ggtt->private;
+}
+
+static struct i915_vma *__hwsp_alloc(struct drm_i915_private *i915)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+
+	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+	if (IS_ERR(obj))
+		return ERR_CAST(obj);
+
+	i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
+
+	vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+	if (IS_ERR(vma))
+		i915_gem_object_put(obj);
+
+	return vma;
+}
+
+static struct i915_vma *
+hwsp_alloc(struct i915_timeline *timeline, unsigned int *cacheline)
 {
-	lockdep_assert_held(&i915->drm.struct_mutex);
+	struct drm_i915_private *i915 = timeline->i915;
+	struct i915_gt_timelines *gt = &i915->gt.timelines;
+	struct i915_timeline_hwsp *hwsp;
+
+	BUILD_BUG_ON(BITS_PER_TYPE(u64) * CACHELINE_BYTES > PAGE_SIZE);
+
+	spin_lock(&gt->hwsp_lock);
+
+	/* hwsp_free_list only contains HWSP that have available cachelines */
+	hwsp = list_first_entry_or_null(&gt->hwsp_free_list,
+					typeof(*hwsp), free_link);
+	if (!hwsp) {
+		struct i915_vma *vma;
+
+		spin_unlock(&gt->hwsp_lock);
+
+		hwsp = kmalloc(sizeof(*hwsp), GFP_KERNEL);
+		if (!hwsp)
+			return ERR_PTR(-ENOMEM);
+
+		vma = __hwsp_alloc(i915);
+		if (IS_ERR(vma)) {
+			kfree(hwsp);
+			return vma;
+		}
+
+		vma->private = hwsp;
+		hwsp->vma = vma;
+		hwsp->free_bitmap = ~0ull;
+
+		spin_lock(&gt->hwsp_lock);
+		list_add(&hwsp->free_link, &gt->hwsp_free_list);
+	}
+
+	GEM_BUG_ON(!hwsp->free_bitmap);
+	*cacheline = __ffs64(hwsp->free_bitmap);
+	hwsp->free_bitmap &= ~BIT_ULL(*cacheline);
+	if (!hwsp->free_bitmap)
+		list_del(&hwsp->free_link);
+
+	spin_unlock(&gt->hwsp_lock);
+
+	GEM_BUG_ON(hwsp->vma->private != hwsp);
+	return hwsp->vma;
+}
+
+static void hwsp_free(struct i915_timeline *timeline)
+{
+	struct i915_gt_timelines *gt = &timeline->i915->gt.timelines;
+	struct i915_timeline_hwsp *hwsp;
+
+	hwsp = i915_timeline_hwsp(timeline);
+	if (!hwsp) /* leave global HWSP alone! */
+		return;
+
+	spin_lock(&gt->hwsp_lock);
+
+	/* As a cacheline becomes available, publish the HWSP on the freelist */
+	if (!hwsp->free_bitmap)
+		list_add_tail(&hwsp->free_link, &gt->hwsp_free_list);
+
+	hwsp->free_bitmap |= BIT_ULL(timeline->hwsp_offset / CACHELINE_BYTES);
+
+	/* And if no one is left using it, give the page back to the system */
+	if (hwsp->free_bitmap == ~0ull) {
+		i915_vma_put(hwsp->vma);
+		list_del(&hwsp->free_link);
+		kfree(hwsp);
+	}
+
+	spin_unlock(&gt->hwsp_lock);
+}
+
+int i915_timeline_init(struct drm_i915_private *i915,
+		       struct i915_timeline *timeline,
+		       const char *name,
+		       struct i915_vma *hwsp)
+{
+	void *vaddr;
 
 	/*
 	 * Ideally we want a set of engines on a single leaf as we expect
 	 * to mostly be tracking synchronisation between engines. It is not
 	 * a huge issue if this is not the case, but we may want to mitigate
 	 * any page crossing penalties if they become an issue.
+	 *
+	 * Called during early_init before we know how many engines there are.
 	 */
 	BUILD_BUG_ON(KSYNCMAP < I915_NUM_ENGINES);
 
+	timeline->i915 = i915;
 	timeline->name = name;
+	timeline->pin_count = 0;
+	timeline->has_initial_breadcrumb = !hwsp;
 
-	list_add(&timeline->link, &i915->gt.timelines);
+	timeline->hwsp_offset = I915_GEM_HWS_SEQNO_ADDR;
+	if (!hwsp) {
+		unsigned int cacheline;
+
+		hwsp = hwsp_alloc(timeline, &cacheline);
+		if (IS_ERR(hwsp))
+			return PTR_ERR(hwsp);
+
+		timeline->hwsp_offset = cacheline * CACHELINE_BYTES;
+	}
+	timeline->hwsp_ggtt = i915_vma_get(hwsp);
+
+	vaddr = i915_gem_object_pin_map(hwsp->obj, I915_MAP_WB);
+	if (IS_ERR(vaddr)) {
+		hwsp_free(timeline);
+		i915_vma_put(hwsp);
+		return PTR_ERR(vaddr);
+	}
 
-	/* Called during early_init before we know how many engines there are */
+	timeline->hwsp_seqno =
+		memset(vaddr + timeline->hwsp_offset, 0, CACHELINE_BYTES);
 
 	timeline->fence_context = dma_fence_context_alloc(1);
 
 	spin_lock_init(&timeline->lock);
 
-	init_request_active(&timeline->last_request, NULL);
+	INIT_ACTIVE_REQUEST(&timeline->barrier);
+	INIT_ACTIVE_REQUEST(&timeline->last_request);
 	INIT_LIST_HEAD(&timeline->requests);
 
 	i915_syncmap_init(&timeline->sync);
+
+	return 0;
+}
+
+void i915_timelines_init(struct drm_i915_private *i915)
+{
+	struct i915_gt_timelines *gt = &i915->gt.timelines;
+
+	mutex_init(&gt->mutex);
+	INIT_LIST_HEAD(&gt->active_list);
+
+	spin_lock_init(&gt->hwsp_lock);
+	INIT_LIST_HEAD(&gt->hwsp_free_list);
+
+	/* via i915_gem_wait_for_idle() */
+	i915_gem_shrinker_taints_mutex(i915, &gt->mutex);
+}
+
+static void timeline_add_to_active(struct i915_timeline *tl)
+{
+	struct i915_gt_timelines *gt = &tl->i915->gt.timelines;
+
+	mutex_lock(&gt->mutex);
+	list_add(&tl->link, &gt->active_list);
+	mutex_unlock(&gt->mutex);
+}
+
+static void timeline_remove_from_active(struct i915_timeline *tl)
+{
+	struct i915_gt_timelines *gt = &tl->i915->gt.timelines;
+
+	mutex_lock(&gt->mutex);
+	list_del(&tl->link);
+	mutex_unlock(&gt->mutex);
 }
 
 /**
@@ -51,11 +216,11 @@ void i915_timeline_init(struct drm_i915_private *i915,
  */
 void i915_timelines_park(struct drm_i915_private *i915)
 {
+	struct i915_gt_timelines *gt = &i915->gt.timelines;
 	struct i915_timeline *timeline;
 
-	lockdep_assert_held(&i915->drm.struct_mutex);
-
-	list_for_each_entry(timeline, &i915->gt.timelines, link) {
+	mutex_lock(&gt->mutex);
+	list_for_each_entry(timeline, &gt->active_list, link) {
 		/*
 		 * All known fences are completed so we can scrap
 		 * the current sync point tracking and start afresh,
@@ -64,32 +229,88 @@ void i915_timelines_park(struct drm_i915_private *i915)
 		 */
 		i915_syncmap_free(&timeline->sync);
 	}
+	mutex_unlock(&gt->mutex);
 }
 
 void i915_timeline_fini(struct i915_timeline *timeline)
 {
+	GEM_BUG_ON(timeline->pin_count);
 	GEM_BUG_ON(!list_empty(&timeline->requests));
+	GEM_BUG_ON(i915_active_request_isset(&timeline->barrier));
 
 	i915_syncmap_free(&timeline->sync);
+	hwsp_free(timeline);
 
-	list_del(&timeline->link);
+	i915_gem_object_unpin_map(timeline->hwsp_ggtt->obj);
+	i915_vma_put(timeline->hwsp_ggtt);
 }
 
 struct i915_timeline *
-i915_timeline_create(struct drm_i915_private *i915, const char *name)
+i915_timeline_create(struct drm_i915_private *i915,
+		     const char *name,
+		     struct i915_vma *global_hwsp)
 {
 	struct i915_timeline *timeline;
+	int err;
 
 	timeline = kzalloc(sizeof(*timeline), GFP_KERNEL);
 	if (!timeline)
 		return ERR_PTR(-ENOMEM);
 
-	i915_timeline_init(i915, timeline, name);
+	err = i915_timeline_init(i915, timeline, name, global_hwsp);
+	if (err) {
+		kfree(timeline);
+		return ERR_PTR(err);
+	}
+
 	kref_init(&timeline->kref);
 
 	return timeline;
 }
 
+int i915_timeline_pin(struct i915_timeline *tl)
+{
+	int err;
+
+	if (tl->pin_count++)
+		return 0;
+	GEM_BUG_ON(!tl->pin_count);
+
+	err = i915_vma_pin(tl->hwsp_ggtt, 0, 0, PIN_GLOBAL | PIN_HIGH);
+	if (err)
+		goto unpin;
+
+	tl->hwsp_offset =
+		i915_ggtt_offset(tl->hwsp_ggtt) +
+		offset_in_page(tl->hwsp_offset);
+
+	timeline_add_to_active(tl);
+
+	return 0;
+
+unpin:
+	tl->pin_count = 0;
+	return err;
+}
+
+void i915_timeline_unpin(struct i915_timeline *tl)
+{
+	GEM_BUG_ON(!tl->pin_count);
+	if (--tl->pin_count)
+		return;
+
+	timeline_remove_from_active(tl);
+
+	/*
+	 * Since this timeline is idle, all bariers upon which we were waiting
+	 * must also be complete and so we can discard the last used barriers
+	 * without loss of information.
+	 */
+	i915_syncmap_free(&tl->sync);
+
+	__i915_vma_unpin(tl->hwsp_ggtt);
+}
+
 void __i915_timeline_free(struct kref *kref)
 {
 	struct i915_timeline *timeline =
@@ -99,6 +320,16 @@ void __i915_timeline_free(struct kref *kref)
 	kfree(timeline);
 }
 
+void i915_timelines_fini(struct drm_i915_private *i915)
+{
+	struct i915_gt_timelines *gt = &i915->gt.timelines;
+
+	GEM_BUG_ON(!list_empty(&gt->active_list));
+	GEM_BUG_ON(!list_empty(&gt->hwsp_free_list));
+
+	mutex_destroy(&gt->mutex);
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_timeline.c"
 #include "selftests/i915_timeline.c"
diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h
index ebd71b487220..7bec7d2e45bf 100644
--- a/drivers/gpu/drm/i915/i915_timeline.h
+++ b/drivers/gpu/drm/i915/i915_timeline.h
@@ -28,10 +28,14 @@
 #include <linux/list.h>
 #include <linux/kref.h>
 
+#include "i915_active.h"
 #include "i915_request.h"
 #include "i915_syncmap.h"
 #include "i915_utils.h"
 
+struct i915_vma;
+struct i915_timeline_hwsp;
+
 struct i915_timeline {
 	u64 fence_context;
 	u32 seqno;
@@ -40,6 +44,13 @@ struct i915_timeline {
 #define TIMELINE_CLIENT 0 /* default subclass */
 #define TIMELINE_ENGINE 1
 
+	unsigned int pin_count;
+	const u32 *hwsp_seqno;
+	struct i915_vma *hwsp_ggtt;
+	u32 hwsp_offset;
+
+	bool has_initial_breadcrumb;
+
 	/**
 	 * List of breadcrumbs associated with GPU requests currently
 	 * outstanding.
@@ -48,10 +59,10 @@ struct i915_timeline {
 
 	/* Contains an RCU guarded pointer to the last request. No reference is
 	 * held to the request, users must carefully acquire a reference to
-	 * the request using i915_gem_active_get_request_rcu(), or hold the
+	 * the request using i915_active_request_get_request_rcu(), or hold the
 	 * struct_mutex.
 	 */
-	struct i915_gem_active last_request;
+	struct i915_active_request last_request;
 
 	/**
 	 * We track the most recent seqno that we wait on in every context so
@@ -63,24 +74,28 @@ struct i915_timeline {
 	 * redundant and we can discard it without loss of generality.
 	 */
 	struct i915_syncmap *sync;
+
 	/**
-	 * Separately to the inter-context seqno map above, we track the last
-	 * barrier (e.g. semaphore wait) to the global engine timelines. Note
-	 * that this tracks global_seqno rather than the context.seqno, and
-	 * so it is subject to the limitations of hw wraparound and that we
-	 * may need to revoke global_seqno (on pre-emption).
+	 * Barrier provides the ability to serialize ordering between different
+	 * timelines.
+	 *
+	 * Users can call i915_timeline_set_barrier which will make all
+	 * subsequent submissions to this timeline be executed only after the
+	 * barrier has been completed.
 	 */
-	u32 global_sync[I915_NUM_ENGINES];
+	struct i915_active_request barrier;
 
 	struct list_head link;
 	const char *name;
+	struct drm_i915_private *i915;
 
 	struct kref kref;
 };
 
-void i915_timeline_init(struct drm_i915_private *i915,
-			struct i915_timeline *tl,
-			const char *name);
+int i915_timeline_init(struct drm_i915_private *i915,
+		       struct i915_timeline *tl,
+		       const char *name,
+		       struct i915_vma *hwsp);
 void i915_timeline_fini(struct i915_timeline *tl);
 
 static inline void
@@ -103,7 +118,9 @@ i915_timeline_set_subclass(struct i915_timeline *timeline,
 }
 
 struct i915_timeline *
-i915_timeline_create(struct drm_i915_private *i915, const char *name);
+i915_timeline_create(struct drm_i915_private *i915,
+		     const char *name,
+		     struct i915_vma *global_hwsp);
 
 static inline struct i915_timeline *
 i915_timeline_get(struct i915_timeline *timeline)
@@ -142,6 +159,26 @@ static inline bool i915_timeline_sync_is_later(struct i915_timeline *tl,
 	return __i915_timeline_sync_is_later(tl, fence->context, fence->seqno);
 }
 
+int i915_timeline_pin(struct i915_timeline *tl);
+void i915_timeline_unpin(struct i915_timeline *tl);
+
+void i915_timelines_init(struct drm_i915_private *i915);
 void i915_timelines_park(struct drm_i915_private *i915);
+void i915_timelines_fini(struct drm_i915_private *i915);
+
+/**
+ * i915_timeline_set_barrier - orders submission between different timelines
+ * @timeline: timeline to set the barrier on
+ * @rq: request after which new submissions can proceed
+ *
+ * Sets the passed in request as the serialization point for all subsequent
+ * submissions on @timeline. Subsequent requests will not be submitted to GPU
+ * until the barrier has been completed.
+ */
+static inline int
+i915_timeline_set_barrier(struct i915_timeline *tl, struct i915_request *rq)
+{
+	return i915_active_request_set(&tl->barrier, rq);
+}
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index b50c6b829715..eab313c3163c 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -6,7 +6,8 @@
 #include <linux/types.h>
 #include <linux/tracepoint.h>
 
-#include <drm/drmP.h>
+#include <drm/drm_drv.h>
+
 #include "i915_drv.h"
 #include "intel_drv.h"
 #include "intel_ringbuffer.h"
@@ -585,35 +586,6 @@ TRACE_EVENT(i915_gem_evict_vm,
 	    TP_printk("dev=%d, vm=%p", __entry->dev, __entry->vm)
 );
 
-TRACE_EVENT(i915_gem_ring_sync_to,
-	    TP_PROTO(struct i915_request *to, struct i915_request *from),
-	    TP_ARGS(to, from),
-
-	    TP_STRUCT__entry(
-			     __field(u32, dev)
-			     __field(u32, from_class)
-			     __field(u32, from_instance)
-			     __field(u32, to_class)
-			     __field(u32, to_instance)
-			     __field(u32, seqno)
-			     ),
-
-	    TP_fast_assign(
-			   __entry->dev = from->i915->drm.primary->index;
-			   __entry->from_class = from->engine->uabi_class;
-			   __entry->from_instance = from->engine->instance;
-			   __entry->to_class = to->engine->uabi_class;
-			   __entry->to_instance = to->engine->instance;
-			   __entry->seqno = from->global_seqno;
-			   ),
-
-	    TP_printk("dev=%u, sync-from=%u:%u, sync-to=%u:%u, seqno=%u",
-		      __entry->dev,
-		      __entry->from_class, __entry->from_instance,
-		      __entry->to_class, __entry->to_instance,
-		      __entry->seqno)
-);
-
 TRACE_EVENT(i915_request_queue,
 	    TP_PROTO(struct i915_request *rq, u32 flags),
 	    TP_ARGS(rq, flags),
@@ -780,31 +752,6 @@ trace_i915_request_out(struct i915_request *rq)
 #endif
 #endif
 
-TRACE_EVENT(intel_engine_notify,
-	    TP_PROTO(struct intel_engine_cs *engine, bool waiters),
-	    TP_ARGS(engine, waiters),
-
-	    TP_STRUCT__entry(
-			     __field(u32, dev)
-			     __field(u16, class)
-			     __field(u16, instance)
-			     __field(u32, seqno)
-			     __field(bool, waiters)
-			     ),
-
-	    TP_fast_assign(
-			   __entry->dev = engine->i915->drm.primary->index;
-			   __entry->class = engine->uabi_class;
-			   __entry->instance = engine->instance;
-			   __entry->seqno = intel_engine_get_seqno(engine);
-			   __entry->waiters = waiters;
-			   ),
-
-	    TP_printk("dev=%u, engine=%u:%u, seqno=%u, waiters=%u",
-		      __entry->dev, __entry->class, __entry->instance,
-		      __entry->seqno, __entry->waiters)
-);
-
 DEFINE_EVENT(i915_request, i915_request_retire,
 	    TP_PROTO(struct i915_request *rq),
 	    TP_ARGS(rq)
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 5b4d78cdb4ca..b713bed20c38 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -63,24 +63,22 @@ static void vma_print_allocator(struct i915_vma *vma, const char *reason)
 
 #endif
 
-struct i915_vma_active {
-	struct i915_gem_active base;
-	struct i915_vma *vma;
-	struct rb_node node;
-	u64 timeline;
-};
-
-static void
-__i915_vma_retire(struct i915_vma *vma, struct i915_request *rq)
+static void obj_bump_mru(struct drm_i915_gem_object *obj)
 {
-	struct drm_i915_gem_object *obj = vma->obj;
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 
-	GEM_BUG_ON(!i915_vma_is_active(vma));
-	if (--vma->active_count)
-		return;
+	spin_lock(&i915->mm.obj_lock);
+	if (obj->bind_count)
+		list_move_tail(&obj->mm.link, &i915->mm.bound_list);
+	spin_unlock(&i915->mm.obj_lock);
 
-	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
-	list_move_tail(&vma->vm_link, &vma->vm->inactive_list);
+	obj->mm.dirty = true; /* be paranoid  */
+}
+
+static void __i915_vma_retire(struct i915_active *ref)
+{
+	struct i915_vma *vma = container_of(ref, typeof(*vma), active);
+	struct drm_i915_gem_object *obj = vma->obj;
 
 	GEM_BUG_ON(!i915_gem_object_is_active(obj));
 	if (--obj->active_count)
@@ -93,16 +91,12 @@ __i915_vma_retire(struct i915_vma *vma, struct i915_request *rq)
 		reservation_object_unlock(obj->resv);
 	}
 
-	/* Bump our place on the bound list to keep it roughly in LRU order
+	/*
+	 * Bump our place on the bound list to keep it roughly in LRU order
 	 * so that we don't steal from recently used but inactive objects
 	 * (unless we are forced to ofc!)
 	 */
-	spin_lock(&rq->i915->mm.obj_lock);
-	if (obj->bind_count)
-		list_move_tail(&obj->mm.link, &rq->i915->mm.bound_list);
-	spin_unlock(&rq->i915->mm.obj_lock);
-
-	obj->mm.dirty = true; /* be paranoid  */
+	obj_bump_mru(obj);
 
 	if (i915_gem_object_has_active_reference(obj)) {
 		i915_gem_object_clear_active_reference(obj);
@@ -110,21 +104,6 @@ __i915_vma_retire(struct i915_vma *vma, struct i915_request *rq)
 	}
 }
 
-static void
-i915_vma_retire(struct i915_gem_active *base, struct i915_request *rq)
-{
-	struct i915_vma_active *active =
-		container_of(base, typeof(*active), base);
-
-	__i915_vma_retire(active->vma, rq);
-}
-
-static void
-i915_vma_last_retire(struct i915_gem_active *base, struct i915_request *rq)
-{
-	__i915_vma_retire(container_of(base, struct i915_vma, last_active), rq);
-}
-
 static struct i915_vma *
 vma_create(struct drm_i915_gem_object *obj,
 	   struct i915_address_space *vm,
@@ -140,10 +119,9 @@ vma_create(struct drm_i915_gem_object *obj,
 	if (vma == NULL)
 		return ERR_PTR(-ENOMEM);
 
-	vma->active = RB_ROOT;
+	i915_active_init(vm->i915, &vma->active, __i915_vma_retire);
+	INIT_ACTIVE_REQUEST(&vma->last_fence);
 
-	init_request_active(&vma->last_active, i915_vma_last_retire);
-	init_request_active(&vma->last_fence, NULL);
 	vma->vm = vm;
 	vma->ops = &vm->vma_ops;
 	vma->obj = obj;
@@ -190,33 +168,56 @@ vma_create(struct drm_i915_gem_object *obj,
 								i915_gem_object_get_stride(obj));
 		GEM_BUG_ON(!is_power_of_2(vma->fence_alignment));
 
-		/*
-		 * We put the GGTT vma at the start of the vma-list, followed
-		 * by the ppGGTT vma. This allows us to break early when
-		 * iterating over only the GGTT vma for an object, see
-		 * for_each_ggtt_vma()
-		 */
 		vma->flags |= I915_VMA_GGTT;
-		list_add(&vma->obj_link, &obj->vma_list);
-	} else {
-		list_add_tail(&vma->obj_link, &obj->vma_list);
 	}
 
+	spin_lock(&obj->vma.lock);
+
 	rb = NULL;
-	p = &obj->vma_tree.rb_node;
+	p = &obj->vma.tree.rb_node;
 	while (*p) {
 		struct i915_vma *pos;
+		long cmp;
 
 		rb = *p;
 		pos = rb_entry(rb, struct i915_vma, obj_node);
-		if (i915_vma_compare(pos, vm, view) < 0)
+
+		/*
+		 * If the view already exists in the tree, another thread
+		 * already created a matching vma, so return the older instance
+		 * and dispose of ours.
+		 */
+		cmp = i915_vma_compare(pos, vm, view);
+		if (cmp == 0) {
+			spin_unlock(&obj->vma.lock);
+			kmem_cache_free(vm->i915->vmas, vma);
+			return pos;
+		}
+
+		if (cmp < 0)
 			p = &rb->rb_right;
 		else
 			p = &rb->rb_left;
 	}
 	rb_link_node(&vma->obj_node, rb, p);
-	rb_insert_color(&vma->obj_node, &obj->vma_tree);
+	rb_insert_color(&vma->obj_node, &obj->vma.tree);
+
+	if (i915_vma_is_ggtt(vma))
+		/*
+		 * We put the GGTT vma at the start of the vma-list, followed
+		 * by the ppGGTT vma. This allows us to break early when
+		 * iterating over only the GGTT vma for an object, see
+		 * for_each_ggtt_vma()
+		 */
+		list_add(&vma->obj_link, &obj->vma.list);
+	else
+		list_add_tail(&vma->obj_link, &obj->vma.list);
+
+	spin_unlock(&obj->vma.lock);
+
+	mutex_lock(&vm->mutex);
 	list_add(&vma->vm_link, &vm->unbound_list);
+	mutex_unlock(&vm->mutex);
 
 	return vma;
 
@@ -232,7 +233,7 @@ vma_lookup(struct drm_i915_gem_object *obj,
 {
 	struct rb_node *rb;
 
-	rb = obj->vma_tree.rb_node;
+	rb = obj->vma.tree.rb_node;
 	while (rb) {
 		struct i915_vma *vma = rb_entry(rb, struct i915_vma, obj_node);
 		long cmp;
@@ -272,16 +273,18 @@ i915_vma_instance(struct drm_i915_gem_object *obj,
 {
 	struct i915_vma *vma;
 
-	lockdep_assert_held(&obj->base.dev->struct_mutex);
 	GEM_BUG_ON(view && !i915_is_ggtt(vm));
 	GEM_BUG_ON(vm->closed);
 
+	spin_lock(&obj->vma.lock);
 	vma = vma_lookup(obj, vm, view);
-	if (!vma)
+	spin_unlock(&obj->vma.lock);
+
+	/* vma_create() will resolve the race if another creates the vma */
+	if (unlikely(!vma))
 		vma = vma_create(obj, vm, view);
 
 	GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
-	GEM_BUG_ON(!IS_ERR(vma) && vma_lookup(obj, vm, view) != vma);
 	return vma;
 }
 
@@ -659,7 +662,9 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
 	GEM_BUG_ON(!i915_gem_valid_gtt_space(vma, cache_level));
 
-	list_move_tail(&vma->vm_link, &vma->vm->inactive_list);
+	mutex_lock(&vma->vm->mutex);
+	list_move_tail(&vma->vm_link, &vma->vm->bound_list);
+	mutex_unlock(&vma->vm->mutex);
 
 	if (vma->obj) {
 		struct drm_i915_gem_object *obj = vma->obj;
@@ -692,8 +697,10 @@ i915_vma_remove(struct i915_vma *vma)
 
 	vma->ops->clear_pages(vma);
 
+	mutex_lock(&vma->vm->mutex);
 	drm_mm_remove_node(&vma->node);
 	list_move_tail(&vma->vm_link, &vma->vm->unbound_list);
+	mutex_unlock(&vma->vm->mutex);
 
 	/*
 	 * Since the unbound list is global, only move to that list if
@@ -797,23 +804,27 @@ void i915_vma_reopen(struct i915_vma *vma)
 static void __i915_vma_destroy(struct i915_vma *vma)
 {
 	struct drm_i915_private *i915 = vma->vm->i915;
-	struct i915_vma_active *iter, *n;
 
 	GEM_BUG_ON(vma->node.allocated);
 	GEM_BUG_ON(vma->fence);
 
-	GEM_BUG_ON(i915_gem_active_isset(&vma->last_fence));
+	GEM_BUG_ON(i915_active_request_isset(&vma->last_fence));
 
-	list_del(&vma->obj_link);
+	mutex_lock(&vma->vm->mutex);
 	list_del(&vma->vm_link);
-	if (vma->obj)
-		rb_erase(&vma->obj_node, &vma->obj->vma_tree);
+	mutex_unlock(&vma->vm->mutex);
+
+	if (vma->obj) {
+		struct drm_i915_gem_object *obj = vma->obj;
 
-	rbtree_postorder_for_each_entry_safe(iter, n, &vma->active, node) {
-		GEM_BUG_ON(i915_gem_active_isset(&iter->base));
-		kfree(iter);
+		spin_lock(&obj->vma.lock);
+		list_del(&vma->obj_link);
+		rb_erase(&vma->obj_node, &vma->obj->vma.tree);
+		spin_unlock(&obj->vma.lock);
 	}
 
+	i915_active_fini(&vma->active);
+
 	kmem_cache_free(i915->vmas, vma);
 }
 
@@ -897,104 +908,15 @@ static void export_fence(struct i915_vma *vma,
 	reservation_object_unlock(resv);
 }
 
-static struct i915_gem_active *active_instance(struct i915_vma *vma, u64 idx)
-{
-	struct i915_vma_active *active;
-	struct rb_node **p, *parent;
-	struct i915_request *old;
-
-	/*
-	 * We track the most recently used timeline to skip a rbtree search
-	 * for the common case, under typical loads we never need the rbtree
-	 * at all. We can reuse the last_active slot if it is empty, that is
-	 * after the previous activity has been retired, or if the active
-	 * matches the current timeline.
-	 *
-	 * Note that we allow the timeline to be active simultaneously in
-	 * the rbtree and the last_active cache. We do this to avoid having
-	 * to search and replace the rbtree element for a new timeline, with
-	 * the cost being that we must be aware that the vma may be retired
-	 * twice for the same timeline (as the older rbtree element will be
-	 * retired before the new request added to last_active).
-	 */
-	old = i915_gem_active_raw(&vma->last_active,
-				  &vma->vm->i915->drm.struct_mutex);
-	if (!old || old->fence.context == idx)
-		goto out;
-
-	/* Move the currently active fence into the rbtree */
-	idx = old->fence.context;
-
-	parent = NULL;
-	p = &vma->active.rb_node;
-	while (*p) {
-		parent = *p;
-
-		active = rb_entry(parent, struct i915_vma_active, node);
-		if (active->timeline == idx)
-			goto replace;
-
-		if (active->timeline < idx)
-			p = &parent->rb_right;
-		else
-			p = &parent->rb_left;
-	}
-
-	active = kmalloc(sizeof(*active), GFP_KERNEL);
-
-	/* kmalloc may retire the vma->last_active request (thanks shrinker)! */
-	if (unlikely(!i915_gem_active_raw(&vma->last_active,
-					  &vma->vm->i915->drm.struct_mutex))) {
-		kfree(active);
-		goto out;
-	}
-
-	if (unlikely(!active))
-		return ERR_PTR(-ENOMEM);
-
-	init_request_active(&active->base, i915_vma_retire);
-	active->vma = vma;
-	active->timeline = idx;
-
-	rb_link_node(&active->node, parent, p);
-	rb_insert_color(&active->node, &vma->active);
-
-replace:
-	/*
-	 * Overwrite the previous active slot in the rbtree with last_active,
-	 * leaving last_active zeroed. If the previous slot is still active,
-	 * we must be careful as we now only expect to receive one retire
-	 * callback not two, and so much undo the active counting for the
-	 * overwritten slot.
-	 */
-	if (i915_gem_active_isset(&active->base)) {
-		/* Retire ourselves from the old rq->active_list */
-		__list_del_entry(&active->base.link);
-		vma->active_count--;
-		GEM_BUG_ON(!vma->active_count);
-	}
-	GEM_BUG_ON(list_empty(&vma->last_active.link));
-	list_replace_init(&vma->last_active.link, &active->base.link);
-	active->base.request = fetch_and_zero(&vma->last_active.request);
-
-out:
-	return &vma->last_active;
-}
-
 int i915_vma_move_to_active(struct i915_vma *vma,
 			    struct i915_request *rq,
 			    unsigned int flags)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
-	struct i915_gem_active *active;
 
 	lockdep_assert_held(&rq->i915->drm.struct_mutex);
 	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
 
-	active = active_instance(vma, rq->fence.context);
-	if (IS_ERR(active))
-		return PTR_ERR(active);
-
 	/*
 	 * Add a reference if we're newly entering the active list.
 	 * The order in which we add operations to the retirement queue is
@@ -1003,11 +925,15 @@ int i915_vma_move_to_active(struct i915_vma *vma,
 	 * add the active reference first and queue for it to be dropped
 	 * *last*.
 	 */
-	if (!i915_gem_active_isset(active) && !vma->active_count++) {
-		list_move_tail(&vma->vm_link, &vma->vm->active_list);
+	if (!vma->active.count)
 		obj->active_count++;
+
+	if (unlikely(i915_active_ref(&vma->active, rq->fence.context, rq))) {
+		if (!vma->active.count)
+			obj->active_count--;
+		return -ENOMEM;
 	}
-	i915_gem_active_set(active, rq);
+
 	GEM_BUG_ON(!i915_vma_is_active(vma));
 	GEM_BUG_ON(!obj->active_count);
 
@@ -1016,14 +942,14 @@ int i915_vma_move_to_active(struct i915_vma *vma,
 		obj->write_domain = I915_GEM_DOMAIN_RENDER;
 
 		if (intel_fb_obj_invalidate(obj, ORIGIN_CS))
-			i915_gem_active_set(&obj->frontbuffer_write, rq);
+			__i915_active_request_set(&obj->frontbuffer_write, rq);
 
 		obj->read_domains = 0;
 	}
 	obj->read_domains |= I915_GEM_GPU_DOMAINS;
 
 	if (flags & EXEC_OBJECT_NEEDS_FENCE)
-		i915_gem_active_set(&vma->last_fence, rq);
+		__i915_active_request_set(&vma->last_fence, rq);
 
 	export_fence(vma, rq, flags);
 	return 0;
@@ -1041,8 +967,6 @@ int i915_vma_unbind(struct i915_vma *vma)
 	 */
 	might_sleep();
 	if (i915_vma_is_active(vma)) {
-		struct i915_vma_active *active, *n;
-
 		/*
 		 * When a closed VMA is retired, it is unbound - eek.
 		 * In order to prevent it from being recursively closed,
@@ -1058,21 +982,12 @@ int i915_vma_unbind(struct i915_vma *vma)
 		 */
 		__i915_vma_pin(vma);
 
-		ret = i915_gem_active_retire(&vma->last_active,
-					     &vma->vm->i915->drm.struct_mutex);
+		ret = i915_active_wait(&vma->active);
 		if (ret)
 			goto unpin;
 
-		rbtree_postorder_for_each_entry_safe(active, n,
-						     &vma->active, node) {
-			ret = i915_gem_active_retire(&active->base,
-						     &vma->vm->i915->drm.struct_mutex);
-			if (ret)
-				goto unpin;
-		}
-
-		ret = i915_gem_active_retire(&vma->last_fence,
-					     &vma->vm->i915->drm.struct_mutex);
+		ret = i915_active_request_retire(&vma->last_fence,
+					      &vma->vm->i915->drm.struct_mutex);
 unpin:
 		__i915_vma_unpin(vma);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 4f7c1c7599f4..7c742027f866 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -34,6 +34,7 @@
 #include "i915_gem_fence_reg.h"
 #include "i915_gem_object.h"
 
+#include "i915_active.h"
 #include "i915_request.h"
 
 enum i915_cache_level;
@@ -71,34 +72,45 @@ struct i915_vma {
 	unsigned int open_count;
 	unsigned long flags;
 	/**
-	 * How many users have pinned this object in GTT space. The following
-	 * users can each hold at most one reference: pwrite/pread, execbuffer
-	 * (objects are not allowed multiple times for the same batchbuffer),
-	 * and the framebuffer code. When switching/pageflipping, the
-	 * framebuffer code has at most two buffers pinned per crtc.
+	 * How many users have pinned this object in GTT space.
 	 *
-	 * In the worst case this is 1 + 1 + 1 + 2*2 = 7. That would fit into 3
-	 * bits with absolutely no headroom. So use 4 bits.
+	 * This is a tightly bound, fairly small number of users, so we
+	 * stuff inside the flags field so that we can both check for overflow
+	 * and detect a no-op i915_vma_pin() in a single check, while also
+	 * pinning the vma.
+	 *
+	 * The worst case display setup would have the same vma pinned for
+	 * use on each plane on each crtc, while also building the next atomic
+	 * state and holding a pin for the length of the cleanup queue. In the
+	 * future, the flip queue may be increased from 1.
+	 * Estimated worst case: 3 [qlen] * 4 [max crtcs] * 7 [max planes] = 84
+	 *
+	 * For GEM, the number of concurrent users for pwrite/pread is
+	 * unbounded. For execbuffer, it is currently one but will in future
+	 * be extended to allow multiple clients to pin vma concurrently.
+	 *
+	 * We also use suballocated pages, with each suballocation claiming
+	 * its own pin on the shared vma. At present, this is limited to
+	 * exclusive cachelines of a single page, so a maximum of 64 possible
+	 * users.
 	 */
-#define I915_VMA_PIN_MASK 0xf
-#define I915_VMA_PIN_OVERFLOW	BIT(5)
+#define I915_VMA_PIN_MASK 0xff
+#define I915_VMA_PIN_OVERFLOW	BIT(8)
 
 	/** Flags and address space this VMA is bound to */
-#define I915_VMA_GLOBAL_BIND	BIT(6)
-#define I915_VMA_LOCAL_BIND	BIT(7)
+#define I915_VMA_GLOBAL_BIND	BIT(9)
+#define I915_VMA_LOCAL_BIND	BIT(10)
 #define I915_VMA_BIND_MASK (I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND | I915_VMA_PIN_OVERFLOW)
 
-#define I915_VMA_GGTT		BIT(8)
-#define I915_VMA_CAN_FENCE	BIT(9)
-#define I915_VMA_CLOSED		BIT(10)
-#define I915_VMA_USERFAULT_BIT	11
+#define I915_VMA_GGTT		BIT(11)
+#define I915_VMA_CAN_FENCE	BIT(12)
+#define I915_VMA_CLOSED		BIT(13)
+#define I915_VMA_USERFAULT_BIT	14
 #define I915_VMA_USERFAULT	BIT(I915_VMA_USERFAULT_BIT)
-#define I915_VMA_GGTT_WRITE	BIT(12)
+#define I915_VMA_GGTT_WRITE	BIT(15)
 
-	unsigned int active_count;
-	struct rb_root active;
-	struct i915_gem_active last_active;
-	struct i915_gem_active last_fence;
+	struct i915_active active;
+	struct i915_active_request last_fence;
 
 	/**
 	 * Support different GGTT views into the same object.
@@ -141,9 +153,9 @@ i915_vma_instance(struct drm_i915_gem_object *obj,
 void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 #define I915_VMA_RELEASE_MAP BIT(0)
 
-static inline bool i915_vma_is_active(struct i915_vma *vma)
+static inline bool i915_vma_is_active(const struct i915_vma *vma)
 {
-	return vma->active_count;
+	return !i915_active_is_idle(&vma->active);
 }
 
 int __must_check i915_vma_move_to_active(struct i915_vma *vma,
@@ -425,7 +437,7 @@ void i915_vma_parked(struct drm_i915_private *i915);
  * or the list is empty ofc.
  */
 #define for_each_ggtt_vma(V, OBJ) \
-	list_for_each_entry(V, &(OBJ)->vma_list, obj_link)		\
+	list_for_each_entry(V, &(OBJ)->vma.list, obj_link)		\
 		for_each_until(!i915_vma_is_ggtt(V))
 
 #endif
diff --git a/drivers/gpu/drm/i915/icl_dsi.c b/drivers/gpu/drm/i915/icl_dsi.c
index 4dd793b78996..73a7bee24a66 100644
--- a/drivers/gpu/drm/i915/icl_dsi.c
+++ b/drivers/gpu/drm/i915/icl_dsi.c
@@ -337,9 +337,11 @@ static void gen11_dsi_enable_io_power(struct intel_encoder *encoder)
 	}
 
 	for_each_dsi_port(port, intel_dsi->ports) {
-		intel_display_power_get(dev_priv, port == PORT_A ?
-					POWER_DOMAIN_PORT_DDI_A_IO :
-					POWER_DOMAIN_PORT_DDI_B_IO);
+		intel_dsi->io_wakeref[port] =
+			intel_display_power_get(dev_priv,
+						port == PORT_A ?
+						POWER_DOMAIN_PORT_DDI_A_IO :
+						POWER_DOMAIN_PORT_DDI_B_IO);
 	}
 }
 
@@ -1125,10 +1127,18 @@ static void gen11_dsi_disable_io_power(struct intel_encoder *encoder)
 	enum port port;
 	u32 tmp;
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PORT_DDI_A_IO);
-
-	if (intel_dsi->dual_link)
-		intel_display_power_put(dev_priv, POWER_DOMAIN_PORT_DDI_B_IO);
+	for_each_dsi_port(port, intel_dsi->ports) {
+		intel_wakeref_t wakeref;
+
+		wakeref = fetch_and_zero(&intel_dsi->io_wakeref[port]);
+		if (wakeref) {
+			intel_display_power_put(dev_priv,
+						port == PORT_A ?
+						POWER_DOMAIN_PORT_DDI_A_IO :
+						POWER_DOMAIN_PORT_DDI_B_IO,
+						wakeref);
+		}
+	}
 
 	/* set mode to DDI */
 	for_each_dsi_port(port, intel_dsi->ports) {
@@ -1178,9 +1188,9 @@ static void gen11_dsi_get_config(struct intel_encoder *encoder,
 	pipe_config->output_types |= BIT(INTEL_OUTPUT_DSI);
 }
 
-static bool gen11_dsi_compute_config(struct intel_encoder *encoder,
-				     struct intel_crtc_state *pipe_config,
-				     struct drm_connector_state *conn_state)
+static int gen11_dsi_compute_config(struct intel_encoder *encoder,
+				    struct intel_crtc_state *pipe_config,
+				    struct drm_connector_state *conn_state)
 {
 	struct intel_dsi *intel_dsi = container_of(encoder, struct intel_dsi,
 						   base);
@@ -1205,7 +1215,7 @@ static bool gen11_dsi_compute_config(struct intel_encoder *encoder,
 	pipe_config->clock_set = true;
 	pipe_config->port_clock = intel_dsi_bitrate(intel_dsi) / 5;
 
-	return true;
+	return 0;
 }
 
 static u64 gen11_dsi_get_power_domains(struct intel_encoder *encoder,
@@ -1229,13 +1239,15 @@ static bool gen11_dsi_get_hw_state(struct intel_encoder *encoder,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_dsi *intel_dsi = enc_to_intel_dsi(&encoder->base);
-	u32 tmp;
-	enum port port;
 	enum transcoder dsi_trans;
+	intel_wakeref_t wakeref;
+	enum port port;
 	bool ret = false;
+	u32 tmp;
 
-	if (!intel_display_power_get_if_enabled(dev_priv,
-						encoder->power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     encoder->power_domain);
+	if (!wakeref)
 		return false;
 
 	for_each_dsi_port(port, intel_dsi->ports) {
@@ -1260,7 +1272,7 @@ static bool gen11_dsi_get_hw_state(struct intel_encoder *encoder,
 		ret = tmp & PIPECONF_ENABLE;
 	}
 out:
-	intel_display_power_put(dev_priv, encoder->power_domain);
+	intel_display_power_put(dev_priv, encoder->power_domain, wakeref);
 	return ret;
 }
 
@@ -1378,6 +1390,7 @@ void icl_dsi_init(struct drm_i915_private *dev_priv)
 	encoder->disable = gen11_dsi_disable;
 	encoder->port = port;
 	encoder->get_config = gen11_dsi_get_config;
+	encoder->update_pipe = intel_panel_update_backlight;
 	encoder->compute_config = gen11_dsi_compute_config;
 	encoder->get_hw_state = gen11_dsi_get_hw_state;
 	encoder->type = INTEL_OUTPUT_DSI;
diff --git a/drivers/gpu/drm/i915/intel_acpi.c b/drivers/gpu/drm/i915/intel_acpi.c
index 6ba478e57b9b..9d142d038a7d 100644
--- a/drivers/gpu/drm/i915/intel_acpi.c
+++ b/drivers/gpu/drm/i915/intel_acpi.c
@@ -6,7 +6,6 @@
  */
 #include <linux/pci.h>
 #include <linux/acpi.h>
-#include <drm/drmP.h>
 #include "i915_drv.h"
 
 #define INTEL_DSM_REVISION_ID 1 /* For Calpella anyway... */
diff --git a/drivers/gpu/drm/i915/intel_atomic.c b/drivers/gpu/drm/i915/intel_atomic.c
index 8cb02f28d30c..7cf9290ea34a 100644
--- a/drivers/gpu/drm/i915/intel_atomic.c
+++ b/drivers/gpu/drm/i915/intel_atomic.c
@@ -29,10 +29,11 @@
  * See intel_atomic_plane.c for the plane-specific atomic functionality.
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_fourcc.h>
 #include <drm/drm_plane_helper.h>
+
 #include "intel_drv.h"
 
 /**
@@ -47,7 +48,7 @@
 int intel_digital_connector_atomic_get_property(struct drm_connector *connector,
 						const struct drm_connector_state *state,
 						struct drm_property *property,
-						uint64_t *val)
+						u64 *val)
 {
 	struct drm_device *dev = connector->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -79,7 +80,7 @@ int intel_digital_connector_atomic_get_property(struct drm_connector *connector,
 int intel_digital_connector_atomic_set_property(struct drm_connector *connector,
 						struct drm_connector_state *state,
 						struct drm_property *property,
-						uint64_t val)
+						u64 val)
 {
 	struct drm_device *dev = connector->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -233,7 +234,7 @@ static void intel_atomic_setup_scaler(struct intel_crtc_scaler_state *scaler_sta
 	if (plane_state && plane_state->base.fb &&
 	    plane_state->base.fb->format->is_yuv &&
 	    plane_state->base.fb->format->num_planes > 1) {
-		if (IS_GEN9(dev_priv) &&
+		if (IS_GEN(dev_priv, 9) &&
 		    !IS_GEMINILAKE(dev_priv)) {
 			mode = SKL_PS_SCALER_MODE_NV12;
 		} else if (icl_is_hdr_plane(to_intel_plane(plane_state->base.plane))) {
diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c b/drivers/gpu/drm/i915/intel_atomic_plane.c
index 0a73e6e65c20..db0965904439 100644
--- a/drivers/gpu/drm/i915/intel_atomic_plane.c
+++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
@@ -31,9 +31,10 @@
  * prepare/check/commit/cleanup steps.
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_fourcc.h>
 #include <drm/drm_plane_helper.h>
+
 #include "intel_drv.h"
 
 struct intel_plane *intel_plane_alloc(void)
@@ -111,41 +112,39 @@ intel_plane_destroy_state(struct drm_plane *plane,
 }
 
 int intel_plane_atomic_check_with_state(const struct intel_crtc_state *old_crtc_state,
-					struct intel_crtc_state *crtc_state,
+					struct intel_crtc_state *new_crtc_state,
 					const struct intel_plane_state *old_plane_state,
-					struct intel_plane_state *intel_state)
+					struct intel_plane_state *new_plane_state)
 {
-	struct drm_plane *plane = intel_state->base.plane;
-	struct drm_plane_state *state = &intel_state->base;
-	struct intel_plane *intel_plane = to_intel_plane(plane);
+	struct intel_plane *plane = to_intel_plane(new_plane_state->base.plane);
 	int ret;
 
-	crtc_state->active_planes &= ~BIT(intel_plane->id);
-	crtc_state->nv12_planes &= ~BIT(intel_plane->id);
-	intel_state->base.visible = false;
+	new_crtc_state->active_planes &= ~BIT(plane->id);
+	new_crtc_state->nv12_planes &= ~BIT(plane->id);
+	new_plane_state->base.visible = false;
 
-	/* If this is a cursor plane, no further checks are needed. */
-	if (!intel_state->base.crtc && !old_plane_state->base.crtc)
+	if (!new_plane_state->base.crtc && !old_plane_state->base.crtc)
 		return 0;
 
-	ret = intel_plane->check_plane(crtc_state, intel_state);
+	ret = plane->check_plane(new_crtc_state, new_plane_state);
 	if (ret)
 		return ret;
 
 	/* FIXME pre-g4x don't work like this */
-	if (state->visible)
-		crtc_state->active_planes |= BIT(intel_plane->id);
+	if (new_plane_state->base.visible)
+		new_crtc_state->active_planes |= BIT(plane->id);
 
-	if (state->visible && state->fb->format->format == DRM_FORMAT_NV12)
-		crtc_state->nv12_planes |= BIT(intel_plane->id);
+	if (new_plane_state->base.visible &&
+	    new_plane_state->base.fb->format->format == DRM_FORMAT_NV12)
+		new_crtc_state->nv12_planes |= BIT(plane->id);
 
-	if (state->visible || old_plane_state->base.visible)
-		crtc_state->update_planes |= BIT(intel_plane->id);
+	if (new_plane_state->base.visible || old_plane_state->base.visible)
+		new_crtc_state->update_planes |= BIT(plane->id);
 
 	return intel_plane_atomic_calc_changes(old_crtc_state,
-					       &crtc_state->base,
+					       &new_crtc_state->base,
 					       old_plane_state,
-					       state);
+					       &new_plane_state->base);
 }
 
 static int intel_plane_atomic_check(struct drm_plane *plane,
@@ -312,7 +311,7 @@ int
 intel_plane_atomic_get_property(struct drm_plane *plane,
 				const struct drm_plane_state *state,
 				struct drm_property *property,
-				uint64_t *val)
+				u64 *val)
 {
 	DRM_DEBUG_KMS("Unknown property [PROP:%d:%s]\n",
 		      property->base.id, property->name);
@@ -335,7 +334,7 @@ int
 intel_plane_atomic_set_property(struct drm_plane *plane,
 				struct drm_plane_state *state,
 				struct drm_property *property,
-				uint64_t val)
+				u64 val)
 {
 	DRM_DEBUG_KMS("Unknown property [PROP:%d:%s]\n",
 		      property->base.id, property->name);
diff --git a/drivers/gpu/drm/i915/intel_audio.c b/drivers/gpu/drm/i915/intel_audio.c
index b32681632f30..5104c6bbd66f 100644
--- a/drivers/gpu/drm/i915/intel_audio.c
+++ b/drivers/gpu/drm/i915/intel_audio.c
@@ -27,7 +27,6 @@
 #include <drm/intel_lpe_audio.h>
 #include "intel_drv.h"
 
-#include <drm/drmP.h>
 #include <drm/drm_edid.h>
 #include "i915_drv.h"
 
@@ -749,7 +748,8 @@ static void i915_audio_component_get_power(struct device *kdev)
 
 static void i915_audio_component_put_power(struct device *kdev)
 {
-	intel_display_power_put(kdev_to_i915(kdev), POWER_DOMAIN_AUDIO);
+	intel_display_power_put_unchecked(kdev_to_i915(kdev),
+					  POWER_DOMAIN_AUDIO);
 }
 
 static void i915_audio_component_codec_wake_override(struct device *kdev,
@@ -758,7 +758,7 @@ static void i915_audio_component_codec_wake_override(struct device *kdev,
 	struct drm_i915_private *dev_priv = kdev_to_i915(kdev);
 	u32 tmp;
 
-	if (!IS_GEN9(dev_priv))
+	if (!IS_GEN(dev_priv, 9))
 		return;
 
 	i915_audio_component_get_power(kdev);
diff --git a/drivers/gpu/drm/i915/intel_bios.c b/drivers/gpu/drm/i915/intel_bios.c
index 6d3e0260d49c..b508d8a735e0 100644
--- a/drivers/gpu/drm/i915/intel_bios.c
+++ b/drivers/gpu/drm/i915/intel_bios.c
@@ -26,7 +26,6 @@
  */
 
 #include <drm/drm_dp_helper.h>
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 
@@ -453,7 +452,7 @@ parse_sdvo_device_mapping(struct drm_i915_private *dev_priv, u8 bdb_version)
 	 * Only parse SDVO mappings on gens that could have SDVO. This isn't
 	 * accurate and doesn't have to be, as long as it's not too strict.
 	 */
-	if (!IS_GEN(dev_priv, 3, 7)) {
+	if (!IS_GEN_RANGE(dev_priv, 3, 7)) {
 		DRM_DEBUG_KMS("Skipping SDVO device mapping\n");
 		return;
 	}
@@ -1386,8 +1385,15 @@ static void parse_ddi_port(struct drm_i915_private *dev_priv, enum port port,
 	info->supports_dp = is_dp;
 	info->supports_edp = is_edp;
 
-	DRM_DEBUG_KMS("Port %c VBT info: DP:%d HDMI:%d DVI:%d EDP:%d CRT:%d\n",
-		      port_name(port), is_dp, is_hdmi, is_dvi, is_edp, is_crt);
+	if (bdb_version >= 195)
+		info->supports_typec_usb = child->dp_usb_type_c;
+
+	if (bdb_version >= 209)
+		info->supports_tbt = child->tbt;
+
+	DRM_DEBUG_KMS("Port %c VBT info: DP:%d HDMI:%d DVI:%d EDP:%d CRT:%d TCUSB:%d TBT:%d\n",
+		      port_name(port), is_dp, is_hdmi, is_dvi, is_edp, is_crt,
+		      info->supports_typec_usb, info->supports_tbt);
 
 	if (is_edp && is_dvi)
 		DRM_DEBUG_KMS("Internal DP port %c is TMDS compatible\n",
@@ -1657,6 +1663,13 @@ init_vbt_missing_defaults(struct drm_i915_private *dev_priv)
 		struct ddi_vbt_port_info *info =
 			&dev_priv->vbt.ddi_port_info[port];
 
+		/*
+		 * VBT has the TypeC mode (native,TBT/USB) and we don't want
+		 * to detect it.
+		 */
+		if (intel_port_is_tc(dev_priv, port))
+			continue;
+
 		info->supports_dvi = (port != PORT_A && port != PORT_E);
 		info->supports_hdmi = info->supports_dvi;
 		info->supports_dp = (port != PORT_E);
@@ -1940,6 +1953,15 @@ bool intel_bios_is_port_present(struct drm_i915_private *dev_priv, enum port por
 	};
 	int i;
 
+	if (HAS_DDI(dev_priv)) {
+		const struct ddi_vbt_port_info *port_info =
+			&dev_priv->vbt.ddi_port_info[port];
+
+		return port_info->supports_dp ||
+		       port_info->supports_dvi ||
+		       port_info->supports_hdmi;
+	}
+
 	/* FIXME maybe deal with port A as well? */
 	if (WARN_ON(port == PORT_A) || port >= ARRAY_SIZE(port_mapping))
 		return false;
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 447c5256f63a..cacaa1d04d17 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -29,180 +29,146 @@
 
 #define task_asleep(tsk) ((tsk)->state & TASK_NORMAL && !(tsk)->on_rq)
 
-static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b)
+static void irq_enable(struct intel_engine_cs *engine)
 {
-	struct intel_wait *wait;
-	unsigned int result = 0;
-
-	lockdep_assert_held(&b->irq_lock);
-
-	wait = b->irq_wait;
-	if (wait) {
-		/*
-		 * N.B. Since task_asleep() and ttwu are not atomic, the
-		 * waiter may actually go to sleep after the check, causing
-		 * us to suppress a valid wakeup. We prefer to reduce the
-		 * number of false positive missed_breadcrumb() warnings
-		 * at the expense of a few false negatives, as it it easy
-		 * to trigger a false positive under heavy load. Enough
-		 * signal should remain from genuine missed_breadcrumb()
-		 * for us to detect in CI.
-		 */
-		bool was_asleep = task_asleep(wait->tsk);
-
-		result = ENGINE_WAKEUP_WAITER;
-		if (wake_up_process(wait->tsk) && was_asleep)
-			result |= ENGINE_WAKEUP_ASLEEP;
-	}
+	if (!engine->irq_enable)
+		return;
 
-	return result;
+	/* Caller disables interrupts */
+	spin_lock(&engine->i915->irq_lock);
+	engine->irq_enable(engine);
+	spin_unlock(&engine->i915->irq_lock);
 }
 
-unsigned int intel_engine_wakeup(struct intel_engine_cs *engine)
+static void irq_disable(struct intel_engine_cs *engine)
 {
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	unsigned long flags;
-	unsigned int result;
-
-	spin_lock_irqsave(&b->irq_lock, flags);
-	result = __intel_breadcrumbs_wakeup(b);
-	spin_unlock_irqrestore(&b->irq_lock, flags);
-
-	return result;
-}
+	if (!engine->irq_disable)
+		return;
 
-static unsigned long wait_timeout(void)
-{
-	return round_jiffies_up(jiffies + DRM_I915_HANGCHECK_JIFFIES);
+	/* Caller disables interrupts */
+	spin_lock(&engine->i915->irq_lock);
+	engine->irq_disable(engine);
+	spin_unlock(&engine->i915->irq_lock);
 }
 
-static noinline void missed_breadcrumb(struct intel_engine_cs *engine)
+static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b)
 {
-	if (GEM_SHOW_DEBUG()) {
-		struct drm_printer p = drm_debug_printer(__func__);
+	lockdep_assert_held(&b->irq_lock);
 
-		intel_engine_dump(engine, &p,
-				  "%s missed breadcrumb at %pS\n",
-				  engine->name, __builtin_return_address(0));
-	}
+	GEM_BUG_ON(!b->irq_enabled);
+	if (!--b->irq_enabled)
+		irq_disable(container_of(b,
+					 struct intel_engine_cs,
+					 breadcrumbs));
 
-	set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
+	b->irq_armed = false;
 }
 
-static void intel_breadcrumbs_hangcheck(struct timer_list *t)
+void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
 {
-	struct intel_engine_cs *engine =
-		from_timer(engine, t, breadcrumbs.hangcheck);
 	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	unsigned int irq_count;
 
 	if (!b->irq_armed)
 		return;
 
-	irq_count = READ_ONCE(b->irq_count);
-	if (b->hangcheck_interrupts != irq_count) {
-		b->hangcheck_interrupts = irq_count;
-		mod_timer(&b->hangcheck, wait_timeout());
-		return;
-	}
+	spin_lock_irq(&b->irq_lock);
+	if (b->irq_armed)
+		__intel_breadcrumbs_disarm_irq(b);
+	spin_unlock_irq(&b->irq_lock);
+}
 
-	/* We keep the hangcheck timer alive until we disarm the irq, even
-	 * if there are no waiters at present.
-	 *
-	 * If the waiter was currently running, assume it hasn't had a chance
-	 * to process the pending interrupt (e.g, low priority task on a loaded
-	 * system) and wait until it sleeps before declaring a missed interrupt.
-	 *
-	 * If the waiter was asleep (and not even pending a wakeup), then we
-	 * must have missed an interrupt as the GPU has stopped advancing
-	 * but we still have a waiter. Assuming all batches complete within
-	 * DRM_I915_HANGCHECK_JIFFIES [1.5s]!
-	 */
-	if (intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP) {
-		missed_breadcrumb(engine);
-		mod_timer(&b->fake_irq, jiffies + 1);
-	} else {
-		mod_timer(&b->hangcheck, wait_timeout());
-	}
+static inline bool __request_completed(const struct i915_request *rq)
+{
+	return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno);
 }
 
-static void intel_breadcrumbs_fake_irq(struct timer_list *t)
+bool intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
 {
-	struct intel_engine_cs *engine =
-		from_timer(engine, t, breadcrumbs.fake_irq);
 	struct intel_breadcrumbs *b = &engine->breadcrumbs;
+	struct intel_context *ce, *cn;
+	struct list_head *pos, *next;
+	LIST_HEAD(signal);
 
-	/*
-	 * The timer persists in case we cannot enable interrupts,
-	 * or if we have previously seen seqno/interrupt incoherency
-	 * ("missed interrupt" syndrome, better known as a "missed breadcrumb").
-	 * Here the worker will wake up every jiffie in order to kick the
-	 * oldest waiter to do the coherent seqno check.
-	 */
+	spin_lock(&b->irq_lock);
 
-	spin_lock_irq(&b->irq_lock);
-	if (b->irq_armed && !__intel_breadcrumbs_wakeup(b))
-		__intel_engine_disarm_breadcrumbs(engine);
-	spin_unlock_irq(&b->irq_lock);
-	if (!b->irq_armed)
-		return;
+	if (b->irq_armed && list_empty(&b->signalers))
+		__intel_breadcrumbs_disarm_irq(b);
 
-	/* If the user has disabled the fake-irq, restore the hangchecking */
-	if (!test_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings)) {
-		mod_timer(&b->hangcheck, wait_timeout());
-		return;
-	}
+	list_for_each_entry_safe(ce, cn, &b->signalers, signal_link) {
+		GEM_BUG_ON(list_empty(&ce->signals));
 
-	mod_timer(&b->fake_irq, jiffies + 1);
-}
+		list_for_each_safe(pos, next, &ce->signals) {
+			struct i915_request *rq =
+				list_entry(pos, typeof(*rq), signal_link);
 
-static void irq_enable(struct intel_engine_cs *engine)
-{
-	/*
-	 * FIXME: Ideally we want this on the API boundary, but for the
-	 * sake of testing with mock breadcrumbs (no HW so unable to
-	 * enable irqs) we place it deep within the bowels, at the point
-	 * of no return.
-	 */
-	GEM_BUG_ON(!intel_irqs_enabled(engine->i915));
+			if (!__request_completed(rq))
+				break;
 
-	/* Enabling the IRQ may miss the generation of the interrupt, but
-	 * we still need to force the barrier before reading the seqno,
-	 * just in case.
-	 */
-	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
+			GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_SIGNAL,
+					     &rq->fence.flags));
+			clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
 
-	/* Caller disables interrupts */
-	if (engine->irq_enable) {
-		spin_lock(&engine->i915->irq_lock);
-		engine->irq_enable(engine);
-		spin_unlock(&engine->i915->irq_lock);
+			/*
+			 * We may race with direct invocation of
+			 * dma_fence_signal(), e.g. i915_request_retire(),
+			 * in which case we can skip processing it ourselves.
+			 */
+			if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+				     &rq->fence.flags))
+				continue;
+
+			/*
+			 * Queue for execution after dropping the signaling
+			 * spinlock as the callback chain may end up adding
+			 * more signalers to the same context or engine.
+			 */
+			i915_request_get(rq);
+			list_add_tail(&rq->signal_link, &signal);
+		}
+
+		/*
+		 * We process the list deletion in bulk, only using a list_add
+		 * (not list_move) above but keeping the status of
+		 * rq->signal_link known with the I915_FENCE_FLAG_SIGNAL bit.
+		 */
+		if (!list_is_first(pos, &ce->signals)) {
+			/* Advance the list to the first incomplete request */
+			__list_del_many(&ce->signals, pos);
+			if (&ce->signals == pos) /* now empty */
+				list_del_init(&ce->signal_link);
+		}
 	}
-}
 
-static void irq_disable(struct intel_engine_cs *engine)
-{
-	/* Caller disables interrupts */
-	if (engine->irq_disable) {
-		spin_lock(&engine->i915->irq_lock);
-		engine->irq_disable(engine);
-		spin_unlock(&engine->i915->irq_lock);
+	spin_unlock(&b->irq_lock);
+
+	list_for_each_safe(pos, next, &signal) {
+		struct i915_request *rq =
+			list_entry(pos, typeof(*rq), signal_link);
+
+		dma_fence_signal(&rq->fence);
+		i915_request_put(rq);
 	}
+
+	return !list_empty(&signal);
 }
 
-void __intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
+bool intel_engine_signal_breadcrumbs(struct intel_engine_cs *engine)
 {
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
+	bool result;
 
-	lockdep_assert_held(&b->irq_lock);
-	GEM_BUG_ON(b->irq_wait);
-	GEM_BUG_ON(!b->irq_armed);
+	local_irq_disable();
+	result = intel_engine_breadcrumbs_irq(engine);
+	local_irq_enable();
 
-	GEM_BUG_ON(!b->irq_enabled);
-	if (!--b->irq_enabled)
-		irq_disable(engine);
+	return result;
+}
 
-	b->irq_armed = false;
+static void signal_irq_work(struct irq_work *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), breadcrumbs.irq_work);
+
+	intel_engine_breadcrumbs_irq(engine);
 }
 
 void intel_engine_pin_breadcrumbs_irq(struct intel_engine_cs *engine)
@@ -227,666 +193,155 @@ void intel_engine_unpin_breadcrumbs_irq(struct intel_engine_cs *engine)
 	spin_unlock_irq(&b->irq_lock);
 }
 
-void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
-{
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	struct intel_wait *wait, *n;
-
-	if (!b->irq_armed)
-		return;
-
-	/*
-	 * We only disarm the irq when we are idle (all requests completed),
-	 * so if the bottom-half remains asleep, it missed the request
-	 * completion.
-	 */
-	if (intel_engine_wakeup(engine) & ENGINE_WAKEUP_ASLEEP)
-		missed_breadcrumb(engine);
-
-	spin_lock_irq(&b->rb_lock);
-
-	spin_lock(&b->irq_lock);
-	b->irq_wait = NULL;
-	if (b->irq_armed)
-		__intel_engine_disarm_breadcrumbs(engine);
-	spin_unlock(&b->irq_lock);
-
-	rbtree_postorder_for_each_entry_safe(wait, n, &b->waiters, node) {
-		GEM_BUG_ON(!intel_engine_signaled(engine, wait->seqno));
-		RB_CLEAR_NODE(&wait->node);
-		wake_up_process(wait->tsk);
-	}
-	b->waiters = RB_ROOT;
-
-	spin_unlock_irq(&b->rb_lock);
-}
-
-static bool use_fake_irq(const struct intel_breadcrumbs *b)
-{
-	const struct intel_engine_cs *engine =
-		container_of(b, struct intel_engine_cs, breadcrumbs);
-
-	if (!test_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings))
-		return false;
-
-	/*
-	 * Only start with the heavy weight fake irq timer if we have not
-	 * seen any interrupts since enabling it the first time. If the
-	 * interrupts are still arriving, it means we made a mistake in our
-	 * engine->seqno_barrier(), a timing error that should be transient
-	 * and unlikely to reoccur.
-	 */
-	return READ_ONCE(b->irq_count) == b->hangcheck_interrupts;
-}
-
-static void enable_fake_irq(struct intel_breadcrumbs *b)
-{
-	/* Ensure we never sleep indefinitely */
-	if (!b->irq_enabled || use_fake_irq(b))
-		mod_timer(&b->fake_irq, jiffies + 1);
-	else
-		mod_timer(&b->hangcheck, wait_timeout());
-}
-
-static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
+static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b)
 {
 	struct intel_engine_cs *engine =
 		container_of(b, struct intel_engine_cs, breadcrumbs);
-	struct drm_i915_private *i915 = engine->i915;
-	bool enabled;
 
 	lockdep_assert_held(&b->irq_lock);
 	if (b->irq_armed)
-		return false;
+		return;
 
-	/* The breadcrumb irq will be disarmed on the interrupt after the
+	/*
+	 * The breadcrumb irq will be disarmed on the interrupt after the
 	 * waiters are signaled. This gives us a single interrupt window in
 	 * which we can add a new waiter and avoid the cost of re-enabling
 	 * the irq.
 	 */
 	b->irq_armed = true;
 
-	if (I915_SELFTEST_ONLY(b->mock)) {
-		/* For our mock objects we want to avoid interaction
-		 * with the real hardware (which is not set up). So
-		 * we simply pretend we have enabled the powerwell
-		 * and the irq, and leave it up to the mock
-		 * implementation to call intel_engine_wakeup()
-		 * itself when it wants to simulate a user interrupt,
-		 */
-		return true;
-	}
-
-	/* Since we are waiting on a request, the GPU should be busy
+	/*
+	 * Since we are waiting on a request, the GPU should be busy
 	 * and should have its own rpm reference. This is tracked
 	 * by i915->gt.awake, we can forgo holding our own wakref
 	 * for the interrupt as before i915->gt.awake is released (when
 	 * the driver is idle) we disarm the breadcrumbs.
 	 */
 
-	/* No interrupts? Kick the waiter every jiffie! */
-	enabled = false;
-	if (!b->irq_enabled++ &&
-	    !test_bit(engine->id, &i915->gpu_error.test_irq_rings)) {
+	if (!b->irq_enabled++)
 		irq_enable(engine);
-		enabled = true;
-	}
-
-	enable_fake_irq(b);
-	return enabled;
-}
-
-static inline struct intel_wait *to_wait(struct rb_node *node)
-{
-	return rb_entry(node, struct intel_wait, node);
-}
-
-static inline void __intel_breadcrumbs_finish(struct intel_breadcrumbs *b,
-					      struct intel_wait *wait)
-{
-	lockdep_assert_held(&b->rb_lock);
-	GEM_BUG_ON(b->irq_wait == wait);
-
-	/*
-	 * This request is completed, so remove it from the tree, mark it as
-	 * complete, and *then* wake up the associated task. N.B. when the
-	 * task wakes up, it will find the empty rb_node, discern that it
-	 * has already been removed from the tree and skip the serialisation
-	 * of the b->rb_lock and b->irq_lock. This means that the destruction
-	 * of the intel_wait is not serialised with the interrupt handler
-	 * by the waiter - it must instead be serialised by the caller.
-	 */
-	rb_erase(&wait->node, &b->waiters);
-	RB_CLEAR_NODE(&wait->node);
-
-	if (wait->tsk->state != TASK_RUNNING)
-		wake_up_process(wait->tsk); /* implicit smp_wmb() */
-}
-
-static inline void __intel_breadcrumbs_next(struct intel_engine_cs *engine,
-					    struct rb_node *next)
-{
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-
-	spin_lock(&b->irq_lock);
-	GEM_BUG_ON(!b->irq_armed);
-	GEM_BUG_ON(!b->irq_wait);
-	b->irq_wait = to_wait(next);
-	spin_unlock(&b->irq_lock);
-
-	/* We always wake up the next waiter that takes over as the bottom-half
-	 * as we may delegate not only the irq-seqno barrier to the next waiter
-	 * but also the task of waking up concurrent waiters.
-	 */
-	if (next)
-		wake_up_process(to_wait(next)->tsk);
 }
 
-static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
-				    struct intel_wait *wait)
+void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine)
 {
 	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	struct rb_node **p, *parent, *completed;
-	bool first, armed;
-	u32 seqno;
-
-	GEM_BUG_ON(!wait->seqno);
-
-	/* Insert the request into the retirement ordered list
-	 * of waiters by walking the rbtree. If we are the oldest
-	 * seqno in the tree (the first to be retired), then
-	 * set ourselves as the bottom-half.
-	 *
-	 * As we descend the tree, prune completed branches since we hold the
-	 * spinlock we know that the first_waiter must be delayed and can
-	 * reduce some of the sequential wake up latency if we take action
-	 * ourselves and wake up the completed tasks in parallel. Also, by
-	 * removing stale elements in the tree, we may be able to reduce the
-	 * ping-pong between the old bottom-half and ourselves as first-waiter.
-	 */
-	armed = false;
-	first = true;
-	parent = NULL;
-	completed = NULL;
-	seqno = intel_engine_get_seqno(engine);
-
-	 /* If the request completed before we managed to grab the spinlock,
-	  * return now before adding ourselves to the rbtree. We let the
-	  * current bottom-half handle any pending wakeups and instead
-	  * try and get out of the way quickly.
-	  */
-	if (i915_seqno_passed(seqno, wait->seqno)) {
-		RB_CLEAR_NODE(&wait->node);
-		return first;
-	}
-
-	p = &b->waiters.rb_node;
-	while (*p) {
-		parent = *p;
-		if (wait->seqno == to_wait(parent)->seqno) {
-			/* We have multiple waiters on the same seqno, select
-			 * the highest priority task (that with the smallest
-			 * task->prio) to serve as the bottom-half for this
-			 * group.
-			 */
-			if (wait->tsk->prio > to_wait(parent)->tsk->prio) {
-				p = &parent->rb_right;
-				first = false;
-			} else {
-				p = &parent->rb_left;
-			}
-		} else if (i915_seqno_passed(wait->seqno,
-					     to_wait(parent)->seqno)) {
-			p = &parent->rb_right;
-			if (i915_seqno_passed(seqno, to_wait(parent)->seqno))
-				completed = parent;
-			else
-				first = false;
-		} else {
-			p = &parent->rb_left;
-		}
-	}
-	rb_link_node(&wait->node, parent, p);
-	rb_insert_color(&wait->node, &b->waiters);
-
-	if (first) {
-		spin_lock(&b->irq_lock);
-		b->irq_wait = wait;
-		/* After assigning ourselves as the new bottom-half, we must
-		 * perform a cursory check to prevent a missed interrupt.
-		 * Either we miss the interrupt whilst programming the hardware,
-		 * or if there was a previous waiter (for a later seqno) they
-		 * may be woken instead of us (due to the inherent race
-		 * in the unlocked read of b->irq_seqno_bh in the irq handler)
-		 * and so we miss the wake up.
-		 */
-		armed = __intel_breadcrumbs_enable_irq(b);
-		spin_unlock(&b->irq_lock);
-	}
 
-	if (completed) {
-		/* Advance the bottom-half (b->irq_wait) before we wake up
-		 * the waiters who may scribble over their intel_wait
-		 * just as the interrupt handler is dereferencing it via
-		 * b->irq_wait.
-		 */
-		if (!first) {
-			struct rb_node *next = rb_next(completed);
-			GEM_BUG_ON(next == &wait->node);
-			__intel_breadcrumbs_next(engine, next);
-		}
-
-		do {
-			struct intel_wait *crumb = to_wait(completed);
-			completed = rb_prev(completed);
-			__intel_breadcrumbs_finish(b, crumb);
-		} while (completed);
-	}
-
-	GEM_BUG_ON(!b->irq_wait);
-	GEM_BUG_ON(!b->irq_armed);
-	GEM_BUG_ON(rb_first(&b->waiters) != &b->irq_wait->node);
+	spin_lock_init(&b->irq_lock);
+	INIT_LIST_HEAD(&b->signalers);
 
-	return armed;
+	init_irq_work(&b->irq_work, signal_irq_work);
 }
 
-bool intel_engine_add_wait(struct intel_engine_cs *engine,
-			   struct intel_wait *wait)
+void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
 {
 	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	bool armed;
-
-	spin_lock_irq(&b->rb_lock);
-	armed = __intel_engine_add_wait(engine, wait);
-	spin_unlock_irq(&b->rb_lock);
-	if (armed)
-		return armed;
-
-	/* Make the caller recheck if its request has already started. */
-	return intel_engine_has_started(engine, wait->seqno);
-}
+	unsigned long flags;
 
-static inline bool chain_wakeup(struct rb_node *rb, int priority)
-{
-	return rb && to_wait(rb)->tsk->prio <= priority;
-}
+	spin_lock_irqsave(&b->irq_lock, flags);
 
-static inline int wakeup_priority(struct intel_breadcrumbs *b,
-				  struct task_struct *tsk)
-{
-	if (tsk == b->signaler)
-		return INT_MIN;
+	if (b->irq_enabled)
+		irq_enable(engine);
 	else
-		return tsk->prio;
-}
-
-static void __intel_engine_remove_wait(struct intel_engine_cs *engine,
-				       struct intel_wait *wait)
-{
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-
-	lockdep_assert_held(&b->rb_lock);
-
-	if (RB_EMPTY_NODE(&wait->node))
-		goto out;
-
-	if (b->irq_wait == wait) {
-		const int priority = wakeup_priority(b, wait->tsk);
-		struct rb_node *next;
-
-		/* We are the current bottom-half. Find the next candidate,
-		 * the first waiter in the queue on the remaining oldest
-		 * request. As multiple seqnos may complete in the time it
-		 * takes us to wake up and find the next waiter, we have to
-		 * wake up that waiter for it to perform its own coherent
-		 * completion check.
-		 */
-		next = rb_next(&wait->node);
-		if (chain_wakeup(next, priority)) {
-			/* If the next waiter is already complete,
-			 * wake it up and continue onto the next waiter. So
-			 * if have a small herd, they will wake up in parallel
-			 * rather than sequentially, which should reduce
-			 * the overall latency in waking all the completed
-			 * clients.
-			 *
-			 * However, waking up a chain adds extra latency to
-			 * the first_waiter. This is undesirable if that
-			 * waiter is a high priority task.
-			 */
-			u32 seqno = intel_engine_get_seqno(engine);
-
-			while (i915_seqno_passed(seqno, to_wait(next)->seqno)) {
-				struct rb_node *n = rb_next(next);
-
-				__intel_breadcrumbs_finish(b, to_wait(next));
-				next = n;
-				if (!chain_wakeup(next, priority))
-					break;
-			}
-		}
-
-		__intel_breadcrumbs_next(engine, next);
-	} else {
-		GEM_BUG_ON(rb_first(&b->waiters) == &wait->node);
-	}
-
-	GEM_BUG_ON(RB_EMPTY_NODE(&wait->node));
-	rb_erase(&wait->node, &b->waiters);
-	RB_CLEAR_NODE(&wait->node);
+		irq_disable(engine);
 
-out:
-	GEM_BUG_ON(b->irq_wait == wait);
-	GEM_BUG_ON(rb_first(&b->waiters) !=
-		   (b->irq_wait ? &b->irq_wait->node : NULL));
+	spin_unlock_irqrestore(&b->irq_lock, flags);
 }
 
-void intel_engine_remove_wait(struct intel_engine_cs *engine,
-			      struct intel_wait *wait)
+void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine)
 {
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-
-	/* Quick check to see if this waiter was already decoupled from
-	 * the tree by the bottom-half to avoid contention on the spinlock
-	 * by the herd.
-	 */
-	if (RB_EMPTY_NODE(&wait->node)) {
-		GEM_BUG_ON(READ_ONCE(b->irq_wait) == wait);
-		return;
-	}
-
-	spin_lock_irq(&b->rb_lock);
-	__intel_engine_remove_wait(engine, wait);
-	spin_unlock_irq(&b->rb_lock);
 }
 
-static void signaler_set_rtpriority(void)
+bool i915_request_enable_breadcrumb(struct i915_request *rq)
 {
-	 struct sched_param param = { .sched_priority = 1 };
-
-	 sched_setscheduler_nocheck(current, SCHED_FIFO, &param);
-}
+	struct intel_breadcrumbs *b = &rq->engine->breadcrumbs;
 
-static int intel_breadcrumbs_signaler(void *arg)
-{
-	struct intel_engine_cs *engine = arg;
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	struct i915_request *rq, *n;
+	GEM_BUG_ON(test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags));
 
-	/* Install ourselves with high priority to reduce signalling latency */
-	signaler_set_rtpriority();
+	if (!test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags))
+		return true;
 
-	do {
-		bool do_schedule = true;
-		LIST_HEAD(list);
-		u32 seqno;
+	spin_lock(&b->irq_lock);
+	if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags) &&
+	    !__request_completed(rq)) {
+		struct intel_context *ce = rq->hw_context;
+		struct list_head *pos;
 
-		set_current_state(TASK_INTERRUPTIBLE);
-		if (list_empty(&b->signals))
-			goto sleep;
+		__intel_breadcrumbs_arm_irq(b);
 
 		/*
-		 * We are either woken up by the interrupt bottom-half,
-		 * or by a client adding a new signaller. In both cases,
-		 * the GPU seqno may have advanced beyond our oldest signal.
-		 * If it has, propagate the signal, remove the waiter and
-		 * check again with the next oldest signal. Otherwise we
-		 * need to wait for a new interrupt from the GPU or for
-		 * a new client.
+		 * We keep the seqno in retirement order, so we can break
+		 * inside intel_engine_breadcrumbs_irq as soon as we've passed
+		 * the last completed request (or seen a request that hasn't
+		 * event started). We could iterate the timeline->requests list,
+		 * but keeping a separate signalers_list has the advantage of
+		 * hopefully being much smaller than the full list and so
+		 * provides faster iteration and detection when there are no
+		 * more interrupts required for this context.
+		 *
+		 * We typically expect to add new signalers in order, so we
+		 * start looking for our insertion point from the tail of
+		 * the list.
 		 */
-		seqno = intel_engine_get_seqno(engine);
-
-		spin_lock_irq(&b->rb_lock);
-		list_for_each_entry_safe(rq, n, &b->signals, signaling.link) {
-			u32 this = rq->signaling.wait.seqno;
-
-			GEM_BUG_ON(!rq->signaling.wait.seqno);
-
-			if (!i915_seqno_passed(seqno, this))
-				break;
-
-			if (likely(this == i915_request_global_seqno(rq))) {
-				__intel_engine_remove_wait(engine,
-							   &rq->signaling.wait);
+		list_for_each_prev(pos, &ce->signals) {
+			struct i915_request *it =
+				list_entry(pos, typeof(*it), signal_link);
 
-				rq->signaling.wait.seqno = 0;
-				__list_del_entry(&rq->signaling.link);
-
-				if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-					      &rq->fence.flags)) {
-					list_add_tail(&rq->signaling.link,
-						      &list);
-					i915_request_get(rq);
-				}
-			}
-		}
-		spin_unlock_irq(&b->rb_lock);
-
-		if (!list_empty(&list)) {
-			local_bh_disable();
-			list_for_each_entry_safe(rq, n, &list, signaling.link) {
-				dma_fence_signal(&rq->fence);
-				GEM_BUG_ON(!i915_request_completed(rq));
-				i915_request_put(rq);
-			}
-			local_bh_enable(); /* kick start the tasklets */
-
-			/*
-			 * If the engine is saturated we may be continually
-			 * processing completed requests. This angers the
-			 * NMI watchdog if we never let anything else
-			 * have access to the CPU. Let's pretend to be nice
-			 * and relinquish the CPU if we burn through the
-			 * entire RT timeslice!
-			 */
-			do_schedule = need_resched();
-		}
-
-		if (unlikely(do_schedule)) {
-			/* Before we sleep, check for a missed seqno */
-			if (current->state & TASK_NORMAL &&
-			    !list_empty(&b->signals) &&
-			    engine->irq_seqno_barrier &&
-			    test_and_clear_bit(ENGINE_IRQ_BREADCRUMB,
-					       &engine->irq_posted)) {
-				engine->irq_seqno_barrier(engine);
-				intel_engine_wakeup(engine);
-			}
-
-sleep:
-			if (kthread_should_park())
-				kthread_parkme();
-
-			if (unlikely(kthread_should_stop()))
+			if (i915_seqno_passed(rq->fence.seqno, it->fence.seqno))
 				break;
-
-			schedule();
 		}
-	} while (1);
-	__set_current_state(TASK_RUNNING);
+		list_add(&rq->signal_link, pos);
+		if (pos == &ce->signals) /* catch transitions from empty list */
+			list_move_tail(&ce->signal_link, &b->signalers);
 
-	return 0;
-}
-
-static void insert_signal(struct intel_breadcrumbs *b,
-			  struct i915_request *request,
-			  const u32 seqno)
-{
-	struct i915_request *iter;
-
-	lockdep_assert_held(&b->rb_lock);
-
-	/*
-	 * A reasonable assumption is that we are called to add signals
-	 * in sequence, as the requests are submitted for execution and
-	 * assigned a global_seqno. This will be the case for the majority
-	 * of internally generated signals (inter-engine signaling).
-	 *
-	 * Out of order waiters triggering random signaling enabling will
-	 * be more problematic, but hopefully rare enough and the list
-	 * small enough that the O(N) insertion sort is not an issue.
-	 */
-
-	list_for_each_entry_reverse(iter, &b->signals, signaling.link)
-		if (i915_seqno_passed(seqno, iter->signaling.wait.seqno))
-			break;
-
-	list_add(&request->signaling.link, &iter->signaling.link);
-}
-
-bool intel_engine_enable_signaling(struct i915_request *request, bool wakeup)
-{
-	struct intel_engine_cs *engine = request->engine;
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	struct intel_wait *wait = &request->signaling.wait;
-	u32 seqno;
-
-	/*
-	 * Note that we may be called from an interrupt handler on another
-	 * device (e.g. nouveau signaling a fence completion causing us
-	 * to submit a request, and so enable signaling). As such,
-	 * we need to make sure that all other users of b->rb_lock protect
-	 * against interrupts, i.e. use spin_lock_irqsave.
-	 */
-
-	/* locked by dma_fence_enable_sw_signaling() (irqsafe fence->lock) */
-	GEM_BUG_ON(!irqs_disabled());
-	lockdep_assert_held(&request->lock);
-
-	seqno = i915_request_global_seqno(request);
-	if (!seqno) /* will be enabled later upon execution */
-		return true;
-
-	GEM_BUG_ON(wait->seqno);
-	wait->tsk = b->signaler;
-	wait->request = request;
-	wait->seqno = seqno;
-
-	/*
-	 * Add ourselves into the list of waiters, but registering our
-	 * bottom-half as the signaller thread. As per usual, only the oldest
-	 * waiter (not just signaller) is tasked as the bottom-half waking
-	 * up all completed waiters after the user interrupt.
-	 *
-	 * If we are the oldest waiter, enable the irq (after which we
-	 * must double check that the seqno did not complete).
-	 */
-	spin_lock(&b->rb_lock);
-	insert_signal(b, request, seqno);
-	wakeup &= __intel_engine_add_wait(engine, wait);
-	spin_unlock(&b->rb_lock);
-
-	if (wakeup) {
-		wake_up_process(b->signaler);
-		return !intel_wait_complete(wait);
+		set_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
 	}
+	spin_unlock(&b->irq_lock);
 
-	return true;
+	return !__request_completed(rq);
 }
 
-void intel_engine_cancel_signaling(struct i915_request *request)
+void i915_request_cancel_breadcrumb(struct i915_request *rq)
 {
-	struct intel_engine_cs *engine = request->engine;
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
+	struct intel_breadcrumbs *b = &rq->engine->breadcrumbs;
 
-	GEM_BUG_ON(!irqs_disabled());
-	lockdep_assert_held(&request->lock);
-
-	if (!READ_ONCE(request->signaling.wait.seqno))
+	if (!test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags))
 		return;
 
-	spin_lock(&b->rb_lock);
-	__intel_engine_remove_wait(engine, &request->signaling.wait);
-	if (fetch_and_zero(&request->signaling.wait.seqno))
-		__list_del_entry(&request->signaling.link);
-	spin_unlock(&b->rb_lock);
-}
-
-int intel_engine_init_breadcrumbs(struct intel_engine_cs *engine)
-{
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	struct task_struct *tsk;
-
-	spin_lock_init(&b->rb_lock);
-	spin_lock_init(&b->irq_lock);
-
-	timer_setup(&b->fake_irq, intel_breadcrumbs_fake_irq, 0);
-	timer_setup(&b->hangcheck, intel_breadcrumbs_hangcheck, 0);
-
-	INIT_LIST_HEAD(&b->signals);
-
-	/* Spawn a thread to provide a common bottom-half for all signals.
-	 * As this is an asynchronous interface we cannot steal the current
-	 * task for handling the bottom-half to the user interrupt, therefore
-	 * we create a thread to do the coherent seqno dance after the
-	 * interrupt and then signal the waitqueue (via the dma-buf/fence).
-	 */
-	tsk = kthread_run(intel_breadcrumbs_signaler, engine,
-			  "i915/signal:%d", engine->id);
-	if (IS_ERR(tsk))
-		return PTR_ERR(tsk);
-
-	b->signaler = tsk;
-
-	return 0;
-}
-
-static void cancel_fake_irq(struct intel_engine_cs *engine)
-{
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-
-	del_timer_sync(&b->fake_irq); /* may queue b->hangcheck */
-	del_timer_sync(&b->hangcheck);
-	clear_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
-}
-
-void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine)
-{
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	unsigned long flags;
-
-	spin_lock_irqsave(&b->irq_lock, flags);
-
-	/*
-	 * Leave the fake_irq timer enabled (if it is running), but clear the
-	 * bit so that it turns itself off on its next wake up and goes back
-	 * to the long hangcheck interval if still required.
-	 */
-	clear_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
-
-	if (b->irq_enabled)
-		irq_enable(engine);
-	else
-		irq_disable(engine);
+	spin_lock(&b->irq_lock);
+	if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) {
+		struct intel_context *ce = rq->hw_context;
 
-	/*
-	 * We set the IRQ_BREADCRUMB bit when we enable the irq presuming the
-	 * GPU is active and may have already executed the MI_USER_INTERRUPT
-	 * before the CPU is ready to receive. However, the engine is currently
-	 * idle (we haven't started it yet), there is no possibility for a
-	 * missed interrupt as we enabled the irq and so we can clear the
-	 * immediate wakeup (until a real interrupt arrives for the waiter).
-	 */
-	clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
+		list_del(&rq->signal_link);
+		if (list_empty(&ce->signals))
+			list_del_init(&ce->signal_link);
 
-	spin_unlock_irqrestore(&b->irq_lock, flags);
+		clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
+	}
+	spin_unlock(&b->irq_lock);
 }
 
-void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine)
+void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine,
+				    struct drm_printer *p)
 {
 	struct intel_breadcrumbs *b = &engine->breadcrumbs;
+	struct intel_context *ce;
+	struct i915_request *rq;
 
-	/* The engines should be idle and all requests accounted for! */
-	WARN_ON(READ_ONCE(b->irq_wait));
-	WARN_ON(!RB_EMPTY_ROOT(&b->waiters));
-	WARN_ON(!list_empty(&b->signals));
+	if (list_empty(&b->signalers))
+		return;
 
-	if (!IS_ERR_OR_NULL(b->signaler))
-		kthread_stop(b->signaler);
+	drm_printf(p, "Signals:\n");
 
-	cancel_fake_irq(engine);
+	spin_lock_irq(&b->irq_lock);
+	list_for_each_entry(ce, &b->signalers, signal_link) {
+		list_for_each_entry(rq, &ce->signals, signal_link) {
+			drm_printf(p, "\t[%llx:%llx%s] @ %dms\n",
+				   rq->fence.context, rq->fence.seqno,
+				   i915_request_completed(rq) ? "!" :
+				   i915_request_started(rq) ? "*" :
+				   "",
+				   jiffies_to_msecs(jiffies - rq->emitted_jiffies));
+		}
+	}
+	spin_unlock_irq(&b->irq_lock);
 }
-
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/intel_breadcrumbs.c"
-#endif
diff --git a/drivers/gpu/drm/i915/intel_cdclk.c b/drivers/gpu/drm/i915/intel_cdclk.c
index 25e3aba9cded..15ba950dee00 100644
--- a/drivers/gpu/drm/i915/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/intel_cdclk.c
@@ -218,7 +218,7 @@ static unsigned int intel_hpll_vco(struct drm_i915_private *dev_priv)
 	};
 	const unsigned int *vco_table;
 	unsigned int vco;
-	uint8_t tmp = 0;
+	u8 tmp = 0;
 
 	/* FIXME other chipsets? */
 	if (IS_GM45(dev_priv))
@@ -249,13 +249,13 @@ static void g33_get_cdclk(struct drm_i915_private *dev_priv,
 			  struct intel_cdclk_state *cdclk_state)
 {
 	struct pci_dev *pdev = dev_priv->drm.pdev;
-	static const uint8_t div_3200[] = { 12, 10,  8,  7, 5, 16 };
-	static const uint8_t div_4000[] = { 14, 12, 10,  8, 6, 20 };
-	static const uint8_t div_4800[] = { 20, 14, 12, 10, 8, 24 };
-	static const uint8_t div_5333[] = { 20, 16, 12, 12, 8, 28 };
-	const uint8_t *div_table;
+	static const u8 div_3200[] = { 12, 10,  8,  7, 5, 16 };
+	static const u8 div_4000[] = { 14, 12, 10,  8, 6, 20 };
+	static const u8 div_4800[] = { 20, 14, 12, 10, 8, 24 };
+	static const u8 div_5333[] = { 20, 16, 12, 12, 8, 28 };
+	const u8 *div_table;
 	unsigned int cdclk_sel;
-	uint16_t tmp = 0;
+	u16 tmp = 0;
 
 	cdclk_state->vco = intel_hpll_vco(dev_priv);
 
@@ -330,12 +330,12 @@ static void i965gm_get_cdclk(struct drm_i915_private *dev_priv,
 			     struct intel_cdclk_state *cdclk_state)
 {
 	struct pci_dev *pdev = dev_priv->drm.pdev;
-	static const uint8_t div_3200[] = { 16, 10,  8 };
-	static const uint8_t div_4000[] = { 20, 12, 10 };
-	static const uint8_t div_5333[] = { 24, 16, 14 };
-	const uint8_t *div_table;
+	static const u8 div_3200[] = { 16, 10,  8 };
+	static const u8 div_4000[] = { 20, 12, 10 };
+	static const u8 div_5333[] = { 24, 16, 14 };
+	const u8 *div_table;
 	unsigned int cdclk_sel;
-	uint16_t tmp = 0;
+	u16 tmp = 0;
 
 	cdclk_state->vco = intel_hpll_vco(dev_priv);
 
@@ -375,7 +375,7 @@ static void gm45_get_cdclk(struct drm_i915_private *dev_priv,
 {
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 	unsigned int cdclk_sel;
-	uint16_t tmp = 0;
+	u16 tmp = 0;
 
 	cdclk_state->vco = intel_hpll_vco(dev_priv);
 
@@ -403,8 +403,8 @@ static void gm45_get_cdclk(struct drm_i915_private *dev_priv,
 static void hsw_get_cdclk(struct drm_i915_private *dev_priv,
 			  struct intel_cdclk_state *cdclk_state)
 {
-	uint32_t lcpll = I915_READ(LCPLL_CTL);
-	uint32_t freq = lcpll & LCPLL_CLK_FREQ_MASK;
+	u32 lcpll = I915_READ(LCPLL_CTL);
+	u32 freq = lcpll & LCPLL_CLK_FREQ_MASK;
 
 	if (lcpll & LCPLL_CD_SOURCE_FCLK)
 		cdclk_state->cdclk = 800000;
@@ -520,6 +520,7 @@ static void vlv_set_cdclk(struct drm_i915_private *dev_priv,
 {
 	int cdclk = cdclk_state->cdclk;
 	u32 val, cmd = cdclk_state->voltage_level;
+	intel_wakeref_t wakeref;
 
 	switch (cdclk) {
 	case 400000:
@@ -539,7 +540,7 @@ static void vlv_set_cdclk(struct drm_i915_private *dev_priv,
 	 * a system suspend.  So grab the PIPE-A domain, which covers
 	 * the HW blocks needed for the following programming.
 	 */
-	intel_display_power_get(dev_priv, POWER_DOMAIN_PIPE_A);
+	wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_PIPE_A);
 
 	mutex_lock(&dev_priv->pcu_lock);
 	val = vlv_punit_read(dev_priv, PUNIT_REG_DSPFREQ);
@@ -593,7 +594,7 @@ static void vlv_set_cdclk(struct drm_i915_private *dev_priv,
 
 	vlv_program_pfi_credits(dev_priv);
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PIPE_A);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PIPE_A, wakeref);
 }
 
 static void chv_set_cdclk(struct drm_i915_private *dev_priv,
@@ -601,6 +602,7 @@ static void chv_set_cdclk(struct drm_i915_private *dev_priv,
 {
 	int cdclk = cdclk_state->cdclk;
 	u32 val, cmd = cdclk_state->voltage_level;
+	intel_wakeref_t wakeref;
 
 	switch (cdclk) {
 	case 333333:
@@ -619,7 +621,7 @@ static void chv_set_cdclk(struct drm_i915_private *dev_priv,
 	 * a system suspend.  So grab the PIPE-A domain, which covers
 	 * the HW blocks needed for the following programming.
 	 */
-	intel_display_power_get(dev_priv, POWER_DOMAIN_PIPE_A);
+	wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_PIPE_A);
 
 	mutex_lock(&dev_priv->pcu_lock);
 	val = vlv_punit_read(dev_priv, PUNIT_REG_DSPFREQ);
@@ -637,7 +639,7 @@ static void chv_set_cdclk(struct drm_i915_private *dev_priv,
 
 	vlv_program_pfi_credits(dev_priv);
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PIPE_A);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PIPE_A, wakeref);
 }
 
 static int bdw_calc_cdclk(int min_cdclk)
@@ -670,8 +672,8 @@ static u8 bdw_calc_voltage_level(int cdclk)
 static void bdw_get_cdclk(struct drm_i915_private *dev_priv,
 			  struct intel_cdclk_state *cdclk_state)
 {
-	uint32_t lcpll = I915_READ(LCPLL_CTL);
-	uint32_t freq = lcpll & LCPLL_CLK_FREQ_MASK;
+	u32 lcpll = I915_READ(LCPLL_CTL);
+	u32 freq = lcpll & LCPLL_CLK_FREQ_MASK;
 
 	if (lcpll & LCPLL_CD_SOURCE_FCLK)
 		cdclk_state->cdclk = 800000;
@@ -698,7 +700,7 @@ static void bdw_set_cdclk(struct drm_i915_private *dev_priv,
 			  const struct intel_cdclk_state *cdclk_state)
 {
 	int cdclk = cdclk_state->cdclk;
-	uint32_t val;
+	u32 val;
 	int ret;
 
 	if (WARN((I915_READ(LCPLL_CTL) &
@@ -1081,7 +1083,7 @@ static void skl_set_cdclk(struct drm_i915_private *dev_priv,
 
 static void skl_sanitize_cdclk(struct drm_i915_private *dev_priv)
 {
-	uint32_t cdctl, expected;
+	u32 cdctl, expected;
 
 	/*
 	 * check if the pre-os initialized the display
@@ -2140,7 +2142,7 @@ static int intel_pixel_rate_to_cdclk(struct drm_i915_private *dev_priv,
 {
 	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
 		return DIV_ROUND_UP(pixel_rate, 2);
-	else if (IS_GEN9(dev_priv) ||
+	else if (IS_GEN(dev_priv, 9) ||
 		 IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv))
 		return pixel_rate;
 	else if (IS_CHERRYVIEW(dev_priv))
@@ -2176,7 +2178,7 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
 		if (IS_CANNONLAKE(dev_priv) || IS_GEMINILAKE(dev_priv)) {
 			/* Display WA #1145: glk,cnl */
 			min_cdclk = max(316800, min_cdclk);
-		} else if (IS_GEN9(dev_priv) || IS_BROADWELL(dev_priv)) {
+		} else if (IS_GEN(dev_priv, 9) || IS_BROADWELL(dev_priv)) {
 			/* Display WA #1144: skl,bxt */
 			min_cdclk = max(432000, min_cdclk);
 		}
@@ -2537,7 +2539,7 @@ static int intel_compute_max_dotclk(struct drm_i915_private *dev_priv)
 
 	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
 		return 2 * max_cdclk_freq;
-	else if (IS_GEN9(dev_priv) ||
+	else if (IS_GEN(dev_priv, 9) ||
 		 IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv))
 		return max_cdclk_freq;
 	else if (IS_CHERRYVIEW(dev_priv))
@@ -2688,7 +2690,7 @@ static int vlv_hrawclk(struct drm_i915_private *dev_priv)
 
 static int g4x_hrawclk(struct drm_i915_private *dev_priv)
 {
-	uint32_t clkcfg;
+	u32 clkcfg;
 
 	/* hrawclock is 1/4 the FSB frequency */
 	clkcfg = I915_READ(CLKCFG);
@@ -2785,9 +2787,9 @@ void intel_init_cdclk_hooks(struct drm_i915_private *dev_priv)
 		dev_priv->display.get_cdclk = hsw_get_cdclk;
 	else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
 		dev_priv->display.get_cdclk = vlv_get_cdclk;
-	else if (IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv))
+	else if (IS_GEN(dev_priv, 6) || IS_IVYBRIDGE(dev_priv))
 		dev_priv->display.get_cdclk = fixed_400mhz_get_cdclk;
-	else if (IS_GEN5(dev_priv))
+	else if (IS_GEN(dev_priv, 5))
 		dev_priv->display.get_cdclk = fixed_450mhz_get_cdclk;
 	else if (IS_GM45(dev_priv))
 		dev_priv->display.get_cdclk = gm45_get_cdclk;
diff --git a/drivers/gpu/drm/i915/intel_color.c b/drivers/gpu/drm/i915/intel_color.c
index 5127da286a2b..71a1f12c6b2a 100644
--- a/drivers/gpu/drm/i915/intel_color.c
+++ b/drivers/gpu/drm/i915/intel_color.c
@@ -74,12 +74,17 @@
 #define ILK_CSC_COEFF_1_0		\
 	((7 << 12) | ILK_CSC_COEFF_FP(CTM_COEFF_1_0, 8))
 
-static bool crtc_state_is_legacy_gamma(struct drm_crtc_state *state)
+static bool lut_is_legacy(const struct drm_property_blob *lut)
 {
-	return !state->degamma_lut &&
-		!state->ctm &&
-		state->gamma_lut &&
-		drm_color_lut_size(state->gamma_lut) == LEGACY_LUT_LENGTH;
+	return drm_color_lut_size(lut) == LEGACY_LUT_LENGTH;
+}
+
+static bool crtc_state_is_legacy_gamma(const struct intel_crtc_state *crtc_state)
+{
+	return !crtc_state->base.degamma_lut &&
+		!crtc_state->base.ctm &&
+		crtc_state->base.gamma_lut &&
+		lut_is_legacy(crtc_state->base.gamma_lut);
 }
 
 /*
@@ -108,10 +113,10 @@ static u64 *ctm_mult_by_limited(u64 *result, const u64 *input)
 	return result;
 }
 
-static void ilk_load_ycbcr_conversion_matrix(struct intel_crtc *intel_crtc)
+static void ilk_load_ycbcr_conversion_matrix(struct intel_crtc *crtc)
 {
-	int pipe = intel_crtc->pipe;
-	struct drm_i915_private *dev_priv = to_i915(intel_crtc->base.dev);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	enum pipe pipe = crtc->pipe;
 
 	I915_WRITE(PIPE_CSC_PREOFF_HI(pipe), 0);
 	I915_WRITE(PIPE_CSC_PREOFF_ME(pipe), 0);
@@ -132,29 +137,28 @@ static void ilk_load_ycbcr_conversion_matrix(struct intel_crtc *intel_crtc)
 	I915_WRITE(PIPE_CSC_MODE(pipe), 0);
 }
 
-static void ilk_load_csc_matrix(struct drm_crtc_state *crtc_state)
+static void ilk_load_csc_matrix(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_crtc *crtc = crtc_state->crtc;
-	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	int i, pipe = intel_crtc->pipe;
-	uint16_t coeffs[9] = { 0, };
-	struct intel_crtc_state *intel_crtc_state = to_intel_crtc_state(crtc_state);
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	bool limited_color_range = false;
+	enum pipe pipe = crtc->pipe;
+	u16 coeffs[9] = {};
+	int i;
 
 	/*
 	 * FIXME if there's a gamma LUT after the CSC, we should
 	 * do the range compression using the gamma LUT instead.
 	 */
 	if (INTEL_GEN(dev_priv) >= 8 || IS_HASWELL(dev_priv))
-		limited_color_range = intel_crtc_state->limited_color_range;
+		limited_color_range = crtc_state->limited_color_range;
 
-	if (intel_crtc_state->output_format == INTEL_OUTPUT_FORMAT_YCBCR420 ||
-	    intel_crtc_state->output_format == INTEL_OUTPUT_FORMAT_YCBCR444) {
-		ilk_load_ycbcr_conversion_matrix(intel_crtc);
+	if (crtc_state->output_format == INTEL_OUTPUT_FORMAT_YCBCR420 ||
+	    crtc_state->output_format == INTEL_OUTPUT_FORMAT_YCBCR444) {
+		ilk_load_ycbcr_conversion_matrix(crtc);
 		return;
-	} else if (crtc_state->ctm) {
-		struct drm_color_ctm *ctm = crtc_state->ctm->data;
+	} else if (crtc_state->base.ctm) {
+		struct drm_color_ctm *ctm = crtc_state->base.ctm->data;
 		const u64 *input;
 		u64 temp[9];
 
@@ -168,7 +172,7 @@ static void ilk_load_csc_matrix(struct drm_crtc_state *crtc_state)
 		 * hardware.
 		 */
 		for (i = 0; i < ARRAY_SIZE(coeffs); i++) {
-			uint64_t abs_coeff = ((1ULL << 63) - 1) & input[i];
+			u64 abs_coeff = ((1ULL << 63) - 1) & input[i];
 
 			/*
 			 * Clamp input value to min/max supported by
@@ -230,7 +234,7 @@ static void ilk_load_csc_matrix(struct drm_crtc_state *crtc_state)
 	I915_WRITE(PIPE_CSC_PREOFF_LO(pipe), 0);
 
 	if (INTEL_GEN(dev_priv) > 6) {
-		uint16_t postoff = 0;
+		u16 postoff = 0;
 
 		if (limited_color_range)
 			postoff = (16 * (1 << 12) / 255) & 0x1fff;
@@ -241,7 +245,7 @@ static void ilk_load_csc_matrix(struct drm_crtc_state *crtc_state)
 
 		I915_WRITE(PIPE_CSC_MODE(pipe), 0);
 	} else {
-		uint32_t mode = CSC_MODE_YUV_TO_RGB;
+		u32 mode = CSC_MODE_YUV_TO_RGB;
 
 		if (limited_color_range)
 			mode |= CSC_BLACK_SCREEN_OFFSET;
@@ -253,21 +257,20 @@ static void ilk_load_csc_matrix(struct drm_crtc_state *crtc_state)
 /*
  * Set up the pipe CSC unit on CherryView.
  */
-static void cherryview_load_csc_matrix(struct drm_crtc_state *state)
+static void cherryview_load_csc_matrix(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_crtc *crtc = state->crtc;
-	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	int pipe = to_intel_crtc(crtc)->pipe;
-	uint32_t mode;
-
-	if (state->ctm) {
-		struct drm_color_ctm *ctm = state->ctm->data;
-		uint16_t coeffs[9] = { 0, };
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	enum pipe pipe = crtc->pipe;
+	u32 mode;
+
+	if (crtc_state->base.ctm) {
+		const struct drm_color_ctm *ctm = crtc_state->base.ctm->data;
+		u16 coeffs[9] = {};
 		int i;
 
 		for (i = 0; i < ARRAY_SIZE(coeffs); i++) {
-			uint64_t abs_coeff =
+			u64 abs_coeff =
 				((1ULL << 63) - 1) & ctm->matrix[i];
 
 			/* Round coefficient. */
@@ -293,35 +296,24 @@ static void cherryview_load_csc_matrix(struct drm_crtc_state *state)
 		I915_WRITE(CGM_PIPE_CSC_COEFF8(pipe), coeffs[8]);
 	}
 
-	mode = (state->ctm ? CGM_PIPE_MODE_CSC : 0);
-	if (!crtc_state_is_legacy_gamma(state)) {
-		mode |= (state->degamma_lut ? CGM_PIPE_MODE_DEGAMMA : 0) |
-			(state->gamma_lut ? CGM_PIPE_MODE_GAMMA : 0);
+	mode = (crtc_state->base.ctm ? CGM_PIPE_MODE_CSC : 0);
+	if (!crtc_state_is_legacy_gamma(crtc_state)) {
+		mode |= (crtc_state->base.degamma_lut ? CGM_PIPE_MODE_DEGAMMA : 0) |
+			(crtc_state->base.gamma_lut ? CGM_PIPE_MODE_GAMMA : 0);
 	}
 	I915_WRITE(CGM_PIPE_MODE(pipe), mode);
 }
 
-void intel_color_set_csc(struct drm_crtc_state *crtc_state)
-{
-	struct drm_device *dev = crtc_state->crtc->dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-
-	if (dev_priv->display.load_csc_matrix)
-		dev_priv->display.load_csc_matrix(crtc_state);
-}
-
 /* Loads the legacy palette/gamma unit for the CRTC. */
-static void i9xx_load_luts_internal(struct drm_crtc *crtc,
-				    struct drm_property_blob *blob,
-				    struct intel_crtc_state *crtc_state)
+static void i9xx_load_luts_internal(const struct intel_crtc_state *crtc_state,
+				    const struct drm_property_blob *blob)
 {
-	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	enum pipe pipe = intel_crtc->pipe;
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	enum pipe pipe = crtc->pipe;
 	int i;
 
-	if (HAS_GMCH_DISPLAY(dev_priv)) {
+	if (HAS_GMCH(dev_priv)) {
 		if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DSI))
 			assert_dsi_pll_enabled(dev_priv);
 		else
@@ -329,23 +321,24 @@ static void i9xx_load_luts_internal(struct drm_crtc *crtc,
 	}
 
 	if (blob) {
-		struct drm_color_lut *lut = blob->data;
+		const struct drm_color_lut *lut = blob->data;
+
 		for (i = 0; i < 256; i++) {
-			uint32_t word =
+			u32 word =
 				(drm_color_lut_extract(lut[i].red, 8) << 16) |
 				(drm_color_lut_extract(lut[i].green, 8) << 8) |
 				drm_color_lut_extract(lut[i].blue, 8);
 
-			if (HAS_GMCH_DISPLAY(dev_priv))
+			if (HAS_GMCH(dev_priv))
 				I915_WRITE(PALETTE(pipe, i), word);
 			else
 				I915_WRITE(LGC_PALETTE(pipe, i), word);
 		}
 	} else {
 		for (i = 0; i < 256; i++) {
-			uint32_t word = (i << 16) | (i << 8) | i;
+			u32 word = (i << 16) | (i << 8) | i;
 
-			if (HAS_GMCH_DISPLAY(dev_priv))
+			if (HAS_GMCH(dev_priv))
 				I915_WRITE(PALETTE(pipe, i), word);
 			else
 				I915_WRITE(LGC_PALETTE(pipe, i), word);
@@ -353,56 +346,37 @@ static void i9xx_load_luts_internal(struct drm_crtc *crtc,
 	}
 }
 
-static void i9xx_load_luts(struct drm_crtc_state *crtc_state)
+static void i9xx_load_luts(const struct intel_crtc_state *crtc_state)
 {
-	i9xx_load_luts_internal(crtc_state->crtc, crtc_state->gamma_lut,
-				to_intel_crtc_state(crtc_state));
+	i9xx_load_luts_internal(crtc_state, crtc_state->base.gamma_lut);
 }
 
-/* Loads the legacy palette/gamma unit for the CRTC on Haswell. */
-static void haswell_load_luts(struct drm_crtc_state *crtc_state)
+static void hsw_color_commit(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_crtc *crtc = crtc_state->crtc;
-	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct intel_crtc_state *intel_crtc_state =
-		to_intel_crtc_state(crtc_state);
-	bool reenable_ips = false;
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 
-	/*
-	 * Workaround : Do not read or write the pipe palette/gamma data while
-	 * GAMMA_MODE is configured for split gamma and IPS_CTL has IPS enabled.
-	 */
-	if (IS_HASWELL(dev_priv) && intel_crtc_state->ips_enabled &&
-	    (intel_crtc_state->gamma_mode == GAMMA_MODE_MODE_SPLIT)) {
-		hsw_disable_ips(intel_crtc_state);
-		reenable_ips = true;
-	}
+	I915_WRITE(GAMMA_MODE(crtc->pipe), crtc_state->gamma_mode);
 
-	intel_crtc_state->gamma_mode = GAMMA_MODE_MODE_8BIT;
-	I915_WRITE(GAMMA_MODE(intel_crtc->pipe), GAMMA_MODE_MODE_8BIT);
-
-	i9xx_load_luts(crtc_state);
-
-	if (reenable_ips)
-		hsw_enable_ips(intel_crtc_state);
+	ilk_load_csc_matrix(crtc_state);
 }
 
-static void bdw_load_degamma_lut(struct drm_crtc_state *state)
+static void bdw_load_degamma_lut(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_i915_private *dev_priv = to_i915(state->crtc->dev);
-	enum pipe pipe = to_intel_crtc(state->crtc)->pipe;
-	uint32_t i, lut_size = INTEL_INFO(dev_priv)->color.degamma_lut_size;
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	const struct drm_property_blob *degamma_lut = crtc_state->base.degamma_lut;
+	u32 i, lut_size = INTEL_INFO(dev_priv)->color.degamma_lut_size;
+	enum pipe pipe = crtc->pipe;
 
 	I915_WRITE(PREC_PAL_INDEX(pipe),
 		   PAL_PREC_SPLIT_MODE | PAL_PREC_AUTO_INCREMENT);
 
-	if (state->degamma_lut) {
-		struct drm_color_lut *lut = state->degamma_lut->data;
+	if (degamma_lut) {
+		const struct drm_color_lut *lut = degamma_lut->data;
 
 		for (i = 0; i < lut_size; i++) {
-			uint32_t word =
+			u32 word =
 			drm_color_lut_extract(lut[i].red, 10) << 20 |
 			drm_color_lut_extract(lut[i].green, 10) << 10 |
 			drm_color_lut_extract(lut[i].blue, 10);
@@ -411,7 +385,7 @@ static void bdw_load_degamma_lut(struct drm_crtc_state *state)
 		}
 	} else {
 		for (i = 0; i < lut_size; i++) {
-			uint32_t v = (i * ((1 << 10) - 1)) / (lut_size - 1);
+			u32 v = (i * ((1 << 10) - 1)) / (lut_size - 1);
 
 			I915_WRITE(PREC_PAL_DATA(pipe),
 				   (v << 20) | (v << 10) | v);
@@ -419,11 +393,13 @@ static void bdw_load_degamma_lut(struct drm_crtc_state *state)
 	}
 }
 
-static void bdw_load_gamma_lut(struct drm_crtc_state *state, u32 offset)
+static void bdw_load_gamma_lut(const struct intel_crtc_state *crtc_state, u32 offset)
 {
-	struct drm_i915_private *dev_priv = to_i915(state->crtc->dev);
-	enum pipe pipe = to_intel_crtc(state->crtc)->pipe;
-	uint32_t i, lut_size = INTEL_INFO(dev_priv)->color.gamma_lut_size;
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	const struct drm_property_blob *gamma_lut = crtc_state->base.gamma_lut;
+	u32 i, lut_size = INTEL_INFO(dev_priv)->color.gamma_lut_size;
+	enum pipe pipe = crtc->pipe;
 
 	WARN_ON(offset & ~PAL_PREC_INDEX_VALUE_MASK);
 
@@ -432,11 +408,11 @@ static void bdw_load_gamma_lut(struct drm_crtc_state *state, u32 offset)
 		   PAL_PREC_AUTO_INCREMENT |
 		   offset);
 
-	if (state->gamma_lut) {
-		struct drm_color_lut *lut = state->gamma_lut->data;
+	if (gamma_lut) {
+		const struct drm_color_lut *lut = gamma_lut->data;
 
 		for (i = 0; i < lut_size; i++) {
-			uint32_t word =
+			u32 word =
 			(drm_color_lut_extract(lut[i].red, 10) << 20) |
 			(drm_color_lut_extract(lut[i].green, 10) << 10) |
 			drm_color_lut_extract(lut[i].blue, 10);
@@ -454,7 +430,7 @@ static void bdw_load_gamma_lut(struct drm_crtc_state *state, u32 offset)
 			   drm_color_lut_extract(lut[i].blue, 16));
 	} else {
 		for (i = 0; i < lut_size; i++) {
-			uint32_t v = (i * ((1 << 10) - 1)) / (lut_size - 1);
+			u32 v = (i * ((1 << 10) - 1)) / (lut_size - 1);
 
 			I915_WRITE(PREC_PAL_DATA(pipe),
 				   (v << 20) | (v << 10) | v);
@@ -467,38 +443,34 @@ static void bdw_load_gamma_lut(struct drm_crtc_state *state, u32 offset)
 }
 
 /* Loads the palette/gamma unit for the CRTC on Broadwell+. */
-static void broadwell_load_luts(struct drm_crtc_state *state)
+static void broadwell_load_luts(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_i915_private *dev_priv = to_i915(state->crtc->dev);
-	struct intel_crtc_state *intel_state = to_intel_crtc_state(state);
-	enum pipe pipe = to_intel_crtc(state->crtc)->pipe;
-
-	if (crtc_state_is_legacy_gamma(state)) {
-		haswell_load_luts(state);
-		return;
-	}
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	enum pipe pipe = crtc->pipe;
 
-	bdw_load_degamma_lut(state);
-	bdw_load_gamma_lut(state,
-			   INTEL_INFO(dev_priv)->color.degamma_lut_size);
-
-	intel_state->gamma_mode = GAMMA_MODE_MODE_SPLIT;
-	I915_WRITE(GAMMA_MODE(pipe), GAMMA_MODE_MODE_SPLIT);
-	POSTING_READ(GAMMA_MODE(pipe));
+	if (crtc_state_is_legacy_gamma(crtc_state)) {
+		i9xx_load_luts(crtc_state);
+	} else {
+		bdw_load_degamma_lut(crtc_state);
+		bdw_load_gamma_lut(crtc_state,
+				   INTEL_INFO(dev_priv)->color.degamma_lut_size);
 
-	/*
-	 * Reset the index, otherwise it prevents the legacy palette to be
-	 * written properly.
-	 */
-	I915_WRITE(PREC_PAL_INDEX(pipe), 0);
+		/*
+		 * Reset the index, otherwise it prevents the legacy palette to be
+		 * written properly.
+		 */
+		I915_WRITE(PREC_PAL_INDEX(pipe), 0);
+	}
 }
 
-static void glk_load_degamma_lut(struct drm_crtc_state *state)
+static void glk_load_degamma_lut(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_i915_private *dev_priv = to_i915(state->crtc->dev);
-	enum pipe pipe = to_intel_crtc(state->crtc)->pipe;
-	const uint32_t lut_size = 33;
-	uint32_t i;
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	enum pipe pipe = crtc->pipe;
+	const u32 lut_size = 33;
+	u32 i;
 
 	/*
 	 * When setting the auto-increment bit, the hardware seems to
@@ -513,7 +485,7 @@ static void glk_load_degamma_lut(struct drm_crtc_state *state)
 	 *  different values per channel, so this just loads a linear table.
 	 */
 	for (i = 0; i < lut_size; i++) {
-		uint32_t v = (i * (1 << 16)) / (lut_size - 1);
+		u32 v = (i * (1 << 16)) / (lut_size - 1);
 
 		I915_WRITE(PRE_CSC_GAMC_DATA(pipe), v);
 	}
@@ -523,51 +495,49 @@ static void glk_load_degamma_lut(struct drm_crtc_state *state)
 		I915_WRITE(PRE_CSC_GAMC_DATA(pipe), (1 << 16));
 }
 
-static void glk_load_luts(struct drm_crtc_state *state)
+static void glk_load_luts(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_crtc *crtc = state->crtc;
-	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_crtc_state *intel_state = to_intel_crtc_state(state);
-	enum pipe pipe = to_intel_crtc(crtc)->pipe;
-
-	glk_load_degamma_lut(state);
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	enum pipe pipe = crtc->pipe;
 
-	if (crtc_state_is_legacy_gamma(state)) {
-		haswell_load_luts(state);
-		return;
-	}
+	glk_load_degamma_lut(crtc_state);
 
-	bdw_load_gamma_lut(state, 0);
+	if (crtc_state_is_legacy_gamma(crtc_state)) {
+		i9xx_load_luts(crtc_state);
+	} else {
+		bdw_load_gamma_lut(crtc_state, 0);
 
-	intel_state->gamma_mode = GAMMA_MODE_MODE_10BIT;
-	I915_WRITE(GAMMA_MODE(pipe), GAMMA_MODE_MODE_10BIT);
-	POSTING_READ(GAMMA_MODE(pipe));
+		/*
+		 * Reset the index, otherwise it prevents the legacy palette to be
+		 * written properly.
+		 */
+		I915_WRITE(PREC_PAL_INDEX(pipe), 0);
+	}
 }
 
-/* Loads the palette/gamma unit for the CRTC on CherryView. */
-static void cherryview_load_luts(struct drm_crtc_state *state)
+static void cherryview_load_luts(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_crtc *crtc = state->crtc;
-	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
-	enum pipe pipe = to_intel_crtc(crtc)->pipe;
-	struct drm_color_lut *lut;
-	uint32_t i, lut_size;
-	uint32_t word0, word1;
-
-	if (crtc_state_is_legacy_gamma(state)) {
-		/* Turn off degamma/gamma on CGM block. */
-		I915_WRITE(CGM_PIPE_MODE(pipe),
-			   (state->ctm ? CGM_PIPE_MODE_CSC : 0));
-		i9xx_load_luts_internal(crtc, state->gamma_lut,
-					to_intel_crtc_state(state));
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	const struct drm_property_blob *gamma_lut = crtc_state->base.gamma_lut;
+	const struct drm_property_blob *degamma_lut = crtc_state->base.degamma_lut;
+	enum pipe pipe = crtc->pipe;
+
+	cherryview_load_csc_matrix(crtc_state);
+
+	if (crtc_state_is_legacy_gamma(crtc_state)) {
+		i9xx_load_luts_internal(crtc_state, gamma_lut);
 		return;
 	}
 
-	if (state->degamma_lut) {
-		lut = state->degamma_lut->data;
-		lut_size = INTEL_INFO(dev_priv)->color.degamma_lut_size;
+	if (degamma_lut) {
+		const struct drm_color_lut *lut = degamma_lut->data;
+		int i, lut_size = INTEL_INFO(dev_priv)->color.degamma_lut_size;
+
 		for (i = 0; i < lut_size; i++) {
+			u32 word0, word1;
+
 			/* Write LUT in U0.14 format. */
 			word0 =
 			(drm_color_lut_extract(lut[i].green, 14) << 16) |
@@ -579,10 +549,13 @@ static void cherryview_load_luts(struct drm_crtc_state *state)
 		}
 	}
 
-	if (state->gamma_lut) {
-		lut = state->gamma_lut->data;
-		lut_size = INTEL_INFO(dev_priv)->color.gamma_lut_size;
+	if (gamma_lut) {
+		const struct drm_color_lut *lut = gamma_lut->data;
+		int i, lut_size = INTEL_INFO(dev_priv)->color.gamma_lut_size;
+
 		for (i = 0; i < lut_size; i++) {
+			u32 word0, word1;
+
 			/* Write LUT in U0.10 format. */
 			word0 =
 			(drm_color_lut_extract(lut[i].green, 10) << 16) |
@@ -594,74 +567,100 @@ static void cherryview_load_luts(struct drm_crtc_state *state)
 		}
 	}
 
-	I915_WRITE(CGM_PIPE_MODE(pipe),
-		   (state->ctm ? CGM_PIPE_MODE_CSC : 0) |
-		   (state->degamma_lut ? CGM_PIPE_MODE_DEGAMMA : 0) |
-		   (state->gamma_lut ? CGM_PIPE_MODE_GAMMA : 0));
-
 	/*
 	 * Also program a linear LUT in the legacy block (behind the
 	 * CGM block).
 	 */
-	i9xx_load_luts_internal(crtc, NULL, to_intel_crtc_state(state));
+	i9xx_load_luts_internal(crtc_state, NULL);
 }
 
-void intel_color_load_luts(struct drm_crtc_state *crtc_state)
+void intel_color_load_luts(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_device *dev = crtc_state->crtc->dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct drm_i915_private *dev_priv = to_i915(crtc_state->base.crtc->dev);
 
 	dev_priv->display.load_luts(crtc_state);
 }
 
-int intel_color_check(struct drm_crtc *crtc,
-		      struct drm_crtc_state *crtc_state)
+void intel_color_commit(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
-	size_t gamma_length, degamma_length;
+	struct drm_i915_private *dev_priv = to_i915(crtc_state->base.crtc->dev);
+
+	if (dev_priv->display.color_commit)
+		dev_priv->display.color_commit(crtc_state);
+}
+
+static int check_lut_size(const struct drm_property_blob *lut, int expected)
+{
+	int len;
+
+	if (!lut)
+		return 0;
+
+	len = drm_color_lut_size(lut);
+	if (len != expected) {
+		DRM_DEBUG_KMS("Invalid LUT size; got %d, expected %d\n",
+			      len, expected);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+int intel_color_check(struct intel_crtc_state *crtc_state)
+{
+	struct drm_i915_private *dev_priv = to_i915(crtc_state->base.crtc->dev);
+	const struct drm_property_blob *gamma_lut = crtc_state->base.gamma_lut;
+	const struct drm_property_blob *degamma_lut = crtc_state->base.degamma_lut;
+	int gamma_length, degamma_length;
+	u32 gamma_tests, degamma_tests;
 
 	degamma_length = INTEL_INFO(dev_priv)->color.degamma_lut_size;
 	gamma_length = INTEL_INFO(dev_priv)->color.gamma_lut_size;
+	degamma_tests = INTEL_INFO(dev_priv)->color.degamma_lut_tests;
+	gamma_tests = INTEL_INFO(dev_priv)->color.gamma_lut_tests;
 
-	/*
-	 * We allow both degamma & gamma luts at the right size or
-	 * NULL.
-	 */
-	if ((!crtc_state->degamma_lut ||
-	     drm_color_lut_size(crtc_state->degamma_lut) == degamma_length) &&
-	    (!crtc_state->gamma_lut ||
-	     drm_color_lut_size(crtc_state->gamma_lut) == gamma_length))
+	/* Always allow legacy gamma LUT with no further checking. */
+	if (crtc_state_is_legacy_gamma(crtc_state)) {
+		crtc_state->gamma_mode = GAMMA_MODE_MODE_8BIT;
 		return 0;
+	}
 
-	/*
-	 * We also allow no degamma lut/ctm and a gamma lut at the legacy
-	 * size (256 entries).
-	 */
-	if (crtc_state_is_legacy_gamma(crtc_state))
-		return 0;
+	if (check_lut_size(degamma_lut, degamma_length) ||
+	    check_lut_size(gamma_lut, gamma_length))
+		return -EINVAL;
 
-	return -EINVAL;
+	if (drm_color_lut_check(degamma_lut, degamma_tests) ||
+	    drm_color_lut_check(gamma_lut, gamma_tests))
+		return -EINVAL;
+
+	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
+		crtc_state->gamma_mode = GAMMA_MODE_MODE_10BIT;
+	else if (INTEL_GEN(dev_priv) >= 9 || IS_BROADWELL(dev_priv))
+		crtc_state->gamma_mode = GAMMA_MODE_MODE_SPLIT;
+	else
+		crtc_state->gamma_mode = GAMMA_MODE_MODE_8BIT;
+
+	return 0;
 }
 
-void intel_color_init(struct drm_crtc *crtc)
+void intel_color_init(struct intel_crtc *crtc)
 {
-	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 
-	drm_mode_crtc_set_gamma_size(crtc, 256);
+	drm_mode_crtc_set_gamma_size(&crtc->base, 256);
 
 	if (IS_CHERRYVIEW(dev_priv)) {
-		dev_priv->display.load_csc_matrix = cherryview_load_csc_matrix;
 		dev_priv->display.load_luts = cherryview_load_luts;
 	} else if (IS_HASWELL(dev_priv)) {
-		dev_priv->display.load_csc_matrix = ilk_load_csc_matrix;
-		dev_priv->display.load_luts = haswell_load_luts;
+		dev_priv->display.load_luts = i9xx_load_luts;
+		dev_priv->display.color_commit = hsw_color_commit;
 	} else if (IS_BROADWELL(dev_priv) || IS_GEN9_BC(dev_priv) ||
 		   IS_BROXTON(dev_priv)) {
-		dev_priv->display.load_csc_matrix = ilk_load_csc_matrix;
 		dev_priv->display.load_luts = broadwell_load_luts;
+		dev_priv->display.color_commit = hsw_color_commit;
 	} else if (IS_GEMINILAKE(dev_priv) || IS_CANNONLAKE(dev_priv)) {
-		dev_priv->display.load_csc_matrix = ilk_load_csc_matrix;
 		dev_priv->display.load_luts = glk_load_luts;
+		dev_priv->display.color_commit = hsw_color_commit;
 	} else {
 		dev_priv->display.load_luts = i9xx_load_luts;
 	}
@@ -669,7 +668,7 @@ void intel_color_init(struct drm_crtc *crtc)
 	/* Enable color management support when we have degamma & gamma LUTs. */
 	if (INTEL_INFO(dev_priv)->color.degamma_lut_size != 0 &&
 	    INTEL_INFO(dev_priv)->color.gamma_lut_size != 0)
-		drm_crtc_enable_color_mgmt(crtc,
+		drm_crtc_enable_color_mgmt(&crtc->base,
 					   INTEL_INFO(dev_priv)->color.degamma_lut_size,
 					   true,
 					   INTEL_INFO(dev_priv)->color.gamma_lut_size);
diff --git a/drivers/gpu/drm/i915/intel_connector.c b/drivers/gpu/drm/i915/intel_connector.c
index 18e370f607bc..ee16758747c5 100644
--- a/drivers/gpu/drm/i915/intel_connector.c
+++ b/drivers/gpu/drm/i915/intel_connector.c
@@ -27,7 +27,6 @@
 #include <linux/i2c.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_edid.h>
-#include <drm/drmP.h>
 #include "intel_drv.h"
 #include "i915_drv.h"
 
@@ -95,6 +94,10 @@ void intel_connector_destroy(struct drm_connector *connector)
 	intel_panel_fini(&intel_connector->panel);
 
 	drm_connector_cleanup(connector);
+
+	if (intel_connector->port)
+		drm_dp_mst_put_port_malloc(intel_connector->port);
+
 	kfree(connector);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_crt.c b/drivers/gpu/drm/i915/intel_crt.c
index 68f2fb89ece3..3716b2ee362f 100644
--- a/drivers/gpu/drm/i915/intel_crt.c
+++ b/drivers/gpu/drm/i915/intel_crt.c
@@ -27,11 +27,10 @@
 #include <linux/dmi.h>
 #include <linux/i2c.h>
 #include <linux/slab.h>
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 #include "intel_drv.h"
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
@@ -84,15 +83,17 @@ static bool intel_crt_get_hw_state(struct intel_encoder *encoder,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_crt *crt = intel_encoder_to_crt(encoder);
+	intel_wakeref_t wakeref;
 	bool ret;
 
-	if (!intel_display_power_get_if_enabled(dev_priv,
-						encoder->power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     encoder->power_domain);
+	if (!wakeref)
 		return false;
 
 	ret = intel_crt_port_enabled(dev_priv, crt->adpa_reg, pipe);
 
-	intel_display_power_put(dev_priv, encoder->power_domain);
+	intel_display_power_put(dev_priv, encoder->power_domain, wakeref);
 
 	return ret;
 }
@@ -322,7 +323,7 @@ intel_crt_mode_valid(struct drm_connector *connector,
 		 * DAC limit supposedly 355 MHz.
 		 */
 		max_clock = 270000;
-	else if (IS_GEN3(dev_priv) || IS_GEN4(dev_priv))
+	else if (IS_GEN_RANGE(dev_priv, 3, 4))
 		max_clock = 400000;
 	else
 		max_clock = 350000;
@@ -344,51 +345,52 @@ intel_crt_mode_valid(struct drm_connector *connector,
 	return MODE_OK;
 }
 
-static bool intel_crt_compute_config(struct intel_encoder *encoder,
-				     struct intel_crtc_state *pipe_config,
-				     struct drm_connector_state *conn_state)
+static int intel_crt_compute_config(struct intel_encoder *encoder,
+				    struct intel_crtc_state *pipe_config,
+				    struct drm_connector_state *conn_state)
 {
 	struct drm_display_mode *adjusted_mode =
 		&pipe_config->base.adjusted_mode;
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
-	return true;
+
+	return 0;
 }
 
-static bool pch_crt_compute_config(struct intel_encoder *encoder,
-				   struct intel_crtc_state *pipe_config,
-				   struct drm_connector_state *conn_state)
+static int pch_crt_compute_config(struct intel_encoder *encoder,
+				  struct intel_crtc_state *pipe_config,
+				  struct drm_connector_state *conn_state)
 {
 	struct drm_display_mode *adjusted_mode =
 		&pipe_config->base.adjusted_mode;
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	pipe_config->has_pch_encoder = true;
 	pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
 
-	return true;
+	return 0;
 }
 
-static bool hsw_crt_compute_config(struct intel_encoder *encoder,
-				   struct intel_crtc_state *pipe_config,
-				   struct drm_connector_state *conn_state)
+static int hsw_crt_compute_config(struct intel_encoder *encoder,
+				  struct intel_crtc_state *pipe_config,
+				  struct drm_connector_state *conn_state)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct drm_display_mode *adjusted_mode =
 		&pipe_config->base.adjusted_mode;
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	/* HSW/BDW FDI limited to 4k */
 	if (adjusted_mode->crtc_hdisplay > 4096 ||
 	    adjusted_mode->crtc_hblank_start > 4096)
-		return false;
+		return -EINVAL;
 
 	pipe_config->has_pch_encoder = true;
 	pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
@@ -397,7 +399,7 @@ static bool hsw_crt_compute_config(struct intel_encoder *encoder,
 	if (HAS_PCH_LPT(dev_priv)) {
 		if (pipe_config->bw_constrained && pipe_config->pipe_bpp < 24) {
 			DRM_DEBUG_KMS("LPT only supports 24bpp\n");
-			return false;
+			return -EINVAL;
 		}
 
 		pipe_config->pipe_bpp = 24;
@@ -406,7 +408,7 @@ static bool hsw_crt_compute_config(struct intel_encoder *encoder,
 	/* FDI must always be 2.7 GHz */
 	pipe_config->port_clock = 135000 * 2;
 
-	return true;
+	return 0;
 }
 
 static bool intel_ironlake_crt_detect_hotplug(struct drm_connector *connector)
@@ -629,19 +631,19 @@ static bool intel_crt_detect_ddc(struct drm_connector *connector)
 }
 
 static enum drm_connector_status
-intel_crt_load_detect(struct intel_crt *crt, uint32_t pipe)
+intel_crt_load_detect(struct intel_crt *crt, u32 pipe)
 {
 	struct drm_device *dev = crt->base.base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	uint32_t save_bclrpat;
-	uint32_t save_vtotal;
-	uint32_t vtotal, vactive;
-	uint32_t vsample;
-	uint32_t vblank, vblank_start, vblank_end;
-	uint32_t dsl;
+	u32 save_bclrpat;
+	u32 save_vtotal;
+	u32 vtotal, vactive;
+	u32 vsample;
+	u32 vblank, vblank_start, vblank_end;
+	u32 dsl;
 	i915_reg_t bclrpat_reg, vtotal_reg,
 		vblank_reg, vsync_reg, pipeconf_reg, pipe_dsl_reg;
-	uint8_t	st00;
+	u8 st00;
 	enum drm_connector_status status;
 
 	DRM_DEBUG_KMS("starting load-detect on CRT\n");
@@ -666,8 +668,8 @@ intel_crt_load_detect(struct intel_crt *crt, uint32_t pipe)
 	/* Set the border color to purple. */
 	I915_WRITE(bclrpat_reg, 0x500050);
 
-	if (!IS_GEN2(dev_priv)) {
-		uint32_t pipeconf = I915_READ(pipeconf_reg);
+	if (!IS_GEN(dev_priv, 2)) {
+		u32 pipeconf = I915_READ(pipeconf_reg);
 		I915_WRITE(pipeconf_reg, pipeconf | PIPECONF_FORCE_BORDER);
 		POSTING_READ(pipeconf_reg);
 		/* Wait for next Vblank to substitue
@@ -688,8 +690,8 @@ intel_crt_load_detect(struct intel_crt *crt, uint32_t pipe)
 		* Yes, this will flicker
 		*/
 		if (vblank_start <= vactive && vblank_end >= vtotal) {
-			uint32_t vsync = I915_READ(vsync_reg);
-			uint32_t vsync_start = (vsync & 0xffff) + 1;
+			u32 vsync = I915_READ(vsync_reg);
+			u32 vsync_start = (vsync & 0xffff) + 1;
 
 			vblank_start = vsync_start;
 			I915_WRITE(vblank_reg,
@@ -777,6 +779,7 @@ intel_crt_detect(struct drm_connector *connector,
 	struct drm_i915_private *dev_priv = to_i915(connector->dev);
 	struct intel_crt *crt = intel_attached_crt(connector);
 	struct intel_encoder *intel_encoder = &crt->base;
+	intel_wakeref_t wakeref;
 	int status, ret;
 	struct intel_load_detect_pipe tmp;
 
@@ -785,7 +788,8 @@ intel_crt_detect(struct drm_connector *connector,
 		      force);
 
 	if (i915_modparams.load_detect_test) {
-		intel_display_power_get(dev_priv, intel_encoder->power_domain);
+		wakeref = intel_display_power_get(dev_priv,
+						  intel_encoder->power_domain);
 		goto load_detect;
 	}
 
@@ -793,7 +797,8 @@ intel_crt_detect(struct drm_connector *connector,
 	if (dmi_check_system(intel_spurious_crt_detect))
 		return connector_status_disconnected;
 
-	intel_display_power_get(dev_priv, intel_encoder->power_domain);
+	wakeref = intel_display_power_get(dev_priv,
+					  intel_encoder->power_domain);
 
 	if (I915_HAS_HOTPLUG(dev_priv)) {
 		/* We can not rely on the HPD pin always being correctly wired
@@ -848,7 +853,7 @@ load_detect:
 	}
 
 out:
-	intel_display_power_put(dev_priv, intel_encoder->power_domain);
+	intel_display_power_put(dev_priv, intel_encoder->power_domain, wakeref);
 	return status;
 }
 
@@ -858,10 +863,12 @@ static int intel_crt_get_modes(struct drm_connector *connector)
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_crt *crt = intel_attached_crt(connector);
 	struct intel_encoder *intel_encoder = &crt->base;
-	int ret;
+	intel_wakeref_t wakeref;
 	struct i2c_adapter *i2c;
+	int ret;
 
-	intel_display_power_get(dev_priv, intel_encoder->power_domain);
+	wakeref = intel_display_power_get(dev_priv,
+					  intel_encoder->power_domain);
 
 	i2c = intel_gmbus_get_adapter(dev_priv, dev_priv->vbt.crt_ddc_pin);
 	ret = intel_crt_ddc_get_modes(connector, i2c);
@@ -873,7 +880,7 @@ static int intel_crt_get_modes(struct drm_connector *connector)
 	ret = intel_crt_ddc_get_modes(connector, i2c);
 
 out:
-	intel_display_power_put(dev_priv, intel_encoder->power_domain);
+	intel_display_power_put(dev_priv, intel_encoder->power_domain, wakeref);
 
 	return ret;
 }
@@ -981,7 +988,7 @@ void intel_crt_init(struct drm_i915_private *dev_priv)
 	else
 		crt->base.crtc_mask = (1 << 0) | (1 << 1) | (1 << 2);
 
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		connector->interlace_allowed = 0;
 	else
 		connector->interlace_allowed = 1;
diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
index a516697bf57d..e8ac04c33e29 100644
--- a/drivers/gpu/drm/i915/intel_csr.c
+++ b/drivers/gpu/drm/i915/intel_csr.c
@@ -70,50 +70,50 @@ MODULE_FIRMWARE(BXT_CSR_PATH);
 
 struct intel_css_header {
 	/* 0x09 for DMC */
-	uint32_t module_type;
+	u32 module_type;
 
 	/* Includes the DMC specific header in dwords */
-	uint32_t header_len;
+	u32 header_len;
 
 	/* always value would be 0x10000 */
-	uint32_t header_ver;
+	u32 header_ver;
 
 	/* Not used */
-	uint32_t module_id;
+	u32 module_id;
 
 	/* Not used */
-	uint32_t module_vendor;
+	u32 module_vendor;
 
 	/* in YYYYMMDD format */
-	uint32_t date;
+	u32 date;
 
 	/* Size in dwords (CSS_Headerlen + PackageHeaderLen + dmc FWsLen)/4 */
-	uint32_t size;
+	u32 size;
 
 	/* Not used */
-	uint32_t key_size;
+	u32 key_size;
 
 	/* Not used */
-	uint32_t modulus_size;
+	u32 modulus_size;
 
 	/* Not used */
-	uint32_t exponent_size;
+	u32 exponent_size;
 
 	/* Not used */
-	uint32_t reserved1[12];
+	u32 reserved1[12];
 
 	/* Major Minor */
-	uint32_t version;
+	u32 version;
 
 	/* Not used */
-	uint32_t reserved2[8];
+	u32 reserved2[8];
 
 	/* Not used */
-	uint32_t kernel_header_info;
+	u32 kernel_header_info;
 } __packed;
 
 struct intel_fw_info {
-	uint16_t reserved1;
+	u16 reserved1;
 
 	/* Stepping (A, B, C, ..., *). * is a wildcard */
 	char stepping;
@@ -121,8 +121,8 @@ struct intel_fw_info {
 	/* Sub-stepping (0, 1, ..., *). * is a wildcard */
 	char substepping;
 
-	uint32_t offset;
-	uint32_t reserved2;
+	u32 offset;
+	u32 reserved2;
 } __packed;
 
 struct intel_package_header {
@@ -135,14 +135,14 @@ struct intel_package_header {
 	unsigned char reserved[10];
 
 	/* Number of valid entries in the FWInfo array below */
-	uint32_t num_entries;
+	u32 num_entries;
 
 	struct intel_fw_info fw_info[20];
 } __packed;
 
 struct intel_dmc_header {
 	/* always value would be 0x40403E3E */
-	uint32_t signature;
+	u32 signature;
 
 	/* DMC binary header length */
 	unsigned char header_len;
@@ -151,30 +151,30 @@ struct intel_dmc_header {
 	unsigned char header_ver;
 
 	/* Reserved */
-	uint16_t dmcc_ver;
+	u16 dmcc_ver;
 
 	/* Major, Minor */
-	uint32_t	project;
+	u32 project;
 
 	/* Firmware program size (excluding header) in dwords */
-	uint32_t	fw_size;
+	u32 fw_size;
 
 	/* Major Minor version */
-	uint32_t fw_version;
+	u32 fw_version;
 
 	/* Number of valid MMIO cycles present. */
-	uint32_t mmio_count;
+	u32 mmio_count;
 
 	/* MMIO address */
-	uint32_t mmioaddr[8];
+	u32 mmioaddr[8];
 
 	/* MMIO data */
-	uint32_t mmiodata[8];
+	u32 mmiodata[8];
 
 	/* FW filename  */
 	unsigned char dfile[32];
 
-	uint32_t reserved1[2];
+	u32 reserved1[2];
 } __packed;
 
 struct stepping_info {
@@ -230,7 +230,7 @@ intel_get_stepping_info(struct drm_i915_private *dev_priv)
 
 static void gen9_set_dc_state_debugmask(struct drm_i915_private *dev_priv)
 {
-	uint32_t val, mask;
+	u32 val, mask;
 
 	mask = DC_STATE_DEBUG_MASK_MEMORY_UP;
 
@@ -257,7 +257,7 @@ static void gen9_set_dc_state_debugmask(struct drm_i915_private *dev_priv)
 void intel_csr_load_program(struct drm_i915_private *dev_priv)
 {
 	u32 *payload = dev_priv->csr.dmc_payload;
-	uint32_t i, fw_size;
+	u32 i, fw_size;
 
 	if (!HAS_CSR(dev_priv)) {
 		DRM_ERROR("No CSR support available for this platform\n");
@@ -289,17 +289,17 @@ void intel_csr_load_program(struct drm_i915_private *dev_priv)
 	gen9_set_dc_state_debugmask(dev_priv);
 }
 
-static uint32_t *parse_csr_fw(struct drm_i915_private *dev_priv,
-			      const struct firmware *fw)
+static u32 *parse_csr_fw(struct drm_i915_private *dev_priv,
+			 const struct firmware *fw)
 {
 	struct intel_css_header *css_header;
 	struct intel_package_header *package_header;
 	struct intel_dmc_header *dmc_header;
 	struct intel_csr *csr = &dev_priv->csr;
 	const struct stepping_info *si = intel_get_stepping_info(dev_priv);
-	uint32_t dmc_offset = CSR_DEFAULT_FW_OFFSET, readcount = 0, nbytes;
-	uint32_t i;
-	uint32_t *dmc_payload;
+	u32 dmc_offset = CSR_DEFAULT_FW_OFFSET, readcount = 0, nbytes;
+	u32 i;
+	u32 *dmc_payload;
 
 	if (!fw)
 		return NULL;
@@ -409,6 +409,21 @@ static uint32_t *parse_csr_fw(struct drm_i915_private *dev_priv,
 	return memcpy(dmc_payload, &fw->data[readcount], nbytes);
 }
 
+static void intel_csr_runtime_pm_get(struct drm_i915_private *dev_priv)
+{
+	WARN_ON(dev_priv->csr.wakeref);
+	dev_priv->csr.wakeref =
+		intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+}
+
+static void intel_csr_runtime_pm_put(struct drm_i915_private *dev_priv)
+{
+	intel_wakeref_t wakeref __maybe_unused =
+		fetch_and_zero(&dev_priv->csr.wakeref);
+
+	intel_display_power_put(dev_priv, POWER_DOMAIN_INIT, wakeref);
+}
+
 static void csr_load_work_fn(struct work_struct *work)
 {
 	struct drm_i915_private *dev_priv;
@@ -424,8 +439,7 @@ static void csr_load_work_fn(struct work_struct *work)
 
 	if (dev_priv->csr.dmc_payload) {
 		intel_csr_load_program(dev_priv);
-
-		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+		intel_csr_runtime_pm_put(dev_priv);
 
 		DRM_INFO("Finished loading DMC firmware %s (v%u.%u)\n",
 			 dev_priv->csr.fw_path,
@@ -467,7 +481,7 @@ void intel_csr_ucode_init(struct drm_i915_private *dev_priv)
 	 * suspend as runtime suspend *requires* a working CSR for whatever
 	 * reason.
 	 */
-	intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+	intel_csr_runtime_pm_get(dev_priv);
 
 	if (INTEL_GEN(dev_priv) >= 12) {
 		/* Allow to load fw via parameter using the last known size */
@@ -538,7 +552,7 @@ void intel_csr_ucode_suspend(struct drm_i915_private *dev_priv)
 
 	/* Drop the reference held in case DMC isn't loaded. */
 	if (!dev_priv->csr.dmc_payload)
-		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+		intel_csr_runtime_pm_put(dev_priv);
 }
 
 /**
@@ -558,7 +572,7 @@ void intel_csr_ucode_resume(struct drm_i915_private *dev_priv)
 	 * loaded.
 	 */
 	if (!dev_priv->csr.dmc_payload)
-		intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+		intel_csr_runtime_pm_get(dev_priv);
 }
 
 /**
@@ -574,6 +588,7 @@ void intel_csr_ucode_fini(struct drm_i915_private *dev_priv)
 		return;
 
 	intel_csr_ucode_suspend(dev_priv);
+	WARN_ON(dev_priv->csr.wakeref);
 
 	kfree(dev_priv->csr.dmc_payload);
 }
diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 7edce1b7b348..ca705546a0ab 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -974,7 +974,7 @@ static void intel_wait_ddi_buf_idle(struct drm_i915_private *dev_priv,
 	DRM_ERROR("Timeout waiting for DDI BUF %c idle bit\n", port_name(port));
 }
 
-static uint32_t hsw_pll_to_ddi_pll_sel(const struct intel_shared_dpll *pll)
+static u32 hsw_pll_to_ddi_pll_sel(const struct intel_shared_dpll *pll)
 {
 	switch (pll->info->id) {
 	case DPLL_ID_WRPLL1:
@@ -995,8 +995,8 @@ static uint32_t hsw_pll_to_ddi_pll_sel(const struct intel_shared_dpll *pll)
 	}
 }
 
-static uint32_t icl_pll_to_ddi_pll_sel(struct intel_encoder *encoder,
-				       const struct intel_crtc_state *crtc_state)
+static u32 icl_pll_to_ddi_clk_sel(struct intel_encoder *encoder,
+				  const struct intel_crtc_state *crtc_state)
 {
 	const struct intel_shared_dpll *pll = crtc_state->shared_dpll;
 	int clock = crtc_state->port_clock;
@@ -1004,10 +1004,11 @@ static uint32_t icl_pll_to_ddi_pll_sel(struct intel_encoder *encoder,
 
 	switch (id) {
 	default:
+		/*
+		 * DPLL_ID_ICL_DPLL0 and DPLL_ID_ICL_DPLL1 should not be used
+		 * here, so do warn if this get passed in
+		 */
 		MISSING_CASE(id);
-		/* fall through */
-	case DPLL_ID_ICL_DPLL0:
-	case DPLL_ID_ICL_DPLL1:
 		return DDI_CLK_SEL_NONE;
 	case DPLL_ID_ICL_TBTPLL:
 		switch (clock) {
@@ -1243,8 +1244,8 @@ static int skl_calc_wrpll_link(struct drm_i915_private *dev_priv,
 			       enum intel_dpll_id pll_id)
 {
 	i915_reg_t cfgcr1_reg, cfgcr2_reg;
-	uint32_t cfgcr1_val, cfgcr2_val;
-	uint32_t p0, p1, p2, dco_freq;
+	u32 cfgcr1_val, cfgcr2_val;
+	u32 p0, p1, p2, dco_freq;
 
 	cfgcr1_reg = DPLL_CFGCR1(pll_id);
 	cfgcr2_reg = DPLL_CFGCR2(pll_id);
@@ -1296,14 +1297,17 @@ static int skl_calc_wrpll_link(struct drm_i915_private *dev_priv,
 	dco_freq += (((cfgcr1_val & DPLL_CFGCR1_DCO_FRACTION_MASK) >> 9) * 24 *
 		1000) / 0x8000;
 
+	if (WARN_ON(p0 == 0 || p1 == 0 || p2 == 0))
+		return 0;
+
 	return dco_freq / (p0 * p1 * p2 * 5);
 }
 
 int cnl_calc_wrpll_link(struct drm_i915_private *dev_priv,
 			enum intel_dpll_id pll_id)
 {
-	uint32_t cfgcr0, cfgcr1;
-	uint32_t p0, p1, p2, dco_freq, ref_clock;
+	u32 cfgcr0, cfgcr1;
+	u32 p0, p1, p2, dco_freq, ref_clock;
 
 	if (INTEL_GEN(dev_priv) >= 11) {
 		cfgcr0 = I915_READ(ICL_DPLL_CFGCR0(pll_id));
@@ -1388,16 +1392,17 @@ static int icl_calc_tbt_pll_link(struct drm_i915_private *dev_priv,
 static int icl_calc_mg_pll_link(struct drm_i915_private *dev_priv,
 				enum port port)
 {
+	enum tc_port tc_port = intel_port_to_tc(dev_priv, port);
 	u32 mg_pll_div0, mg_clktop_hsclkctl;
 	u32 m1, m2_int, m2_frac, div1, div2, refclk;
 	u64 tmp;
 
 	refclk = dev_priv->cdclk.hw.ref;
 
-	mg_pll_div0 = I915_READ(MG_PLL_DIV0(port));
-	mg_clktop_hsclkctl = I915_READ(MG_CLKTOP2_HSCLKCTL(port));
+	mg_pll_div0 = I915_READ(MG_PLL_DIV0(tc_port));
+	mg_clktop_hsclkctl = I915_READ(MG_CLKTOP2_HSCLKCTL(tc_port));
 
-	m1 = I915_READ(MG_PLL_DIV1(port)) & MG_PLL_DIV1_FBPREDIV_MASK;
+	m1 = I915_READ(MG_PLL_DIV1(tc_port)) & MG_PLL_DIV1_FBPREDIV_MASK;
 	m2_int = mg_pll_div0 & MG_PLL_DIV0_FBDIV_INT_MASK;
 	m2_frac = (mg_pll_div0 & MG_PLL_DIV0_FRACNEN_H) ?
 		  (mg_pll_div0 & MG_PLL_DIV0_FBDIV_FRAC_MASK) >>
@@ -1468,7 +1473,7 @@ static void icl_ddi_clock_get(struct intel_encoder *encoder,
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	enum port port = encoder->port;
 	int link_clock = 0;
-	uint32_t pll_id;
+	u32 pll_id;
 
 	pll_id = intel_get_shared_dpll_id(dev_priv, pipe_config->shared_dpll);
 	if (intel_port_is_combophy(dev_priv, port)) {
@@ -1493,7 +1498,7 @@ static void cnl_ddi_clock_get(struct intel_encoder *encoder,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	int link_clock = 0;
-	uint32_t cfgcr0;
+	u32 cfgcr0;
 	enum intel_dpll_id pll_id;
 
 	pll_id = intel_get_shared_dpll_id(dev_priv, pipe_config->shared_dpll);
@@ -1547,7 +1552,7 @@ static void skl_ddi_clock_get(struct intel_encoder *encoder,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	int link_clock = 0;
-	uint32_t dpll_ctl1;
+	u32 dpll_ctl1;
 	enum intel_dpll_id pll_id;
 
 	pll_id = intel_get_shared_dpll_id(dev_priv, pipe_config->shared_dpll);
@@ -1736,7 +1741,7 @@ void intel_ddi_set_vc_payload_alloc(const struct intel_crtc_state *crtc_state,
 	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum transcoder cpu_transcoder = crtc_state->cpu_transcoder;
-	uint32_t temp;
+	u32 temp;
 
 	temp = I915_READ(TRANS_DDI_FUNC_CTL(cpu_transcoder));
 	if (state == true)
@@ -1754,7 +1759,7 @@ void intel_ddi_enable_transcoder_func(const struct intel_crtc_state *crtc_state)
 	enum pipe pipe = crtc->pipe;
 	enum transcoder cpu_transcoder = crtc_state->cpu_transcoder;
 	enum port port = encoder->port;
-	uint32_t temp;
+	u32 temp;
 
 	/* Enable TRANS_DDI_FUNC_CTL for the pipe to work in HDMI mode */
 	temp = TRANS_DDI_FUNC_ENABLE;
@@ -1815,7 +1820,7 @@ void intel_ddi_enable_transcoder_func(const struct intel_crtc_state *crtc_state)
 			temp |= TRANS_DDI_MODE_SELECT_DVI;
 
 		if (crtc_state->hdmi_scrambling)
-			temp |= TRANS_DDI_HDMI_SCRAMBLING_MASK;
+			temp |= TRANS_DDI_HDMI_SCRAMBLING;
 		if (crtc_state->hdmi_high_tmds_clock_ratio)
 			temp |= TRANS_DDI_HIGH_TMDS_CHAR_RATE;
 	} else if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_ANALOG)) {
@@ -1838,7 +1843,7 @@ void intel_ddi_disable_transcoder_func(const struct intel_crtc_state *crtc_state
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum transcoder cpu_transcoder = crtc_state->cpu_transcoder;
 	i915_reg_t reg = TRANS_DDI_FUNC_CTL(cpu_transcoder);
-	uint32_t val = I915_READ(reg);
+	u32 val = I915_READ(reg);
 
 	val &= ~(TRANS_DDI_FUNC_ENABLE | TRANS_DDI_PORT_MASK | TRANS_DDI_DP_VC_PAYLOAD_ALLOC);
 	val |= TRANS_DDI_PORT_NONE;
@@ -1857,12 +1862,14 @@ int intel_ddi_toggle_hdcp_signalling(struct intel_encoder *intel_encoder,
 {
 	struct drm_device *dev = intel_encoder->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
+	intel_wakeref_t wakeref;
 	enum pipe pipe = 0;
 	int ret = 0;
-	uint32_t tmp;
+	u32 tmp;
 
-	if (WARN_ON(!intel_display_power_get_if_enabled(dev_priv,
-						intel_encoder->power_domain)))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     intel_encoder->power_domain);
+	if (WARN_ON(!wakeref))
 		return -ENXIO;
 
 	if (WARN_ON(!intel_encoder->get_hw_state(intel_encoder, &pipe))) {
@@ -1877,7 +1884,7 @@ int intel_ddi_toggle_hdcp_signalling(struct intel_encoder *intel_encoder,
 		tmp &= ~TRANS_DDI_HDCP_SIGNALLING;
 	I915_WRITE(TRANS_DDI_FUNC_CTL(pipe), tmp);
 out:
-	intel_display_power_put(dev_priv, intel_encoder->power_domain);
+	intel_display_power_put(dev_priv, intel_encoder->power_domain, wakeref);
 	return ret;
 }
 
@@ -1888,13 +1895,15 @@ bool intel_ddi_connector_get_hw_state(struct intel_connector *intel_connector)
 	struct intel_encoder *encoder = intel_connector->encoder;
 	int type = intel_connector->base.connector_type;
 	enum port port = encoder->port;
-	enum pipe pipe = 0;
 	enum transcoder cpu_transcoder;
-	uint32_t tmp;
+	intel_wakeref_t wakeref;
+	enum pipe pipe = 0;
+	u32 tmp;
 	bool ret;
 
-	if (!intel_display_power_get_if_enabled(dev_priv,
-						encoder->power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     encoder->power_domain);
+	if (!wakeref)
 		return false;
 
 	if (!encoder->get_hw_state(encoder, &pipe)) {
@@ -1936,7 +1945,7 @@ bool intel_ddi_connector_get_hw_state(struct intel_connector *intel_connector)
 	}
 
 out:
-	intel_display_power_put(dev_priv, encoder->power_domain);
+	intel_display_power_put(dev_priv, encoder->power_domain, wakeref);
 
 	return ret;
 }
@@ -1947,6 +1956,7 @@ static void intel_ddi_get_encoder_pipes(struct intel_encoder *encoder,
 	struct drm_device *dev = encoder->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	enum port port = encoder->port;
+	intel_wakeref_t wakeref;
 	enum pipe p;
 	u32 tmp;
 	u8 mst_pipe_mask;
@@ -1954,8 +1964,9 @@ static void intel_ddi_get_encoder_pipes(struct intel_encoder *encoder,
 	*pipe_mask = 0;
 	*is_dp_mst = false;
 
-	if (!intel_display_power_get_if_enabled(dev_priv,
-						encoder->power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     encoder->power_domain);
+	if (!wakeref)
 		return;
 
 	tmp = I915_READ(DDI_BUF_CTL(port));
@@ -2026,7 +2037,7 @@ out:
 				  "(PHY_CTL %08x)\n", port_name(port), tmp);
 	}
 
-	intel_display_power_put(dev_priv, encoder->power_domain);
+	intel_display_power_put(dev_priv, encoder->power_domain, wakeref);
 }
 
 bool intel_ddi_get_hw_state(struct intel_encoder *encoder,
@@ -2123,7 +2134,7 @@ void intel_ddi_disable_pipe_clock(const struct intel_crtc_state *crtc_state)
 }
 
 static void _skl_ddi_set_iboost(struct drm_i915_private *dev_priv,
-				enum port port, uint8_t iboost)
+				enum port port, u8 iboost)
 {
 	u32 tmp;
 
@@ -2142,7 +2153,7 @@ static void skl_ddi_set_iboost(struct intel_encoder *encoder,
 	struct intel_digital_port *intel_dig_port = enc_to_dig_port(&encoder->base);
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	enum port port = encoder->port;
-	uint8_t iboost;
+	u8 iboost;
 
 	if (type == INTEL_OUTPUT_HDMI)
 		iboost = dev_priv->vbt.ddi_port_info[port].hdmi_boost_level;
@@ -2656,7 +2667,7 @@ static void icl_ddi_vswing_sequence(struct intel_encoder *encoder,
 		icl_mg_phy_ddi_vswing_sequence(encoder, link_clock, level);
 }
 
-static uint32_t translate_signal_level(int signal_levels)
+static u32 translate_signal_level(int signal_levels)
 {
 	int i;
 
@@ -2671,9 +2682,9 @@ static uint32_t translate_signal_level(int signal_levels)
 	return 0;
 }
 
-static uint32_t intel_ddi_dp_level(struct intel_dp *intel_dp)
+static u32 intel_ddi_dp_level(struct intel_dp *intel_dp)
 {
-	uint8_t train_set = intel_dp->train_set[0];
+	u8 train_set = intel_dp->train_set[0];
 	int signal_levels = train_set & (DP_TRAIN_VOLTAGE_SWING_MASK |
 					 DP_TRAIN_PRE_EMPHASIS_MASK);
 
@@ -2698,7 +2709,7 @@ u32 bxt_signal_levels(struct intel_dp *intel_dp)
 	return 0;
 }
 
-uint32_t ddi_signal_levels(struct intel_dp *intel_dp)
+u32 ddi_signal_levels(struct intel_dp *intel_dp)
 {
 	struct intel_digital_port *dport = dp_to_dig_port(intel_dp);
 	struct drm_i915_private *dev_priv = to_i915(dport->base.base.dev);
@@ -2712,8 +2723,8 @@ uint32_t ddi_signal_levels(struct intel_dp *intel_dp)
 }
 
 static inline
-uint32_t icl_dpclka_cfgcr0_clk_off(struct drm_i915_private *dev_priv,
-				   enum port port)
+u32 icl_dpclka_cfgcr0_clk_off(struct drm_i915_private *dev_priv,
+			      enum port port)
 {
 	if (intel_port_is_combophy(dev_priv, port)) {
 		return ICL_DPCLKA_CFGCR0_DDI_CLK_OFF(port);
@@ -2848,7 +2859,7 @@ static void intel_ddi_clk_select(struct intel_encoder *encoder,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	enum port port = encoder->port;
-	uint32_t val;
+	u32 val;
 	const struct intel_shared_dpll *pll = crtc_state->shared_dpll;
 
 	if (WARN_ON(!pll))
@@ -2859,7 +2870,7 @@ static void intel_ddi_clk_select(struct intel_encoder *encoder,
 	if (IS_ICELAKE(dev_priv)) {
 		if (!intel_port_is_combophy(dev_priv, port))
 			I915_WRITE(DDI_CLK_SEL(port),
-				   icl_pll_to_ddi_pll_sel(encoder, crtc_state));
+				   icl_pll_to_ddi_clk_sel(encoder, crtc_state));
 	} else if (IS_CANNONLAKE(dev_priv)) {
 		/* Configure DPCLKA_CFGCR0 to map the DPLL to the DDI. */
 		val = I915_READ(DPCLKA_CFGCR0);
@@ -3283,7 +3294,8 @@ static void intel_ddi_post_disable_dp(struct intel_encoder *encoder,
 	intel_edp_panel_vdd_on(intel_dp);
 	intel_edp_panel_off(intel_dp);
 
-	intel_display_power_put(dev_priv, dig_port->ddi_io_power_domain);
+	intel_display_power_put_unchecked(dev_priv,
+					  dig_port->ddi_io_power_domain);
 
 	intel_ddi_clk_disable(encoder);
 }
@@ -3303,7 +3315,8 @@ static void intel_ddi_post_disable_hdmi(struct intel_encoder *encoder,
 
 	intel_disable_ddi_buf(encoder, old_crtc_state);
 
-	intel_display_power_put(dev_priv, dig_port->ddi_io_power_domain);
+	intel_display_power_put_unchecked(dev_priv,
+					  dig_port->ddi_io_power_domain);
 
 	intel_ddi_clk_disable(encoder);
 
@@ -3345,7 +3358,7 @@ void intel_ddi_fdi_post_disable(struct intel_encoder *encoder,
 				const struct drm_connector_state *old_conn_state)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
-	uint32_t val;
+	u32 val;
 
 	/*
 	 * Bspec lists this as both step 13 (before DDI_BUF_CTL disable)
@@ -3537,6 +3550,26 @@ static void intel_disable_ddi(struct intel_encoder *encoder,
 		intel_disable_ddi_dp(encoder, old_crtc_state, old_conn_state);
 }
 
+static void intel_ddi_update_pipe_dp(struct intel_encoder *encoder,
+				     const struct intel_crtc_state *crtc_state,
+				     const struct drm_connector_state *conn_state)
+{
+	struct intel_dp *intel_dp = enc_to_intel_dp(&encoder->base);
+
+	intel_psr_enable(intel_dp, crtc_state);
+	intel_edp_drrs_enable(intel_dp, crtc_state);
+
+	intel_panel_update_backlight(encoder, crtc_state, conn_state);
+}
+
+static void intel_ddi_update_pipe(struct intel_encoder *encoder,
+				  const struct intel_crtc_state *crtc_state,
+				  const struct drm_connector_state *conn_state)
+{
+	if (!intel_crtc_has_type(crtc_state, INTEL_OUTPUT_HDMI))
+		intel_ddi_update_pipe_dp(encoder, crtc_state, conn_state);
+}
+
 static void intel_ddi_set_fia_lane_count(struct intel_encoder *encoder,
 					 const struct intel_crtc_state *pipe_config,
 					 enum port port)
@@ -3605,8 +3638,8 @@ intel_ddi_post_pll_disable(struct intel_encoder *encoder,
 
 	if (intel_crtc_has_dp_encoder(crtc_state) ||
 	    intel_port_is_tc(dev_priv, encoder->port))
-		intel_display_power_put(dev_priv,
-					intel_ddi_main_link_aux_domain(dig_port));
+		intel_display_power_put_unchecked(dev_priv,
+						  intel_ddi_main_link_aux_domain(dig_port));
 }
 
 void intel_ddi_prepare_link_retrain(struct intel_dp *intel_dp)
@@ -3615,7 +3648,7 @@ void intel_ddi_prepare_link_retrain(struct intel_dp *intel_dp)
 	struct drm_i915_private *dev_priv =
 		to_i915(intel_dig_port->base.base.dev);
 	enum port port = intel_dig_port->base.port;
-	uint32_t val;
+	u32 val;
 	bool wait = false;
 
 	if (I915_READ(DP_TP_CTL(port)) & DP_TP_CTL_ENABLE) {
@@ -3727,8 +3760,7 @@ void intel_ddi_get_config(struct intel_encoder *encoder,
 		if (intel_dig_port->infoframe_enabled(encoder, pipe_config))
 			pipe_config->has_infoframe = true;
 
-		if ((temp & TRANS_DDI_HDMI_SCRAMBLING_MASK) ==
-			TRANS_DDI_HDMI_SCRAMBLING_MASK)
+		if (temp & TRANS_DDI_HDMI_SCRAMBLING)
 			pipe_config->hdmi_scrambling = true;
 		if (temp & TRANS_DDI_HIGH_TMDS_CHAR_RATE)
 			pipe_config->hdmi_high_tmds_clock_ratio = true;
@@ -3809,9 +3841,9 @@ intel_ddi_compute_output_type(struct intel_encoder *encoder,
 	}
 }
 
-static bool intel_ddi_compute_config(struct intel_encoder *encoder,
-				     struct intel_crtc_state *pipe_config,
-				     struct drm_connector_state *conn_state)
+static int intel_ddi_compute_config(struct intel_encoder *encoder,
+				    struct intel_crtc_state *pipe_config,
+				    struct drm_connector_state *conn_state)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	enum port port = encoder->port;
@@ -3835,9 +3867,50 @@ static bool intel_ddi_compute_config(struct intel_encoder *encoder,
 
 }
 
+static void intel_ddi_encoder_suspend(struct intel_encoder *encoder)
+{
+	struct intel_digital_port *dig_port = enc_to_dig_port(&encoder->base);
+	struct drm_i915_private *i915 = to_i915(encoder->base.dev);
+
+	intel_dp_encoder_suspend(encoder);
+
+	/*
+	 * TODO: disconnect also from USB DP alternate mode once we have a
+	 * way to handle the modeset restore in that mode during resume
+	 * even if the sink has disappeared while being suspended.
+	 */
+	if (dig_port->tc_legacy_port)
+		icl_tc_phy_disconnect(i915, dig_port);
+}
+
+static void intel_ddi_encoder_reset(struct drm_encoder *drm_encoder)
+{
+	struct intel_digital_port *dig_port = enc_to_dig_port(drm_encoder);
+	struct drm_i915_private *i915 = to_i915(drm_encoder->dev);
+
+	if (intel_port_is_tc(i915, dig_port->base.port))
+		intel_digital_port_connected(&dig_port->base);
+
+	intel_dp_encoder_reset(drm_encoder);
+}
+
+static void intel_ddi_encoder_destroy(struct drm_encoder *encoder)
+{
+	struct intel_digital_port *dig_port = enc_to_dig_port(encoder);
+	struct drm_i915_private *i915 = to_i915(encoder->dev);
+
+	intel_dp_encoder_flush_work(encoder);
+
+	if (intel_port_is_tc(i915, dig_port->base.port))
+		icl_tc_phy_disconnect(i915, dig_port);
+
+	drm_encoder_cleanup(encoder);
+	kfree(dig_port);
+}
+
 static const struct drm_encoder_funcs intel_ddi_funcs = {
-	.reset = intel_dp_encoder_reset,
-	.destroy = intel_dp_encoder_destroy,
+	.reset = intel_ddi_encoder_reset,
+	.destroy = intel_ddi_encoder_destroy,
 };
 
 static struct intel_connector *
@@ -4081,16 +4154,16 @@ intel_ddi_max_lanes(struct intel_digital_port *intel_dport)
 
 void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port)
 {
+	struct ddi_vbt_port_info *port_info =
+		&dev_priv->vbt.ddi_port_info[port];
 	struct intel_digital_port *intel_dig_port;
 	struct intel_encoder *intel_encoder;
 	struct drm_encoder *encoder;
 	bool init_hdmi, init_dp, init_lspcon = false;
 	enum pipe pipe;
 
-
-	init_hdmi = (dev_priv->vbt.ddi_port_info[port].supports_dvi ||
-		     dev_priv->vbt.ddi_port_info[port].supports_hdmi);
-	init_dp = dev_priv->vbt.ddi_port_info[port].supports_dp;
+	init_hdmi = port_info->supports_dvi || port_info->supports_hdmi;
+	init_dp = port_info->supports_dp;
 
 	if (intel_bios_is_lspcon_present(dev_priv, port)) {
 		/*
@@ -4129,9 +4202,10 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port)
 	intel_encoder->pre_enable = intel_ddi_pre_enable;
 	intel_encoder->disable = intel_disable_ddi;
 	intel_encoder->post_disable = intel_ddi_post_disable;
+	intel_encoder->update_pipe = intel_ddi_update_pipe;
 	intel_encoder->get_hw_state = intel_ddi_get_hw_state;
 	intel_encoder->get_config = intel_ddi_get_config;
-	intel_encoder->suspend = intel_dp_encoder_suspend;
+	intel_encoder->suspend = intel_ddi_encoder_suspend;
 	intel_encoder->get_power_domains = intel_ddi_get_power_domains;
 	intel_encoder->type = INTEL_OUTPUT_DDI;
 	intel_encoder->power_domain = intel_port_to_power_domain(port);
@@ -4150,6 +4224,10 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port)
 	intel_dig_port->max_lanes = intel_ddi_max_lanes(intel_dig_port);
 	intel_dig_port->aux_ch = intel_bios_port_aux_ch(dev_priv, port);
 
+	intel_dig_port->tc_legacy_port = intel_port_is_tc(dev_priv, port) &&
+					 !port_info->supports_typec_usb &&
+					 !port_info->supports_tbt;
+
 	switch (port) {
 	case PORT_A:
 		intel_dig_port->ddi_io_power_domain =
@@ -4208,6 +4286,10 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port)
 	}
 
 	intel_infoframe_init(intel_dig_port);
+
+	if (intel_port_is_tc(dev_priv, port))
+		intel_digital_port_connected(intel_encoder);
+
 	return;
 
 err:
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 1e56319334f3..855a5074ad77 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -104,7 +104,7 @@ static void sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p)
 	drm_printf(p, "has EU power gating: %s\n", yesno(sseu->has_eu_pg));
 }
 
-void intel_device_info_dump_runtime(const struct intel_device_info *info,
+void intel_device_info_dump_runtime(const struct intel_runtime_info *info,
 				    struct drm_printer *p)
 {
 	sseu_dump(&info->sseu, p);
@@ -113,21 +113,6 @@ void intel_device_info_dump_runtime(const struct intel_device_info *info,
 		   info->cs_timestamp_frequency_khz);
 }
 
-void intel_device_info_dump(const struct intel_device_info *info,
-			    struct drm_printer *p)
-{
-	struct drm_i915_private *dev_priv =
-		container_of(info, struct drm_i915_private, info);
-
-	drm_printf(p, "pciid=0x%04x rev=0x%02x platform=%s gen=%i\n",
-		   INTEL_DEVID(dev_priv),
-		   INTEL_REVID(dev_priv),
-		   intel_platform_name(info->platform),
-		   info->gen);
-
-	intel_device_info_dump_flags(info, p);
-}
-
 void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
 				     struct drm_printer *p)
 {
@@ -164,7 +149,7 @@ static u16 compute_eu_total(const struct sseu_dev_info *sseu)
 
 static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
 {
-	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u8 s_en;
 	u32 ss_en, ss_en_mask;
 	u8 eu_en;
@@ -203,7 +188,7 @@ static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
 
 static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 {
-	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	const u32 fuse2 = I915_READ(GEN8_FUSE2);
 	int s, ss;
 	const int eu_mask = 0xff;
@@ -280,7 +265,7 @@ static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
 
 static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 {
-	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 fuse;
 
 	fuse = I915_READ(CHV_FUSE_GT);
@@ -334,7 +319,7 @@ static void cherryview_sseu_info_init(struct drm_i915_private *dev_priv)
 static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 {
 	struct intel_device_info *info = mkwrite_device_info(dev_priv);
-	struct sseu_dev_info *sseu = &info->sseu;
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	int s, ss;
 	u32 fuse2, eu_disable, subslice_mask;
 	const u8 eu_mask = 0xff;
@@ -437,7 +422,7 @@ static void gen9_sseu_info_init(struct drm_i915_private *dev_priv)
 
 static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 {
-	struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	int s, ss;
 	u32 fuse2, subslice_mask, eu_disable[3]; /* s_max */
 
@@ -519,8 +504,7 @@ static void broadwell_sseu_info_init(struct drm_i915_private *dev_priv)
 
 static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
 {
-	struct intel_device_info *info = mkwrite_device_info(dev_priv);
-	struct sseu_dev_info *sseu = &info->sseu;
+	struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 fuse1;
 	int s, ss;
 
@@ -528,9 +512,9 @@ static void haswell_sseu_info_init(struct drm_i915_private *dev_priv)
 	 * There isn't a register to tell us how many slices/subslices. We
 	 * work off the PCI-ids here.
 	 */
-	switch (info->gt) {
+	switch (INTEL_INFO(dev_priv)->gt) {
 	default:
-		MISSING_CASE(info->gt);
+		MISSING_CASE(INTEL_INFO(dev_priv)->gt);
 		/* fall through */
 	case 1:
 		sseu->slice_mask = BIT(0);
@@ -725,7 +709,7 @@ static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
 
 /**
  * intel_device_info_runtime_init - initialize runtime info
- * @info: intel device info struct
+ * @dev_priv: the i915 device
  *
  * Determine various intel_device_info fields at runtime.
  *
@@ -739,29 +723,29 @@ static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
  *   - after the PCH has been detected,
  *   - before the first usage of the fields it can tweak.
  */
-void intel_device_info_runtime_init(struct intel_device_info *info)
+void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv =
-		container_of(info, struct drm_i915_private, info);
+	struct intel_device_info *info = mkwrite_device_info(dev_priv);
+	struct intel_runtime_info *runtime = RUNTIME_INFO(dev_priv);
 	enum pipe pipe;
 
 	if (INTEL_GEN(dev_priv) >= 10) {
 		for_each_pipe(dev_priv, pipe)
-			info->num_scalers[pipe] = 2;
-	} else if (IS_GEN9(dev_priv)) {
-		info->num_scalers[PIPE_A] = 2;
-		info->num_scalers[PIPE_B] = 2;
-		info->num_scalers[PIPE_C] = 1;
+			runtime->num_scalers[pipe] = 2;
+	} else if (IS_GEN(dev_priv, 9)) {
+		runtime->num_scalers[PIPE_A] = 2;
+		runtime->num_scalers[PIPE_B] = 2;
+		runtime->num_scalers[PIPE_C] = 1;
 	}
 
 	BUILD_BUG_ON(I915_NUM_ENGINES > BITS_PER_TYPE(intel_ring_mask_t));
 
-	if (IS_GEN11(dev_priv))
+	if (IS_GEN(dev_priv, 11))
 		for_each_pipe(dev_priv, pipe)
-			info->num_sprites[pipe] = 6;
-	else if (IS_GEN10(dev_priv) || IS_GEMINILAKE(dev_priv))
+			runtime->num_sprites[pipe] = 6;
+	else if (IS_GEN(dev_priv, 10) || IS_GEMINILAKE(dev_priv))
 		for_each_pipe(dev_priv, pipe)
-			info->num_sprites[pipe] = 3;
+			runtime->num_sprites[pipe] = 3;
 	else if (IS_BROXTON(dev_priv)) {
 		/*
 		 * Skylake and Broxton currently don't expose the topmost plane as its
@@ -772,22 +756,22 @@ void intel_device_info_runtime_init(struct intel_device_info *info)
 		 * down the line.
 		 */
 
-		info->num_sprites[PIPE_A] = 2;
-		info->num_sprites[PIPE_B] = 2;
-		info->num_sprites[PIPE_C] = 1;
+		runtime->num_sprites[PIPE_A] = 2;
+		runtime->num_sprites[PIPE_B] = 2;
+		runtime->num_sprites[PIPE_C] = 1;
 	} else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
 		for_each_pipe(dev_priv, pipe)
-			info->num_sprites[pipe] = 2;
+			runtime->num_sprites[pipe] = 2;
 	} else if (INTEL_GEN(dev_priv) >= 5 || IS_G4X(dev_priv)) {
 		for_each_pipe(dev_priv, pipe)
-			info->num_sprites[pipe] = 1;
+			runtime->num_sprites[pipe] = 1;
 	}
 
 	if (i915_modparams.disable_display) {
 		DRM_INFO("Display disabled (module parameter)\n");
 		info->num_pipes = 0;
 	} else if (HAS_DISPLAY(dev_priv) &&
-		   (IS_GEN7(dev_priv) || IS_GEN8(dev_priv)) &&
+		   (IS_GEN_RANGE(dev_priv, 7, 8)) &&
 		   HAS_PCH_SPLIT(dev_priv)) {
 		u32 fuse_strap = I915_READ(FUSE_STRAP);
 		u32 sfuse_strap = I915_READ(SFUSE_STRAP);
@@ -811,7 +795,7 @@ void intel_device_info_runtime_init(struct intel_device_info *info)
 			DRM_INFO("PipeC fused off\n");
 			info->num_pipes -= 1;
 		}
-	} else if (HAS_DISPLAY(dev_priv) && IS_GEN9(dev_priv)) {
+	} else if (HAS_DISPLAY(dev_priv) && INTEL_GEN(dev_priv) >= 9) {
 		u32 dfsm = I915_READ(SKL_DFSM);
 		u8 disabled_mask = 0;
 		bool invalid;
@@ -851,20 +835,20 @@ void intel_device_info_runtime_init(struct intel_device_info *info)
 		cherryview_sseu_info_init(dev_priv);
 	else if (IS_BROADWELL(dev_priv))
 		broadwell_sseu_info_init(dev_priv);
-	else if (IS_GEN9(dev_priv))
+	else if (IS_GEN(dev_priv, 9))
 		gen9_sseu_info_init(dev_priv);
-	else if (IS_GEN10(dev_priv))
+	else if (IS_GEN(dev_priv, 10))
 		gen10_sseu_info_init(dev_priv);
 	else if (INTEL_GEN(dev_priv) >= 11)
 		gen11_sseu_info_init(dev_priv);
 
-	if (IS_GEN6(dev_priv) && intel_vtd_active()) {
+	if (IS_GEN(dev_priv, 6) && intel_vtd_active()) {
 		DRM_INFO("Disabling ppGTT for VT-d support\n");
 		info->ppgtt = INTEL_PPGTT_NONE;
 	}
 
 	/* Initialize command stream timestamp frequency */
-	info->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
+	runtime->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
 }
 
 void intel_driver_caps_print(const struct intel_driver_caps *caps,
@@ -884,35 +868,44 @@ void intel_driver_caps_print(const struct intel_driver_caps *caps,
 void intel_device_info_init_mmio(struct drm_i915_private *dev_priv)
 {
 	struct intel_device_info *info = mkwrite_device_info(dev_priv);
-	u32 media_fuse;
+	unsigned int logical_vdbox = 0;
 	unsigned int i;
+	u32 media_fuse;
 
 	if (INTEL_GEN(dev_priv) < 11)
 		return;
 
 	media_fuse = ~I915_READ(GEN11_GT_VEBOX_VDBOX_DISABLE);
 
-	info->vdbox_enable = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
-	info->vebox_enable = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
-			     GEN11_GT_VEBOX_DISABLE_SHIFT;
+	RUNTIME_INFO(dev_priv)->vdbox_enable = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
+	RUNTIME_INFO(dev_priv)->vebox_enable = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
+		GEN11_GT_VEBOX_DISABLE_SHIFT;
 
-	DRM_DEBUG_DRIVER("vdbox enable: %04x\n", info->vdbox_enable);
+	DRM_DEBUG_DRIVER("vdbox enable: %04x\n", RUNTIME_INFO(dev_priv)->vdbox_enable);
 	for (i = 0; i < I915_MAX_VCS; i++) {
 		if (!HAS_ENGINE(dev_priv, _VCS(i)))
 			continue;
 
-		if (!(BIT(i) & info->vdbox_enable)) {
+		if (!(BIT(i) & RUNTIME_INFO(dev_priv)->vdbox_enable)) {
 			info->ring_mask &= ~ENGINE_MASK(_VCS(i));
 			DRM_DEBUG_DRIVER("vcs%u fused off\n", i);
+			continue;
 		}
+
+		/*
+		 * In Gen11, only even numbered logical VDBOXes are
+		 * hooked up to an SFC (Scaler & Format Converter) unit.
+		 */
+		if (logical_vdbox++ % 2 == 0)
+			RUNTIME_INFO(dev_priv)->vdbox_sfc_access |= BIT(i);
 	}
 
-	DRM_DEBUG_DRIVER("vebox enable: %04x\n", info->vebox_enable);
+	DRM_DEBUG_DRIVER("vebox enable: %04x\n", RUNTIME_INFO(dev_priv)->vebox_enable);
 	for (i = 0; i < I915_MAX_VECS; i++) {
 		if (!HAS_ENGINE(dev_priv, _VECS(i)))
 			continue;
 
-		if (!(BIT(i) & info->vebox_enable)) {
+		if (!(BIT(i) & RUNTIME_INFO(dev_priv)->vebox_enable)) {
 			info->ring_mask &= ~ENGINE_MASK(_VECS(i));
 			DRM_DEBUG_DRIVER("vecs%u fused off\n", i);
 		}
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 1caf24e2cf0b..e8b8661df746 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -89,6 +89,7 @@ enum intel_ppgtt {
 	func(is_alpha_support); \
 	/* Keep has_* in alphabetical order */ \
 	func(has_64bit_reloc); \
+	func(gpu_reset_clobbers_display); \
 	func(has_reset_engine); \
 	func(has_fpga_dbg); \
 	func(has_guc); \
@@ -114,7 +115,7 @@ enum intel_ppgtt {
 	func(has_ddi); \
 	func(has_dp_mst); \
 	func(has_fbc); \
-	func(has_gmch_display); \
+	func(has_gmch); \
 	func(has_hotplug); \
 	func(has_ipc); \
 	func(has_overlay); \
@@ -152,12 +153,10 @@ struct sseu_dev_info {
 typedef u8 intel_ring_mask_t;
 
 struct intel_device_info {
-	u16 device_id;
 	u16 gen_mask;
 
 	u8 gen;
 	u8 gt; /* GT number, 0 if undefined */
-	u8 num_rings;
 	intel_ring_mask_t ring_mask; /* Rings supported by the HW */
 
 	enum intel_platform platform;
@@ -169,8 +168,6 @@ struct intel_device_info {
 	u32 display_mmio_offset;
 
 	u8 num_pipes;
-	u8 num_sprites[I915_MAX_PIPES];
-	u8 num_scalers[I915_MAX_PIPES];
 
 #define DEFINE_FLAG(name) u8 name:1
 	DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG);
@@ -189,6 +186,22 @@ struct intel_device_info {
 	int trans_offsets[I915_MAX_TRANSCODERS];
 	int cursor_offsets[I915_MAX_PIPES];
 
+	struct color_luts {
+		u16 degamma_lut_size;
+		u16 gamma_lut_size;
+		u32 degamma_lut_tests;
+		u32 gamma_lut_tests;
+	} color;
+};
+
+struct intel_runtime_info {
+	u16 device_id;
+
+	u8 num_sprites[I915_MAX_PIPES];
+	u8 num_scalers[I915_MAX_PIPES];
+
+	u8 num_rings;
+
 	/* Slice/subslice/EU info */
 	struct sseu_dev_info sseu;
 
@@ -198,10 +211,8 @@ struct intel_device_info {
 	u8 vdbox_enable;
 	u8 vebox_enable;
 
-	struct color_luts {
-		u16 degamma_lut_size;
-		u16 gamma_lut_size;
-	} color;
+	/* Media engine access to SFC per instance */
+	u8 vdbox_sfc_access;
 };
 
 struct intel_driver_caps {
@@ -258,12 +269,10 @@ static inline void sseu_set_eus(struct sseu_dev_info *sseu,
 
 const char *intel_platform_name(enum intel_platform platform);
 
-void intel_device_info_runtime_init(struct intel_device_info *info);
-void intel_device_info_dump(const struct intel_device_info *info,
-			    struct drm_printer *p);
+void intel_device_info_runtime_init(struct drm_i915_private *dev_priv);
 void intel_device_info_dump_flags(const struct intel_device_info *info,
 				  struct drm_printer *p);
-void intel_device_info_dump_runtime(const struct intel_device_info *info,
+void intel_device_info_dump_runtime(const struct intel_runtime_info *info,
 				    struct drm_printer *p);
 void intel_device_info_dump_topology(const struct sseu_dev_info *sseu,
 				     struct drm_printer *p);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 248128126422..ccb616351bba 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -24,33 +24,44 @@
  *	Eric Anholt <eric@anholt.net>
  */
 
-#include <linux/module.h>
-#include <linux/input.h>
 #include <linux/i2c.h>
+#include <linux/input.h>
+#include <linux/intel-iommu.h>
 #include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/reservation.h>
 #include <linux/slab.h>
 #include <linux/vgaarb.h>
+
+#include <drm/drm_atomic.h>
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_atomic_uapi.h>
+#include <drm/drm_dp_helper.h>
 #include <drm/drm_edid.h>
-#include <drm/drmP.h>
-#include "intel_drv.h"
-#include "intel_frontbuffer.h"
+#include <drm/drm_fourcc.h>
+#include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
+#include <drm/drm_rect.h>
 #include <drm/i915_drm.h>
+
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
+#include "i915_trace.h"
+#include "intel_drv.h"
+#include "intel_dsi.h"
+#include "intel_frontbuffer.h"
+
+#include "intel_drv.h"
 #include "intel_dsi.h"
+#include "intel_frontbuffer.h"
+
+#include "i915_drv.h"
+#include "i915_gem_clflush.h"
+#include "i915_reset.h"
 #include "i915_trace.h"
-#include <drm/drm_atomic.h>
-#include <drm/drm_atomic_helper.h>
-#include <drm/drm_dp_helper.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_plane_helper.h>
-#include <drm/drm_rect.h>
-#include <drm/drm_atomic_uapi.h>
-#include <linux/intel-iommu.h>
-#include <linux/reservation.h>
 
 /* Primary plane formats for gen <= 3 */
-static const uint32_t i8xx_primary_formats[] = {
+static const u32 i8xx_primary_formats[] = {
 	DRM_FORMAT_C8,
 	DRM_FORMAT_RGB565,
 	DRM_FORMAT_XRGB1555,
@@ -58,7 +69,7 @@ static const uint32_t i8xx_primary_formats[] = {
 };
 
 /* Primary plane formats for gen >= 4 */
-static const uint32_t i965_primary_formats[] = {
+static const u32 i965_primary_formats[] = {
 	DRM_FORMAT_C8,
 	DRM_FORMAT_RGB565,
 	DRM_FORMAT_XRGB8888,
@@ -67,18 +78,18 @@ static const uint32_t i965_primary_formats[] = {
 	DRM_FORMAT_XBGR2101010,
 };
 
-static const uint64_t i9xx_format_modifiers[] = {
+static const u64 i9xx_format_modifiers[] = {
 	I915_FORMAT_MOD_X_TILED,
 	DRM_FORMAT_MOD_LINEAR,
 	DRM_FORMAT_MOD_INVALID
 };
 
 /* Cursor formats */
-static const uint32_t intel_cursor_formats[] = {
+static const u32 intel_cursor_formats[] = {
 	DRM_FORMAT_ARGB8888,
 };
 
-static const uint64_t cursor_format_modifiers[] = {
+static const u64 cursor_format_modifiers[] = {
 	DRM_FORMAT_MOD_LINEAR,
 	DRM_FORMAT_MOD_INVALID
 };
@@ -494,7 +505,7 @@ static int pnv_calc_dpll_params(int refclk, struct dpll *clock)
 	return clock->dot;
 }
 
-static uint32_t i9xx_dpll_compute_m(struct dpll *dpll)
+static u32 i9xx_dpll_compute_m(struct dpll *dpll)
 {
 	return 5 * (dpll->m1 + 2) + (dpll->m2 + 2);
 }
@@ -529,8 +540,8 @@ int chv_calc_dpll_params(int refclk, struct dpll *clock)
 	clock->p = clock->p1 * clock->p2;
 	if (WARN_ON(clock->n == 0 || clock->p == 0))
 		return 0;
-	clock->vco = DIV_ROUND_CLOSEST_ULL((uint64_t)refclk * clock->m,
-			clock->n << 22);
+	clock->vco = DIV_ROUND_CLOSEST_ULL((u64)refclk * clock->m,
+					   clock->n << 22);
 	clock->dot = DIV_ROUND_CLOSEST(clock->vco, clock->p);
 
 	return clock->dot / 5;
@@ -892,7 +903,7 @@ chv_find_best_dpll(const struct intel_limit *limit,
 	struct drm_device *dev = crtc->base.dev;
 	unsigned int best_error_ppm;
 	struct dpll clock;
-	uint64_t m2;
+	u64 m2;
 	int found = false;
 
 	memset(best_clock, 0, sizeof(*best_clock));
@@ -914,7 +925,7 @@ chv_find_best_dpll(const struct intel_limit *limit,
 
 			clock.p = clock.p1 * clock.p2;
 
-			m2 = DIV_ROUND_CLOSEST_ULL(((uint64_t)target * clock.p *
+			m2 = DIV_ROUND_CLOSEST_ULL(((u64)target * clock.p *
 					clock.n) << 22, refclk * clock.m1);
 
 			if (m2 > INT_MAX/clock.m1)
@@ -984,7 +995,7 @@ static bool pipe_scanline_is_moving(struct drm_i915_private *dev_priv,
 	u32 line1, line2;
 	u32 line_mask;
 
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		line_mask = DSL_LINEMASK_GEN2;
 	else
 		line_mask = DSL_LINEMASK_GEN3;
@@ -1110,7 +1121,7 @@ static void assert_fdi_tx_pll_enabled(struct drm_i915_private *dev_priv,
 	u32 val;
 
 	/* ILK FDI PLL is always enabled */
-	if (IS_GEN5(dev_priv))
+	if (IS_GEN(dev_priv, 5))
 		return;
 
 	/* On Haswell, DDI ports are responsible for the FDI PLL setup */
@@ -1198,17 +1209,19 @@ void assert_pipe(struct drm_i915_private *dev_priv,
 	enum transcoder cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv,
 								      pipe);
 	enum intel_display_power_domain power_domain;
+	intel_wakeref_t wakeref;
 
 	/* we keep both pipes enabled on 830 */
 	if (IS_I830(dev_priv))
 		state = true;
 
 	power_domain = POWER_DOMAIN_TRANSCODER(cpu_transcoder);
-	if (intel_display_power_get_if_enabled(dev_priv, power_domain)) {
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (wakeref) {
 		u32 val = I915_READ(PIPECONF(cpu_transcoder));
 		cur_state = !!(val & PIPECONF_ENABLE);
 
-		intel_display_power_put(dev_priv, power_domain);
+		intel_display_power_put(dev_priv, power_domain, wakeref);
 	} else {
 		cur_state = false;
 	}
@@ -1609,7 +1622,7 @@ static void ironlake_enable_pch_transcoder(const struct intel_crtc_state *crtc_s
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum pipe pipe = crtc->pipe;
 	i915_reg_t reg;
-	uint32_t val, pipeconf_val;
+	u32 val, pipeconf_val;
 
 	/* Make sure PCH DPLL is enabled */
 	assert_shared_dpll_enabled(dev_priv, crtc_state->shared_dpll);
@@ -1697,7 +1710,7 @@ static void ironlake_disable_pch_transcoder(struct drm_i915_private *dev_priv,
 					    enum pipe pipe)
 {
 	i915_reg_t reg;
-	uint32_t val;
+	u32 val;
 
 	/* FDI relies on the transcoder */
 	assert_fdi_tx_disabled(dev_priv, pipe);
@@ -1754,6 +1767,35 @@ enum pipe intel_crtc_pch_transcoder(struct intel_crtc *crtc)
 		return crtc->pipe;
 }
 
+static u32 intel_crtc_max_vblank_count(const struct intel_crtc_state *crtc_state)
+{
+	struct drm_i915_private *dev_priv = to_i915(crtc_state->base.crtc->dev);
+
+	/*
+	 * On i965gm the hardware frame counter reads
+	 * zero when the TV encoder is enabled :(
+	 */
+	if (IS_I965GM(dev_priv) &&
+	    (crtc_state->output_types & BIT(INTEL_OUTPUT_TVOUT)))
+		return 0;
+
+	if (INTEL_GEN(dev_priv) >= 5 || IS_G4X(dev_priv))
+		return 0xffffffff; /* full 32 bit counter */
+	else if (INTEL_GEN(dev_priv) >= 3)
+		return 0xffffff; /* only 24 bits of frame count */
+	else
+		return 0; /* Gen2 doesn't have a hardware frame counter */
+}
+
+static void intel_crtc_vblank_on(const struct intel_crtc_state *crtc_state)
+{
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+
+	drm_crtc_set_max_vblank_count(&crtc->base,
+				      intel_crtc_max_vblank_count(crtc_state));
+	drm_crtc_vblank_on(&crtc->base);
+}
+
 static void intel_enable_pipe(const struct intel_crtc_state *new_crtc_state)
 {
 	struct intel_crtc *crtc = to_intel_crtc(new_crtc_state->base.crtc);
@@ -1772,7 +1814,7 @@ static void intel_enable_pipe(const struct intel_crtc_state *new_crtc_state)
 	 * a plane.  On ILK+ the pipe PLLs are integrated, so we don't
 	 * need the check.
 	 */
-	if (HAS_GMCH_DISPLAY(dev_priv)) {
+	if (HAS_GMCH(dev_priv)) {
 		if (intel_crtc_has_type(new_crtc_state, INTEL_OUTPUT_DSI))
 			assert_dsi_pll_enabled(dev_priv);
 		else
@@ -1806,7 +1848,7 @@ static void intel_enable_pipe(const struct intel_crtc_state *new_crtc_state)
 	 * when it's derived from the timestamps. So let's wait for the
 	 * pipe to start properly before we call drm_crtc_vblank_on()
 	 */
-	if (dev_priv->drm.max_vblank_count == 0)
+	if (intel_crtc_max_vblank_count(new_crtc_state) == 0)
 		intel_wait_for_pipe_scanline_moving(crtc);
 }
 
@@ -1850,7 +1892,7 @@ static void intel_disable_pipe(const struct intel_crtc_state *old_crtc_state)
 
 static unsigned int intel_tile_size(const struct drm_i915_private *dev_priv)
 {
-	return IS_GEN2(dev_priv) ? 2048 : 4096;
+	return IS_GEN(dev_priv, 2) ? 2048 : 4096;
 }
 
 static unsigned int
@@ -1863,7 +1905,7 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int color_plane)
 	case DRM_FORMAT_MOD_LINEAR:
 		return cpp;
 	case I915_FORMAT_MOD_X_TILED:
-		if (IS_GEN2(dev_priv))
+		if (IS_GEN(dev_priv, 2))
 			return 128;
 		else
 			return 512;
@@ -1872,7 +1914,7 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int color_plane)
 			return 128;
 		/* fall through */
 	case I915_FORMAT_MOD_Y_TILED:
-		if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
+		if (IS_GEN(dev_priv, 2) || HAS_128_BYTE_Y_TILING(dev_priv))
 			return 128;
 		else
 			return 512;
@@ -2024,6 +2066,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 	struct drm_device *dev = fb->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_object *obj = intel_fb_obj(fb);
+	intel_wakeref_t wakeref;
 	struct i915_vma *vma;
 	unsigned int pinctl;
 	u32 alignment;
@@ -2047,7 +2090,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 	 * intel_runtime_pm_put(), so it is correct to wrap only the
 	 * pin/unpin/fence and not more.
 	 */
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	atomic_inc(&dev_priv->gpu_error.pending_fb_pin);
 
@@ -2060,7 +2103,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 	 * complicated than this. For example, Cherryview appears quite
 	 * happy to scanout from anywhere within its global aperture.
 	 */
-	if (HAS_GMCH_DISPLAY(dev_priv))
+	if (HAS_GMCH(dev_priv))
 		pinctl |= PIN_MAPPABLE;
 
 	vma = i915_gem_object_pin_to_display_plane(obj,
@@ -2102,7 +2145,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 err:
 	atomic_dec(&dev_priv->gpu_error.pending_fb_pin);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 	return vma;
 }
 
@@ -2373,7 +2416,7 @@ static int intel_fb_offset_to_xy(int *x, int *y,
 	return 0;
 }
 
-static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
+static unsigned int intel_fb_modifier_to_tiling(u64 fb_modifier)
 {
 	switch (fb_modifier) {
 	case I915_FORMAT_MOD_X_TILED:
@@ -3161,7 +3204,7 @@ i9xx_plane_max_stride(struct intel_plane *plane,
 {
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 
-	if (!HAS_GMCH_DISPLAY(dev_priv)) {
+	if (!HAS_GMCH(dev_priv)) {
 		return 32*1024;
 	} else if (INTEL_GEN(dev_priv) >= 4) {
 		if (modifier == I915_FORMAT_MOD_X_TILED)
@@ -3181,28 +3224,38 @@ i9xx_plane_max_stride(struct intel_plane *plane,
 	}
 }
 
+static u32 i9xx_plane_ctl_crtc(const struct intel_crtc_state *crtc_state)
+{
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	u32 dspcntr = 0;
+
+	dspcntr |= DISPPLANE_GAMMA_ENABLE;
+
+	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
+		dspcntr |= DISPPLANE_PIPE_CSC_ENABLE;
+
+	if (INTEL_GEN(dev_priv) < 5)
+		dspcntr |= DISPPLANE_SEL_PIPE(crtc->pipe);
+
+	return dspcntr;
+}
+
 static u32 i9xx_plane_ctl(const struct intel_crtc_state *crtc_state,
 			  const struct intel_plane_state *plane_state)
 {
 	struct drm_i915_private *dev_priv =
 		to_i915(plane_state->base.plane->dev);
-	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
 	const struct drm_framebuffer *fb = plane_state->base.fb;
 	unsigned int rotation = plane_state->base.rotation;
 	u32 dspcntr;
 
-	dspcntr = DISPLAY_PLANE_ENABLE | DISPPLANE_GAMMA_ENABLE;
+	dspcntr = DISPLAY_PLANE_ENABLE;
 
-	if (IS_G4X(dev_priv) || IS_GEN5(dev_priv) ||
-	    IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv))
+	if (IS_G4X(dev_priv) || IS_GEN(dev_priv, 5) ||
+	    IS_GEN(dev_priv, 6) || IS_IVYBRIDGE(dev_priv))
 		dspcntr |= DISPPLANE_TRICKLE_FEED_DISABLE;
 
-	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
-		dspcntr |= DISPPLANE_PIPE_CSC_ENABLE;
-
-	if (INTEL_GEN(dev_priv) < 5)
-		dspcntr |= DISPPLANE_SEL_PIPE(crtc->pipe);
-
 	switch (fb->format->format) {
 	case DRM_FORMAT_C8:
 		dspcntr |= DISPPLANE_8BPP;
@@ -3330,11 +3383,13 @@ static void i9xx_update_plane(struct intel_plane *plane,
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum i9xx_plane_id i9xx_plane = plane->i9xx_plane;
 	u32 linear_offset;
-	u32 dspcntr = plane_state->ctl;
 	int x = plane_state->color_plane[0].x;
 	int y = plane_state->color_plane[0].y;
 	unsigned long irqflags;
 	u32 dspaddr_offset;
+	u32 dspcntr;
+
+	dspcntr = plane_state->ctl | i9xx_plane_ctl_crtc(crtc_state);
 
 	linear_offset = intel_fb_xy_to_linear(x, y, plane_state, 0);
 
@@ -3394,10 +3449,23 @@ static void i9xx_disable_plane(struct intel_plane *plane,
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum i9xx_plane_id i9xx_plane = plane->i9xx_plane;
 	unsigned long irqflags;
+	u32 dspcntr;
+
+	/*
+	 * DSPCNTR pipe gamma enable on g4x+ and pipe csc
+	 * enable on ilk+ affect the pipe bottom color as
+	 * well, so we must configure them even if the plane
+	 * is disabled.
+	 *
+	 * On pre-g4x there is no way to gamma correct the
+	 * pipe bottom color but we'll keep on doing this
+	 * anyway.
+	 */
+	dspcntr = i9xx_plane_ctl_crtc(crtc_state);
 
 	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
 
-	I915_WRITE_FW(DSPCNTR(i9xx_plane), 0);
+	I915_WRITE_FW(DSPCNTR(i9xx_plane), dspcntr);
 	if (INTEL_GEN(dev_priv) >= 4)
 		I915_WRITE_FW(DSPSURF(i9xx_plane), 0);
 	else
@@ -3412,6 +3480,7 @@ static bool i9xx_plane_get_hw_state(struct intel_plane *plane,
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum intel_display_power_domain power_domain;
 	enum i9xx_plane_id i9xx_plane = plane->i9xx_plane;
+	intel_wakeref_t wakeref;
 	bool ret;
 	u32 val;
 
@@ -3421,7 +3490,8 @@ static bool i9xx_plane_get_hw_state(struct intel_plane *plane,
 	 * display power wells.
 	 */
 	power_domain = POWER_DOMAIN_PIPE(plane->pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return false;
 
 	val = I915_READ(DSPCNTR(i9xx_plane));
@@ -3434,7 +3504,7 @@ static bool i9xx_plane_get_hw_state(struct intel_plane *plane,
 		*pipe = (val & DISPPLANE_SEL_PIPE_MASK) >>
 			DISPPLANE_SEL_PIPE_SHIFT;
 
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
@@ -3503,7 +3573,7 @@ u32 skl_plane_stride(const struct intel_plane_state *plane_state,
 	return stride / skl_plane_stride_mult(fb, color_plane, rotation);
 }
 
-static u32 skl_plane_ctl_format(uint32_t pixel_format)
+static u32 skl_plane_ctl_format(u32 pixel_format)
 {
 	switch (pixel_format) {
 	case DRM_FORMAT_C8:
@@ -3573,7 +3643,7 @@ static u32 glk_plane_color_ctl_alpha(const struct intel_plane_state *plane_state
 	}
 }
 
-static u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
+static u32 skl_plane_ctl_tiling(u64 fb_modifier)
 {
 	switch (fb_modifier) {
 	case DRM_FORMAT_MOD_LINEAR:
@@ -3632,6 +3702,20 @@ static u32 cnl_plane_ctl_flip(unsigned int reflect)
 	return 0;
 }
 
+u32 skl_plane_ctl_crtc(const struct intel_crtc_state *crtc_state)
+{
+	struct drm_i915_private *dev_priv = to_i915(crtc_state->base.crtc->dev);
+	u32 plane_ctl = 0;
+
+	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
+		return plane_ctl;
+
+	plane_ctl |= PLANE_CTL_PIPE_GAMMA_ENABLE;
+	plane_ctl |= PLANE_CTL_PIPE_CSC_ENABLE;
+
+	return plane_ctl;
+}
+
 u32 skl_plane_ctl(const struct intel_crtc_state *crtc_state,
 		  const struct intel_plane_state *plane_state)
 {
@@ -3646,10 +3730,7 @@ u32 skl_plane_ctl(const struct intel_crtc_state *crtc_state,
 
 	if (INTEL_GEN(dev_priv) < 10 && !IS_GEMINILAKE(dev_priv)) {
 		plane_ctl |= skl_plane_ctl_alpha(plane_state);
-		plane_ctl |=
-			PLANE_CTL_PIPE_GAMMA_ENABLE |
-			PLANE_CTL_PIPE_CSC_ENABLE |
-			PLANE_CTL_PLANE_GAMMA_DISABLE;
+		plane_ctl |= PLANE_CTL_PLANE_GAMMA_DISABLE;
 
 		if (plane_state->base.color_encoding == DRM_COLOR_YCBCR_BT709)
 			plane_ctl |= PLANE_CTL_YUV_TO_RGB_CSC_FORMAT_BT709;
@@ -3674,19 +3755,27 @@ u32 skl_plane_ctl(const struct intel_crtc_state *crtc_state,
 	return plane_ctl;
 }
 
+u32 glk_plane_color_ctl_crtc(const struct intel_crtc_state *crtc_state)
+{
+	struct drm_i915_private *dev_priv = to_i915(crtc_state->base.crtc->dev);
+	u32 plane_color_ctl = 0;
+
+	if (INTEL_GEN(dev_priv) >= 11)
+		return plane_color_ctl;
+
+	plane_color_ctl |= PLANE_COLOR_PIPE_GAMMA_ENABLE;
+	plane_color_ctl |= PLANE_COLOR_PIPE_CSC_ENABLE;
+
+	return plane_color_ctl;
+}
+
 u32 glk_plane_color_ctl(const struct intel_crtc_state *crtc_state,
 			const struct intel_plane_state *plane_state)
 {
-	struct drm_i915_private *dev_priv =
-		to_i915(plane_state->base.plane->dev);
 	const struct drm_framebuffer *fb = plane_state->base.fb;
 	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
 	u32 plane_color_ctl = 0;
 
-	if (INTEL_GEN(dev_priv) < 11) {
-		plane_color_ctl |= PLANE_COLOR_PIPE_GAMMA_ENABLE;
-		plane_color_ctl |= PLANE_COLOR_PIPE_CSC_ENABLE;
-	}
 	plane_color_ctl |= PLANE_COLOR_PLANE_GAMMA_DISABLE;
 	plane_color_ctl |= glk_plane_color_ctl_alpha(plane_state);
 
@@ -3735,7 +3824,7 @@ __intel_display_resume(struct drm_device *dev,
 	}
 
 	/* ignore any reset values/BIOS leftovers in the WM registers */
-	if (!HAS_GMCH_DISPLAY(to_i915(dev)))
+	if (!HAS_GMCH(to_i915(dev)))
 		to_intel_atomic_state(state)->skip_intermediate_wm = true;
 
 	ret = drm_atomic_helper_commit_duplicated_state(state, ctx);
@@ -3746,8 +3835,8 @@ __intel_display_resume(struct drm_device *dev,
 
 static bool gpu_reset_clobbers_display(struct drm_i915_private *dev_priv)
 {
-	return intel_has_gpu_reset(dev_priv) &&
-		INTEL_GEN(dev_priv) < 5 && !IS_G4X(dev_priv);
+	return (INTEL_INFO(dev_priv)->gpu_reset_clobbers_display &&
+		intel_has_gpu_reset(dev_priv));
 }
 
 void intel_prepare_reset(struct drm_i915_private *dev_priv)
@@ -3860,6 +3949,30 @@ unlock:
 	clear_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags);
 }
 
+static void icl_set_pipe_chicken(struct intel_crtc *crtc)
+{
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	enum pipe pipe = crtc->pipe;
+	u32 tmp;
+
+	tmp = I915_READ(PIPE_CHICKEN(pipe));
+
+	/*
+	 * Display WA #1153: icl
+	 * enable hardware to bypass the alpha math
+	 * and rounding for per-pixel values 00 and 0xff
+	 */
+	tmp |= PER_PIXEL_ALPHA_BYPASS_EN;
+
+	/*
+	 * W/A for underruns with linear/X-tiled with
+	 * WM1+ disabled.
+	 */
+	tmp |= PM_FILL_MAINTAIN_DBUF_FULLNESS;
+
+	I915_WRITE(PIPE_CHICKEN(pipe), tmp);
+}
+
 static void intel_update_pipe_config(const struct intel_crtc_state *old_crtc_state,
 				     const struct intel_crtc_state *new_crtc_state)
 {
@@ -3894,6 +4007,19 @@ static void intel_update_pipe_config(const struct intel_crtc_state *old_crtc_sta
 		else if (old_crtc_state->pch_pfit.enabled)
 			ironlake_pfit_disable(old_crtc_state);
 	}
+
+	/*
+	 * We don't (yet) allow userspace to control the pipe background color,
+	 * so force it to black, but apply pipe gamma and CSC so that its
+	 * handling will match how we program our planes.
+	 */
+	if (INTEL_GEN(dev_priv) >= 9)
+		I915_WRITE(SKL_BOTTOM_COLOR(crtc->pipe),
+			   SKL_BOTTOM_COLOR_GAMMA_ENABLE |
+			   SKL_BOTTOM_COLOR_CSC_ENABLE);
+
+	if (INTEL_GEN(dev_priv) >= 11)
+		icl_set_pipe_chicken(crtc);
 }
 
 static void intel_fdi_normal_train(struct intel_crtc *crtc)
@@ -4120,7 +4246,7 @@ static void gen6_fdi_link_train(struct intel_crtc *crtc,
 	temp = I915_READ(reg);
 	temp &= ~FDI_LINK_TRAIN_NONE;
 	temp |= FDI_LINK_TRAIN_PATTERN_2;
-	if (IS_GEN6(dev_priv)) {
+	if (IS_GEN(dev_priv, 6)) {
 		temp &= ~FDI_LINK_TRAIN_VOL_EMP_MASK;
 		/* SNB-B */
 		temp |= FDI_LINK_TRAIN_400MV_0DB_SNB_B;
@@ -4593,7 +4719,7 @@ static void ironlake_pch_transcoder_set_timings(const struct intel_crtc_state *c
 
 static void cpt_set_fdi_bc_bifurcation(struct drm_i915_private *dev_priv, bool enable)
 {
-	uint32_t temp;
+	u32 temp;
 
 	temp = I915_READ(SOUTH_CHICKEN1);
 	if (!!(temp & FDI_BC_BIFURCATION_SELECT) == enable)
@@ -4919,10 +5045,10 @@ skl_update_scaler(struct intel_crtc_state *crtc_state, bool force_detach,
 	/* range checks */
 	if (src_w < SKL_MIN_SRC_W || src_h < SKL_MIN_SRC_H ||
 	    dst_w < SKL_MIN_DST_W || dst_h < SKL_MIN_DST_H ||
-	    (IS_GEN11(dev_priv) &&
+	    (IS_GEN(dev_priv, 11) &&
 	     (src_w > ICL_MAX_SRC_W || src_h > ICL_MAX_SRC_H ||
 	      dst_w > ICL_MAX_DST_W || dst_h > ICL_MAX_DST_H)) ||
-	    (!IS_GEN11(dev_priv) &&
+	    (!IS_GEN(dev_priv, 11) &&
 	     (src_w > SKL_MAX_SRC_W || src_h > SKL_MAX_SRC_H ||
 	      dst_w > SKL_MAX_DST_W || dst_h > SKL_MAX_DST_H)))	{
 		DRM_DEBUG_KMS("scaler_user index %u.%u: src %ux%u dst %ux%u "
@@ -5213,7 +5339,7 @@ intel_post_enable_primary(struct drm_crtc *crtc,
 	 * FIXME: Need to fix the logic to work when we turn off all planes
 	 * but leave the pipe running.
 	 */
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, true);
 
 	/* Underruns don't always raise interrupts, so check manually. */
@@ -5234,7 +5360,7 @@ intel_pre_disable_primary_noatomic(struct drm_crtc *crtc)
 	 * Gen2 reports pipe underruns whenever all planes are disabled.
 	 * So disable underrun reporting before all the planes get disabled.
 	 */
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, false);
 
 	hsw_disable_ips(to_intel_crtc_state(crtc->state));
@@ -5248,7 +5374,7 @@ intel_pre_disable_primary_noatomic(struct drm_crtc *crtc)
 	 * event which is after the vblank start event, so we need to have a
 	 * wait-for-vblank between disabling the plane and the pipe.
 	 */
-	if (HAS_GMCH_DISPLAY(dev_priv) &&
+	if (HAS_GMCH(dev_priv) &&
 	    intel_set_memory_cxsr(dev_priv, false))
 		intel_wait_for_vblank(dev_priv, pipe);
 }
@@ -5256,18 +5382,36 @@ intel_pre_disable_primary_noatomic(struct drm_crtc *crtc)
 static bool hsw_pre_update_disable_ips(const struct intel_crtc_state *old_crtc_state,
 				       const struct intel_crtc_state *new_crtc_state)
 {
+	struct intel_crtc *crtc = to_intel_crtc(new_crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+
 	if (!old_crtc_state->ips_enabled)
 		return false;
 
 	if (needs_modeset(&new_crtc_state->base))
 		return true;
 
+	/*
+	 * Workaround : Do not read or write the pipe palette/gamma data while
+	 * GAMMA_MODE is configured for split gamma and IPS_CTL has IPS enabled.
+	 *
+	 * Disable IPS before we program the LUT.
+	 */
+	if (IS_HASWELL(dev_priv) &&
+	    (new_crtc_state->base.color_mgmt_changed ||
+	     new_crtc_state->update_pipe) &&
+	    new_crtc_state->gamma_mode == GAMMA_MODE_MODE_SPLIT)
+		return true;
+
 	return !new_crtc_state->ips_enabled;
 }
 
 static bool hsw_post_update_enable_ips(const struct intel_crtc_state *old_crtc_state,
 				       const struct intel_crtc_state *new_crtc_state)
 {
+	struct intel_crtc *crtc = to_intel_crtc(new_crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+
 	if (!new_crtc_state->ips_enabled)
 		return false;
 
@@ -5275,6 +5419,18 @@ static bool hsw_post_update_enable_ips(const struct intel_crtc_state *old_crtc_s
 		return true;
 
 	/*
+	 * Workaround : Do not read or write the pipe palette/gamma data while
+	 * GAMMA_MODE is configured for split gamma and IPS_CTL has IPS enabled.
+	 *
+	 * Re-enable IPS after the LUT has been programmed.
+	 */
+	if (IS_HASWELL(dev_priv) &&
+	    (new_crtc_state->base.color_mgmt_changed ||
+	     new_crtc_state->update_pipe) &&
+	    new_crtc_state->gamma_mode == GAMMA_MODE_MODE_SPLIT)
+		return true;
+
+	/*
 	 * We can't read out IPS on broadwell, assume the worst and
 	 * forcibly enable IPS on the first fastset.
 	 */
@@ -5292,7 +5448,7 @@ static bool needs_nv12_wa(struct drm_i915_private *dev_priv,
 		return false;
 
 	/* WA Display #0827: Gen9:all */
-	if (IS_GEN9(dev_priv) && !IS_GEMINILAKE(dev_priv))
+	if (IS_GEN(dev_priv, 9) && !IS_GEMINILAKE(dev_priv))
 		return true;
 
 	return false;
@@ -5365,7 +5521,7 @@ static void intel_pre_plane_update(struct intel_crtc_state *old_crtc_state,
 		 * Gen2 reports pipe underruns whenever all planes are disabled.
 		 * So disable underrun reporting before all the planes get disabled.
 		 */
-		if (IS_GEN2(dev_priv) && old_primary_state->visible &&
+		if (IS_GEN(dev_priv, 2) && old_primary_state->visible &&
 		    (modeset || !new_primary_state->base.visible))
 			intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, false);
 	}
@@ -5385,7 +5541,7 @@ static void intel_pre_plane_update(struct intel_crtc_state *old_crtc_state,
 	 * event which is after the vblank start event, so we need to have a
 	 * wait-for-vblank between disabling the plane and the pipe.
 	 */
-	if (HAS_GMCH_DISPLAY(dev_priv) && old_crtc_state->base.active &&
+	if (HAS_GMCH(dev_priv) && old_crtc_state->base.active &&
 	    pipe_config->disable_cxsr && intel_set_memory_cxsr(dev_priv, false))
 		intel_wait_for_vblank(dev_priv, crtc->pipe);
 
@@ -5578,6 +5734,26 @@ static void intel_encoders_post_pll_disable(struct drm_crtc *crtc,
 	}
 }
 
+static void intel_encoders_update_pipe(struct drm_crtc *crtc,
+				       struct intel_crtc_state *crtc_state,
+				       struct drm_atomic_state *old_state)
+{
+	struct drm_connector_state *conn_state;
+	struct drm_connector *conn;
+	int i;
+
+	for_each_new_connector_in_state(old_state, conn, conn_state, i) {
+		struct intel_encoder *encoder =
+			to_intel_encoder(conn_state->best_encoder);
+
+		if (conn_state->crtc != crtc)
+			continue;
+
+		if (encoder->update_pipe)
+			encoder->update_pipe(encoder, crtc_state, conn_state);
+	}
+}
+
 static void ironlake_crtc_enable(struct intel_crtc_state *pipe_config,
 				 struct drm_atomic_state *old_state)
 {
@@ -5641,7 +5817,8 @@ static void ironlake_crtc_enable(struct intel_crtc_state *pipe_config,
 	 * On ILK+ LUT must be loaded before the pipe is running but with
 	 * clocks enabled
 	 */
-	intel_color_load_luts(&pipe_config->base);
+	intel_color_load_luts(pipe_config);
+	intel_color_commit(pipe_config);
 
 	if (dev_priv->display.initial_watermarks != NULL)
 		dev_priv->display.initial_watermarks(old_intel_state, pipe_config);
@@ -5651,7 +5828,7 @@ static void ironlake_crtc_enable(struct intel_crtc_state *pipe_config,
 		ironlake_pch_enable(old_intel_state, pipe_config);
 
 	assert_vblank_disabled(crtc);
-	drm_crtc_vblank_on(crtc);
+	intel_crtc_vblank_on(pipe_config);
 
 	intel_encoders_enable(crtc, pipe_config, old_state);
 
@@ -5696,7 +5873,7 @@ static void icl_pipe_mbus_enable(struct intel_crtc *crtc)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum pipe pipe = crtc->pipe;
-	uint32_t val;
+	u32 val;
 
 	val = MBUS_DBOX_A_CREDIT(2);
 	val |= MBUS_DBOX_BW_CREDIT(1);
@@ -5716,7 +5893,6 @@ static void haswell_crtc_enable(struct intel_crtc_state *pipe_config,
 	struct intel_atomic_state *old_intel_state =
 		to_intel_atomic_state(old_state);
 	bool psl_clkgate_wa;
-	u32 pipe_chicken;
 
 	if (WARN_ON(intel_crtc->active))
 		return;
@@ -5752,8 +5928,6 @@ static void haswell_crtc_enable(struct intel_crtc_state *pipe_config,
 
 	haswell_set_pipemisc(pipe_config);
 
-	intel_color_set_csc(&pipe_config->base);
-
 	intel_crtc->active = true;
 
 	/* Display WA #1180: WaDisableScalarClockGating: glk, cnl */
@@ -5771,18 +5945,11 @@ static void haswell_crtc_enable(struct intel_crtc_state *pipe_config,
 	 * On ILK+ LUT must be loaded before the pipe is running but with
 	 * clocks enabled
 	 */
-	intel_color_load_luts(&pipe_config->base);
+	intel_color_load_luts(pipe_config);
+	intel_color_commit(pipe_config);
 
-	/*
-	 * Display WA #1153: enable hardware to bypass the alpha math
-	 * and rounding for per-pixel values 00 and 0xff
-	 */
-	if (INTEL_GEN(dev_priv) >= 11) {
-		pipe_chicken = I915_READ(PIPE_CHICKEN(pipe));
-		if (!(pipe_chicken & PER_PIXEL_ALPHA_BYPASS_EN))
-			I915_WRITE_FW(PIPE_CHICKEN(pipe),
-				      pipe_chicken | PER_PIXEL_ALPHA_BYPASS_EN);
-	}
+	if (INTEL_GEN(dev_priv) >= 11)
+		icl_set_pipe_chicken(intel_crtc);
 
 	intel_ddi_set_pipe_settings(pipe_config);
 	if (!transcoder_is_dsi(cpu_transcoder))
@@ -5805,7 +5972,7 @@ static void haswell_crtc_enable(struct intel_crtc_state *pipe_config,
 		intel_ddi_set_vc_payload_alloc(pipe_config, true);
 
 	assert_vblank_disabled(crtc);
-	drm_crtc_vblank_on(crtc);
+	intel_crtc_vblank_on(pipe_config);
 
 	intel_encoders_enable(crtc, pipe_config, old_state);
 
@@ -6087,7 +6254,7 @@ static void modeset_put_power_domains(struct drm_i915_private *dev_priv,
 	enum intel_display_power_domain domain;
 
 	for_each_power_domain(domain, domains)
-		intel_display_power_put(dev_priv, domain);
+		intel_display_power_put_unchecked(dev_priv, domain);
 }
 
 static void valleyview_crtc_enable(struct intel_crtc_state *pipe_config,
@@ -6117,8 +6284,6 @@ static void valleyview_crtc_enable(struct intel_crtc_state *pipe_config,
 
 	i9xx_set_pipeconf(pipe_config);
 
-	intel_color_set_csc(&pipe_config->base);
-
 	intel_crtc->active = true;
 
 	intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, true);
@@ -6137,14 +6302,15 @@ static void valleyview_crtc_enable(struct intel_crtc_state *pipe_config,
 
 	i9xx_pfit_enable(pipe_config);
 
-	intel_color_load_luts(&pipe_config->base);
+	intel_color_load_luts(pipe_config);
+	intel_color_commit(pipe_config);
 
 	dev_priv->display.initial_watermarks(old_intel_state,
 					     pipe_config);
 	intel_enable_pipe(pipe_config);
 
 	assert_vblank_disabled(crtc);
-	drm_crtc_vblank_on(crtc);
+	intel_crtc_vblank_on(pipe_config);
 
 	intel_encoders_enable(crtc, pipe_config, old_state);
 }
@@ -6184,7 +6350,7 @@ static void i9xx_crtc_enable(struct intel_crtc_state *pipe_config,
 
 	intel_crtc->active = true;
 
-	if (!IS_GEN2(dev_priv))
+	if (!IS_GEN(dev_priv, 2))
 		intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, true);
 
 	intel_encoders_pre_enable(crtc, pipe_config, old_state);
@@ -6193,7 +6359,8 @@ static void i9xx_crtc_enable(struct intel_crtc_state *pipe_config,
 
 	i9xx_pfit_enable(pipe_config);
 
-	intel_color_load_luts(&pipe_config->base);
+	intel_color_load_luts(pipe_config);
+	intel_color_commit(pipe_config);
 
 	if (dev_priv->display.initial_watermarks != NULL)
 		dev_priv->display.initial_watermarks(old_intel_state,
@@ -6203,7 +6370,7 @@ static void i9xx_crtc_enable(struct intel_crtc_state *pipe_config,
 	intel_enable_pipe(pipe_config);
 
 	assert_vblank_disabled(crtc);
-	drm_crtc_vblank_on(crtc);
+	intel_crtc_vblank_on(pipe_config);
 
 	intel_encoders_enable(crtc, pipe_config, old_state);
 }
@@ -6236,7 +6403,7 @@ static void i9xx_crtc_disable(struct intel_crtc_state *old_crtc_state,
 	 * On gen2 planes are double buffered but the pipe isn't, so we must
 	 * wait for planes to fully turn off before disabling the pipe.
 	 */
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		intel_wait_for_vblank(dev_priv, pipe);
 
 	intel_encoders_disable(crtc, old_crtc_state, old_state);
@@ -6261,7 +6428,7 @@ static void i9xx_crtc_disable(struct intel_crtc_state *old_crtc_state,
 
 	intel_encoders_post_pll_disable(crtc, old_crtc_state, old_state);
 
-	if (!IS_GEN2(dev_priv))
+	if (!IS_GEN(dev_priv, 2))
 		intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, false);
 
 	if (!dev_priv->display.initial_watermarks)
@@ -6334,7 +6501,7 @@ static void intel_crtc_disable_noatomic(struct drm_crtc *crtc,
 
 	domains = intel_crtc->enabled_power_domains;
 	for_each_power_domain(domain, domains)
-		intel_display_power_put(dev_priv, domain);
+		intel_display_power_put_unchecked(dev_priv, domain);
 	intel_crtc->enabled_power_domains = 0;
 
 	dev_priv->active_crtcs &= ~(1 << intel_crtc->pipe);
@@ -6600,9 +6767,9 @@ static bool intel_crtc_supports_double_wide(const struct intel_crtc *crtc)
 		(crtc->pipe == PIPE_A || IS_I915G(dev_priv));
 }
 
-static uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config)
+static u32 ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config)
 {
-	uint32_t pixel_rate;
+	u32 pixel_rate;
 
 	pixel_rate = pipe_config->base.adjusted_mode.crtc_clock;
 
@@ -6612,8 +6779,8 @@ static uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config)
 	 */
 
 	if (pipe_config->pch_pfit.enabled) {
-		uint64_t pipe_w, pipe_h, pfit_w, pfit_h;
-		uint32_t pfit_size = pipe_config->pch_pfit.size;
+		u64 pipe_w, pipe_h, pfit_w, pfit_h;
+		u32 pfit_size = pipe_config->pch_pfit.size;
 
 		pipe_w = pipe_config->pipe_src_w;
 		pipe_h = pipe_config->pipe_src_h;
@@ -6628,7 +6795,7 @@ static uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config)
 		if (WARN_ON(!pfit_w || !pfit_h))
 			return pixel_rate;
 
-		pixel_rate = div_u64((uint64_t) pixel_rate * pipe_w * pipe_h,
+		pixel_rate = div_u64((u64)pixel_rate * pipe_w * pipe_h,
 				     pfit_w * pfit_h);
 	}
 
@@ -6639,7 +6806,7 @@ static void intel_crtc_compute_pixel_rate(struct intel_crtc_state *crtc_state)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc_state->base.crtc->dev);
 
-	if (HAS_GMCH_DISPLAY(dev_priv))
+	if (HAS_GMCH(dev_priv))
 		/* FIXME calculate proper pipe pixel rate for GMCH pfit */
 		crtc_state->pixel_rate =
 			crtc_state->base.adjusted_mode.crtc_clock;
@@ -6724,7 +6891,7 @@ static int intel_crtc_compute_config(struct intel_crtc *crtc,
 }
 
 static void
-intel_reduce_m_n_ratio(uint32_t *num, uint32_t *den)
+intel_reduce_m_n_ratio(u32 *num, u32 *den)
 {
 	while (*num > DATA_LINK_M_N_MASK ||
 	       *den > DATA_LINK_M_N_MASK) {
@@ -6734,7 +6901,7 @@ intel_reduce_m_n_ratio(uint32_t *num, uint32_t *den)
 }
 
 static void compute_m_n(unsigned int m, unsigned int n,
-			uint32_t *ret_m, uint32_t *ret_n,
+			u32 *ret_m, u32 *ret_n,
 			bool constant_n)
 {
 	/*
@@ -6749,7 +6916,7 @@ static void compute_m_n(unsigned int m, unsigned int n,
 	else
 		*ret_n = min_t(unsigned int, roundup_pow_of_two(n), DATA_LINK_N_MAX);
 
-	*ret_m = div_u64((uint64_t) m * *ret_n, n);
+	*ret_m = div_u64((u64)m * *ret_n, n);
 	intel_reduce_m_n_ratio(ret_m, ret_n);
 }
 
@@ -6779,12 +6946,12 @@ static inline bool intel_panel_use_ssc(struct drm_i915_private *dev_priv)
 		&& !(dev_priv->quirks & QUIRK_LVDS_SSC_DISABLE);
 }
 
-static uint32_t pnv_dpll_compute_fp(struct dpll *dpll)
+static u32 pnv_dpll_compute_fp(struct dpll *dpll)
 {
 	return (1 << dpll->n) << 16 | dpll->m2;
 }
 
-static uint32_t i9xx_dpll_compute_fp(struct dpll *dpll)
+static u32 i9xx_dpll_compute_fp(struct dpll *dpll)
 {
 	return dpll->n << 16 | dpll->m1 << 8 | dpll->m2;
 }
@@ -6868,7 +7035,7 @@ static bool transcoder_has_m2_n2(struct drm_i915_private *dev_priv,
 	 * Strictly speaking some registers are available before
 	 * gen7, but we only support DRRS on gen7+
 	 */
-	return IS_GEN7(dev_priv) || IS_CHERRYVIEW(dev_priv);
+	return IS_GEN(dev_priv, 7) || IS_CHERRYVIEW(dev_priv);
 }
 
 static void intel_cpu_transcoder_set_m_n(const struct intel_crtc_state *crtc_state,
@@ -7340,7 +7507,7 @@ static void intel_set_pipe_timings(const struct intel_crtc_state *crtc_state)
 	enum pipe pipe = crtc->pipe;
 	enum transcoder cpu_transcoder = crtc_state->cpu_transcoder;
 	const struct drm_display_mode *adjusted_mode = &crtc_state->base.adjusted_mode;
-	uint32_t crtc_vtotal, crtc_vblank_end;
+	u32 crtc_vtotal, crtc_vblank_end;
 	int vsyncshift = 0;
 
 	/* We need to be careful not to changed the adjusted mode, for otherwise
@@ -7415,7 +7582,7 @@ static void intel_get_pipe_timings(struct intel_crtc *crtc,
 	struct drm_device *dev = crtc->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	enum transcoder cpu_transcoder = pipe_config->cpu_transcoder;
-	uint32_t tmp;
+	u32 tmp;
 
 	tmp = I915_READ(HTOTAL(cpu_transcoder));
 	pipe_config->base.adjusted_mode.crtc_hdisplay = (tmp & 0xffff) + 1;
@@ -7486,7 +7653,7 @@ static void i9xx_set_pipeconf(const struct intel_crtc_state *crtc_state)
 {
 	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
-	uint32_t pipeconf;
+	u32 pipeconf;
 
 	pipeconf = 0;
 
@@ -7731,7 +7898,7 @@ static void i9xx_get_pfit_config(struct intel_crtc *crtc,
 				 struct intel_crtc_state *pipe_config)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
-	uint32_t tmp;
+	u32 tmp;
 
 	if (INTEL_GEN(dev_priv) <= 3 &&
 	    (IS_I830(dev_priv) || !IS_MOBILE(dev_priv)))
@@ -7946,11 +8113,13 @@ static bool i9xx_get_pipe_config(struct intel_crtc *crtc,
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum intel_display_power_domain power_domain;
-	uint32_t tmp;
+	intel_wakeref_t wakeref;
+	u32 tmp;
 	bool ret;
 
 	power_domain = POWER_DOMAIN_PIPE(crtc->pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return false;
 
 	pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
@@ -8051,7 +8220,7 @@ static bool i9xx_get_pipe_config(struct intel_crtc *crtc,
 	ret = true;
 
 out:
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
@@ -8225,7 +8394,7 @@ static void ironlake_init_pch_refclk(struct drm_i915_private *dev_priv)
 
 static void lpt_reset_fdi_mphy(struct drm_i915_private *dev_priv)
 {
-	uint32_t tmp;
+	u32 tmp;
 
 	tmp = I915_READ(SOUTH_CHICKEN2);
 	tmp |= FDI_MPHY_IOSFSB_RESET_CTL;
@@ -8247,7 +8416,7 @@ static void lpt_reset_fdi_mphy(struct drm_i915_private *dev_priv)
 /* WaMPhyProgramming:hsw */
 static void lpt_program_fdi_mphy(struct drm_i915_private *dev_priv)
 {
-	uint32_t tmp;
+	u32 tmp;
 
 	tmp = intel_sbi_read(dev_priv, 0x8008, SBI_MPHY);
 	tmp &= ~(0xFF << 24);
@@ -8328,7 +8497,7 @@ static void lpt_program_fdi_mphy(struct drm_i915_private *dev_priv)
 static void lpt_enable_clkout_dp(struct drm_i915_private *dev_priv,
 				 bool with_spread, bool with_fdi)
 {
-	uint32_t reg, tmp;
+	u32 reg, tmp;
 
 	if (WARN(with_fdi && !with_spread, "FDI requires downspread\n"))
 		with_spread = true;
@@ -8367,7 +8536,7 @@ static void lpt_enable_clkout_dp(struct drm_i915_private *dev_priv,
 /* Sequence to disable CLKOUT_DP */
 static void lpt_disable_clkout_dp(struct drm_i915_private *dev_priv)
 {
-	uint32_t reg, tmp;
+	u32 reg, tmp;
 
 	mutex_lock(&dev_priv->sb_lock);
 
@@ -8392,7 +8561,7 @@ static void lpt_disable_clkout_dp(struct drm_i915_private *dev_priv)
 
 #define BEND_IDX(steps) ((50 + (steps)) / 5)
 
-static const uint16_t sscdivintphase[] = {
+static const u16 sscdivintphase[] = {
 	[BEND_IDX( 50)] = 0x3B23,
 	[BEND_IDX( 45)] = 0x3B23,
 	[BEND_IDX( 40)] = 0x3C23,
@@ -8424,7 +8593,7 @@ static const uint16_t sscdivintphase[] = {
  */
 static void lpt_bend_clkout_dp(struct drm_i915_private *dev_priv, int steps)
 {
-	uint32_t tmp;
+	u32 tmp;
 	int idx = BEND_IDX(steps);
 
 	if (WARN_ON(steps % 5 != 0))
@@ -8490,7 +8659,7 @@ static void ironlake_set_pipeconf(const struct intel_crtc_state *crtc_state)
 	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum pipe pipe = crtc->pipe;
-	uint32_t val;
+	u32 val;
 
 	val = 0;
 
@@ -8837,7 +9006,7 @@ static void skylake_get_pfit_config(struct intel_crtc *crtc,
 	struct drm_device *dev = crtc->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_crtc_scaler_state *scaler_state = &pipe_config->scaler_state;
-	uint32_t ps_ctrl = 0;
+	u32 ps_ctrl = 0;
 	int id = -1;
 	int i;
 
@@ -8849,6 +9018,7 @@ static void skylake_get_pfit_config(struct intel_crtc *crtc,
 			pipe_config->pch_pfit.enabled = true;
 			pipe_config->pch_pfit.pos = I915_READ(SKL_PS_WIN_POS(crtc->pipe, i));
 			pipe_config->pch_pfit.size = I915_READ(SKL_PS_WIN_SZ(crtc->pipe, i));
+			scaler_state->scalers[i].in_use = true;
 			break;
 		}
 	}
@@ -8993,7 +9163,7 @@ static void ironlake_get_pfit_config(struct intel_crtc *crtc,
 {
 	struct drm_device *dev = crtc->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	uint32_t tmp;
+	u32 tmp;
 
 	tmp = I915_READ(PF_CTL(crtc->pipe));
 
@@ -9005,7 +9175,7 @@ static void ironlake_get_pfit_config(struct intel_crtc *crtc,
 		/* We currently do not free assignements of panel fitters on
 		 * ivb/hsw (since we don't use the higher upscaling modes which
 		 * differentiates them) so just WARN about this case for now. */
-		if (IS_GEN7(dev_priv)) {
+		if (IS_GEN(dev_priv, 7)) {
 			WARN_ON((tmp & PF_PIPE_SEL_MASK_IVB) !=
 				PF_PIPE_SEL_IVB(crtc->pipe));
 		}
@@ -9018,11 +9188,13 @@ static bool ironlake_get_pipe_config(struct intel_crtc *crtc,
 	struct drm_device *dev = crtc->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	enum intel_display_power_domain power_domain;
-	uint32_t tmp;
+	intel_wakeref_t wakeref;
+	u32 tmp;
 	bool ret;
 
 	power_domain = POWER_DOMAIN_PIPE(crtc->pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return false;
 
 	pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
@@ -9105,7 +9277,7 @@ static bool ironlake_get_pipe_config(struct intel_crtc *crtc,
 	ret = true;
 
 out:
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
@@ -9145,7 +9317,7 @@ static void assert_can_disable_lcpll(struct drm_i915_private *dev_priv)
 	I915_STATE_WARN(intel_irqs_enabled(dev_priv), "IRQs enabled\n");
 }
 
-static uint32_t hsw_read_dcomp(struct drm_i915_private *dev_priv)
+static u32 hsw_read_dcomp(struct drm_i915_private *dev_priv)
 {
 	if (IS_HASWELL(dev_priv))
 		return I915_READ(D_COMP_HSW);
@@ -9153,7 +9325,7 @@ static uint32_t hsw_read_dcomp(struct drm_i915_private *dev_priv)
 		return I915_READ(D_COMP_BDW);
 }
 
-static void hsw_write_dcomp(struct drm_i915_private *dev_priv, uint32_t val)
+static void hsw_write_dcomp(struct drm_i915_private *dev_priv, u32 val)
 {
 	if (IS_HASWELL(dev_priv)) {
 		mutex_lock(&dev_priv->pcu_lock);
@@ -9178,7 +9350,7 @@ static void hsw_write_dcomp(struct drm_i915_private *dev_priv, uint32_t val)
 static void hsw_disable_lcpll(struct drm_i915_private *dev_priv,
 			      bool switch_to_fclk, bool allow_power_down)
 {
-	uint32_t val;
+	u32 val;
 
 	assert_can_disable_lcpll(dev_priv);
 
@@ -9225,7 +9397,7 @@ static void hsw_disable_lcpll(struct drm_i915_private *dev_priv,
  */
 static void hsw_restore_lcpll(struct drm_i915_private *dev_priv)
 {
-	uint32_t val;
+	u32 val;
 
 	val = I915_READ(LCPLL_CTL);
 
@@ -9300,7 +9472,7 @@ static void hsw_restore_lcpll(struct drm_i915_private *dev_priv)
  */
 void hsw_enable_pc8(struct drm_i915_private *dev_priv)
 {
-	uint32_t val;
+	u32 val;
 
 	DRM_DEBUG_KMS("Enabling package C8+\n");
 
@@ -9316,7 +9488,7 @@ void hsw_enable_pc8(struct drm_i915_private *dev_priv)
 
 void hsw_disable_pc8(struct drm_i915_private *dev_priv)
 {
-	uint32_t val;
+	u32 val;
 
 	DRM_DEBUG_KMS("Disabling package C8+\n");
 
@@ -9384,7 +9556,7 @@ static void icelake_get_ddi_pll(struct drm_i915_private *dev_priv,
 		if (WARN_ON(!intel_dpll_is_combophy(id)))
 			return;
 	} else if (intel_port_is_tc(dev_priv, port)) {
-		id = icl_port_to_mg_pll_id(port);
+		id = icl_tc_port_to_pll_id(intel_port_to_tc(dev_priv, port));
 	} else {
 		WARN(1, "Invalid port %x\n", port);
 		return;
@@ -9438,7 +9610,7 @@ static void haswell_get_ddi_pll(struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *pipe_config)
 {
 	enum intel_dpll_id id;
-	uint32_t ddi_pll_sel = I915_READ(PORT_CLK_SEL(port));
+	u32 ddi_pll_sel = I915_READ(PORT_CLK_SEL(port));
 
 	switch (ddi_pll_sel) {
 	case PORT_CLK_SEL_WRPLL1:
@@ -9495,7 +9667,9 @@ static bool hsw_get_transcoder_state(struct intel_crtc *crtc,
 	 * XXX: Do intel_display_power_get_if_enabled before reading this (for
 	 * consistency and less surprising code; it's in always on power).
 	 */
-	for_each_set_bit(panel_transcoder, &panel_transcoder_mask, 32) {
+	for_each_set_bit(panel_transcoder,
+			 &panel_transcoder_mask,
+			 ARRAY_SIZE(INTEL_INFO(dev_priv)->trans_offsets)) {
 		enum pipe trans_pipe;
 
 		tmp = I915_READ(TRANS_DDI_FUNC_CTL(panel_transcoder));
@@ -9541,6 +9715,8 @@ static bool hsw_get_transcoder_state(struct intel_crtc *crtc,
 	power_domain = POWER_DOMAIN_TRANSCODER(pipe_config->cpu_transcoder);
 	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
 		return false;
+
+	WARN_ON(*power_domain_mask & BIT_ULL(power_domain));
 	*power_domain_mask |= BIT_ULL(power_domain);
 
 	tmp = I915_READ(PIPECONF(pipe_config->cpu_transcoder));
@@ -9568,6 +9744,8 @@ static bool bxt_get_dsi_transcoder_state(struct intel_crtc *crtc,
 		power_domain = POWER_DOMAIN_TRANSCODER(cpu_transcoder);
 		if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
 			continue;
+
+		WARN_ON(*power_domain_mask & BIT_ULL(power_domain));
 		*power_domain_mask |= BIT_ULL(power_domain);
 
 		/*
@@ -9602,7 +9780,7 @@ static void haswell_get_ddi_port_state(struct intel_crtc *crtc,
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	struct intel_shared_dpll *pll;
 	enum port port;
-	uint32_t tmp;
+	u32 tmp;
 
 	tmp = I915_READ(TRANS_DDI_FUNC_CTL(pipe_config->cpu_transcoder));
 
@@ -9684,7 +9862,9 @@ static bool haswell_get_pipe_config(struct intel_crtc *crtc,
 
 	power_domain = POWER_DOMAIN_PIPE_PANEL_FITTER(crtc->pipe);
 	if (intel_display_power_get_if_enabled(dev_priv, power_domain)) {
+		WARN_ON(power_domain_mask & BIT_ULL(power_domain));
 		power_domain_mask |= BIT_ULL(power_domain);
+
 		if (INTEL_GEN(dev_priv) >= 9)
 			skylake_get_pfit_config(crtc, pipe_config);
 		else
@@ -9714,7 +9894,7 @@ static bool haswell_get_pipe_config(struct intel_crtc *crtc,
 
 out:
 	for_each_power_domain(power_domain, power_domain_mask)
-		intel_display_power_put(dev_priv, power_domain);
+		intel_display_power_put_unchecked(dev_priv, power_domain);
 
 	return active;
 }
@@ -9735,7 +9915,7 @@ static u32 intel_cursor_base(const struct intel_plane_state *plane_state)
 	base += plane_state->color_plane[0].offset;
 
 	/* ILK+ do this automagically */
-	if (HAS_GMCH_DISPLAY(dev_priv) &&
+	if (HAS_GMCH(dev_priv) &&
 	    plane_state->base.rotation & DRM_MODE_ROTATE_180)
 		base += (plane_state->base.crtc_h *
 			 plane_state->base.crtc_w - 1) * fb->format->cpp[0];
@@ -9848,11 +10028,15 @@ i845_cursor_max_stride(struct intel_plane *plane,
 	return 2048;
 }
 
+static u32 i845_cursor_ctl_crtc(const struct intel_crtc_state *crtc_state)
+{
+	return CURSOR_GAMMA_ENABLE;
+}
+
 static u32 i845_cursor_ctl(const struct intel_crtc_state *crtc_state,
 			   const struct intel_plane_state *plane_state)
 {
 	return CURSOR_ENABLE |
-		CURSOR_GAMMA_ENABLE |
 		CURSOR_FORMAT_ARGB |
 		CURSOR_STRIDE(plane_state->color_plane[0].stride);
 }
@@ -9922,7 +10106,9 @@ static void i845_update_cursor(struct intel_plane *plane,
 		unsigned int width = plane_state->base.crtc_w;
 		unsigned int height = plane_state->base.crtc_h;
 
-		cntl = plane_state->ctl;
+		cntl = plane_state->ctl |
+			i845_cursor_ctl_crtc(crtc_state);
+
 		size = (height << 12) | width;
 
 		base = intel_cursor_base(plane_state);
@@ -9964,17 +10150,19 @@ static bool i845_cursor_get_hw_state(struct intel_plane *plane,
 {
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum intel_display_power_domain power_domain;
+	intel_wakeref_t wakeref;
 	bool ret;
 
 	power_domain = POWER_DOMAIN_PIPE(PIPE_A);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return false;
 
 	ret = I915_READ(CURCNTR(PIPE_A)) & CURSOR_ENABLE;
 
 	*pipe = PIPE_A;
 
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
@@ -9987,27 +10175,36 @@ i9xx_cursor_max_stride(struct intel_plane *plane,
 	return plane->base.dev->mode_config.cursor_width * 4;
 }
 
-static u32 i9xx_cursor_ctl(const struct intel_crtc_state *crtc_state,
-			   const struct intel_plane_state *plane_state)
+static u32 i9xx_cursor_ctl_crtc(const struct intel_crtc_state *crtc_state)
 {
-	struct drm_i915_private *dev_priv =
-		to_i915(plane_state->base.plane->dev);
 	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	u32 cntl = 0;
 
-	if (IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv))
-		cntl |= MCURSOR_TRICKLE_FEED_DISABLE;
+	if (INTEL_GEN(dev_priv) >= 11)
+		return cntl;
 
-	if (INTEL_GEN(dev_priv) <= 10) {
-		cntl |= MCURSOR_GAMMA_ENABLE;
+	cntl |= MCURSOR_GAMMA_ENABLE;
 
-		if (HAS_DDI(dev_priv))
-			cntl |= MCURSOR_PIPE_CSC_ENABLE;
-	}
+	if (HAS_DDI(dev_priv))
+		cntl |= MCURSOR_PIPE_CSC_ENABLE;
 
 	if (INTEL_GEN(dev_priv) < 5 && !IS_G4X(dev_priv))
 		cntl |= MCURSOR_PIPE_SELECT(crtc->pipe);
 
+	return cntl;
+}
+
+static u32 i9xx_cursor_ctl(const struct intel_crtc_state *crtc_state,
+			   const struct intel_plane_state *plane_state)
+{
+	struct drm_i915_private *dev_priv =
+		to_i915(plane_state->base.plane->dev);
+	u32 cntl = 0;
+
+	if (IS_GEN(dev_priv, 6) || IS_IVYBRIDGE(dev_priv))
+		cntl |= MCURSOR_TRICKLE_FEED_DISABLE;
+
 	switch (plane_state->base.crtc_w) {
 	case 64:
 		cntl |= MCURSOR_MODE_64_ARGB_AX;
@@ -10132,7 +10329,8 @@ static void i9xx_update_cursor(struct intel_plane *plane,
 	unsigned long irqflags;
 
 	if (plane_state && plane_state->base.visible) {
-		cntl = plane_state->ctl;
+		cntl = plane_state->ctl |
+			i9xx_cursor_ctl_crtc(crtc_state);
 
 		if (plane_state->base.crtc_h != plane_state->base.crtc_w)
 			fbc_ctl = CUR_FBC_CTL_EN | (plane_state->base.crtc_h - 1);
@@ -10197,6 +10395,7 @@ static bool i9xx_cursor_get_hw_state(struct intel_plane *plane,
 {
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum intel_display_power_domain power_domain;
+	intel_wakeref_t wakeref;
 	bool ret;
 	u32 val;
 
@@ -10206,7 +10405,8 @@ static bool i9xx_cursor_get_hw_state(struct intel_plane *plane,
 	 * display power wells.
 	 */
 	power_domain = POWER_DOMAIN_PIPE(plane->pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return false;
 
 	val = I915_READ(CURCNTR(plane->pipe));
@@ -10219,7 +10419,7 @@ static bool i9xx_cursor_get_hw_state(struct intel_plane *plane,
 		*pipe = (val & MCURSOR_PIPE_SELECT_MASK) >>
 			MCURSOR_PIPE_SELECT_SHIFT;
 
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
@@ -10468,7 +10668,7 @@ static int i9xx_pll_refclk(struct drm_device *dev,
 		return dev_priv->vbt.lvds_ssc_freq;
 	else if (HAS_PCH_SPLIT(dev_priv))
 		return 120000;
-	else if (!IS_GEN2(dev_priv))
+	else if (!IS_GEN(dev_priv, 2))
 		return 96000;
 	else
 		return 48000;
@@ -10501,7 +10701,7 @@ static void i9xx_crtc_clock_get(struct intel_crtc *crtc,
 		clock.m2 = (fp & FP_M2_DIV_MASK) >> FP_M2_DIV_SHIFT;
 	}
 
-	if (!IS_GEN2(dev_priv)) {
+	if (!IS_GEN(dev_priv, 2)) {
 		if (IS_PINEVIEW(dev_priv))
 			clock.p1 = ffs((dpll & DPLL_FPA01_P1_POST_DIV_MASK_PINEVIEW) >>
 				DPLL_FPA01_P1_POST_DIV_SHIFT_PINEVIEW);
@@ -10653,20 +10853,17 @@ static void intel_crtc_destroy(struct drm_crtc *crtc)
 
 /**
  * intel_wm_need_update - Check whether watermarks need updating
- * @plane: drm plane
- * @state: new plane state
+ * @cur: current plane state
+ * @new: new plane state
  *
  * Check current plane state versus the new one to determine whether
  * watermarks need to be recalculated.
  *
  * Returns true or false.
  */
-static bool intel_wm_need_update(struct drm_plane *plane,
-				 struct drm_plane_state *state)
+static bool intel_wm_need_update(struct intel_plane_state *cur,
+				 struct intel_plane_state *new)
 {
-	struct intel_plane_state *new = to_intel_plane_state(state);
-	struct intel_plane_state *cur = to_intel_plane_state(plane->state);
-
 	/* Update watermarks on tiling or size changes. */
 	if (new->base.visible != cur->base.visible)
 		return true;
@@ -10775,7 +10972,8 @@ int intel_plane_atomic_calc_changes(const struct intel_crtc_state *old_crtc_stat
 		/* must disable cxsr around plane enable/disable */
 		if (plane->id != PLANE_CURSOR)
 			pipe_config->disable_cxsr = true;
-	} else if (intel_wm_need_update(&plane->base, plane_state)) {
+	} else if (intel_wm_need_update(to_intel_plane_state(plane->base.state),
+					to_intel_plane_state(plane_state))) {
 		if (INTEL_GEN(dev_priv) < 5 && !IS_G4X(dev_priv)) {
 			/* FIXME bollocks */
 			pipe_config->update_wm_pre = true;
@@ -10815,9 +11013,12 @@ int intel_plane_atomic_calc_changes(const struct intel_crtc_state *old_crtc_stat
 	 * Despite the w/a only being listed for IVB we assume that
 	 * the ILK/SNB note has similar ramifications, hence we apply
 	 * the w/a on all three platforms.
+	 *
+	 * With experimental results seems this is needed also for primary
+	 * plane, not only sprite plane.
 	 */
-	if (plane->id == PLANE_SPRITE0 &&
-	    (IS_GEN5(dev_priv) || IS_GEN6(dev_priv) ||
+	if (plane->id != PLANE_CURSOR &&
+	    (IS_GEN_RANGE(dev_priv, 5, 6) ||
 	     IS_IVYBRIDGE(dev_priv)) &&
 	    (turn_on || (!needs_scaling(old_plane_state) &&
 			 needs_scaling(to_intel_plane_state(plane_state)))))
@@ -10954,15 +11155,15 @@ static int icl_check_nv12_planes(struct intel_crtc_state *crtc_state)
 static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 				   struct drm_crtc_state *crtc_state)
 {
-	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	struct intel_crtc_state *pipe_config =
 		to_intel_crtc_state(crtc_state);
 	int ret;
 	bool mode_changed = needs_modeset(crtc_state);
 
-	if (mode_changed && !crtc_state->active)
+	if (INTEL_GEN(dev_priv) < 5 && !IS_G4X(dev_priv) &&
+	    mode_changed && !crtc_state->active)
 		pipe_config->update_wm_post = true;
 
 	if (mode_changed && crtc_state->enable &&
@@ -10974,8 +11175,8 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 			return ret;
 	}
 
-	if (crtc_state->color_mgmt_changed) {
-		ret = intel_color_check(crtc, crtc_state);
+	if (mode_changed || crtc_state->color_mgmt_changed) {
+		ret = intel_color_check(pipe_config);
 		if (ret)
 			return ret;
 
@@ -11004,9 +11205,7 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 		 * old state and the new state.  We can program these
 		 * immediately.
 		 */
-		ret = dev_priv->display.compute_intermediate_wm(dev,
-								intel_crtc,
-								pipe_config);
+		ret = dev_priv->display.compute_intermediate_wm(pipe_config);
 		if (ret) {
 			DRM_DEBUG_KMS("No valid intermediate pipe watermarks are possible\n");
 			return ret;
@@ -11014,7 +11213,7 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 	}
 
 	if (INTEL_GEN(dev_priv) >= 9) {
-		if (mode_changed)
+		if (mode_changed || pipe_config->update_pipe)
 			ret = skl_update_scaler_crtc(pipe_config);
 
 		if (!ret)
@@ -11275,7 +11474,7 @@ static void intel_dump_pipe_config(struct intel_crtc *crtc,
 			      pipe_config->scaler_state.scaler_users,
 		              pipe_config->scaler_state.scaler_id);
 
-	if (HAS_GMCH_DISPLAY(dev_priv))
+	if (HAS_GMCH(dev_priv))
 		DRM_DEBUG_KMS("gmch pfit: control: 0x%08x, ratios: 0x%08x, lvds border: 0x%08x\n",
 			      pipe_config->gmch_pfit.control,
 			      pipe_config->gmch_pfit.pgm_ratios,
@@ -11387,44 +11586,38 @@ static bool check_digital_port_conflicts(struct drm_atomic_state *state)
 	return ret;
 }
 
-static void
+static int
 clear_intel_crtc_state(struct intel_crtc_state *crtc_state)
 {
 	struct drm_i915_private *dev_priv =
 		to_i915(crtc_state->base.crtc->dev);
-	struct intel_crtc_scaler_state scaler_state;
-	struct intel_dpll_hw_state dpll_hw_state;
-	struct intel_shared_dpll *shared_dpll;
-	struct intel_crtc_wm_state wm_state;
-	bool force_thru, ips_force_disable;
+	struct intel_crtc_state *saved_state;
+
+	saved_state = kzalloc(sizeof(*saved_state), GFP_KERNEL);
+	if (!saved_state)
+		return -ENOMEM;
 
 	/* FIXME: before the switch to atomic started, a new pipe_config was
 	 * kzalloc'd. Code that depends on any field being zero should be
 	 * fixed, so that the crtc_state can be safely duplicated. For now,
 	 * only fields that are know to not cause problems are preserved. */
 
-	scaler_state = crtc_state->scaler_state;
-	shared_dpll = crtc_state->shared_dpll;
-	dpll_hw_state = crtc_state->dpll_hw_state;
-	force_thru = crtc_state->pch_pfit.force_thru;
-	ips_force_disable = crtc_state->ips_force_disable;
+	saved_state->scaler_state = crtc_state->scaler_state;
+	saved_state->shared_dpll = crtc_state->shared_dpll;
+	saved_state->dpll_hw_state = crtc_state->dpll_hw_state;
+	saved_state->pch_pfit.force_thru = crtc_state->pch_pfit.force_thru;
+	saved_state->ips_force_disable = crtc_state->ips_force_disable;
 	if (IS_G4X(dev_priv) ||
 	    IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
-		wm_state = crtc_state->wm;
+		saved_state->wm = crtc_state->wm;
 
 	/* Keep base drm_crtc_state intact, only clear our extended struct */
 	BUILD_BUG_ON(offsetof(struct intel_crtc_state, base));
-	memset(&crtc_state->base + 1, 0,
+	memcpy(&crtc_state->base + 1, &saved_state->base + 1,
 	       sizeof(*crtc_state) - sizeof(crtc_state->base));
 
-	crtc_state->scaler_state = scaler_state;
-	crtc_state->shared_dpll = shared_dpll;
-	crtc_state->dpll_hw_state = dpll_hw_state;
-	crtc_state->pch_pfit.force_thru = force_thru;
-	crtc_state->ips_force_disable = ips_force_disable;
-	if (IS_G4X(dev_priv) ||
-	    IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
-		crtc_state->wm = wm_state;
+	kfree(saved_state);
+	return 0;
 }
 
 static int
@@ -11439,7 +11632,9 @@ intel_modeset_pipe_config(struct drm_crtc *crtc,
 	int i;
 	bool retry = true;
 
-	clear_intel_crtc_state(pipe_config);
+	ret = clear_intel_crtc_state(pipe_config);
+	if (ret)
+		return ret;
 
 	pipe_config->cpu_transcoder =
 		(enum transcoder) to_intel_crtc(crtc)->pipe;
@@ -11517,10 +11712,13 @@ encoder_retry:
 			continue;
 
 		encoder = to_intel_encoder(connector_state->best_encoder);
-
-		if (!(encoder->compute_config(encoder, pipe_config, connector_state))) {
-			DRM_DEBUG_KMS("Encoder config failure\n");
-			return -EINVAL;
+		ret = encoder->compute_config(encoder, pipe_config,
+					      connector_state);
+		if (ret < 0) {
+			if (ret != -EDEADLK)
+				DRM_DEBUG_KMS("Encoder config failure: %d\n",
+					      ret);
+			return ret;
 		}
 	}
 
@@ -11645,6 +11843,23 @@ pipe_config_err(bool adjust, const char *name, const char *format, ...)
 	va_end(args);
 }
 
+static bool fastboot_enabled(struct drm_i915_private *dev_priv)
+{
+	if (i915_modparams.fastboot != -1)
+		return i915_modparams.fastboot;
+
+	/* Enable fastboot by default on Skylake and newer */
+	if (INTEL_GEN(dev_priv) >= 9)
+		return true;
+
+	/* Enable fastboot by default on VLV and CHV */
+	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
+		return true;
+
+	/* Disabled by default on all others */
+	return false;
+}
+
 static bool
 intel_pipe_config_compare(struct drm_i915_private *dev_priv,
 			  struct intel_crtc_state *current_config,
@@ -11656,6 +11871,11 @@ intel_pipe_config_compare(struct drm_i915_private *dev_priv,
 		(current_config->base.mode.private_flags & I915_MODE_FLAG_INHERITED) &&
 		!(pipe_config->base.mode.private_flags & I915_MODE_FLAG_INHERITED);
 
+	if (fixup_inherited && !fastboot_enabled(dev_priv)) {
+		DRM_DEBUG_KMS("initial modeset and fastboot not set\n");
+		ret = false;
+	}
+
 #define PIPE_CONF_CHECK_X(name) do { \
 	if (current_config->name != pipe_config->name) { \
 		pipe_config_err(adjust, __stringify(name), \
@@ -11964,7 +12184,7 @@ static void verify_wm_state(struct drm_crtc *crtc,
 	if (INTEL_GEN(dev_priv) < 9 || !new_state->active)
 		return;
 
-	skl_pipe_wm_get_hw_state(crtc, &hw_wm);
+	skl_pipe_wm_get_hw_state(intel_crtc, &hw_wm);
 	sw_wm = &to_intel_crtc_state(new_state)->wm.skl.optimal;
 
 	skl_pipe_ddb_get_hw_state(intel_crtc, hw_ddb_y, hw_ddb_uv);
@@ -12378,7 +12598,7 @@ static void update_scanline_offset(const struct intel_crtc_state *crtc_state)
 	 * However if queried just before the start of vblank we'll get an
 	 * answer that's slightly in the future.
 	 */
-	if (IS_GEN2(dev_priv)) {
+	if (IS_GEN(dev_priv, 2)) {
 		const struct drm_display_mode *adjusted_mode = &crtc_state->base.adjusted_mode;
 		int vtotal;
 
@@ -12619,9 +12839,9 @@ static int intel_modeset_checks(struct drm_atomic_state *state)
  * phase.  The code here should be run after the per-crtc and per-plane 'check'
  * handlers to ensure that all derived state has been updated.
  */
-static int calc_watermark_data(struct drm_atomic_state *state)
+static int calc_watermark_data(struct intel_atomic_state *state)
 {
-	struct drm_device *dev = state->dev;
+	struct drm_device *dev = state->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 
 	/* Is there platform-specific watermark information to calculate? */
@@ -12679,8 +12899,7 @@ static int intel_atomic_check(struct drm_device *dev,
 			return ret;
 		}
 
-		if (i915_modparams.fastboot &&
-		    intel_pipe_config_compare(dev_priv,
+		if (intel_pipe_config_compare(dev_priv,
 					to_intel_crtc_state(old_crtc_state),
 					pipe_config, true)) {
 			crtc_state->mode_changed = false;
@@ -12695,6 +12914,10 @@ static int intel_atomic_check(struct drm_device *dev,
 				       "[modeset]" : "[fastset]");
 	}
 
+	ret = drm_dp_mst_atomic_check(state);
+	if (ret)
+		return ret;
+
 	if (any_ms) {
 		ret = intel_modeset_checks(state);
 
@@ -12713,7 +12936,7 @@ static int intel_atomic_check(struct drm_device *dev,
 		return ret;
 
 	intel_fbc_choose_crtc(dev_priv, intel_state);
-	return calc_watermark_data(state);
+	return calc_watermark_data(intel_state);
 }
 
 static int intel_atomic_prepare_commit(struct drm_device *dev,
@@ -12725,8 +12948,9 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 u32 intel_crtc_get_vblank_counter(struct intel_crtc *crtc)
 {
 	struct drm_device *dev = crtc->base.dev;
+	struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(&crtc->base)];
 
-	if (!dev->max_vblank_count)
+	if (!vblank->max_vblank_count)
 		return (u32)drm_crtc_accurate_vblank_count(&crtc->base);
 
 	return dev->driver->get_vblank_counter(dev, crtc->pipe);
@@ -12755,9 +12979,14 @@ static void intel_update_crtc(struct drm_crtc *crtc,
 	} else {
 		intel_pre_plane_update(to_intel_crtc_state(old_crtc_state),
 				       pipe_config);
+
+		if (pipe_config->update_pipe)
+			intel_encoders_update_pipe(crtc, pipe_config, state);
 	}
 
-	if (new_plane_state)
+	if (pipe_config->update_pipe && !pipe_config->enable_fbc)
+		intel_fbc_disable(intel_crtc);
+	else if (new_plane_state)
 		intel_fbc_enable(intel_crtc, pipe_config, new_plane_state);
 
 	intel_begin_crtc_commit(crtc, old_crtc_state);
@@ -12930,6 +13159,7 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
 	struct drm_crtc *crtc;
 	struct intel_crtc *intel_crtc;
 	u64 put_domains[I915_MAX_PIPES] = {};
+	intel_wakeref_t wakeref = 0;
 	int i;
 
 	intel_atomic_commit_fence_wait(intel_state);
@@ -12937,7 +13167,7 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
 	drm_atomic_helper_wait_for_dependencies(state);
 
 	if (intel_state->modeset)
-		intel_display_power_get(dev_priv, POWER_DOMAIN_MODESET);
+		wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_MODESET);
 
 	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
 		old_intel_crtc_state = to_intel_crtc_state(old_crtc_state);
@@ -12980,7 +13210,7 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
 
 			/* FIXME unify this for all platforms */
 			if (!new_crtc_state->active &&
-			    !HAS_GMCH_DISPLAY(dev_priv) &&
+			    !HAS_GMCH(dev_priv) &&
 			    dev_priv->display.initial_watermarks)
 				dev_priv->display.initial_watermarks(intel_state,
 								     new_intel_crtc_state);
@@ -13034,6 +13264,16 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
 	 */
 	drm_atomic_helper_wait_for_flip_done(dev, state);
 
+	for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
+		new_intel_crtc_state = to_intel_crtc_state(new_crtc_state);
+
+		if (new_crtc_state->active &&
+		    !needs_modeset(new_crtc_state) &&
+		    (new_intel_crtc_state->base.color_mgmt_changed ||
+		     new_intel_crtc_state->update_pipe))
+			intel_color_load_luts(new_intel_crtc_state);
+	}
+
 	/*
 	 * Now that the vblank has passed, we can go ahead and program the
 	 * optimal watermarks on platforms that need two-step watermark
@@ -13074,7 +13314,7 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
 		 * the culprit.
 		 */
 		intel_uncore_arm_unclaimed_mmio_detection(dev_priv);
-		intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET);
+		intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET, wakeref);
 	}
 
 	/*
@@ -13549,19 +13789,16 @@ static void intel_begin_crtc_commit(struct drm_crtc *crtc,
 		intel_atomic_get_new_crtc_state(old_intel_state, intel_crtc);
 	bool modeset = needs_modeset(&intel_cstate->base);
 
-	if (!modeset &&
-	    (intel_cstate->base.color_mgmt_changed ||
-	     intel_cstate->update_pipe)) {
-		intel_color_set_csc(&intel_cstate->base);
-		intel_color_load_luts(&intel_cstate->base);
-	}
-
 	/* Perform vblank evasion around commit operation */
 	intel_pipe_update_start(intel_cstate);
 
 	if (modeset)
 		goto out;
 
+	if (intel_cstate->base.color_mgmt_changed ||
+	    intel_cstate->update_pipe)
+		intel_color_commit(intel_cstate);
+
 	if (intel_cstate->update_pipe)
 		intel_update_pipe_config(old_intel_cstate, intel_cstate);
 	else if (INTEL_GEN(dev_priv) >= 9)
@@ -13578,7 +13815,7 @@ void intel_crtc_arm_fifo_underrun(struct intel_crtc *crtc,
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 
-	if (!IS_GEN2(dev_priv))
+	if (!IS_GEN(dev_priv, 2))
 		intel_set_cpu_fifo_underrun_reporting(dev_priv, crtc->pipe, true);
 
 	if (crtc_state->has_pch_encoder) {
@@ -13702,8 +13939,8 @@ intel_legacy_cursor_update(struct drm_plane *plane,
 			   struct drm_framebuffer *fb,
 			   int crtc_x, int crtc_y,
 			   unsigned int crtc_w, unsigned int crtc_h,
-			   uint32_t src_x, uint32_t src_y,
-			   uint32_t src_w, uint32_t src_h,
+			   u32 src_x, u32 src_y,
+			   u32 src_w, u32 src_h,
 			   struct drm_modeset_acquire_ctx *ctx)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
@@ -14040,7 +14277,7 @@ static void intel_crtc_init_scalers(struct intel_crtc *crtc,
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	int i;
 
-	crtc->num_scalers = dev_priv->info.num_scalers[crtc->pipe];
+	crtc->num_scalers = RUNTIME_INFO(dev_priv)->num_scalers[crtc->pipe];
 	if (!crtc->num_scalers)
 		return;
 
@@ -14126,7 +14363,7 @@ static int intel_crtc_init(struct drm_i915_private *dev_priv, enum pipe pipe)
 
 	drm_crtc_helper_add(&intel_crtc->base, &intel_helper_funcs);
 
-	intel_color_init(&intel_crtc->base);
+	intel_color_init(intel_crtc);
 
 	WARN_ON(drm_crtc_index(&intel_crtc->base) != intel_crtc->pipe);
 
@@ -14177,7 +14414,7 @@ static int intel_encoder_clones(struct intel_encoder *encoder)
 	return index_mask;
 }
 
-static bool has_edp_a(struct drm_i915_private *dev_priv)
+static bool ilk_has_edp_a(struct drm_i915_private *dev_priv)
 {
 	if (!IS_MOBILE(dev_priv))
 		return false;
@@ -14185,13 +14422,13 @@ static bool has_edp_a(struct drm_i915_private *dev_priv)
 	if ((I915_READ(DP_A) & DP_DETECTED) == 0)
 		return false;
 
-	if (IS_GEN5(dev_priv) && (I915_READ(FUSE_STRAP) & ILK_eDP_A_DISABLE))
+	if (IS_GEN(dev_priv, 5) && (I915_READ(FUSE_STRAP) & ILK_eDP_A_DISABLE))
 		return false;
 
 	return true;
 }
 
-static bool intel_crt_present(struct drm_i915_private *dev_priv)
+static bool intel_ddi_crt_present(struct drm_i915_private *dev_priv)
 {
 	if (INTEL_GEN(dev_priv) >= 9)
 		return false;
@@ -14199,15 +14436,12 @@ static bool intel_crt_present(struct drm_i915_private *dev_priv)
 	if (IS_HSW_ULT(dev_priv) || IS_BDW_ULT(dev_priv))
 		return false;
 
-	if (IS_CHERRYVIEW(dev_priv))
-		return false;
-
 	if (HAS_PCH_LPT_H(dev_priv) &&
 	    I915_READ(SFUSE_STRAP) & SFUSE_STRAP_CRT_DISABLED)
 		return false;
 
 	/* DDI E can't be used if DDI A requires 4 lanes */
-	if (HAS_DDI(dev_priv) && I915_READ(DDI_BUF_CTL(PORT_A)) & DDI_A_4_LANES)
+	if (I915_READ(DDI_BUF_CTL(PORT_A)) & DDI_A_4_LANES)
 		return false;
 
 	if (!dev_priv->vbt.int_crt_support)
@@ -14262,23 +14496,21 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv)
 	if (!HAS_DISPLAY(dev_priv))
 		return;
 
-	/*
-	 * intel_edp_init_connector() depends on this completing first, to
-	 * prevent the registeration of both eDP and LVDS and the incorrect
-	 * sharing of the PPS.
-	 */
-	intel_lvds_init(dev_priv);
-
-	if (intel_crt_present(dev_priv))
-		intel_crt_init(dev_priv);
-
 	if (IS_ICELAKE(dev_priv)) {
 		intel_ddi_init(dev_priv, PORT_A);
 		intel_ddi_init(dev_priv, PORT_B);
 		intel_ddi_init(dev_priv, PORT_C);
 		intel_ddi_init(dev_priv, PORT_D);
 		intel_ddi_init(dev_priv, PORT_E);
-		intel_ddi_init(dev_priv, PORT_F);
+		/*
+		 * On some ICL SKUs port F is not present. No strap bits for
+		 * this, so rely on VBT.
+		 * Work around broken VBTs on SKUs known to have no port F.
+		 */
+		if (IS_ICL_WITH_PORT_F(dev_priv) &&
+		    intel_bios_is_port_present(dev_priv, PORT_F))
+			intel_ddi_init(dev_priv, PORT_F);
+
 		icl_dsi_init(dev_priv);
 	} else if (IS_GEN9_LP(dev_priv)) {
 		/*
@@ -14294,6 +14526,9 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv)
 	} else if (HAS_DDI(dev_priv)) {
 		int found;
 
+		if (intel_ddi_crt_present(dev_priv))
+			intel_crt_init(dev_priv);
+
 		/*
 		 * Haswell uses DDI functions to detect digital outputs.
 		 * On SKL pre-D0 the strap isn't connected, so we assume
@@ -14320,16 +14555,23 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv)
 		 * On SKL we don't have a way to detect DDI-E so we rely on VBT.
 		 */
 		if (IS_GEN9_BC(dev_priv) &&
-		    (dev_priv->vbt.ddi_port_info[PORT_E].supports_dp ||
-		     dev_priv->vbt.ddi_port_info[PORT_E].supports_dvi ||
-		     dev_priv->vbt.ddi_port_info[PORT_E].supports_hdmi))
+		    intel_bios_is_port_present(dev_priv, PORT_E))
 			intel_ddi_init(dev_priv, PORT_E);
 
 	} else if (HAS_PCH_SPLIT(dev_priv)) {
 		int found;
+
+		/*
+		 * intel_edp_init_connector() depends on this completing first,
+		 * to prevent the registration of both eDP and LVDS and the
+		 * incorrect sharing of the PPS.
+		 */
+		intel_lvds_init(dev_priv);
+		intel_crt_init(dev_priv);
+
 		dpd_is_edp = intel_dp_is_port_edp(dev_priv, PORT_D);
 
-		if (has_edp_a(dev_priv))
+		if (ilk_has_edp_a(dev_priv))
 			intel_dp_init(dev_priv, DP_A, PORT_A);
 
 		if (I915_READ(PCH_HDMIB) & SDVO_DETECTED) {
@@ -14355,6 +14597,9 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv)
 	} else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
 		bool has_edp, has_port;
 
+		if (IS_VALLEYVIEW(dev_priv) && dev_priv->vbt.int_crt_support)
+			intel_crt_init(dev_priv);
+
 		/*
 		 * The DP_DETECTED bit is the latched state of the DDC
 		 * SDA pin at boot. However since eDP doesn't require DDC
@@ -14397,9 +14642,17 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv)
 		}
 
 		vlv_dsi_init(dev_priv);
-	} else if (!IS_GEN2(dev_priv) && !IS_PINEVIEW(dev_priv)) {
+	} else if (IS_PINEVIEW(dev_priv)) {
+		intel_lvds_init(dev_priv);
+		intel_crt_init(dev_priv);
+	} else if (IS_GEN_RANGE(dev_priv, 3, 4)) {
 		bool found = false;
 
+		if (IS_MOBILE(dev_priv))
+			intel_lvds_init(dev_priv);
+
+		intel_crt_init(dev_priv);
+
 		if (I915_READ(GEN3_SDVOB) & SDVO_DETECTED) {
 			DRM_DEBUG_KMS("probing SDVOB\n");
 			found = intel_sdvo_init(dev_priv, GEN3_SDVOB, PORT_B);
@@ -14431,11 +14684,16 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv)
 
 		if (IS_G4X(dev_priv) && (I915_READ(DP_D) & DP_DETECTED))
 			intel_dp_init(dev_priv, DP_D, PORT_D);
-	} else if (IS_GEN2(dev_priv))
-		intel_dvo_init(dev_priv);
 
-	if (SUPPORTS_TV(dev_priv))
-		intel_tv_init(dev_priv);
+		if (SUPPORTS_TV(dev_priv))
+			intel_tv_init(dev_priv);
+	} else if (IS_GEN(dev_priv, 2)) {
+		if (IS_I85X(dev_priv))
+			intel_lvds_init(dev_priv);
+
+		intel_crt_init(dev_priv);
+		intel_dvo_init(dev_priv);
+	}
 
 	intel_psr_init(dev_priv);
 
@@ -14602,14 +14860,6 @@ static int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
 
 	drm_helper_mode_fill_fb_struct(&dev_priv->drm, fb, mode_cmd);
 
-	if (fb->format->format == DRM_FORMAT_NV12 &&
-	    (fb->width < SKL_MIN_YUV_420_SRC_W ||
-	     fb->height < SKL_MIN_YUV_420_SRC_H ||
-	     (fb->width % 4) != 0 || (fb->height % 4) != 0)) {
-		DRM_DEBUG_KMS("src dimensions not correct for NV12\n");
-		goto err;
-	}
-
 	for (i = 0; i < fb->format->num_planes; i++) {
 		u32 stride_alignment;
 
@@ -14629,7 +14879,7 @@ static int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
 		 * require the entire fb to accommodate that to avoid
 		 * potential runtime errors at plane configuration time.
 		 */
-		if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
+		if (IS_GEN(dev_priv, 9) && i == 0 && fb->width > 3840 &&
 		    is_ccs_modifier(fb->modifier))
 			stride_alignment *= 4;
 
@@ -14834,7 +15084,7 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv)
 		dev_priv->display.crtc_compute_clock = pnv_crtc_compute_clock;
 		dev_priv->display.crtc_enable = i9xx_crtc_enable;
 		dev_priv->display.crtc_disable = i9xx_crtc_disable;
-	} else if (!IS_GEN2(dev_priv)) {
+	} else if (!IS_GEN(dev_priv, 2)) {
 		dev_priv->display.get_pipe_config = i9xx_get_pipe_config;
 		dev_priv->display.get_initial_plane_config =
 			i9xx_get_initial_plane_config;
@@ -14850,9 +15100,9 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv)
 		dev_priv->display.crtc_disable = i9xx_crtc_disable;
 	}
 
-	if (IS_GEN5(dev_priv)) {
+	if (IS_GEN(dev_priv, 5)) {
 		dev_priv->display.fdi_link_train = ironlake_fdi_link_train;
-	} else if (IS_GEN6(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 6)) {
 		dev_priv->display.fdi_link_train = gen6_fdi_link_train;
 	} else if (IS_IVYBRIDGE(dev_priv)) {
 		/* FIXME: detect B0+ stepping and use auto training */
@@ -14945,7 +15195,7 @@ retry:
 	 * intermediate watermarks (since we don't trust the current
 	 * watermarks).
 	 */
-	if (!HAS_GMCH_DISPLAY(dev_priv))
+	if (!HAS_GMCH(dev_priv))
 		intel_state->skip_intermediate_wm = true;
 
 	ret = intel_atomic_check(dev, state);
@@ -14984,12 +15234,12 @@ fail:
 
 static void intel_update_fdi_pll_freq(struct drm_i915_private *dev_priv)
 {
-	if (IS_GEN5(dev_priv)) {
+	if (IS_GEN(dev_priv, 5)) {
 		u32 fdi_pll_clk =
 			I915_READ(FDI_PLL_BIOS_0) & FDI_PLL_FB_CLOCK_MASK;
 
 		dev_priv->fdi_pll_freq = (fdi_pll_clk + 2) * 10000;
-	} else if (IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 6) || IS_IVYBRIDGE(dev_priv)) {
 		dev_priv->fdi_pll_freq = 270000;
 	} else {
 		return;
@@ -15105,10 +15355,10 @@ int intel_modeset_init(struct drm_device *dev)
 	}
 
 	/* maximum framebuffer dimensions */
-	if (IS_GEN2(dev_priv)) {
+	if (IS_GEN(dev_priv, 2)) {
 		dev->mode_config.max_width = 2048;
 		dev->mode_config.max_height = 2048;
-	} else if (IS_GEN3(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 3)) {
 		dev->mode_config.max_width = 4096;
 		dev->mode_config.max_height = 4096;
 	} else {
@@ -15119,7 +15369,7 @@ int intel_modeset_init(struct drm_device *dev)
 	if (IS_I845G(dev_priv) || IS_I865G(dev_priv)) {
 		dev->mode_config.cursor_width = IS_I845G(dev_priv) ? 64 : 512;
 		dev->mode_config.cursor_height = 1023;
-	} else if (IS_GEN2(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 2)) {
 		dev->mode_config.cursor_width = 64;
 		dev->mode_config.cursor_height = 64;
 	} else {
@@ -15186,7 +15436,7 @@ int intel_modeset_init(struct drm_device *dev)
 	 * Note that we need to do this after reconstructing the BIOS fb's
 	 * since the watermark calculation done here will use pstate->fb.
 	 */
-	if (!HAS_GMCH_DISPLAY(dev_priv))
+	if (!HAS_GMCH(dev_priv))
 		sanitize_watermarks(dev);
 
 	/*
@@ -15379,6 +15629,15 @@ static void intel_sanitize_crtc(struct intel_crtc *crtc,
 			    plane->base.type != DRM_PLANE_TYPE_PRIMARY)
 				intel_plane_disable_noatomic(crtc, plane);
 		}
+
+		/*
+		 * Disable any background color set by the BIOS, but enable the
+		 * gamma and CSC to match how we program our planes.
+		 */
+		if (INTEL_GEN(dev_priv) >= 9)
+			I915_WRITE(SKL_BOTTOM_COLOR(crtc->pipe),
+				   SKL_BOTTOM_COLOR_GAMMA_ENABLE |
+				   SKL_BOTTOM_COLOR_CSC_ENABLE);
 	}
 
 	/* Adjust the state of the output pipe according to whether we
@@ -15386,7 +15645,7 @@ static void intel_sanitize_crtc(struct intel_crtc *crtc,
 	if (crtc_state->base.active && !intel_crtc_has_encoders(crtc))
 		intel_crtc_disable_noatomic(&crtc->base, ctx);
 
-	if (crtc_state->base.active || HAS_GMCH_DISPLAY(dev_priv)) {
+	if (crtc_state->base.active || HAS_GMCH(dev_priv)) {
 		/*
 		 * We start out with underrun reporting disabled to avoid races.
 		 * For correct bookkeeping mark this on active crtcs.
@@ -15429,7 +15688,7 @@ static bool has_bogus_dpll_config(const struct intel_crtc_state *crtc_state)
 	 * without several WARNs, but for now let's take the easy
 	 * road.
 	 */
-	return IS_GEN6(dev_priv) &&
+	return IS_GEN(dev_priv, 6) &&
 		crtc_state->base.active &&
 		crtc_state->shared_dpll &&
 		crtc_state->port_clock == 0;
@@ -15514,19 +15773,25 @@ void i915_redisable_vga_power_on(struct drm_i915_private *dev_priv)
 
 void i915_redisable_vga(struct drm_i915_private *dev_priv)
 {
-	/* This function can be called both from intel_modeset_setup_hw_state or
+	intel_wakeref_t wakeref;
+
+	/*
+	 * This function can be called both from intel_modeset_setup_hw_state or
 	 * at a very early point in our resume sequence, where the power well
 	 * structures are not yet restored. Since this function is at a very
 	 * paranoid "someone might have enabled VGA while we were not looking"
 	 * level, just check if the power well is enabled instead of trying to
 	 * follow the "don't touch the power well if we don't need it" policy
-	 * the rest of the driver uses. */
-	if (!intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_VGA))
+	 * the rest of the driver uses.
+	 */
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     POWER_DOMAIN_VGA);
+	if (!wakeref)
 		return;
 
 	i915_redisable_vga_power_on(dev_priv);
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_VGA);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_VGA, wakeref);
 }
 
 /* FIXME read out full plane state for all planes */
@@ -15826,12 +16091,13 @@ intel_modeset_setup_hw_state(struct drm_device *dev,
 			     struct drm_modeset_acquire_ctx *ctx)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_crtc *crtc;
 	struct intel_crtc_state *crtc_state;
 	struct intel_encoder *encoder;
+	struct intel_crtc *crtc;
+	intel_wakeref_t wakeref;
 	int i;
 
-	intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+	wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
 
 	intel_early_display_was(dev_priv);
 	intel_modeset_readout_hw_state(dev);
@@ -15847,10 +16113,12 @@ intel_modeset_setup_hw_state(struct drm_device *dev,
 	 * waits, so we need vblank interrupts restored beforehand.
 	 */
 	for_each_intel_crtc(&dev_priv->drm, crtc) {
+		crtc_state = to_intel_crtc_state(crtc->base.state);
+
 		drm_crtc_vblank_reset(&crtc->base);
 
-		if (crtc->base.state->active)
-			drm_crtc_vblank_on(&crtc->base);
+		if (crtc_state->base.active)
+			intel_crtc_vblank_on(crtc_state);
 	}
 
 	intel_sanitize_plane_mapping(dev_priv);
@@ -15881,15 +16149,15 @@ intel_modeset_setup_hw_state(struct drm_device *dev,
 	}
 
 	if (IS_G4X(dev_priv)) {
-		g4x_wm_get_hw_state(dev);
+		g4x_wm_get_hw_state(dev_priv);
 		g4x_wm_sanitize(dev_priv);
 	} else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
-		vlv_wm_get_hw_state(dev);
+		vlv_wm_get_hw_state(dev_priv);
 		vlv_wm_sanitize(dev_priv);
 	} else if (INTEL_GEN(dev_priv) >= 9) {
-		skl_wm_get_hw_state(dev);
+		skl_wm_get_hw_state(dev_priv);
 	} else if (HAS_PCH_SPLIT(dev_priv)) {
-		ilk_wm_get_hw_state(dev);
+		ilk_wm_get_hw_state(dev_priv);
 	}
 
 	for_each_intel_crtc(dev, crtc) {
@@ -15901,7 +16169,7 @@ intel_modeset_setup_hw_state(struct drm_device *dev,
 			modeset_put_power_domains(dev_priv, put_domains);
 	}
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_INIT, wakeref);
 
 	intel_fbc_init_pipe_state(dev_priv);
 }
@@ -16124,7 +16392,7 @@ intel_display_capture_error_state(struct drm_i915_private *dev_priv)
 
 		error->pipe[i].source = I915_READ(PIPESRC(i));
 
-		if (HAS_GMCH_DISPLAY(dev_priv))
+		if (HAS_GMCH(dev_priv))
 			error->pipe[i].stat = I915_READ(PIPESTAT(i));
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_display.h b/drivers/gpu/drm/i915/intel_display.h
index 79203666fc62..2220588e86ac 100644
--- a/drivers/gpu/drm/i915/intel_display.h
+++ b/drivers/gpu/drm/i915/intel_display.h
@@ -122,7 +122,7 @@ enum i9xx_plane_id {
 };
 
 #define plane_name(p) ((p) + 'A')
-#define sprite_name(p, s) ((p) * INTEL_INFO(dev_priv)->num_sprites[(p)] + (s) + 'A')
+#define sprite_name(p, s) ((p) * RUNTIME_INFO(dev_priv)->num_sprites[(p)] + (s) + 'A')
 
 /*
  * Per-pipe plane identifier.
@@ -297,12 +297,12 @@ struct intel_link_m_n {
 
 #define for_each_universal_plane(__dev_priv, __pipe, __p)		\
 	for ((__p) = 0;							\
-	     (__p) < INTEL_INFO(__dev_priv)->num_sprites[(__pipe)] + 1;	\
+	     (__p) < RUNTIME_INFO(__dev_priv)->num_sprites[(__pipe)] + 1;	\
 	     (__p)++)
 
 #define for_each_sprite(__dev_priv, __p, __s)				\
 	for ((__s) = 0;							\
-	     (__s) < INTEL_INFO(__dev_priv)->num_sprites[(__p)];	\
+	     (__s) < RUNTIME_INFO(__dev_priv)->num_sprites[(__p)];	\
 	     (__s)++)
 
 #define for_each_port_masked(__port, __ports_mask) \
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 22a74608c6e4..cf709835fb9a 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -32,13 +32,12 @@
 #include <linux/notifier.h>
 #include <linux/reboot.h>
 #include <asm/byteorder.h>
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_hdcp.h>
+#include <drm/drm_probe_helper.h>
 #include "intel_drv.h"
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
@@ -346,7 +345,7 @@ intel_dp_set_source_rates(struct intel_dp *intel_dp)
 	if (INTEL_GEN(dev_priv) >= 10) {
 		source_rates = cnl_rates;
 		size = ARRAY_SIZE(cnl_rates);
-		if (IS_GEN10(dev_priv))
+		if (IS_GEN(dev_priv, 10))
 			max_rate = cnl_max_source_rate(intel_dp);
 		else
 			max_rate = icl_max_source_rate(intel_dp);
@@ -430,7 +429,7 @@ static void intel_dp_set_common_rates(struct intel_dp *intel_dp)
 }
 
 static bool intel_dp_link_params_valid(struct intel_dp *intel_dp, int link_rate,
-				       uint8_t lane_count)
+				       u8 lane_count)
 {
 	/*
 	 * FIXME: we need to synchronize the current link parameters with
@@ -450,7 +449,7 @@ static bool intel_dp_link_params_valid(struct intel_dp *intel_dp, int link_rate,
 
 static bool intel_dp_can_link_train_fallback_for_edp(struct intel_dp *intel_dp,
 						     int link_rate,
-						     uint8_t lane_count)
+						     u8 lane_count)
 {
 	const struct drm_display_mode *fixed_mode =
 		intel_dp->attached_connector->panel.fixed_mode;
@@ -465,7 +464,7 @@ static bool intel_dp_can_link_train_fallback_for_edp(struct intel_dp *intel_dp,
 }
 
 int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
-					    int link_rate, uint8_t lane_count)
+					    int link_rate, u8 lane_count)
 {
 	int index;
 
@@ -573,19 +572,19 @@ intel_dp_mode_valid(struct drm_connector *connector,
 	return MODE_OK;
 }
 
-uint32_t intel_dp_pack_aux(const uint8_t *src, int src_bytes)
+u32 intel_dp_pack_aux(const u8 *src, int src_bytes)
 {
-	int	i;
-	uint32_t v = 0;
+	int i;
+	u32 v = 0;
 
 	if (src_bytes > 4)
 		src_bytes = 4;
 	for (i = 0; i < src_bytes; i++)
-		v |= ((uint32_t) src[i]) << ((3-i) * 8);
+		v |= ((u32)src[i]) << ((3 - i) * 8);
 	return v;
 }
 
-static void intel_dp_unpack_aux(uint32_t src, uint8_t *dst, int dst_bytes)
+static void intel_dp_unpack_aux(u32 src, u8 *dst, int dst_bytes)
 {
 	int i;
 	if (dst_bytes > 4)
@@ -602,30 +601,39 @@ intel_dp_init_panel_power_sequencer_registers(struct intel_dp *intel_dp,
 static void
 intel_dp_pps_init(struct intel_dp *intel_dp);
 
-static void pps_lock(struct intel_dp *intel_dp)
+static intel_wakeref_t
+pps_lock(struct intel_dp *intel_dp)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
+	intel_wakeref_t wakeref;
 
 	/*
 	 * See intel_power_sequencer_reset() why we need
 	 * a power domain reference here.
 	 */
-	intel_display_power_get(dev_priv,
-				intel_aux_power_domain(dp_to_dig_port(intel_dp)));
+	wakeref = intel_display_power_get(dev_priv,
+					  intel_aux_power_domain(dp_to_dig_port(intel_dp)));
 
 	mutex_lock(&dev_priv->pps_mutex);
+
+	return wakeref;
 }
 
-static void pps_unlock(struct intel_dp *intel_dp)
+static intel_wakeref_t
+pps_unlock(struct intel_dp *intel_dp, intel_wakeref_t wakeref)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 
 	mutex_unlock(&dev_priv->pps_mutex);
-
 	intel_display_power_put(dev_priv,
-				intel_aux_power_domain(dp_to_dig_port(intel_dp)));
+				intel_aux_power_domain(dp_to_dig_port(intel_dp)),
+				wakeref);
+	return 0;
 }
 
+#define with_pps_lock(dp, wf) \
+	for ((wf) = pps_lock(dp); (wf); (wf) = pps_unlock((dp), (wf)))
+
 static void
 vlv_power_sequencer_kick(struct intel_dp *intel_dp)
 {
@@ -635,7 +643,7 @@ vlv_power_sequencer_kick(struct intel_dp *intel_dp)
 	bool pll_enabled, release_cl_override = false;
 	enum dpio_phy phy = DPIO_PHY(pipe);
 	enum dpio_channel ch = vlv_pipe_to_channel(pipe);
-	uint32_t DP;
+	u32 DP;
 
 	if (WARN(I915_READ(intel_dp->output_reg) & DP_PORT_EN,
 		 "skipping pipe %c power sequencer kick due to port %c being active\n",
@@ -974,30 +982,29 @@ static int edp_notify_handler(struct notifier_block *this, unsigned long code,
 	struct intel_dp *intel_dp = container_of(this, typeof(* intel_dp),
 						 edp_notifier);
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
+	intel_wakeref_t wakeref;
 
 	if (!intel_dp_is_edp(intel_dp) || code != SYS_RESTART)
 		return 0;
 
-	pps_lock(intel_dp);
-
-	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
-		enum pipe pipe = vlv_power_sequencer_pipe(intel_dp);
-		i915_reg_t pp_ctrl_reg, pp_div_reg;
-		u32 pp_div;
-
-		pp_ctrl_reg = PP_CONTROL(pipe);
-		pp_div_reg  = PP_DIVISOR(pipe);
-		pp_div = I915_READ(pp_div_reg);
-		pp_div &= PP_REFERENCE_DIVIDER_MASK;
-
-		/* 0x1F write to PP_DIV_REG sets max cycle delay */
-		I915_WRITE(pp_div_reg, pp_div | 0x1F);
-		I915_WRITE(pp_ctrl_reg, PANEL_UNLOCK_REGS | PANEL_POWER_OFF);
-		msleep(intel_dp->panel_power_cycle_delay);
+	with_pps_lock(intel_dp, wakeref) {
+		if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
+			enum pipe pipe = vlv_power_sequencer_pipe(intel_dp);
+			i915_reg_t pp_ctrl_reg, pp_div_reg;
+			u32 pp_div;
+
+			pp_ctrl_reg = PP_CONTROL(pipe);
+			pp_div_reg  = PP_DIVISOR(pipe);
+			pp_div = I915_READ(pp_div_reg);
+			pp_div &= PP_REFERENCE_DIVIDER_MASK;
+
+			/* 0x1F write to PP_DIV_REG sets max cycle delay */
+			I915_WRITE(pp_div_reg, pp_div | 0x1F);
+			I915_WRITE(pp_ctrl_reg, PANEL_UNLOCK_REGS);
+			msleep(intel_dp->panel_power_cycle_delay);
+		}
 	}
 
-	pps_unlock(intel_dp);
-
 	return 0;
 }
 
@@ -1043,17 +1050,21 @@ intel_dp_check_edp(struct intel_dp *intel_dp)
 	}
 }
 
-static uint32_t
+static u32
 intel_dp_aux_wait_done(struct intel_dp *intel_dp)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 	i915_reg_t ch_ctl = intel_dp->aux_ch_ctl_reg(intel_dp);
-	uint32_t status;
+	u32 status;
 	bool done;
 
 #define C (((status = I915_READ_NOTRACE(ch_ctl)) & DP_AUX_CH_CTL_SEND_BUSY) == 0)
 	done = wait_event_timeout(dev_priv->gmbus_wait_queue, C,
 				  msecs_to_jiffies_timeout(10));
+
+	/* just trace the final value */
+	trace_i915_reg_rw(false, ch_ctl, status, sizeof(status), true);
+
 	if (!done)
 		DRM_ERROR("dp aux hw did not signal timeout!\n");
 #undef C
@@ -1061,7 +1072,7 @@ intel_dp_aux_wait_done(struct intel_dp *intel_dp)
 	return status;
 }
 
-static uint32_t g4x_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
+static u32 g4x_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 
@@ -1075,7 +1086,7 @@ static uint32_t g4x_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
 	return DIV_ROUND_CLOSEST(dev_priv->rawclk_freq, 2000);
 }
 
-static uint32_t ilk_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
+static u32 ilk_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 	struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
@@ -1094,7 +1105,7 @@ static uint32_t ilk_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
 		return DIV_ROUND_CLOSEST(dev_priv->rawclk_freq, 2000);
 }
 
-static uint32_t hsw_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
+static u32 hsw_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 	struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
@@ -1111,7 +1122,7 @@ static uint32_t hsw_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
 	return ilk_get_aux_clock_divider(intel_dp, index);
 }
 
-static uint32_t skl_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
+static u32 skl_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
 {
 	/*
 	 * SKL doesn't need us to program the AUX clock divider (Hardware will
@@ -1121,16 +1132,16 @@ static uint32_t skl_get_aux_clock_divider(struct intel_dp *intel_dp, int index)
 	return index ? 0 : 1;
 }
 
-static uint32_t g4x_get_aux_send_ctl(struct intel_dp *intel_dp,
-				     int send_bytes,
-				     uint32_t aux_clock_divider)
+static u32 g4x_get_aux_send_ctl(struct intel_dp *intel_dp,
+				int send_bytes,
+				u32 aux_clock_divider)
 {
 	struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
 	struct drm_i915_private *dev_priv =
 			to_i915(intel_dig_port->base.base.dev);
-	uint32_t precharge, timeout;
+	u32 precharge, timeout;
 
-	if (IS_GEN6(dev_priv))
+	if (IS_GEN(dev_priv, 6))
 		precharge = 3;
 	else
 		precharge = 5;
@@ -1151,12 +1162,12 @@ static uint32_t g4x_get_aux_send_ctl(struct intel_dp *intel_dp,
 	       (aux_clock_divider << DP_AUX_CH_CTL_BIT_CLOCK_2X_SHIFT);
 }
 
-static uint32_t skl_get_aux_send_ctl(struct intel_dp *intel_dp,
-				      int send_bytes,
-				      uint32_t unused)
+static u32 skl_get_aux_send_ctl(struct intel_dp *intel_dp,
+				int send_bytes,
+				u32 unused)
 {
 	struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
-	uint32_t ret;
+	u32 ret;
 
 	ret = DP_AUX_CH_CTL_SEND_BUSY |
 	      DP_AUX_CH_CTL_DONE |
@@ -1176,25 +1187,26 @@ static uint32_t skl_get_aux_send_ctl(struct intel_dp *intel_dp,
 
 static int
 intel_dp_aux_xfer(struct intel_dp *intel_dp,
-		  const uint8_t *send, int send_bytes,
-		  uint8_t *recv, int recv_size,
+		  const u8 *send, int send_bytes,
+		  u8 *recv, int recv_size,
 		  u32 aux_send_ctl_flags)
 {
 	struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
 	struct drm_i915_private *dev_priv =
 			to_i915(intel_dig_port->base.base.dev);
 	i915_reg_t ch_ctl, ch_data[5];
-	uint32_t aux_clock_divider;
+	u32 aux_clock_divider;
+	intel_wakeref_t wakeref;
 	int i, ret, recv_bytes;
-	uint32_t status;
 	int try, clock = 0;
+	u32 status;
 	bool vdd;
 
 	ch_ctl = intel_dp->aux_ch_ctl_reg(intel_dp);
 	for (i = 0; i < ARRAY_SIZE(ch_data); i++)
 		ch_data[i] = intel_dp->aux_ch_data_reg(intel_dp, i);
 
-	pps_lock(intel_dp);
+	wakeref = pps_lock(intel_dp);
 
 	/*
 	 * We will be called with VDD already enabled for dpcd/edid/oui reads.
@@ -1219,6 +1231,8 @@ intel_dp_aux_xfer(struct intel_dp *intel_dp,
 			break;
 		msleep(1);
 	}
+	/* just trace the final value */
+	trace_i915_reg_rw(false, ch_ctl, status, sizeof(status), true);
 
 	if (try == 3) {
 		static u32 last_status = -1;
@@ -1338,7 +1352,7 @@ out:
 	if (vdd)
 		edp_panel_vdd_off(intel_dp, false);
 
-	pps_unlock(intel_dp);
+	pps_unlock(intel_dp, wakeref);
 
 	return ret;
 }
@@ -1360,7 +1374,7 @@ static ssize_t
 intel_dp_aux_transfer(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg)
 {
 	struct intel_dp *intel_dp = container_of(aux, struct intel_dp, aux);
-	uint8_t txbuf[20], rxbuf[20];
+	u8 txbuf[20], rxbuf[20];
 	size_t txsize, rxsize;
 	int ret;
 
@@ -1693,7 +1707,7 @@ int intel_dp_rate_select(struct intel_dp *intel_dp, int rate)
 }
 
 void intel_dp_compute_rate(struct intel_dp *intel_dp, int port_clock,
-			   uint8_t *link_bw, uint8_t *rate_select)
+			   u8 *link_bw, u8 *rate_select)
 {
 	/* eDP 1.4 rate select method. */
 	if (intel_dp->use_rate_select) {
@@ -1810,7 +1824,7 @@ intel_dp_adjust_compliance_config(struct intel_dp *intel_dp,
 }
 
 /* Optimize link config in order: max bpp, min clock, min lanes */
-static bool
+static int
 intel_dp_compute_link_config_wide(struct intel_dp *intel_dp,
 				  struct intel_crtc_state *pipe_config,
 				  const struct link_config_limits *limits)
@@ -1836,17 +1850,17 @@ intel_dp_compute_link_config_wide(struct intel_dp *intel_dp,
 					pipe_config->pipe_bpp = bpp;
 					pipe_config->port_clock = link_clock;
 
-					return true;
+					return 0;
 				}
 			}
 		}
 	}
 
-	return false;
+	return -EINVAL;
 }
 
 /* Optimize link config in order: max bpp, min lanes, min clock */
-static bool
+static int
 intel_dp_compute_link_config_fast(struct intel_dp *intel_dp,
 				  struct intel_crtc_state *pipe_config,
 				  const struct link_config_limits *limits)
@@ -1872,13 +1886,13 @@ intel_dp_compute_link_config_fast(struct intel_dp *intel_dp,
 					pipe_config->pipe_bpp = bpp;
 					pipe_config->port_clock = link_clock;
 
-					return true;
+					return 0;
 				}
 			}
 		}
 	}
 
-	return false;
+	return -EINVAL;
 }
 
 static int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 dsc_max_bpc)
@@ -1896,19 +1910,20 @@ static int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 dsc_max_bpc)
 	return 0;
 }
 
-static bool intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
-					struct intel_crtc_state *pipe_config,
-					struct drm_connector_state *conn_state,
-					struct link_config_limits *limits)
+static int intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
+				       struct intel_crtc_state *pipe_config,
+				       struct drm_connector_state *conn_state,
+				       struct link_config_limits *limits)
 {
 	struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
 	struct drm_i915_private *dev_priv = to_i915(dig_port->base.base.dev);
 	struct drm_display_mode *adjusted_mode = &pipe_config->base.adjusted_mode;
 	u8 dsc_max_bpc;
 	int pipe_bpp;
+	int ret;
 
 	if (!intel_dp_supports_dsc(intel_dp, pipe_config))
-		return false;
+		return -EINVAL;
 
 	dsc_max_bpc = min_t(u8, DP_DSC_MAX_SUPPORTED_BPC,
 			    conn_state->max_requested_bpc);
@@ -1916,7 +1931,7 @@ static bool intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
 	pipe_bpp = intel_dp_dsc_compute_bpp(intel_dp, dsc_max_bpc);
 	if (pipe_bpp < DP_DSC_MIN_SUPPORTED_BPC * 3) {
 		DRM_DEBUG_KMS("No DSC support for less than 8bpc\n");
-		return false;
+		return -EINVAL;
 	}
 
 	/*
@@ -1950,7 +1965,7 @@ static bool intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
 						     adjusted_mode->crtc_hdisplay);
 		if (!dsc_max_output_bpp || !dsc_dp_slice_count) {
 			DRM_DEBUG_KMS("Compressed BPP/Slice Count not supported\n");
-			return false;
+			return -EINVAL;
 		}
 		pipe_config->dsc_params.compressed_bpp = min_t(u16,
 							       dsc_max_output_bpp >> 4,
@@ -1967,16 +1982,19 @@ static bool intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
 			pipe_config->dsc_params.dsc_split = true;
 		} else {
 			DRM_DEBUG_KMS("Cannot split stream to use 2 VDSC instances\n");
-			return false;
+			return -EINVAL;
 		}
 	}
-	if (intel_dp_compute_dsc_params(intel_dp, pipe_config) < 0) {
+
+	ret = intel_dp_compute_dsc_params(intel_dp, pipe_config);
+	if (ret < 0) {
 		DRM_DEBUG_KMS("Cannot compute valid DSC parameters for Input Bpp = %d "
 			      "Compressed BPP = %d\n",
 			      pipe_config->pipe_bpp,
 			      pipe_config->dsc_params.compressed_bpp);
-		return false;
+		return ret;
 	}
+
 	pipe_config->dsc_params.compression_enable = true;
 	DRM_DEBUG_KMS("DP DSC computed with Input Bpp = %d "
 		      "Compressed Bpp = %d Slice Count = %d\n",
@@ -1984,10 +2002,10 @@ static bool intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
 		      pipe_config->dsc_params.compressed_bpp,
 		      pipe_config->dsc_params.slice_count);
 
-	return true;
+	return 0;
 }
 
-static bool
+static int
 intel_dp_compute_link_config(struct intel_encoder *encoder,
 			     struct intel_crtc_state *pipe_config,
 			     struct drm_connector_state *conn_state)
@@ -1996,7 +2014,7 @@ intel_dp_compute_link_config(struct intel_encoder *encoder,
 	struct intel_dp *intel_dp = enc_to_intel_dp(&encoder->base);
 	struct link_config_limits limits;
 	int common_len;
-	bool ret;
+	int ret;
 
 	common_len = intel_dp_common_len_rate_limit(intel_dp,
 						    intel_dp->max_link_rate);
@@ -2053,10 +2071,12 @@ intel_dp_compute_link_config(struct intel_encoder *encoder,
 							&limits);
 
 	/* enable compression if the mode doesn't fit available BW */
-	if (!ret) {
-		if (!intel_dp_dsc_compute_config(intel_dp, pipe_config,
-						 conn_state, &limits))
-			return false;
+	DRM_DEBUG_KMS("Force DSC en = %d\n", intel_dp->force_dsc_en);
+	if (ret || intel_dp->force_dsc_en) {
+		ret = intel_dp_dsc_compute_config(intel_dp, pipe_config,
+						  conn_state, &limits);
+		if (ret < 0)
+			return ret;
 	}
 
 	if (pipe_config->dsc_params.compression_enable) {
@@ -2081,10 +2101,10 @@ intel_dp_compute_link_config(struct intel_encoder *encoder,
 			      intel_dp_max_data_rate(pipe_config->port_clock,
 						     pipe_config->lane_count));
 	}
-	return true;
+	return 0;
 }
 
-bool
+int
 intel_dp_compute_config(struct intel_encoder *encoder,
 			struct intel_crtc_state *pipe_config,
 			struct drm_connector_state *conn_state)
@@ -2100,6 +2120,7 @@ intel_dp_compute_config(struct intel_encoder *encoder,
 		to_intel_digital_connector_state(conn_state);
 	bool constant_n = drm_dp_has_quirk(&intel_dp->desc,
 					   DP_DPCD_QUIRK_CONSTANT_N);
+	int ret;
 
 	if (HAS_PCH_SPLIT(dev_priv) && !HAS_DDI(dev_priv) && port != PORT_A)
 		pipe_config->has_pch_encoder = true;
@@ -2121,14 +2142,12 @@ intel_dp_compute_config(struct intel_encoder *encoder,
 				       adjusted_mode);
 
 		if (INTEL_GEN(dev_priv) >= 9) {
-			int ret;
-
 			ret = skl_update_scaler_crtc(pipe_config);
 			if (ret)
 				return ret;
 		}
 
-		if (HAS_GMCH_DISPLAY(dev_priv))
+		if (HAS_GMCH(dev_priv))
 			intel_gmch_panel_fitting(intel_crtc, pipe_config,
 						 conn_state->scaling_mode);
 		else
@@ -2137,20 +2156,21 @@ intel_dp_compute_config(struct intel_encoder *encoder,
 	}
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
-	if (HAS_GMCH_DISPLAY(dev_priv) &&
+	if (HAS_GMCH(dev_priv) &&
 	    adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE)
-		return false;
+		return -EINVAL;
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLCLK)
-		return false;
+		return -EINVAL;
 
 	pipe_config->fec_enable = !intel_dp_is_edp(intel_dp) &&
 				  intel_dp_supports_fec(intel_dp, pipe_config);
 
-	if (!intel_dp_compute_link_config(encoder, pipe_config, conn_state))
-		return false;
+	ret = intel_dp_compute_link_config(encoder, pipe_config, conn_state);
+	if (ret < 0)
+		return ret;
 
 	if (intel_conn_state->broadcast_rgb == INTEL_BROADCAST_RGB_AUTO) {
 		/*
@@ -2198,11 +2218,11 @@ intel_dp_compute_config(struct intel_encoder *encoder,
 
 	intel_psr_compute_config(intel_dp, pipe_config);
 
-	return true;
+	return 0;
 }
 
 void intel_dp_set_link_params(struct intel_dp *intel_dp,
-			      int link_rate, uint8_t lane_count,
+			      int link_rate, u8 lane_count,
 			      bool link_mst)
 {
 	intel_dp->link_trained = false;
@@ -2464,15 +2484,15 @@ static bool edp_panel_vdd_on(struct intel_dp *intel_dp)
  */
 void intel_edp_panel_vdd_on(struct intel_dp *intel_dp)
 {
+	intel_wakeref_t wakeref;
 	bool vdd;
 
 	if (!intel_dp_is_edp(intel_dp))
 		return;
 
-	pps_lock(intel_dp);
-	vdd = edp_panel_vdd_on(intel_dp);
-	pps_unlock(intel_dp);
-
+	vdd = false;
+	with_pps_lock(intel_dp, wakeref)
+		vdd = edp_panel_vdd_on(intel_dp);
 	I915_STATE_WARN(!vdd, "eDP port %c VDD already requested on\n",
 	     port_name(dp_to_dig_port(intel_dp)->base.port));
 }
@@ -2511,19 +2531,21 @@ static void edp_panel_vdd_off_sync(struct intel_dp *intel_dp)
 	if ((pp & PANEL_POWER_ON) == 0)
 		intel_dp->panel_power_off_time = ktime_get_boottime();
 
-	intel_display_power_put(dev_priv,
-				intel_aux_power_domain(intel_dig_port));
+	intel_display_power_put_unchecked(dev_priv,
+					  intel_aux_power_domain(intel_dig_port));
 }
 
 static void edp_panel_vdd_work(struct work_struct *__work)
 {
-	struct intel_dp *intel_dp = container_of(to_delayed_work(__work),
-						 struct intel_dp, panel_vdd_work);
+	struct intel_dp *intel_dp =
+		container_of(to_delayed_work(__work),
+			     struct intel_dp, panel_vdd_work);
+	intel_wakeref_t wakeref;
 
-	pps_lock(intel_dp);
-	if (!intel_dp->want_panel_vdd)
-		edp_panel_vdd_off_sync(intel_dp);
-	pps_unlock(intel_dp);
+	with_pps_lock(intel_dp, wakeref) {
+		if (!intel_dp->want_panel_vdd)
+			edp_panel_vdd_off_sync(intel_dp);
+	}
 }
 
 static void edp_panel_vdd_schedule_off(struct intel_dp *intel_dp)
@@ -2587,7 +2609,7 @@ static void edp_panel_on(struct intel_dp *intel_dp)
 
 	pp_ctrl_reg = _pp_ctrl_reg(intel_dp);
 	pp = ironlake_get_pp_control(intel_dp);
-	if (IS_GEN5(dev_priv)) {
+	if (IS_GEN(dev_priv, 5)) {
 		/* ILK workaround: disable reset around power sequence */
 		pp &= ~PANEL_POWER_RESET;
 		I915_WRITE(pp_ctrl_reg, pp);
@@ -2595,7 +2617,7 @@ static void edp_panel_on(struct intel_dp *intel_dp)
 	}
 
 	pp |= PANEL_POWER_ON;
-	if (!IS_GEN5(dev_priv))
+	if (!IS_GEN(dev_priv, 5))
 		pp |= PANEL_POWER_RESET;
 
 	I915_WRITE(pp_ctrl_reg, pp);
@@ -2604,7 +2626,7 @@ static void edp_panel_on(struct intel_dp *intel_dp)
 	wait_panel_on(intel_dp);
 	intel_dp->last_power_on = jiffies;
 
-	if (IS_GEN5(dev_priv)) {
+	if (IS_GEN(dev_priv, 5)) {
 		pp |= PANEL_POWER_RESET; /* restore panel reset bit */
 		I915_WRITE(pp_ctrl_reg, pp);
 		POSTING_READ(pp_ctrl_reg);
@@ -2613,12 +2635,13 @@ static void edp_panel_on(struct intel_dp *intel_dp)
 
 void intel_edp_panel_on(struct intel_dp *intel_dp)
 {
+	intel_wakeref_t wakeref;
+
 	if (!intel_dp_is_edp(intel_dp))
 		return;
 
-	pps_lock(intel_dp);
-	edp_panel_on(intel_dp);
-	pps_unlock(intel_dp);
+	with_pps_lock(intel_dp, wakeref)
+		edp_panel_on(intel_dp);
 }
 
 
@@ -2657,25 +2680,25 @@ static void edp_panel_off(struct intel_dp *intel_dp)
 	intel_dp->panel_power_off_time = ktime_get_boottime();
 
 	/* We got a reference when we enabled the VDD. */
-	intel_display_power_put(dev_priv, intel_aux_power_domain(dig_port));
+	intel_display_power_put_unchecked(dev_priv, intel_aux_power_domain(dig_port));
 }
 
 void intel_edp_panel_off(struct intel_dp *intel_dp)
 {
+	intel_wakeref_t wakeref;
+
 	if (!intel_dp_is_edp(intel_dp))
 		return;
 
-	pps_lock(intel_dp);
-	edp_panel_off(intel_dp);
-	pps_unlock(intel_dp);
+	with_pps_lock(intel_dp, wakeref)
+		edp_panel_off(intel_dp);
 }
 
 /* Enable backlight in the panel power control. */
 static void _intel_edp_backlight_on(struct intel_dp *intel_dp)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
-	u32 pp;
-	i915_reg_t pp_ctrl_reg;
+	intel_wakeref_t wakeref;
 
 	/*
 	 * If we enable the backlight right away following a panel power
@@ -2685,17 +2708,16 @@ static void _intel_edp_backlight_on(struct intel_dp *intel_dp)
 	 */
 	wait_backlight_on(intel_dp);
 
-	pps_lock(intel_dp);
+	with_pps_lock(intel_dp, wakeref) {
+		i915_reg_t pp_ctrl_reg = _pp_ctrl_reg(intel_dp);
+		u32 pp;
 
-	pp = ironlake_get_pp_control(intel_dp);
-	pp |= EDP_BLC_ENABLE;
-
-	pp_ctrl_reg = _pp_ctrl_reg(intel_dp);
-
-	I915_WRITE(pp_ctrl_reg, pp);
-	POSTING_READ(pp_ctrl_reg);
+		pp = ironlake_get_pp_control(intel_dp);
+		pp |= EDP_BLC_ENABLE;
 
-	pps_unlock(intel_dp);
+		I915_WRITE(pp_ctrl_reg, pp);
+		POSTING_READ(pp_ctrl_reg);
+	}
 }
 
 /* Enable backlight PWM and backlight PP control. */
@@ -2717,23 +2739,21 @@ void intel_edp_backlight_on(const struct intel_crtc_state *crtc_state,
 static void _intel_edp_backlight_off(struct intel_dp *intel_dp)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
-	u32 pp;
-	i915_reg_t pp_ctrl_reg;
+	intel_wakeref_t wakeref;
 
 	if (!intel_dp_is_edp(intel_dp))
 		return;
 
-	pps_lock(intel_dp);
+	with_pps_lock(intel_dp, wakeref) {
+		i915_reg_t pp_ctrl_reg = _pp_ctrl_reg(intel_dp);
+		u32 pp;
 
-	pp = ironlake_get_pp_control(intel_dp);
-	pp &= ~EDP_BLC_ENABLE;
-
-	pp_ctrl_reg = _pp_ctrl_reg(intel_dp);
-
-	I915_WRITE(pp_ctrl_reg, pp);
-	POSTING_READ(pp_ctrl_reg);
+		pp = ironlake_get_pp_control(intel_dp);
+		pp &= ~EDP_BLC_ENABLE;
 
-	pps_unlock(intel_dp);
+		I915_WRITE(pp_ctrl_reg, pp);
+		POSTING_READ(pp_ctrl_reg);
+	}
 
 	intel_dp->last_backlight_off = jiffies;
 	edp_wait_backlight_off(intel_dp);
@@ -2761,12 +2781,12 @@ static void intel_edp_backlight_power(struct intel_connector *connector,
 				      bool enable)
 {
 	struct intel_dp *intel_dp = intel_attached_dp(&connector->base);
+	intel_wakeref_t wakeref;
 	bool is_enabled;
 
-	pps_lock(intel_dp);
-	is_enabled = ironlake_get_pp_control(intel_dp) & EDP_BLC_ENABLE;
-	pps_unlock(intel_dp);
-
+	is_enabled = false;
+	with_pps_lock(intel_dp, wakeref)
+		is_enabled = ironlake_get_pp_control(intel_dp) & EDP_BLC_ENABLE;
 	if (is_enabled == enable)
 		return;
 
@@ -2833,7 +2853,7 @@ static void ironlake_edp_pll_on(struct intel_dp *intel_dp,
 	 * 1. Wait for the start of vertical blank on the enabled pipe going to FDI
 	 * 2. Program DP PLL enable
 	 */
-	if (IS_GEN5(dev_priv))
+	if (IS_GEN(dev_priv, 5))
 		intel_wait_for_vblank_if_active(dev_priv, !crtc->pipe);
 
 	intel_dp->DP |= DP_PLL_ENABLE;
@@ -2983,16 +3003,18 @@ static bool intel_dp_get_hw_state(struct intel_encoder *encoder,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_dp *intel_dp = enc_to_intel_dp(&encoder->base);
+	intel_wakeref_t wakeref;
 	bool ret;
 
-	if (!intel_display_power_get_if_enabled(dev_priv,
-						encoder->power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     encoder->power_domain);
+	if (!wakeref)
 		return false;
 
 	ret = intel_dp_port_enabled(dev_priv, intel_dp->output_reg,
 				    encoder->port, pipe);
 
-	intel_display_power_put(dev_priv, encoder->power_domain);
+	intel_display_power_put(dev_priv, encoder->power_domain, wakeref);
 
 	return ret;
 }
@@ -3160,20 +3182,20 @@ static void chv_post_disable_dp(struct intel_encoder *encoder,
 
 static void
 _intel_dp_set_link_train(struct intel_dp *intel_dp,
-			 uint32_t *DP,
-			 uint8_t dp_train_pat)
+			 u32 *DP,
+			 u8 dp_train_pat)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 	struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
 	enum port port = intel_dig_port->base.port;
-	uint8_t train_pat_mask = drm_dp_training_pattern_mask(intel_dp->dpcd);
+	u8 train_pat_mask = drm_dp_training_pattern_mask(intel_dp->dpcd);
 
 	if (dp_train_pat & train_pat_mask)
 		DRM_DEBUG_KMS("Using DP training pattern TPS%d\n",
 			      dp_train_pat & train_pat_mask);
 
 	if (HAS_DDI(dev_priv)) {
-		uint32_t temp = I915_READ(DP_TP_CTL(port));
+		u32 temp = I915_READ(DP_TP_CTL(port));
 
 		if (dp_train_pat & DP_LINK_SCRAMBLING_DISABLE)
 			temp |= DP_TP_CTL_SCRAMBLE_DISABLE;
@@ -3272,24 +3294,23 @@ static void intel_enable_dp(struct intel_encoder *encoder,
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_dp *intel_dp = enc_to_intel_dp(&encoder->base);
 	struct intel_crtc *crtc = to_intel_crtc(pipe_config->base.crtc);
-	uint32_t dp_reg = I915_READ(intel_dp->output_reg);
+	u32 dp_reg = I915_READ(intel_dp->output_reg);
 	enum pipe pipe = crtc->pipe;
+	intel_wakeref_t wakeref;
 
 	if (WARN_ON(dp_reg & DP_PORT_EN))
 		return;
 
-	pps_lock(intel_dp);
+	with_pps_lock(intel_dp, wakeref) {
+		if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
+			vlv_init_panel_power_sequencer(encoder, pipe_config);
 
-	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
-		vlv_init_panel_power_sequencer(encoder, pipe_config);
+		intel_dp_enable_port(intel_dp, pipe_config);
 
-	intel_dp_enable_port(intel_dp, pipe_config);
-
-	edp_panel_vdd_on(intel_dp);
-	edp_panel_on(intel_dp);
-	edp_panel_vdd_off(intel_dp, true);
-
-	pps_unlock(intel_dp);
+		edp_panel_vdd_on(intel_dp);
+		edp_panel_on(intel_dp);
+		edp_panel_vdd_off(intel_dp, true);
+	}
 
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
 		unsigned int lane_mask = 0x0;
@@ -3492,14 +3513,14 @@ static void chv_dp_post_pll_disable(struct intel_encoder *encoder,
  * link status information
  */
 bool
-intel_dp_get_link_status(struct intel_dp *intel_dp, uint8_t link_status[DP_LINK_STATUS_SIZE])
+intel_dp_get_link_status(struct intel_dp *intel_dp, u8 link_status[DP_LINK_STATUS_SIZE])
 {
 	return drm_dp_dpcd_read(&intel_dp->aux, DP_LANE0_1_STATUS, link_status,
 				DP_LINK_STATUS_SIZE) == DP_LINK_STATUS_SIZE;
 }
 
 /* These are source-specific values. */
-uint8_t
+u8
 intel_dp_voltage_max(struct intel_dp *intel_dp)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
@@ -3518,8 +3539,8 @@ intel_dp_voltage_max(struct intel_dp *intel_dp)
 		return DP_TRAIN_VOLTAGE_SWING_LEVEL_2;
 }
 
-uint8_t
-intel_dp_pre_emphasis_max(struct intel_dp *intel_dp, uint8_t voltage_swing)
+u8
+intel_dp_pre_emphasis_max(struct intel_dp *intel_dp, u8 voltage_swing)
 {
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 	struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base;
@@ -3564,12 +3585,12 @@ intel_dp_pre_emphasis_max(struct intel_dp *intel_dp, uint8_t voltage_swing)
 	}
 }
 
-static uint32_t vlv_signal_levels(struct intel_dp *intel_dp)
+static u32 vlv_signal_levels(struct intel_dp *intel_dp)
 {
 	struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base;
 	unsigned long demph_reg_value, preemph_reg_value,
 		uniqtranscale_reg_value;
-	uint8_t train_set = intel_dp->train_set[0];
+	u8 train_set = intel_dp->train_set[0];
 
 	switch (train_set & DP_TRAIN_PRE_EMPHASIS_MASK) {
 	case DP_TRAIN_PRE_EMPH_LEVEL_0:
@@ -3650,12 +3671,12 @@ static uint32_t vlv_signal_levels(struct intel_dp *intel_dp)
 	return 0;
 }
 
-static uint32_t chv_signal_levels(struct intel_dp *intel_dp)
+static u32 chv_signal_levels(struct intel_dp *intel_dp)
 {
 	struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base;
 	u32 deemph_reg_value, margin_reg_value;
 	bool uniq_trans_scale = false;
-	uint8_t train_set = intel_dp->train_set[0];
+	u8 train_set = intel_dp->train_set[0];
 
 	switch (train_set & DP_TRAIN_PRE_EMPHASIS_MASK) {
 	case DP_TRAIN_PRE_EMPH_LEVEL_0:
@@ -3733,10 +3754,10 @@ static uint32_t chv_signal_levels(struct intel_dp *intel_dp)
 	return 0;
 }
 
-static uint32_t
-g4x_signal_levels(uint8_t train_set)
+static u32
+g4x_signal_levels(u8 train_set)
 {
-	uint32_t	signal_levels = 0;
+	u32 signal_levels = 0;
 
 	switch (train_set & DP_TRAIN_VOLTAGE_SWING_MASK) {
 	case DP_TRAIN_VOLTAGE_SWING_LEVEL_0:
@@ -3772,8 +3793,8 @@ g4x_signal_levels(uint8_t train_set)
 }
 
 /* SNB CPU eDP voltage swing and pre-emphasis control */
-static uint32_t
-snb_cpu_edp_signal_levels(uint8_t train_set)
+static u32
+snb_cpu_edp_signal_levels(u8 train_set)
 {
 	int signal_levels = train_set & (DP_TRAIN_VOLTAGE_SWING_MASK |
 					 DP_TRAIN_PRE_EMPHASIS_MASK);
@@ -3800,8 +3821,8 @@ snb_cpu_edp_signal_levels(uint8_t train_set)
 }
 
 /* IVB CPU eDP voltage swing and pre-emphasis control */
-static uint32_t
-ivb_cpu_edp_signal_levels(uint8_t train_set)
+static u32
+ivb_cpu_edp_signal_levels(u8 train_set)
 {
 	int signal_levels = train_set & (DP_TRAIN_VOLTAGE_SWING_MASK |
 					 DP_TRAIN_PRE_EMPHASIS_MASK);
@@ -3836,8 +3857,8 @@ intel_dp_set_signal_levels(struct intel_dp *intel_dp)
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 	struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
 	enum port port = intel_dig_port->base.port;
-	uint32_t signal_levels, mask = 0;
-	uint8_t train_set = intel_dp->train_set[0];
+	u32 signal_levels, mask = 0;
+	u8 train_set = intel_dp->train_set[0];
 
 	if (IS_GEN9_LP(dev_priv) || INTEL_GEN(dev_priv) >= 10) {
 		signal_levels = bxt_signal_levels(intel_dp);
@@ -3851,7 +3872,7 @@ intel_dp_set_signal_levels(struct intel_dp *intel_dp)
 	} else if (IS_IVYBRIDGE(dev_priv) && port == PORT_A) {
 		signal_levels = ivb_cpu_edp_signal_levels(train_set);
 		mask = EDP_LINK_TRAIN_VOL_EMP_MASK_IVB;
-	} else if (IS_GEN6(dev_priv) && port == PORT_A) {
+	} else if (IS_GEN(dev_priv, 6) && port == PORT_A) {
 		signal_levels = snb_cpu_edp_signal_levels(train_set);
 		mask = EDP_LINK_TRAIN_VOL_EMP_MASK_SNB;
 	} else {
@@ -3876,7 +3897,7 @@ intel_dp_set_signal_levels(struct intel_dp *intel_dp)
 
 void
 intel_dp_program_link_training_pattern(struct intel_dp *intel_dp,
-				       uint8_t dp_train_pat)
+				       u8 dp_train_pat)
 {
 	struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
 	struct drm_i915_private *dev_priv =
@@ -3893,7 +3914,7 @@ void intel_dp_set_idle_link_train(struct intel_dp *intel_dp)
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 	struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
 	enum port port = intel_dig_port->base.port;
-	uint32_t val;
+	u32 val;
 
 	if (!HAS_DDI(dev_priv))
 		return;
@@ -3928,7 +3949,7 @@ intel_dp_link_down(struct intel_encoder *encoder,
 	struct intel_dp *intel_dp = enc_to_intel_dp(&encoder->base);
 	struct intel_crtc *crtc = to_intel_crtc(old_crtc_state->base.crtc);
 	enum port port = encoder->port;
-	uint32_t DP = intel_dp->DP;
+	u32 DP = intel_dp->DP;
 
 	if (WARN_ON(HAS_DDI(dev_priv)))
 		return;
@@ -3987,12 +4008,49 @@ intel_dp_link_down(struct intel_encoder *encoder,
 	intel_dp->DP = DP;
 
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
-		pps_lock(intel_dp);
-		intel_dp->active_pipe = INVALID_PIPE;
-		pps_unlock(intel_dp);
+		intel_wakeref_t wakeref;
+
+		with_pps_lock(intel_dp, wakeref)
+			intel_dp->active_pipe = INVALID_PIPE;
 	}
 }
 
+static void
+intel_dp_extended_receiver_capabilities(struct intel_dp *intel_dp)
+{
+	u8 dpcd_ext[6];
+
+	/*
+	 * Prior to DP1.3 the bit represented by
+	 * DP_EXTENDED_RECEIVER_CAP_FIELD_PRESENT was reserved.
+	 * if it is set DP_DPCD_REV at 0000h could be at a value less than
+	 * the true capability of the panel. The only way to check is to
+	 * then compare 0000h and 2200h.
+	 */
+	if (!(intel_dp->dpcd[DP_TRAINING_AUX_RD_INTERVAL] &
+	      DP_EXTENDED_RECEIVER_CAP_FIELD_PRESENT))
+		return;
+
+	if (drm_dp_dpcd_read(&intel_dp->aux, DP_DP13_DPCD_REV,
+			     &dpcd_ext, sizeof(dpcd_ext)) != sizeof(dpcd_ext)) {
+		DRM_ERROR("DPCD failed read at extended capabilities\n");
+		return;
+	}
+
+	if (intel_dp->dpcd[DP_DPCD_REV] > dpcd_ext[DP_DPCD_REV]) {
+		DRM_DEBUG_KMS("DPCD extended DPCD rev less than base DPCD rev\n");
+		return;
+	}
+
+	if (!memcmp(intel_dp->dpcd, dpcd_ext, sizeof(dpcd_ext)))
+		return;
+
+	DRM_DEBUG_KMS("Base DPCD: %*ph\n",
+		      (int)sizeof(intel_dp->dpcd), intel_dp->dpcd);
+
+	memcpy(intel_dp->dpcd, dpcd_ext, sizeof(dpcd_ext));
+}
+
 bool
 intel_dp_read_dpcd(struct intel_dp *intel_dp)
 {
@@ -4000,6 +4058,8 @@ intel_dp_read_dpcd(struct intel_dp *intel_dp)
 			     sizeof(intel_dp->dpcd)) < 0)
 		return false; /* aux transfer failed */
 
+	intel_dp_extended_receiver_capabilities(intel_dp);
+
 	DRM_DEBUG_KMS("DPCD: %*ph\n", (int) sizeof(intel_dp->dpcd), intel_dp->dpcd);
 
 	return intel_dp->dpcd[DP_DPCD_REV] != 0;
@@ -4230,7 +4290,7 @@ intel_dp_get_sink_irq_esi(struct intel_dp *intel_dp, u8 *sink_irq_vector)
 		DP_DPRX_ESI_LEN;
 }
 
-u16 intel_dp_dsc_get_output_bpp(int link_clock, uint8_t lane_count,
+u16 intel_dp_dsc_get_output_bpp(int link_clock, u8 lane_count,
 				int mode_clock, int mode_hdisplay)
 {
 	u16 bits_per_pixel, max_bpp_small_joiner_ram;
@@ -4297,7 +4357,7 @@ u8 intel_dp_dsc_get_slice_count(struct intel_dp *intel_dp,
 		return 0;
 	}
 	/* Also take into account max slice width */
-	min_slice_count = min_t(uint8_t, min_slice_count,
+	min_slice_count = min_t(u8, min_slice_count,
 				DIV_ROUND_UP(mode_hdisplay,
 					     max_slice_width));
 
@@ -4315,11 +4375,11 @@ u8 intel_dp_dsc_get_slice_count(struct intel_dp *intel_dp,
 	return 0;
 }
 
-static uint8_t intel_dp_autotest_link_training(struct intel_dp *intel_dp)
+static u8 intel_dp_autotest_link_training(struct intel_dp *intel_dp)
 {
 	int status = 0;
 	int test_link_rate;
-	uint8_t test_lane_count, test_link_bw;
+	u8 test_lane_count, test_link_bw;
 	/* (DP CTS 1.2)
 	 * 4.3.1.11
 	 */
@@ -4352,10 +4412,10 @@ static uint8_t intel_dp_autotest_link_training(struct intel_dp *intel_dp)
 	return DP_TEST_ACK;
 }
 
-static uint8_t intel_dp_autotest_video_pattern(struct intel_dp *intel_dp)
+static u8 intel_dp_autotest_video_pattern(struct intel_dp *intel_dp)
 {
-	uint8_t test_pattern;
-	uint8_t test_misc;
+	u8 test_pattern;
+	u8 test_misc;
 	__be16 h_width, v_height;
 	int status = 0;
 
@@ -4413,9 +4473,9 @@ static uint8_t intel_dp_autotest_video_pattern(struct intel_dp *intel_dp)
 	return DP_TEST_ACK;
 }
 
-static uint8_t intel_dp_autotest_edid(struct intel_dp *intel_dp)
+static u8 intel_dp_autotest_edid(struct intel_dp *intel_dp)
 {
-	uint8_t test_result = DP_TEST_ACK;
+	u8 test_result = DP_TEST_ACK;
 	struct intel_connector *intel_connector = intel_dp->attached_connector;
 	struct drm_connector *connector = &intel_connector->base;
 
@@ -4457,16 +4517,16 @@ static uint8_t intel_dp_autotest_edid(struct intel_dp *intel_dp)
 	return test_result;
 }
 
-static uint8_t intel_dp_autotest_phy_pattern(struct intel_dp *intel_dp)
+static u8 intel_dp_autotest_phy_pattern(struct intel_dp *intel_dp)
 {
-	uint8_t test_result = DP_TEST_NAK;
+	u8 test_result = DP_TEST_NAK;
 	return test_result;
 }
 
 static void intel_dp_handle_test_request(struct intel_dp *intel_dp)
 {
-	uint8_t response = DP_TEST_NAK;
-	uint8_t request = 0;
+	u8 response = DP_TEST_NAK;
+	u8 request = 0;
 	int status;
 
 	status = drm_dp_dpcd_readb(&intel_dp->aux, DP_TEST_REQUEST, &request);
@@ -4554,12 +4614,10 @@ go_again:
 
 			return ret;
 		} else {
-			struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
 			DRM_DEBUG_KMS("failed to get ESI - device may have failed\n");
 			intel_dp->is_mst = false;
-			drm_dp_mst_topology_mgr_set_mst(&intel_dp->mst_mgr, intel_dp->is_mst);
-			/* send a hotplug event */
-			drm_kms_helper_hotplug_event(intel_dig_port->base.base.dev);
+			drm_dp_mst_topology_mgr_set_mst(&intel_dp->mst_mgr,
+							intel_dp->is_mst);
 		}
 	}
 	return -EINVAL;
@@ -4792,8 +4850,8 @@ static enum drm_connector_status
 intel_dp_detect_dpcd(struct intel_dp *intel_dp)
 {
 	struct intel_lspcon *lspcon = dp_to_lspcon(intel_dp);
-	uint8_t *dpcd = intel_dp->dpcd;
-	uint8_t type;
+	u8 *dpcd = intel_dp->dpcd;
+	u8 type;
 
 	if (lspcon->active)
 		lspcon_resume(lspcon);
@@ -5030,28 +5088,38 @@ static bool icl_combo_port_connected(struct drm_i915_private *dev_priv,
 	return I915_READ(SDEISR) & SDE_DDI_HOTPLUG_ICP(port);
 }
 
+static const char *tc_type_name(enum tc_port_type type)
+{
+	static const char * const names[] = {
+		[TC_PORT_UNKNOWN] = "unknown",
+		[TC_PORT_LEGACY] = "legacy",
+		[TC_PORT_TYPEC] = "typec",
+		[TC_PORT_TBT] = "tbt",
+	};
+
+	if (WARN_ON(type >= ARRAY_SIZE(names)))
+		type = TC_PORT_UNKNOWN;
+
+	return names[type];
+}
+
 static void icl_update_tc_port_type(struct drm_i915_private *dev_priv,
 				    struct intel_digital_port *intel_dig_port,
 				    bool is_legacy, bool is_typec, bool is_tbt)
 {
 	enum port port = intel_dig_port->base.port;
 	enum tc_port_type old_type = intel_dig_port->tc_type;
-	const char *type_str;
 
 	WARN_ON(is_legacy + is_typec + is_tbt != 1);
 
-	if (is_legacy) {
+	if (is_legacy)
 		intel_dig_port->tc_type = TC_PORT_LEGACY;
-		type_str = "legacy";
-	} else if (is_typec) {
+	else if (is_typec)
 		intel_dig_port->tc_type = TC_PORT_TYPEC;
-		type_str = "typec";
-	} else if (is_tbt) {
+	else if (is_tbt)
 		intel_dig_port->tc_type = TC_PORT_TBT;
-		type_str = "tbt";
-	} else {
+	else
 		return;
-	}
 
 	/* Types are not supposed to be changed at runtime. */
 	WARN_ON(old_type != TC_PORT_UNKNOWN &&
@@ -5059,12 +5127,9 @@ static void icl_update_tc_port_type(struct drm_i915_private *dev_priv,
 
 	if (old_type != intel_dig_port->tc_type)
 		DRM_DEBUG_KMS("Port %c has TC type %s\n", port_name(port),
-			      type_str);
+			      tc_type_name(intel_dig_port->tc_type));
 }
 
-static void icl_tc_phy_disconnect(struct drm_i915_private *dev_priv,
-				  struct intel_digital_port *dig_port);
-
 /*
  * This function implements the first part of the Connect Flow described by our
  * specification, Gen11 TypeC Programming chapter. The rest of the flow (reading
@@ -5099,6 +5164,7 @@ static bool icl_tc_phy_connect(struct drm_i915_private *dev_priv,
 	val = I915_READ(PORT_TX_DFLEXDPPMS);
 	if (!(val & DP_PHY_MODE_STATUS_COMPLETED(tc_port))) {
 		DRM_DEBUG_KMS("DP PHY for TC port %d not ready\n", tc_port);
+		WARN_ON(dig_port->tc_legacy_port);
 		return false;
 	}
 
@@ -5130,8 +5196,8 @@ static bool icl_tc_phy_connect(struct drm_i915_private *dev_priv,
  * See the comment at the connect function. This implements the Disconnect
  * Flow.
  */
-static void icl_tc_phy_disconnect(struct drm_i915_private *dev_priv,
-				  struct intel_digital_port *dig_port)
+void icl_tc_phy_disconnect(struct drm_i915_private *dev_priv,
+			   struct intel_digital_port *dig_port)
 {
 	enum tc_port tc_port = intel_port_to_tc(dev_priv, dig_port->base.port);
 
@@ -5151,6 +5217,10 @@ static void icl_tc_phy_disconnect(struct drm_i915_private *dev_priv,
 		I915_WRITE(PORT_TX_DFLEXDPCSSS, val);
 	}
 
+	DRM_DEBUG_KMS("Port %c TC type %s disconnected\n",
+		      port_name(dig_port->base.port),
+		      tc_type_name(dig_port->tc_type));
+
 	dig_port->tc_type = TC_PORT_UNKNOWN;
 }
 
@@ -5172,7 +5242,14 @@ static bool icl_tc_port_connected(struct drm_i915_private *dev_priv,
 	bool is_legacy, is_typec, is_tbt;
 	u32 dpsp;
 
-	is_legacy = I915_READ(SDEISR) & SDE_TC_HOTPLUG_ICP(tc_port);
+	/*
+	 * WARN if we got a legacy port HPD, but VBT didn't mark the port as
+	 * legacy. Treat the port as legacy from now on.
+	 */
+	if (WARN_ON(!intel_dig_port->tc_legacy_port &&
+		    I915_READ(SDEISR) & SDE_TC_HOTPLUG_ICP(tc_port)))
+		intel_dig_port->tc_legacy_port = true;
+	is_legacy = intel_dig_port->tc_legacy_port;
 
 	/*
 	 * The spec says we shouldn't be using the ISR bits for detecting
@@ -5184,6 +5261,7 @@ static bool icl_tc_port_connected(struct drm_i915_private *dev_priv,
 
 	if (!is_legacy && !is_typec && !is_tbt) {
 		icl_tc_phy_disconnect(dev_priv, intel_dig_port);
+
 		return false;
 	}
 
@@ -5226,7 +5304,7 @@ bool intel_digital_port_connected(struct intel_encoder *encoder)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 
-	if (HAS_GMCH_DISPLAY(dev_priv)) {
+	if (HAS_GMCH(dev_priv)) {
 		if (IS_GM45(dev_priv))
 			return gm45_digital_port_connected(encoder);
 		else
@@ -5235,17 +5313,17 @@ bool intel_digital_port_connected(struct intel_encoder *encoder)
 
 	if (INTEL_GEN(dev_priv) >= 11)
 		return icl_digital_port_connected(encoder);
-	else if (IS_GEN10(dev_priv) || IS_GEN9_BC(dev_priv))
+	else if (IS_GEN(dev_priv, 10) || IS_GEN9_BC(dev_priv))
 		return spt_digital_port_connected(encoder);
 	else if (IS_GEN9_LP(dev_priv))
 		return bxt_digital_port_connected(encoder);
-	else if (IS_GEN8(dev_priv))
+	else if (IS_GEN(dev_priv, 8))
 		return bdw_digital_port_connected(encoder);
-	else if (IS_GEN7(dev_priv))
+	else if (IS_GEN(dev_priv, 7))
 		return ivb_digital_port_connected(encoder);
-	else if (IS_GEN6(dev_priv))
+	else if (IS_GEN(dev_priv, 6))
 		return snb_digital_port_connected(encoder);
-	else if (IS_GEN5(dev_priv))
+	else if (IS_GEN(dev_priv, 5))
 		return ilk_digital_port_connected(encoder);
 
 	MISSING_CASE(INTEL_GEN(dev_priv));
@@ -5307,12 +5385,13 @@ intel_dp_detect(struct drm_connector *connector,
 	enum drm_connector_status status;
 	enum intel_display_power_domain aux_domain =
 		intel_aux_power_domain(dig_port);
+	intel_wakeref_t wakeref;
 
 	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n",
 		      connector->base.id, connector->name);
 	WARN_ON(!drm_modeset_is_locked(&dev_priv->drm.mode_config.connection_mutex));
 
-	intel_display_power_get(dev_priv, aux_domain);
+	wakeref = intel_display_power_get(dev_priv, aux_domain);
 
 	/* Can't disconnect eDP */
 	if (intel_dp_is_edp(intel_dp))
@@ -5378,7 +5457,7 @@ intel_dp_detect(struct drm_connector *connector,
 
 		ret = intel_dp_retrain_link(encoder, ctx);
 		if (ret) {
-			intel_display_power_put(dev_priv, aux_domain);
+			intel_display_power_put(dev_priv, aux_domain, wakeref);
 			return ret;
 		}
 	}
@@ -5402,7 +5481,7 @@ out:
 	if (status != connector_status_connected && !intel_dp->is_mst)
 		intel_dp_unset_edid(intel_dp);
 
-	intel_display_power_put(dev_priv, aux_domain);
+	intel_display_power_put(dev_priv, aux_domain, wakeref);
 	return status;
 }
 
@@ -5415,6 +5494,7 @@ intel_dp_force(struct drm_connector *connector)
 	struct drm_i915_private *dev_priv = to_i915(intel_encoder->base.dev);
 	enum intel_display_power_domain aux_domain =
 		intel_aux_power_domain(dig_port);
+	intel_wakeref_t wakeref;
 
 	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n",
 		      connector->base.id, connector->name);
@@ -5423,11 +5503,11 @@ intel_dp_force(struct drm_connector *connector)
 	if (connector->status != connector_status_connected)
 		return;
 
-	intel_display_power_get(dev_priv, aux_domain);
+	wakeref = intel_display_power_get(dev_priv, aux_domain);
 
 	intel_dp_set_edid(intel_dp);
 
-	intel_display_power_put(dev_priv, aux_domain);
+	intel_display_power_put(dev_priv, aux_domain, wakeref);
 }
 
 static int intel_dp_get_modes(struct drm_connector *connector)
@@ -5492,21 +5572,22 @@ intel_dp_connector_unregister(struct drm_connector *connector)
 	intel_connector_unregister(connector);
 }
 
-void intel_dp_encoder_destroy(struct drm_encoder *encoder)
+void intel_dp_encoder_flush_work(struct drm_encoder *encoder)
 {
 	struct intel_digital_port *intel_dig_port = enc_to_dig_port(encoder);
 	struct intel_dp *intel_dp = &intel_dig_port->dp;
 
 	intel_dp_mst_encoder_cleanup(intel_dig_port);
 	if (intel_dp_is_edp(intel_dp)) {
+		intel_wakeref_t wakeref;
+
 		cancel_delayed_work_sync(&intel_dp->panel_vdd_work);
 		/*
 		 * vdd might still be enabled do to the delayed vdd off.
 		 * Make sure vdd is actually turned off here.
 		 */
-		pps_lock(intel_dp);
-		edp_panel_vdd_off_sync(intel_dp);
-		pps_unlock(intel_dp);
+		with_pps_lock(intel_dp, wakeref)
+			edp_panel_vdd_off_sync(intel_dp);
 
 		if (intel_dp->edp_notifier.notifier_call) {
 			unregister_reboot_notifier(&intel_dp->edp_notifier);
@@ -5515,14 +5596,20 @@ void intel_dp_encoder_destroy(struct drm_encoder *encoder)
 	}
 
 	intel_dp_aux_fini(intel_dp);
+}
+
+static void intel_dp_encoder_destroy(struct drm_encoder *encoder)
+{
+	intel_dp_encoder_flush_work(encoder);
 
 	drm_encoder_cleanup(encoder);
-	kfree(intel_dig_port);
+	kfree(enc_to_dig_port(encoder));
 }
 
 void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
 {
 	struct intel_dp *intel_dp = enc_to_intel_dp(&intel_encoder->base);
+	intel_wakeref_t wakeref;
 
 	if (!intel_dp_is_edp(intel_dp))
 		return;
@@ -5532,9 +5619,8 @@ void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder)
 	 * Make sure vdd is actually turned off here.
 	 */
 	cancel_delayed_work_sync(&intel_dp->panel_vdd_work);
-	pps_lock(intel_dp);
-	edp_panel_vdd_off_sync(intel_dp);
-	pps_unlock(intel_dp);
+	with_pps_lock(intel_dp, wakeref)
+		edp_panel_vdd_off_sync(intel_dp);
 }
 
 static
@@ -5547,7 +5633,7 @@ int intel_dp_hdcp_write_an_aksv(struct intel_digital_port *intel_dig_port,
 		.address = DP_AUX_HDCP_AKSV,
 		.size = DRM_HDCP_KSV_LEN,
 	};
-	uint8_t txbuf[HEADER_SIZE + DRM_HDCP_KSV_LEN] = {}, rxbuf[2], reply = 0;
+	u8 txbuf[HEADER_SIZE + DRM_HDCP_KSV_LEN] = {}, rxbuf[2], reply = 0;
 	ssize_t dpcd_ret;
 	int ret;
 
@@ -5580,7 +5666,12 @@ int intel_dp_hdcp_write_an_aksv(struct intel_digital_port *intel_dig_port,
 	}
 
 	reply = (rxbuf[0] >> 4) & DP_AUX_NATIVE_REPLY_MASK;
-	return reply == DP_AUX_NATIVE_REPLY_ACK ? 0 : -EIO;
+	if (reply != DP_AUX_NATIVE_REPLY_ACK) {
+		DRM_DEBUG_KMS("Aksv write: no DP_AUX_NATIVE_REPLY_ACK %x\n",
+			      reply);
+		return -EIO;
+	}
+	return 0;
 }
 
 static int intel_dp_hdcp_read_bksv(struct intel_digital_port *intel_dig_port,
@@ -5810,6 +5901,7 @@ void intel_dp_encoder_reset(struct drm_encoder *encoder)
 	struct drm_i915_private *dev_priv = to_i915(encoder->dev);
 	struct intel_dp *intel_dp = enc_to_intel_dp(encoder);
 	struct intel_lspcon *lspcon = dp_to_lspcon(intel_dp);
+	intel_wakeref_t wakeref;
 
 	if (!HAS_DDI(dev_priv))
 		intel_dp->DP = I915_READ(intel_dp->output_reg);
@@ -5819,18 +5911,19 @@ void intel_dp_encoder_reset(struct drm_encoder *encoder)
 
 	intel_dp->reset_link_params = true;
 
-	pps_lock(intel_dp);
-
-	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
-		intel_dp->active_pipe = vlv_active_pipe(intel_dp);
+	with_pps_lock(intel_dp, wakeref) {
+		if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
+			intel_dp->active_pipe = vlv_active_pipe(intel_dp);
 
-	if (intel_dp_is_edp(intel_dp)) {
-		/* Reinit the power sequencer, in case BIOS did something with it. */
-		intel_dp_pps_init(intel_dp);
-		intel_edp_panel_vdd_sanitize(intel_dp);
+		if (intel_dp_is_edp(intel_dp)) {
+			/*
+			 * Reinit the power sequencer, in case BIOS did
+			 * something nasty with it.
+			 */
+			intel_dp_pps_init(intel_dp);
+			intel_edp_panel_vdd_sanitize(intel_dp);
+		}
 	}
-
-	pps_unlock(intel_dp);
 }
 
 static const struct drm_connector_funcs intel_dp_connector_funcs = {
@@ -5863,6 +5956,7 @@ intel_dp_hpd_pulse(struct intel_digital_port *intel_dig_port, bool long_hpd)
 	struct intel_dp *intel_dp = &intel_dig_port->dp;
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 	enum irqreturn ret = IRQ_NONE;
+	intel_wakeref_t wakeref;
 
 	if (long_hpd && intel_dig_port->base.type == INTEL_OUTPUT_EDP) {
 		/*
@@ -5885,8 +5979,8 @@ intel_dp_hpd_pulse(struct intel_digital_port *intel_dig_port, bool long_hpd)
 		return IRQ_NONE;
 	}
 
-	intel_display_power_get(dev_priv,
-				intel_aux_power_domain(intel_dig_port));
+	wakeref = intel_display_power_get(dev_priv,
+					  intel_aux_power_domain(intel_dig_port));
 
 	if (intel_dp->is_mst) {
 		if (intel_dp_check_mst_status(intel_dp) == -EINVAL) {
@@ -5916,7 +6010,8 @@ intel_dp_hpd_pulse(struct intel_digital_port *intel_dig_port, bool long_hpd)
 
 put_power:
 	intel_display_power_put(dev_priv,
-				intel_aux_power_domain(intel_dig_port));
+				intel_aux_power_domain(intel_dig_port),
+				wakeref);
 
 	return ret;
 }
@@ -5947,7 +6042,7 @@ intel_dp_add_properties(struct intel_dp *intel_dp, struct drm_connector *connect
 		intel_attach_force_audio_property(connector);
 
 	intel_attach_broadcast_rgb_property(connector);
-	if (HAS_GMCH_DISPLAY(dev_priv))
+	if (HAS_GMCH(dev_priv))
 		drm_connector_attach_max_bpc_property(connector, 6, 10);
 	else if (INTEL_GEN(dev_priv) >= 5)
 		drm_connector_attach_max_bpc_property(connector, 6, 12);
@@ -5956,7 +6051,7 @@ intel_dp_add_properties(struct intel_dp *intel_dp, struct drm_connector *connect
 		u32 allowed_scalers;
 
 		allowed_scalers = BIT(DRM_MODE_SCALE_ASPECT) | BIT(DRM_MODE_SCALE_FULLSCREEN);
-		if (!HAS_GMCH_DISPLAY(dev_priv))
+		if (!HAS_GMCH(dev_priv))
 			allowed_scalers |= BIT(DRM_MODE_SCALE_CENTER);
 
 		drm_connector_attach_scaling_mode_property(connector, allowed_scalers);
@@ -6363,8 +6458,8 @@ void intel_edp_drrs_enable(struct intel_dp *intel_dp,
 	}
 
 	mutex_lock(&dev_priv->drrs.mutex);
-	if (WARN_ON(dev_priv->drrs.dp)) {
-		DRM_ERROR("DRRS already enabled\n");
+	if (dev_priv->drrs.dp) {
+		DRM_DEBUG_KMS("DRRS already enabled\n");
 		goto unlock;
 	}
 
@@ -6624,8 +6719,9 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,
 	struct drm_display_mode *downclock_mode = NULL;
 	bool has_dpcd;
 	struct drm_display_mode *scan;
-	struct edid *edid;
 	enum pipe pipe = INVALID_PIPE;
+	intel_wakeref_t wakeref;
+	struct edid *edid;
 
 	if (!intel_dp_is_edp(intel_dp))
 		return true;
@@ -6645,13 +6741,11 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,
 		return false;
 	}
 
-	pps_lock(intel_dp);
-
-	intel_dp_init_panel_power_timestamps(intel_dp);
-	intel_dp_pps_init(intel_dp);
-	intel_edp_panel_vdd_sanitize(intel_dp);
-
-	pps_unlock(intel_dp);
+	with_pps_lock(intel_dp, wakeref) {
+		intel_dp_init_panel_power_timestamps(intel_dp);
+		intel_dp_pps_init(intel_dp);
+		intel_edp_panel_vdd_sanitize(intel_dp);
+	}
 
 	/* Cache DPCD and EDID for edp. */
 	has_dpcd = intel_edp_init_dpcd(intel_dp);
@@ -6736,9 +6830,8 @@ out_vdd_off:
 	 * vdd might still be enabled do to the delayed vdd off.
 	 * Make sure vdd is actually turned off here.
 	 */
-	pps_lock(intel_dp);
-	edp_panel_vdd_off_sync(intel_dp);
-	pps_unlock(intel_dp);
+	with_pps_lock(intel_dp, wakeref)
+		edp_panel_vdd_off_sync(intel_dp);
 
 	return false;
 }
@@ -6830,7 +6923,7 @@ intel_dp_init_connector(struct intel_digital_port *intel_dig_port,
 	drm_connector_init(dev, connector, &intel_dp_connector_funcs, type);
 	drm_connector_helper_add(connector, &intel_dp_connector_helper_funcs);
 
-	if (!HAS_GMCH_DISPLAY(dev_priv))
+	if (!HAS_GMCH(dev_priv))
 		connector->interlace_allowed = true;
 	connector->doublescan_allowed = 0;
 
@@ -6912,6 +7005,7 @@ bool intel_dp_init(struct drm_i915_private *dev_priv,
 	intel_encoder->compute_config = intel_dp_compute_config;
 	intel_encoder->get_hw_state = intel_dp_get_hw_state;
 	intel_encoder->get_config = intel_dp_get_config;
+	intel_encoder->update_pipe = intel_panel_update_backlight;
 	intel_encoder->suspend = intel_dp_encoder_suspend;
 	if (IS_CHERRYVIEW(dev_priv)) {
 		intel_encoder->pre_pll_enable = chv_dp_pre_pll_enable;
@@ -7006,7 +7100,10 @@ void intel_dp_mst_resume(struct drm_i915_private *dev_priv)
 			continue;
 
 		ret = drm_dp_mst_topology_mgr_resume(&intel_dp->mst_mgr);
-		if (ret)
-			intel_dp_check_mst_status(intel_dp);
+		if (ret) {
+			intel_dp->is_mst = false;
+			drm_dp_mst_topology_mgr_set_mst(&intel_dp->mst_mgr,
+							false);
+		}
 	}
 }
diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c b/drivers/gpu/drm/i915/intel_dp_link_training.c
index 30be0e39bd5f..b59c87daa4f7 100644
--- a/drivers/gpu/drm/i915/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
@@ -24,7 +24,7 @@
 #include "intel_drv.h"
 
 static void
-intel_dp_dump_link_status(const uint8_t link_status[DP_LINK_STATUS_SIZE])
+intel_dp_dump_link_status(const u8 link_status[DP_LINK_STATUS_SIZE])
 {
 
 	DRM_DEBUG_KMS("ln0_1:0x%x ln2_3:0x%x align:0x%x sink:0x%x adj_req0_1:0x%x adj_req2_3:0x%x",
@@ -34,17 +34,17 @@ intel_dp_dump_link_status(const uint8_t link_status[DP_LINK_STATUS_SIZE])
 
 static void
 intel_get_adjust_train(struct intel_dp *intel_dp,
-		       const uint8_t link_status[DP_LINK_STATUS_SIZE])
+		       const u8 link_status[DP_LINK_STATUS_SIZE])
 {
-	uint8_t v = 0;
-	uint8_t p = 0;
+	u8 v = 0;
+	u8 p = 0;
 	int lane;
-	uint8_t voltage_max;
-	uint8_t preemph_max;
+	u8 voltage_max;
+	u8 preemph_max;
 
 	for (lane = 0; lane < intel_dp->lane_count; lane++) {
-		uint8_t this_v = drm_dp_get_adjust_request_voltage(link_status, lane);
-		uint8_t this_p = drm_dp_get_adjust_request_pre_emphasis(link_status, lane);
+		u8 this_v = drm_dp_get_adjust_request_voltage(link_status, lane);
+		u8 this_p = drm_dp_get_adjust_request_pre_emphasis(link_status, lane);
 
 		if (this_v > v)
 			v = this_v;
@@ -66,9 +66,9 @@ intel_get_adjust_train(struct intel_dp *intel_dp,
 
 static bool
 intel_dp_set_link_train(struct intel_dp *intel_dp,
-			uint8_t dp_train_pat)
+			u8 dp_train_pat)
 {
-	uint8_t buf[sizeof(intel_dp->train_set) + 1];
+	u8 buf[sizeof(intel_dp->train_set) + 1];
 	int ret, len;
 
 	intel_dp_program_link_training_pattern(intel_dp, dp_train_pat);
@@ -92,7 +92,7 @@ intel_dp_set_link_train(struct intel_dp *intel_dp,
 
 static bool
 intel_dp_reset_link_train(struct intel_dp *intel_dp,
-			uint8_t dp_train_pat)
+			u8 dp_train_pat)
 {
 	memset(intel_dp->train_set, 0, sizeof(intel_dp->train_set));
 	intel_dp_set_signal_levels(intel_dp);
@@ -128,11 +128,11 @@ static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
 static bool
 intel_dp_link_training_clock_recovery(struct intel_dp *intel_dp)
 {
-	uint8_t voltage;
+	u8 voltage;
 	int voltage_tries, cr_tries, max_cr_tries;
 	bool max_vswing_reached = false;
-	uint8_t link_config[2];
-	uint8_t link_bw, rate_select;
+	u8 link_config[2];
+	u8 link_bw, rate_select;
 
 	if (intel_dp->prepare_link_retrain)
 		intel_dp->prepare_link_retrain(intel_dp);
@@ -186,7 +186,7 @@ intel_dp_link_training_clock_recovery(struct intel_dp *intel_dp)
 
 	voltage_tries = 1;
 	for (cr_tries = 0; cr_tries < max_cr_tries; ++cr_tries) {
-		uint8_t link_status[DP_LINK_STATUS_SIZE];
+		u8 link_status[DP_LINK_STATUS_SIZE];
 
 		drm_dp_link_train_clock_recovery_delay(intel_dp->dpcd);
 
@@ -282,7 +282,7 @@ intel_dp_link_training_channel_equalization(struct intel_dp *intel_dp)
 {
 	int tries;
 	u32 training_pattern;
-	uint8_t link_status[DP_LINK_STATUS_SIZE];
+	u8 link_status[DP_LINK_STATUS_SIZE];
 	bool channel_eq = false;
 
 	training_pattern = intel_dp_training_pattern(intel_dp);
diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c b/drivers/gpu/drm/i915/intel_dp_mst.c
index 4de247ddf05f..fb67cd931117 100644
--- a/drivers/gpu/drm/i915/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/intel_dp_mst.c
@@ -23,16 +23,15 @@
  *
  */
 
-#include <drm/drmP.h>
 #include "i915_drv.h"
 #include "intel_drv.h"
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 
-static bool intel_dp_mst_compute_config(struct intel_encoder *encoder,
-					struct intel_crtc_state *pipe_config,
-					struct drm_connector_state *conn_state)
+static int intel_dp_mst_compute_config(struct intel_encoder *encoder,
+				       struct intel_crtc_state *pipe_config,
+				       struct drm_connector_state *conn_state)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_dp_mst_encoder *intel_mst = enc_to_mst(&encoder->base);
@@ -41,15 +40,19 @@ static bool intel_dp_mst_compute_config(struct intel_encoder *encoder,
 	struct drm_connector *connector = conn_state->connector;
 	void *port = to_intel_connector(connector)->port;
 	struct drm_atomic_state *state = pipe_config->base.state;
+	struct drm_crtc *crtc = pipe_config->base.crtc;
+	struct drm_crtc_state *old_crtc_state =
+		drm_atomic_get_old_crtc_state(state, crtc);
 	int bpp;
-	int lane_count, slots = 0;
+	int lane_count, slots =
+		to_intel_crtc_state(old_crtc_state)->dp_m_n.tu;
 	const struct drm_display_mode *adjusted_mode = &pipe_config->base.adjusted_mode;
 	int mst_pbn;
 	bool constant_n = drm_dp_has_quirk(&intel_dp->desc,
 					   DP_DPCD_QUIRK_CONSTANT_N);
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
 	pipe_config->has_pch_encoder = false;
@@ -77,17 +80,12 @@ static bool intel_dp_mst_compute_config(struct intel_encoder *encoder,
 	mst_pbn = drm_dp_calc_pbn_mode(adjusted_mode->crtc_clock, bpp);
 	pipe_config->pbn = mst_pbn;
 
-	/* Zombie connectors can't have VCPI slots */
-	if (!drm_connector_is_unregistered(connector)) {
-		slots = drm_dp_atomic_find_vcpi_slots(state,
-						      &intel_dp->mst_mgr,
-						      port,
-						      mst_pbn);
-		if (slots < 0) {
-			DRM_DEBUG_KMS("failed finding vcpi slots:%d\n",
-				      slots);
-			return false;
-		}
+	slots = drm_dp_atomic_find_vcpi_slots(state, &intel_dp->mst_mgr, port,
+					      mst_pbn);
+	if (slots < 0) {
+		DRM_DEBUG_KMS("failed finding vcpi slots:%d\n",
+			      slots);
+		return slots;
 	}
 
 	intel_link_compute_m_n(bpp, lane_count,
@@ -104,38 +102,42 @@ static bool intel_dp_mst_compute_config(struct intel_encoder *encoder,
 
 	intel_ddi_compute_min_voltage_level(dev_priv, pipe_config);
 
-	return true;
+	return 0;
 }
 
-static int intel_dp_mst_atomic_check(struct drm_connector *connector,
-		struct drm_connector_state *new_conn_state)
+static int
+intel_dp_mst_atomic_check(struct drm_connector *connector,
+			  struct drm_connector_state *new_conn_state)
 {
 	struct drm_atomic_state *state = new_conn_state->state;
-	struct drm_connector_state *old_conn_state;
-	struct drm_crtc *old_crtc;
+	struct drm_connector_state *old_conn_state =
+		drm_atomic_get_old_connector_state(state, connector);
+	struct intel_connector *intel_connector =
+		to_intel_connector(connector);
+	struct drm_crtc *new_crtc = new_conn_state->crtc;
 	struct drm_crtc_state *crtc_state;
-	int slots, ret = 0;
-
-	old_conn_state = drm_atomic_get_old_connector_state(state, connector);
-	old_crtc = old_conn_state->crtc;
-	if (!old_crtc)
-		return ret;
+	struct drm_dp_mst_topology_mgr *mgr;
+	int ret = 0;
 
-	crtc_state = drm_atomic_get_new_crtc_state(state, old_crtc);
-	slots = to_intel_crtc_state(crtc_state)->dp_m_n.tu;
-	if (drm_atomic_crtc_needs_modeset(crtc_state) && slots > 0) {
-		struct drm_dp_mst_topology_mgr *mgr;
-		struct drm_encoder *old_encoder;
+	if (!old_conn_state->crtc)
+		return 0;
 
-		old_encoder = old_conn_state->best_encoder;
-		mgr = &enc_to_mst(old_encoder)->primary->dp.mst_mgr;
+	/* We only want to free VCPI if this state disables the CRTC on this
+	 * connector
+	 */
+	if (new_crtc) {
+		crtc_state = drm_atomic_get_new_crtc_state(state, new_crtc);
 
-		ret = drm_dp_atomic_release_vcpi_slots(state, mgr, slots);
-		if (ret)
-			DRM_DEBUG_KMS("failed releasing %d vcpi slots:%d\n", slots, ret);
-		else
-			to_intel_crtc_state(crtc_state)->dp_m_n.tu = 0;
+		if (!crtc_state ||
+		    !drm_atomic_crtc_needs_modeset(crtc_state) ||
+		    crtc_state->enable)
+			return 0;
 	}
+
+	mgr = &enc_to_mst(old_conn_state->best_encoder)->primary->dp.mst_mgr;
+	ret = drm_dp_atomic_release_vcpi_slots(state, mgr,
+					       intel_connector->port);
+
 	return ret;
 }
 
@@ -240,7 +242,7 @@ static void intel_mst_pre_enable_dp(struct intel_encoder *encoder,
 	struct intel_connector *connector =
 		to_intel_connector(conn_state->connector);
 	int ret;
-	uint32_t temp;
+	u32 temp;
 
 	/* MST encoders are bound to a crtc, not to a connector,
 	 * force the mapping here for get_hw_state.
@@ -457,6 +459,7 @@ static struct drm_connector *intel_dp_add_mst_connector(struct drm_dp_mst_topolo
 	intel_connector->get_hw_state = intel_dp_mst_get_hw_state;
 	intel_connector->mst_port = intel_dp;
 	intel_connector->port = port;
+	drm_dp_mst_get_port_malloc(port);
 
 	connector = &intel_connector->base;
 	ret = drm_connector_init(dev, connector, &intel_dp_mst_connector_funcs,
@@ -517,20 +520,10 @@ static void intel_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
 	drm_connector_put(connector);
 }
 
-static void intel_dp_mst_hotplug(struct drm_dp_mst_topology_mgr *mgr)
-{
-	struct intel_dp *intel_dp = container_of(mgr, struct intel_dp, mst_mgr);
-	struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
-	struct drm_device *dev = intel_dig_port->base.base.dev;
-
-	drm_kms_helper_hotplug_event(dev);
-}
-
 static const struct drm_dp_mst_topology_cbs mst_cbs = {
 	.add_connector = intel_dp_add_mst_connector,
 	.register_connector = intel_dp_register_mst_connector,
 	.destroy_connector = intel_dp_destroy_mst_connector,
-	.hotplug = intel_dp_mst_hotplug,
 };
 
 static struct intel_dp_mst_encoder *
diff --git a/drivers/gpu/drm/i915/intel_dpio_phy.c b/drivers/gpu/drm/i915/intel_dpio_phy.c
index 3c7f10d17658..95cb8b154f87 100644
--- a/drivers/gpu/drm/i915/intel_dpio_phy.c
+++ b/drivers/gpu/drm/i915/intel_dpio_phy.c
@@ -413,7 +413,7 @@ static void _bxt_ddi_phy_init(struct drm_i915_private *dev_priv,
 	}
 
 	if (phy_info->rcomp_phy != -1) {
-		uint32_t grc_code;
+		u32 grc_code;
 
 		bxt_phy_wait_grc_done(dev_priv, phy_info->rcomp_phy);
 
@@ -445,7 +445,7 @@ static void _bxt_ddi_phy_init(struct drm_i915_private *dev_priv,
 void bxt_ddi_phy_uninit(struct drm_i915_private *dev_priv, enum dpio_phy phy)
 {
 	const struct bxt_ddi_phy_info *phy_info;
-	uint32_t val;
+	u32 val;
 
 	phy_info = bxt_get_phy_info(dev_priv, phy);
 
@@ -515,7 +515,7 @@ bool bxt_ddi_phy_verify_state(struct drm_i915_private *dev_priv,
 			      enum dpio_phy phy)
 {
 	const struct bxt_ddi_phy_info *phy_info;
-	uint32_t mask;
+	u32 mask;
 	bool ok;
 
 	phy_info = bxt_get_phy_info(dev_priv, phy);
@@ -567,8 +567,8 @@ bool bxt_ddi_phy_verify_state(struct drm_i915_private *dev_priv,
 #undef _CHK
 }
 
-uint8_t
-bxt_ddi_phy_calc_lane_lat_optim_mask(uint8_t lane_count)
+u8
+bxt_ddi_phy_calc_lane_lat_optim_mask(u8 lane_count)
 {
 	switch (lane_count) {
 	case 1:
@@ -585,7 +585,7 @@ bxt_ddi_phy_calc_lane_lat_optim_mask(uint8_t lane_count)
 }
 
 void bxt_ddi_phy_set_lane_optim_mask(struct intel_encoder *encoder,
-				     uint8_t lane_lat_optim_mask)
+				     u8 lane_lat_optim_mask)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	enum port port = encoder->port;
@@ -610,7 +610,7 @@ void bxt_ddi_phy_set_lane_optim_mask(struct intel_encoder *encoder,
 	}
 }
 
-uint8_t
+u8
 bxt_ddi_phy_get_lane_lat_optim_mask(struct intel_encoder *encoder)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
@@ -618,7 +618,7 @@ bxt_ddi_phy_get_lane_lat_optim_mask(struct intel_encoder *encoder)
 	enum dpio_phy phy;
 	enum dpio_channel ch;
 	int lane;
-	uint8_t mask;
+	u8 mask;
 
 	bxt_port_to_phy_channel(dev_priv, port, &phy, &ch);
 
@@ -739,7 +739,7 @@ void chv_data_lane_soft_reset(struct intel_encoder *encoder,
 	enum dpio_channel ch = vlv_dport_to_channel(enc_to_dig_port(&encoder->base));
 	struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
 	enum pipe pipe = crtc->pipe;
-	uint32_t val;
+	u32 val;
 
 	val = vlv_dpio_read(dev_priv, pipe, VLV_PCS01_DW0(ch));
 	if (reset)
diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c b/drivers/gpu/drm/i915/intel_dpll_mgr.c
index d513ca875c67..0a42d11c4c33 100644
--- a/drivers/gpu/drm/i915/intel_dpll_mgr.c
+++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c
@@ -247,7 +247,7 @@ intel_find_shared_dpll(struct intel_crtc *crtc,
 		       enum intel_dpll_id range_max)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
-	struct intel_shared_dpll *pll;
+	struct intel_shared_dpll *pll, *unused_pll = NULL;
 	struct intel_shared_dpll_state *shared_dpll;
 	enum intel_dpll_id i;
 
@@ -257,8 +257,11 @@ intel_find_shared_dpll(struct intel_crtc *crtc,
 		pll = &dev_priv->shared_dplls[i];
 
 		/* Only want to check enabled timings first */
-		if (shared_dpll[i].crtc_mask == 0)
+		if (shared_dpll[i].crtc_mask == 0) {
+			if (!unused_pll)
+				unused_pll = pll;
 			continue;
+		}
 
 		if (memcmp(&crtc_state->dpll_hw_state,
 			   &shared_dpll[i].hw_state,
@@ -273,14 +276,11 @@ intel_find_shared_dpll(struct intel_crtc *crtc,
 	}
 
 	/* Ok no matching timings, maybe there's a free one? */
-	for (i = range_min; i <= range_max; i++) {
-		pll = &dev_priv->shared_dplls[i];
-		if (shared_dpll[i].crtc_mask == 0) {
-			DRM_DEBUG_KMS("[CRTC:%d:%s] allocated %s\n",
-				      crtc->base.base.id, crtc->base.name,
-				      pll->info->name);
-			return pll;
-		}
+	if (unused_pll) {
+		DRM_DEBUG_KMS("[CRTC:%d:%s] allocated %s\n",
+			      crtc->base.base.id, crtc->base.name,
+			      unused_pll->info->name);
+		return unused_pll;
 	}
 
 	return NULL;
@@ -345,9 +345,12 @@ static bool ibx_pch_dpll_get_hw_state(struct drm_i915_private *dev_priv,
 				      struct intel_dpll_hw_state *hw_state)
 {
 	const enum intel_dpll_id id = pll->info->id;
-	uint32_t val;
+	intel_wakeref_t wakeref;
+	u32 val;
 
-	if (!intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_PLLS))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     POWER_DOMAIN_PLLS);
+	if (!wakeref)
 		return false;
 
 	val = I915_READ(PCH_DPLL(id));
@@ -355,7 +358,7 @@ static bool ibx_pch_dpll_get_hw_state(struct drm_i915_private *dev_priv,
 	hw_state->fp0 = I915_READ(PCH_FP0(id));
 	hw_state->fp1 = I915_READ(PCH_FP1(id));
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS, wakeref);
 
 	return val & DPLL_VCO_ENABLE;
 }
@@ -487,7 +490,7 @@ static void hsw_ddi_wrpll_disable(struct drm_i915_private *dev_priv,
 				  struct intel_shared_dpll *pll)
 {
 	const enum intel_dpll_id id = pll->info->id;
-	uint32_t val;
+	u32 val;
 
 	val = I915_READ(WRPLL_CTL(id));
 	I915_WRITE(WRPLL_CTL(id), val & ~WRPLL_PLL_ENABLE);
@@ -497,7 +500,7 @@ static void hsw_ddi_wrpll_disable(struct drm_i915_private *dev_priv,
 static void hsw_ddi_spll_disable(struct drm_i915_private *dev_priv,
 				 struct intel_shared_dpll *pll)
 {
-	uint32_t val;
+	u32 val;
 
 	val = I915_READ(SPLL_CTL);
 	I915_WRITE(SPLL_CTL, val & ~SPLL_PLL_ENABLE);
@@ -509,15 +512,18 @@ static bool hsw_ddi_wrpll_get_hw_state(struct drm_i915_private *dev_priv,
 				       struct intel_dpll_hw_state *hw_state)
 {
 	const enum intel_dpll_id id = pll->info->id;
-	uint32_t val;
+	intel_wakeref_t wakeref;
+	u32 val;
 
-	if (!intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_PLLS))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     POWER_DOMAIN_PLLS);
+	if (!wakeref)
 		return false;
 
 	val = I915_READ(WRPLL_CTL(id));
 	hw_state->wrpll = val;
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS, wakeref);
 
 	return val & WRPLL_PLL_ENABLE;
 }
@@ -526,15 +532,18 @@ static bool hsw_ddi_spll_get_hw_state(struct drm_i915_private *dev_priv,
 				      struct intel_shared_dpll *pll,
 				      struct intel_dpll_hw_state *hw_state)
 {
-	uint32_t val;
+	intel_wakeref_t wakeref;
+	u32 val;
 
-	if (!intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_PLLS))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     POWER_DOMAIN_PLLS);
+	if (!wakeref)
 		return false;
 
 	val = I915_READ(SPLL_CTL);
 	hw_state->spll = val;
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS, wakeref);
 
 	return val & SPLL_PLL_ENABLE;
 }
@@ -630,11 +639,12 @@ static unsigned hsw_wrpll_get_budget_for_freq(int clock)
 	return budget;
 }
 
-static void hsw_wrpll_update_rnp(uint64_t freq2k, unsigned budget,
-				 unsigned r2, unsigned n2, unsigned p,
+static void hsw_wrpll_update_rnp(u64 freq2k, unsigned int budget,
+				 unsigned int r2, unsigned int n2,
+				 unsigned int p,
 				 struct hsw_wrpll_rnp *best)
 {
-	uint64_t a, b, c, d, diff, diff_best;
+	u64 a, b, c, d, diff, diff_best;
 
 	/* No best (r,n,p) yet */
 	if (best->p == 0) {
@@ -693,7 +703,7 @@ static void
 hsw_ddi_calculate_wrpll(int clock /* in Hz */,
 			unsigned *r2_out, unsigned *n2_out, unsigned *p_out)
 {
-	uint64_t freq2k;
+	u64 freq2k;
 	unsigned p, n2, r2;
 	struct hsw_wrpll_rnp best = { 0, 0, 0 };
 	unsigned budget;
@@ -759,7 +769,7 @@ static struct intel_shared_dpll *hsw_ddi_hdmi_get_dpll(int clock,
 						       struct intel_crtc_state *crtc_state)
 {
 	struct intel_shared_dpll *pll;
-	uint32_t val;
+	u32 val;
 	unsigned int p, n2, r2;
 
 	hsw_ddi_calculate_wrpll(clock * 1000, &r2, &n2, &p);
@@ -921,7 +931,7 @@ static void skl_ddi_pll_write_ctrl1(struct drm_i915_private *dev_priv,
 				    struct intel_shared_dpll *pll)
 {
 	const enum intel_dpll_id id = pll->info->id;
-	uint32_t val;
+	u32 val;
 
 	val = I915_READ(DPLL_CTRL1);
 
@@ -986,12 +996,15 @@ static bool skl_ddi_pll_get_hw_state(struct drm_i915_private *dev_priv,
 				     struct intel_shared_dpll *pll,
 				     struct intel_dpll_hw_state *hw_state)
 {
-	uint32_t val;
+	u32 val;
 	const struct skl_dpll_regs *regs = skl_dpll_regs;
 	const enum intel_dpll_id id = pll->info->id;
+	intel_wakeref_t wakeref;
 	bool ret;
 
-	if (!intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_PLLS))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     POWER_DOMAIN_PLLS);
+	if (!wakeref)
 		return false;
 
 	ret = false;
@@ -1011,7 +1024,7 @@ static bool skl_ddi_pll_get_hw_state(struct drm_i915_private *dev_priv,
 	ret = true;
 
 out:
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS, wakeref);
 
 	return ret;
 }
@@ -1020,12 +1033,15 @@ static bool skl_ddi_dpll0_get_hw_state(struct drm_i915_private *dev_priv,
 				       struct intel_shared_dpll *pll,
 				       struct intel_dpll_hw_state *hw_state)
 {
-	uint32_t val;
 	const struct skl_dpll_regs *regs = skl_dpll_regs;
 	const enum intel_dpll_id id = pll->info->id;
+	intel_wakeref_t wakeref;
+	u32 val;
 	bool ret;
 
-	if (!intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_PLLS))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     POWER_DOMAIN_PLLS);
+	if (!wakeref)
 		return false;
 
 	ret = false;
@@ -1041,15 +1057,15 @@ static bool skl_ddi_dpll0_get_hw_state(struct drm_i915_private *dev_priv,
 	ret = true;
 
 out:
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS, wakeref);
 
 	return ret;
 }
 
 struct skl_wrpll_context {
-	uint64_t min_deviation;		/* current minimal deviation */
-	uint64_t central_freq;		/* chosen central freq */
-	uint64_t dco_freq;		/* chosen dco freq */
+	u64 min_deviation;		/* current minimal deviation */
+	u64 central_freq;		/* chosen central freq */
+	u64 dco_freq;			/* chosen dco freq */
 	unsigned int p;			/* chosen divider */
 };
 
@@ -1065,11 +1081,11 @@ static void skl_wrpll_context_init(struct skl_wrpll_context *ctx)
 #define SKL_DCO_MAX_NDEVIATION	600
 
 static void skl_wrpll_try_divider(struct skl_wrpll_context *ctx,
-				  uint64_t central_freq,
-				  uint64_t dco_freq,
+				  u64 central_freq,
+				  u64 dco_freq,
 				  unsigned int divider)
 {
-	uint64_t deviation;
+	u64 deviation;
 
 	deviation = div64_u64(10000 * abs_diff(dco_freq, central_freq),
 			      central_freq);
@@ -1143,21 +1159,21 @@ static void skl_wrpll_get_multipliers(unsigned int p,
 }
 
 struct skl_wrpll_params {
-	uint32_t        dco_fraction;
-	uint32_t        dco_integer;
-	uint32_t        qdiv_ratio;
-	uint32_t        qdiv_mode;
-	uint32_t        kdiv;
-	uint32_t        pdiv;
-	uint32_t        central_freq;
+	u32 dco_fraction;
+	u32 dco_integer;
+	u32 qdiv_ratio;
+	u32 qdiv_mode;
+	u32 kdiv;
+	u32 pdiv;
+	u32 central_freq;
 };
 
 static void skl_wrpll_params_populate(struct skl_wrpll_params *params,
-				      uint64_t afe_clock,
-				      uint64_t central_freq,
-				      uint32_t p0, uint32_t p1, uint32_t p2)
+				      u64 afe_clock,
+				      u64 central_freq,
+				      u32 p0, u32 p1, u32 p2)
 {
-	uint64_t dco_freq;
+	u64 dco_freq;
 
 	switch (central_freq) {
 	case 9600000000ULL:
@@ -1223,10 +1239,10 @@ static bool
 skl_ddi_calculate_wrpll(int clock /* in Hz */,
 			struct skl_wrpll_params *wrpll_params)
 {
-	uint64_t afe_clock = clock * 5; /* AFE Clock is 5x Pixel clock */
-	uint64_t dco_central_freq[3] = {8400000000ULL,
-					9000000000ULL,
-					9600000000ULL};
+	u64 afe_clock = clock * 5; /* AFE Clock is 5x Pixel clock */
+	u64 dco_central_freq[3] = { 8400000000ULL,
+				    9000000000ULL,
+				    9600000000ULL };
 	static const int even_dividers[] = {  4,  6,  8, 10, 12, 14, 16, 18, 20,
 					     24, 28, 30, 32, 36, 40, 42, 44,
 					     48, 52, 54, 56, 60, 64, 66, 68,
@@ -1250,7 +1266,7 @@ skl_ddi_calculate_wrpll(int clock /* in Hz */,
 		for (dco = 0; dco < ARRAY_SIZE(dco_central_freq); dco++) {
 			for (i = 0; i < dividers[d].n_dividers; i++) {
 				unsigned int p = dividers[d].list[i];
-				uint64_t dco_freq = p * afe_clock;
+				u64 dco_freq = p * afe_clock;
 
 				skl_wrpll_try_divider(&ctx,
 						      dco_central_freq[dco],
@@ -1296,7 +1312,7 @@ static bool skl_ddi_hdmi_pll_dividers(struct intel_crtc *crtc,
 				      struct intel_crtc_state *crtc_state,
 				      int clock)
 {
-	uint32_t ctrl1, cfgcr1, cfgcr2;
+	u32 ctrl1, cfgcr1, cfgcr2;
 	struct skl_wrpll_params wrpll_params = { 0, };
 
 	/*
@@ -1333,7 +1349,7 @@ static bool
 skl_ddi_dp_set_dpll_hw_state(int clock,
 			     struct intel_dpll_hw_state *dpll_hw_state)
 {
-	uint32_t ctrl1;
+	u32 ctrl1;
 
 	/*
 	 * See comment in intel_dpll_hw_state to understand why we always use 0
@@ -1435,7 +1451,7 @@ static const struct intel_shared_dpll_funcs skl_ddi_dpll0_funcs = {
 static void bxt_ddi_pll_enable(struct drm_i915_private *dev_priv,
 				struct intel_shared_dpll *pll)
 {
-	uint32_t temp;
+	u32 temp;
 	enum port port = (enum port)pll->info->id; /* 1:1 port->PLL mapping */
 	enum dpio_phy phy;
 	enum dpio_channel ch;
@@ -1556,7 +1572,7 @@ static void bxt_ddi_pll_disable(struct drm_i915_private *dev_priv,
 					struct intel_shared_dpll *pll)
 {
 	enum port port = (enum port)pll->info->id; /* 1:1 port->PLL mapping */
-	uint32_t temp;
+	u32 temp;
 
 	temp = I915_READ(BXT_PORT_PLL_ENABLE(port));
 	temp &= ~PORT_PLL_ENABLE;
@@ -1579,14 +1595,17 @@ static bool bxt_ddi_pll_get_hw_state(struct drm_i915_private *dev_priv,
 					struct intel_dpll_hw_state *hw_state)
 {
 	enum port port = (enum port)pll->info->id; /* 1:1 port->PLL mapping */
-	uint32_t val;
-	bool ret;
+	intel_wakeref_t wakeref;
 	enum dpio_phy phy;
 	enum dpio_channel ch;
+	u32 val;
+	bool ret;
 
 	bxt_port_to_phy_channel(dev_priv, port, &phy, &ch);
 
-	if (!intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_PLLS))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     POWER_DOMAIN_PLLS);
+	if (!wakeref)
 		return false;
 
 	ret = false;
@@ -1643,7 +1662,7 @@ static bool bxt_ddi_pll_get_hw_state(struct drm_i915_private *dev_priv,
 	ret = true;
 
 out:
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS, wakeref);
 
 	return ret;
 }
@@ -1651,12 +1670,12 @@ out:
 /* bxt clock parameters */
 struct bxt_clk_div {
 	int clock;
-	uint32_t p1;
-	uint32_t p2;
-	uint32_t m2_int;
-	uint32_t m2_frac;
+	u32 p1;
+	u32 p2;
+	u32 m2_int;
+	u32 m2_frac;
 	bool m2_frac_en;
-	uint32_t n;
+	u32 n;
 
 	int vco;
 };
@@ -1723,8 +1742,8 @@ static bool bxt_ddi_set_dpll_hw_state(int clock,
 			  struct intel_dpll_hw_state *dpll_hw_state)
 {
 	int vco = clk_div->vco;
-	uint32_t prop_coef, int_coef, gain_ctl, targ_cnt;
-	uint32_t lanestagger;
+	u32 prop_coef, int_coef, gain_ctl, targ_cnt;
+	u32 lanestagger;
 
 	if (vco >= 6200000 && vco <= 6700000) {
 		prop_coef = 4;
@@ -1873,7 +1892,7 @@ static void intel_ddi_pll_init(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = to_i915(dev);
 
 	if (INTEL_GEN(dev_priv) < 9) {
-		uint32_t val = I915_READ(LCPLL_CTL);
+		u32 val = I915_READ(LCPLL_CTL);
 
 		/*
 		 * The LCPLL register should be turned on by the BIOS. For now
@@ -1959,7 +1978,7 @@ static void cnl_ddi_pll_enable(struct drm_i915_private *dev_priv,
 			       struct intel_shared_dpll *pll)
 {
 	const enum intel_dpll_id id = pll->info->id;
-	uint32_t val;
+	u32 val;
 
 	/* 1. Enable DPLL power in DPLL_ENABLE. */
 	val = I915_READ(CNL_DPLL_ENABLE(id));
@@ -2034,7 +2053,7 @@ static void cnl_ddi_pll_disable(struct drm_i915_private *dev_priv,
 				struct intel_shared_dpll *pll)
 {
 	const enum intel_dpll_id id = pll->info->id;
-	uint32_t val;
+	u32 val;
 
 	/*
 	 * 1. Configure DPCLKA_CFGCR0 to turn off the clock for the DDI.
@@ -2091,10 +2110,13 @@ static bool cnl_ddi_pll_get_hw_state(struct drm_i915_private *dev_priv,
 				     struct intel_dpll_hw_state *hw_state)
 {
 	const enum intel_dpll_id id = pll->info->id;
-	uint32_t val;
+	intel_wakeref_t wakeref;
+	u32 val;
 	bool ret;
 
-	if (!intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_PLLS))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     POWER_DOMAIN_PLLS);
+	if (!wakeref)
 		return false;
 
 	ret = false;
@@ -2113,7 +2135,7 @@ static bool cnl_ddi_pll_get_hw_state(struct drm_i915_private *dev_priv,
 	ret = true;
 
 out:
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS, wakeref);
 
 	return ret;
 }
@@ -2225,7 +2247,7 @@ cnl_ddi_calculate_wrpll(int clock,
 			struct skl_wrpll_params *wrpll_params)
 {
 	u32 afe_clock = clock * 5;
-	uint32_t ref_clock;
+	u32 ref_clock;
 	u32 dco_min = 7998000;
 	u32 dco_max = 10000000;
 	u32 dco_mid = (dco_min + dco_max) / 2;
@@ -2271,7 +2293,7 @@ static bool cnl_ddi_hdmi_pll_dividers(struct intel_crtc *crtc,
 				      int clock)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
-	uint32_t cfgcr0, cfgcr1;
+	u32 cfgcr0, cfgcr1;
 	struct skl_wrpll_params wrpll_params = { 0, };
 
 	cfgcr0 = DPLL_CFGCR0_HDMI_MODE;
@@ -2300,7 +2322,7 @@ static bool
 cnl_ddi_dp_set_dpll_hw_state(int clock,
 			     struct intel_dpll_hw_state *dpll_hw_state)
 {
-	uint32_t cfgcr0;
+	u32 cfgcr0;
 
 	cfgcr0 = DPLL_CFGCR0_SSC_ENABLE;
 
@@ -2517,7 +2539,7 @@ static bool icl_calc_dpll_state(struct intel_crtc_state *crtc_state,
 				struct intel_dpll_hw_state *pll_state)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
-	uint32_t cfgcr0, cfgcr1;
+	u32 cfgcr0, cfgcr1;
 	struct skl_wrpll_params pll_params = { 0 };
 	bool ret;
 
@@ -2547,10 +2569,10 @@ static bool icl_calc_dpll_state(struct intel_crtc_state *crtc_state,
 }
 
 int icl_calc_dp_combo_pll_link(struct drm_i915_private *dev_priv,
-			       uint32_t pll_id)
+			       u32 pll_id)
 {
-	uint32_t cfgcr0, cfgcr1;
-	uint32_t pdiv, kdiv, qdiv_mode, qdiv_ratio, dco_integer, dco_fraction;
+	u32 cfgcr0, cfgcr1;
+	u32 pdiv, kdiv, qdiv_mode, qdiv_ratio, dco_integer, dco_fraction;
 	const struct skl_wrpll_params *params;
 	int index, n_entries, link_clock;
 
@@ -2617,14 +2639,14 @@ int icl_calc_dp_combo_pll_link(struct drm_i915_private *dev_priv,
 	return link_clock;
 }
 
-static enum port icl_mg_pll_id_to_port(enum intel_dpll_id id)
+static enum tc_port icl_pll_id_to_tc_port(enum intel_dpll_id id)
 {
-	return id - DPLL_ID_ICL_MGPLL1 + PORT_C;
+	return id - DPLL_ID_ICL_MGPLL1;
 }
 
-enum intel_dpll_id icl_port_to_mg_pll_id(enum port port)
+enum intel_dpll_id icl_tc_port_to_pll_id(enum tc_port tc_port)
 {
-	return port - PORT_C + DPLL_ID_ICL_MGPLL1;
+	return tc_port + DPLL_ID_ICL_MGPLL1;
 }
 
 bool intel_dpll_is_combophy(enum intel_dpll_id id)
@@ -2633,10 +2655,10 @@ bool intel_dpll_is_combophy(enum intel_dpll_id id)
 }
 
 static bool icl_mg_pll_find_divisors(int clock_khz, bool is_dp, bool use_ssc,
-				     uint32_t *target_dco_khz,
+				     u32 *target_dco_khz,
 				     struct intel_dpll_hw_state *state)
 {
-	uint32_t dco_min_freq, dco_max_freq;
+	u32 dco_min_freq, dco_max_freq;
 	int div1_vals[] = {7, 5, 3, 2};
 	unsigned int i;
 	int div2;
@@ -2712,12 +2734,12 @@ static bool icl_calc_mg_pll_state(struct intel_crtc_state *crtc_state,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	int refclk_khz = dev_priv->cdclk.hw.ref;
-	uint32_t dco_khz, m1div, m2div_int, m2div_rem, m2div_frac;
-	uint32_t iref_ndiv, iref_trim, iref_pulse_w;
-	uint32_t prop_coeff, int_coeff;
-	uint32_t tdc_targetcnt, feedfwgain;
-	uint64_t ssc_stepsize, ssc_steplen, ssc_steplog;
-	uint64_t tmp;
+	u32 dco_khz, m1div, m2div_int, m2div_rem, m2div_frac;
+	u32 iref_ndiv, iref_trim, iref_pulse_w;
+	u32 prop_coeff, int_coeff;
+	u32 tdc_targetcnt, feedfwgain;
+	u64 ssc_stepsize, ssc_steplen, ssc_steplog;
+	u64 tmp;
 	bool use_ssc = false;
 	bool is_dp = !intel_crtc_has_type(crtc_state, INTEL_OUTPUT_HDMI);
 
@@ -2740,7 +2762,7 @@ static bool icl_calc_mg_pll_state(struct intel_crtc_state *crtc_state,
 	}
 	m2div_rem = dco_khz % (refclk_khz * m1div);
 
-	tmp = (uint64_t)m2div_rem * (1 << 22);
+	tmp = (u64)m2div_rem * (1 << 22);
 	do_div(tmp, refclk_khz * m1div);
 	m2div_frac = tmp;
 
@@ -2799,11 +2821,11 @@ static bool icl_calc_mg_pll_state(struct intel_crtc_state *crtc_state,
 	}
 
 	if (use_ssc) {
-		tmp = (uint64_t)dco_khz * 47 * 32;
+		tmp = (u64)dco_khz * 47 * 32;
 		do_div(tmp, refclk_khz * m1div * 10000);
 		ssc_stepsize = tmp;
 
-		tmp = (uint64_t)dco_khz * 1000;
+		tmp = (u64)dco_khz * 1000;
 		ssc_steplen = DIV_ROUND_UP_ULL(tmp, 32 * 2 * 32);
 	} else {
 		ssc_stepsize = 0;
@@ -2903,7 +2925,10 @@ icl_get_dpll(struct intel_crtc *crtc, struct intel_crtc_state *crtc_state,
 			ret = icl_calc_dpll_state(crtc_state, encoder, clock,
 						  &pll_state);
 		} else {
-			min = icl_port_to_mg_pll_id(port);
+			enum tc_port tc_port;
+
+			tc_port = intel_port_to_tc(dev_priv, port);
+			min = icl_tc_port_to_pll_id(tc_port);
 			max = min;
 			ret = icl_calc_mg_pll_state(crtc_state, encoder, clock,
 						    &pll_state);
@@ -2937,12 +2962,8 @@ static i915_reg_t icl_pll_id_to_enable_reg(enum intel_dpll_id id)
 		return CNL_DPLL_ENABLE(id);
 	else if (id == DPLL_ID_ICL_TBTPLL)
 		return TBT_PLL_ENABLE;
-	else
-		/*
-		 * TODO: Make MG_PLL macros use
-		 * tc port id instead of port id
-		 */
-		return MG_PLL_ENABLE(icl_mg_pll_id_to_port(id));
+
+	return MG_PLL_ENABLE(icl_pll_id_to_tc_port(id));
 }
 
 static bool icl_pll_get_hw_state(struct drm_i915_private *dev_priv,
@@ -2950,11 +2971,13 @@ static bool icl_pll_get_hw_state(struct drm_i915_private *dev_priv,
 				 struct intel_dpll_hw_state *hw_state)
 {
 	const enum intel_dpll_id id = pll->info->id;
-	uint32_t val;
-	enum port port;
+	intel_wakeref_t wakeref;
 	bool ret = false;
+	u32 val;
 
-	if (!intel_display_power_get_if_enabled(dev_priv, POWER_DOMAIN_PLLS))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     POWER_DOMAIN_PLLS);
+	if (!wakeref)
 		return false;
 
 	val = I915_READ(icl_pll_id_to_enable_reg(id));
@@ -2966,32 +2989,33 @@ static bool icl_pll_get_hw_state(struct drm_i915_private *dev_priv,
 		hw_state->cfgcr0 = I915_READ(ICL_DPLL_CFGCR0(id));
 		hw_state->cfgcr1 = I915_READ(ICL_DPLL_CFGCR1(id));
 	} else {
-		port = icl_mg_pll_id_to_port(id);
-		hw_state->mg_refclkin_ctl = I915_READ(MG_REFCLKIN_CTL(port));
+		enum tc_port tc_port = icl_pll_id_to_tc_port(id);
+
+		hw_state->mg_refclkin_ctl = I915_READ(MG_REFCLKIN_CTL(tc_port));
 		hw_state->mg_refclkin_ctl &= MG_REFCLKIN_CTL_OD_2_MUX_MASK;
 
 		hw_state->mg_clktop2_coreclkctl1 =
-			I915_READ(MG_CLKTOP2_CORECLKCTL1(port));
+			I915_READ(MG_CLKTOP2_CORECLKCTL1(tc_port));
 		hw_state->mg_clktop2_coreclkctl1 &=
 			MG_CLKTOP2_CORECLKCTL1_A_DIVRATIO_MASK;
 
 		hw_state->mg_clktop2_hsclkctl =
-			I915_READ(MG_CLKTOP2_HSCLKCTL(port));
+			I915_READ(MG_CLKTOP2_HSCLKCTL(tc_port));
 		hw_state->mg_clktop2_hsclkctl &=
 			MG_CLKTOP2_HSCLKCTL_TLINEDRV_CLKSEL_MASK |
 			MG_CLKTOP2_HSCLKCTL_CORE_INPUTSEL_MASK |
 			MG_CLKTOP2_HSCLKCTL_HSDIV_RATIO_MASK |
 			MG_CLKTOP2_HSCLKCTL_DSDIV_RATIO_MASK;
 
-		hw_state->mg_pll_div0 = I915_READ(MG_PLL_DIV0(port));
-		hw_state->mg_pll_div1 = I915_READ(MG_PLL_DIV1(port));
-		hw_state->mg_pll_lf = I915_READ(MG_PLL_LF(port));
-		hw_state->mg_pll_frac_lock = I915_READ(MG_PLL_FRAC_LOCK(port));
-		hw_state->mg_pll_ssc = I915_READ(MG_PLL_SSC(port));
+		hw_state->mg_pll_div0 = I915_READ(MG_PLL_DIV0(tc_port));
+		hw_state->mg_pll_div1 = I915_READ(MG_PLL_DIV1(tc_port));
+		hw_state->mg_pll_lf = I915_READ(MG_PLL_LF(tc_port));
+		hw_state->mg_pll_frac_lock = I915_READ(MG_PLL_FRAC_LOCK(tc_port));
+		hw_state->mg_pll_ssc = I915_READ(MG_PLL_SSC(tc_port));
 
-		hw_state->mg_pll_bias = I915_READ(MG_PLL_BIAS(port));
+		hw_state->mg_pll_bias = I915_READ(MG_PLL_BIAS(tc_port));
 		hw_state->mg_pll_tdc_coldst_bias =
-			I915_READ(MG_PLL_TDC_COLDST_BIAS(port));
+			I915_READ(MG_PLL_TDC_COLDST_BIAS(tc_port));
 
 		if (dev_priv->cdclk.hw.ref == 38400) {
 			hw_state->mg_pll_tdc_coldst_bias_mask = MG_PLL_TDC_COLDST_COLDSTART;
@@ -3007,7 +3031,7 @@ static bool icl_pll_get_hw_state(struct drm_i915_private *dev_priv,
 
 	ret = true;
 out:
-	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS, wakeref);
 	return ret;
 }
 
@@ -3026,7 +3050,7 @@ static void icl_mg_pll_write(struct drm_i915_private *dev_priv,
 			     struct intel_shared_dpll *pll)
 {
 	struct intel_dpll_hw_state *hw_state = &pll->state.hw_state;
-	enum port port = icl_mg_pll_id_to_port(pll->info->id);
+	enum tc_port tc_port = icl_pll_id_to_tc_port(pll->info->id);
 	u32 val;
 
 	/*
@@ -3035,41 +3059,41 @@ static void icl_mg_pll_write(struct drm_i915_private *dev_priv,
 	 * during the calc/readout phase if the mask depends on some other HW
 	 * state like refclk, see icl_calc_mg_pll_state().
 	 */
-	val = I915_READ(MG_REFCLKIN_CTL(port));
+	val = I915_READ(MG_REFCLKIN_CTL(tc_port));
 	val &= ~MG_REFCLKIN_CTL_OD_2_MUX_MASK;
 	val |= hw_state->mg_refclkin_ctl;
-	I915_WRITE(MG_REFCLKIN_CTL(port), val);
+	I915_WRITE(MG_REFCLKIN_CTL(tc_port), val);
 
-	val = I915_READ(MG_CLKTOP2_CORECLKCTL1(port));
+	val = I915_READ(MG_CLKTOP2_CORECLKCTL1(tc_port));
 	val &= ~MG_CLKTOP2_CORECLKCTL1_A_DIVRATIO_MASK;
 	val |= hw_state->mg_clktop2_coreclkctl1;
-	I915_WRITE(MG_CLKTOP2_CORECLKCTL1(port), val);
+	I915_WRITE(MG_CLKTOP2_CORECLKCTL1(tc_port), val);
 
-	val = I915_READ(MG_CLKTOP2_HSCLKCTL(port));
+	val = I915_READ(MG_CLKTOP2_HSCLKCTL(tc_port));
 	val &= ~(MG_CLKTOP2_HSCLKCTL_TLINEDRV_CLKSEL_MASK |
 		 MG_CLKTOP2_HSCLKCTL_CORE_INPUTSEL_MASK |
 		 MG_CLKTOP2_HSCLKCTL_HSDIV_RATIO_MASK |
 		 MG_CLKTOP2_HSCLKCTL_DSDIV_RATIO_MASK);
 	val |= hw_state->mg_clktop2_hsclkctl;
-	I915_WRITE(MG_CLKTOP2_HSCLKCTL(port), val);
+	I915_WRITE(MG_CLKTOP2_HSCLKCTL(tc_port), val);
 
-	I915_WRITE(MG_PLL_DIV0(port), hw_state->mg_pll_div0);
-	I915_WRITE(MG_PLL_DIV1(port), hw_state->mg_pll_div1);
-	I915_WRITE(MG_PLL_LF(port), hw_state->mg_pll_lf);
-	I915_WRITE(MG_PLL_FRAC_LOCK(port), hw_state->mg_pll_frac_lock);
-	I915_WRITE(MG_PLL_SSC(port), hw_state->mg_pll_ssc);
+	I915_WRITE(MG_PLL_DIV0(tc_port), hw_state->mg_pll_div0);
+	I915_WRITE(MG_PLL_DIV1(tc_port), hw_state->mg_pll_div1);
+	I915_WRITE(MG_PLL_LF(tc_port), hw_state->mg_pll_lf);
+	I915_WRITE(MG_PLL_FRAC_LOCK(tc_port), hw_state->mg_pll_frac_lock);
+	I915_WRITE(MG_PLL_SSC(tc_port), hw_state->mg_pll_ssc);
 
-	val = I915_READ(MG_PLL_BIAS(port));
+	val = I915_READ(MG_PLL_BIAS(tc_port));
 	val &= ~hw_state->mg_pll_bias_mask;
 	val |= hw_state->mg_pll_bias;
-	I915_WRITE(MG_PLL_BIAS(port), val);
+	I915_WRITE(MG_PLL_BIAS(tc_port), val);
 
-	val = I915_READ(MG_PLL_TDC_COLDST_BIAS(port));
+	val = I915_READ(MG_PLL_TDC_COLDST_BIAS(tc_port));
 	val &= ~hw_state->mg_pll_tdc_coldst_bias_mask;
 	val |= hw_state->mg_pll_tdc_coldst_bias;
-	I915_WRITE(MG_PLL_TDC_COLDST_BIAS(port), val);
+	I915_WRITE(MG_PLL_TDC_COLDST_BIAS(tc_port), val);
 
-	POSTING_READ(MG_PLL_TDC_COLDST_BIAS(port));
+	POSTING_READ(MG_PLL_TDC_COLDST_BIAS(tc_port));
 }
 
 static void icl_pll_enable(struct drm_i915_private *dev_priv,
@@ -3077,7 +3101,7 @@ static void icl_pll_enable(struct drm_i915_private *dev_priv,
 {
 	const enum intel_dpll_id id = pll->info->id;
 	i915_reg_t enable_reg = icl_pll_id_to_enable_reg(id);
-	uint32_t val;
+	u32 val;
 
 	val = I915_READ(enable_reg);
 	val |= PLL_POWER_ENABLE;
@@ -3118,7 +3142,7 @@ static void icl_pll_disable(struct drm_i915_private *dev_priv,
 {
 	const enum intel_dpll_id id = pll->info->id;
 	i915_reg_t enable_reg = icl_pll_id_to_enable_reg(id);
-	uint32_t val;
+	u32 val;
 
 	/* The first steps are done by intel_ddi_post_disable(). */
 
diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.h b/drivers/gpu/drm/i915/intel_dpll_mgr.h
index a033d8f06d4a..40e8391a92f2 100644
--- a/drivers/gpu/drm/i915/intel_dpll_mgr.h
+++ b/drivers/gpu/drm/i915/intel_dpll_mgr.h
@@ -138,14 +138,14 @@ enum intel_dpll_id {
 
 struct intel_dpll_hw_state {
 	/* i9xx, pch plls */
-	uint32_t dpll;
-	uint32_t dpll_md;
-	uint32_t fp0;
-	uint32_t fp1;
+	u32 dpll;
+	u32 dpll_md;
+	u32 fp0;
+	u32 fp1;
 
 	/* hsw, bdw */
-	uint32_t wrpll;
-	uint32_t spll;
+	u32 wrpll;
+	u32 spll;
 
 	/* skl */
 	/*
@@ -154,34 +154,33 @@ struct intel_dpll_hw_state {
 	 * the register.  This allows us to easily compare the state to share
 	 * the DPLL.
 	 */
-	uint32_t ctrl1;
+	u32 ctrl1;
 	/* HDMI only, 0 when used for DP */
-	uint32_t cfgcr1, cfgcr2;
+	u32 cfgcr1, cfgcr2;
 
 	/* cnl */
-	uint32_t cfgcr0;
+	u32 cfgcr0;
 	/* CNL also uses cfgcr1 */
 
 	/* bxt */
-	uint32_t ebb0, ebb4, pll0, pll1, pll2, pll3, pll6, pll8, pll9, pll10,
-		 pcsdw12;
+	u32 ebb0, ebb4, pll0, pll1, pll2, pll3, pll6, pll8, pll9, pll10, pcsdw12;
 
 	/*
 	 * ICL uses the following, already defined:
-	 * uint32_t cfgcr0, cfgcr1;
-	 */
-	uint32_t mg_refclkin_ctl;
-	uint32_t mg_clktop2_coreclkctl1;
-	uint32_t mg_clktop2_hsclkctl;
-	uint32_t mg_pll_div0;
-	uint32_t mg_pll_div1;
-	uint32_t mg_pll_lf;
-	uint32_t mg_pll_frac_lock;
-	uint32_t mg_pll_ssc;
-	uint32_t mg_pll_bias;
-	uint32_t mg_pll_tdc_coldst_bias;
-	uint32_t mg_pll_bias_mask;
-	uint32_t mg_pll_tdc_coldst_bias_mask;
+	 * u32 cfgcr0, cfgcr1;
+	 */
+	u32 mg_refclkin_ctl;
+	u32 mg_clktop2_coreclkctl1;
+	u32 mg_clktop2_hsclkctl;
+	u32 mg_pll_div0;
+	u32 mg_pll_div1;
+	u32 mg_pll_lf;
+	u32 mg_pll_frac_lock;
+	u32 mg_pll_ssc;
+	u32 mg_pll_bias;
+	u32 mg_pll_tdc_coldst_bias;
+	u32 mg_pll_bias_mask;
+	u32 mg_pll_tdc_coldst_bias_mask;
 };
 
 /**
@@ -280,7 +279,7 @@ struct dpll_info {
 	 *     Inform the state checker that the DPLL is kept enabled even if
 	 *     not in use by any CRTC.
 	 */
-	uint32_t flags;
+	u32 flags;
 };
 
 /**
@@ -343,9 +342,9 @@ void intel_shared_dpll_init(struct drm_device *dev);
 void intel_dpll_dump_hw_state(struct drm_i915_private *dev_priv,
 			      struct intel_dpll_hw_state *hw_state);
 int icl_calc_dp_combo_pll_link(struct drm_i915_private *dev_priv,
-			       uint32_t pll_id);
+			       u32 pll_id);
 int cnl_hdmi_pll_ref_clock(struct drm_i915_private *dev_priv);
-enum intel_dpll_id icl_port_to_mg_pll_id(enum port port);
+enum intel_dpll_id icl_tc_port_to_pll_id(enum tc_port tc_port);
 bool intel_dpll_is_combophy(enum intel_dpll_id id);
 
 #endif /* _INTEL_DPLL_MGR_H_ */
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index e9ddeaf05a14..15db41394b9e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -29,18 +29,22 @@
 #include <linux/i2c.h>
 #include <linux/hdmi.h>
 #include <linux/sched/clock.h>
+#include <linux/stackdepot.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_encoder.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_dp_dual_mode_helper.h>
 #include <drm/drm_dp_mst_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_rect.h>
+#include <drm/drm_vblank.h>
 #include <drm/drm_atomic.h>
 #include <media/cec-notifier.h>
 
+struct drm_printer;
+
 /**
  * __wait_for - magic wait macro
  *
@@ -232,9 +236,9 @@ struct intel_encoder {
 	enum intel_output_type (*compute_output_type)(struct intel_encoder *,
 						      struct intel_crtc_state *,
 						      struct drm_connector_state *);
-	bool (*compute_config)(struct intel_encoder *,
-			       struct intel_crtc_state *,
-			       struct drm_connector_state *);
+	int (*compute_config)(struct intel_encoder *,
+			      struct intel_crtc_state *,
+			      struct drm_connector_state *);
 	void (*pre_pll_enable)(struct intel_encoder *,
 			       const struct intel_crtc_state *,
 			       const struct drm_connector_state *);
@@ -253,6 +257,9 @@ struct intel_encoder {
 	void (*post_pll_disable)(struct intel_encoder *,
 				 const struct intel_crtc_state *,
 				 const struct drm_connector_state *);
+	void (*update_pipe)(struct intel_encoder *,
+			    const struct intel_crtc_state *,
+			    const struct drm_connector_state *);
 	/* Read out the current hw state of this connector, returning true if
 	 * the encoder is active. If the encoder is enabled it also set the pipe
 	 * it is connected to in the pipe parameter. */
@@ -304,13 +311,12 @@ struct intel_panel {
 
 		/* Connector and platform specific backlight functions */
 		int (*setup)(struct intel_connector *connector, enum pipe pipe);
-		uint32_t (*get)(struct intel_connector *connector);
-		void (*set)(const struct drm_connector_state *conn_state, uint32_t level);
+		u32 (*get)(struct intel_connector *connector);
+		void (*set)(const struct drm_connector_state *conn_state, u32 level);
 		void (*disable)(const struct drm_connector_state *conn_state);
 		void (*enable)(const struct intel_crtc_state *crtc_state,
 			       const struct drm_connector_state *conn_state);
-		uint32_t (*hz_to_pwm)(struct intel_connector *connector,
-				      uint32_t hz);
+		u32 (*hz_to_pwm)(struct intel_connector *connector, u32 hz);
 		void (*power)(struct intel_connector *, bool enable);
 	} backlight;
 };
@@ -602,7 +608,7 @@ struct intel_initial_plane_config {
 
 struct intel_scaler {
 	int in_use;
-	uint32_t mode;
+	u32 mode;
 };
 
 struct intel_crtc_scaler_state {
@@ -634,13 +640,15 @@ struct intel_crtc_scaler_state {
 };
 
 /* drm_mode->private_flags */
-#define I915_MODE_FLAG_INHERITED 1
+#define I915_MODE_FLAG_INHERITED (1<<0)
 /* Flag to get scanline using frame time stamps */
 #define I915_MODE_FLAG_GET_SCANLINE_FROM_TIMESTAMP (1<<1)
+/* Flag to use the scanline counter instead of the pixel counter */
+#define I915_MODE_FLAG_USE_SCANLINE_COUNTER (1<<2)
 
 struct intel_pipe_wm {
 	struct intel_wm_level wm[5];
-	uint32_t linetime;
+	u32 linetime;
 	bool fbc_wm_enabled;
 	bool pipe_enabled;
 	bool sprites_enabled;
@@ -656,7 +664,7 @@ struct skl_plane_wm {
 
 struct skl_pipe_wm {
 	struct skl_plane_wm planes[I915_MAX_PLANES];
-	uint32_t linetime;
+	u32 linetime;
 };
 
 enum vlv_wm_level {
@@ -669,7 +677,7 @@ enum vlv_wm_level {
 struct vlv_wm_state {
 	struct g4x_pipe_wm wm[NUM_VLV_WM_LEVELS];
 	struct g4x_sr_wm sr[NUM_VLV_WM_LEVELS];
-	uint8_t num_levels;
+	u8 num_levels;
 	bool cxsr;
 };
 
@@ -882,13 +890,13 @@ struct intel_crtc_state {
 	/* Used by SDVO (and if we ever fix it, HDMI). */
 	unsigned pixel_multiplier;
 
-	uint8_t lane_count;
+	u8 lane_count;
 
 	/*
 	 * Used by platforms having DP/HDMI PHY with programmable lane
 	 * latency optimization.
 	 */
-	uint8_t lane_lat_optim_mask;
+	u8 lane_lat_optim_mask;
 
 	/* minimum acceptable voltage level */
 	u8 min_voltage_level;
@@ -932,7 +940,7 @@ struct intel_crtc_state {
 	struct intel_crtc_wm_state wm;
 
 	/* Gamma mode programmed on the pipe */
-	uint32_t gamma_mode;
+	u32 gamma_mode;
 
 	/* bitmask of visible planes (enum plane_id) */
 	u8 active_planes;
@@ -1018,7 +1026,7 @@ struct intel_plane {
 	enum pipe pipe;
 	bool has_fbc;
 	bool has_ccs;
-	uint32_t frontbuffer_bit;
+	u32 frontbuffer_bit;
 
 	struct {
 		u32 base, cntl, size;
@@ -1084,7 +1092,6 @@ struct intel_hdmi {
 	} dp_dual_mode;
 	bool has_hdmi_sink;
 	bool has_audio;
-	bool rgb_quant_range_selectable;
 	struct intel_connector *attached_connector;
 	struct cec_notifier *cec_notifier;
 };
@@ -1114,9 +1121,9 @@ enum link_m_n_set {
 
 struct intel_dp_compliance_data {
 	unsigned long edid;
-	uint8_t video_pattern;
-	uint16_t hdisplay, vdisplay;
-	uint8_t bpc;
+	u8 video_pattern;
+	u16 hdisplay, vdisplay;
+	u8 bpc;
 };
 
 struct intel_dp_compliance {
@@ -1129,18 +1136,18 @@ struct intel_dp_compliance {
 
 struct intel_dp {
 	i915_reg_t output_reg;
-	uint32_t DP;
+	u32 DP;
 	int link_rate;
-	uint8_t lane_count;
-	uint8_t sink_count;
+	u8 lane_count;
+	u8 sink_count;
 	bool link_mst;
 	bool link_trained;
 	bool has_audio;
 	bool reset_link_params;
-	uint8_t dpcd[DP_RECEIVER_CAP_SIZE];
-	uint8_t psr_dpcd[EDP_PSR_RECEIVER_CAP_SIZE];
-	uint8_t downstream_ports[DP_MAX_DOWNSTREAM_PORTS];
-	uint8_t edp_dpcd[EDP_DISPLAY_CTL_CAP_SIZE];
+	u8 dpcd[DP_RECEIVER_CAP_SIZE];
+	u8 psr_dpcd[EDP_PSR_RECEIVER_CAP_SIZE];
+	u8 downstream_ports[DP_MAX_DOWNSTREAM_PORTS];
+	u8 edp_dpcd[EDP_DISPLAY_CTL_CAP_SIZE];
 	u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE];
 	u8 fec_capable;
 	/* source rates */
@@ -1160,7 +1167,7 @@ struct intel_dp {
 	/* sink or branch descriptor */
 	struct drm_dp_desc desc;
 	struct drm_dp_aux aux;
-	uint8_t train_set[4];
+	u8 train_set[4];
 	int panel_power_up_delay;
 	int panel_power_down_delay;
 	int panel_power_cycle_delay;
@@ -1202,14 +1209,13 @@ struct intel_dp {
 	struct intel_dp_mst_encoder *mst_encoders[I915_MAX_PIPES];
 	struct drm_dp_mst_topology_mgr mst_mgr;
 
-	uint32_t (*get_aux_clock_divider)(struct intel_dp *dp, int index);
+	u32 (*get_aux_clock_divider)(struct intel_dp *dp, int index);
 	/*
 	 * This function returns the value we have to program the AUX_CTL
 	 * register with to kick off an AUX transaction.
 	 */
-	uint32_t (*get_aux_send_ctl)(struct intel_dp *dp,
-				     int send_bytes,
-				     uint32_t aux_clock_divider);
+	u32 (*get_aux_send_ctl)(struct intel_dp *dp, int send_bytes,
+				u32 aux_clock_divider);
 
 	i915_reg_t (*aux_ch_ctl_reg)(struct intel_dp *dp);
 	i915_reg_t (*aux_ch_data_reg)(struct intel_dp *dp, int index);
@@ -1219,6 +1225,9 @@ struct intel_dp {
 
 	/* Displayport compliance testing */
 	struct intel_dp_compliance compliance;
+
+	/* Display stream compression testing */
+	bool force_dsc_en;
 };
 
 enum lspcon_vendor {
@@ -1240,10 +1249,11 @@ struct intel_digital_port {
 	struct intel_lspcon lspcon;
 	enum irqreturn (*hpd_pulse)(struct intel_digital_port *, bool);
 	bool release_cl2_override;
-	uint8_t max_lanes;
+	u8 max_lanes;
 	/* Used for DP and ICL+ TypeC/DP and TypeC/HDMI ports. */
 	enum aux_ch aux_ch;
 	enum intel_display_power_domain ddi_io_power_domain;
+	bool tc_legacy_port:1;
 	enum tc_port_type tc_type;
 
 	void (*write_infoframe)(struct intel_encoder *encoder,
@@ -1474,8 +1484,8 @@ void intel_check_cpu_fifo_underruns(struct drm_i915_private *dev_priv);
 void intel_check_pch_fifo_underruns(struct drm_i915_private *dev_priv);
 
 /* i915_irq.c */
-void gen5_enable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask);
-void gen5_disable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask);
+void gen5_enable_gt_irq(struct drm_i915_private *dev_priv, u32 mask);
+void gen5_disable_gt_irq(struct drm_i915_private *dev_priv, u32 mask);
 void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask);
 void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask);
 void gen11_reset_rps_interrupts(struct drm_i915_private *dev_priv);
@@ -1538,7 +1548,7 @@ void intel_ddi_set_vc_payload_alloc(const struct intel_crtc_state *crtc_state,
 void intel_ddi_compute_min_voltage_level(struct drm_i915_private *dev_priv,
 					 struct intel_crtc_state *crtc_state);
 u32 bxt_signal_levels(struct intel_dp *intel_dp);
-uint32_t ddi_signal_levels(struct intel_dp *intel_dp);
+u32 ddi_signal_levels(struct intel_dp *intel_dp);
 u8 intel_ddi_dp_voltage_max(struct intel_encoder *encoder);
 u8 intel_ddi_dp_pre_emphasis_max(struct intel_encoder *encoder,
 				 u8 voltage_swing);
@@ -1678,11 +1688,11 @@ void intel_cleanup_plane_fb(struct drm_plane *plane,
 int intel_plane_atomic_get_property(struct drm_plane *plane,
 				    const struct drm_plane_state *state,
 				    struct drm_property *property,
-				    uint64_t *val);
+				    u64 *val);
 int intel_plane_atomic_set_property(struct drm_plane *plane,
 				    struct drm_plane_state *state,
 				    struct drm_property *property,
-				    uint64_t val);
+				    u64 val);
 int intel_plane_atomic_calc_changes(const struct intel_crtc_state *old_crtc_state,
 				    struct drm_crtc_state *crtc_state,
 				    const struct intel_plane_state *old_plane_state,
@@ -1756,9 +1766,10 @@ static inline u32 intel_plane_ggtt_offset(const struct intel_plane_state *state)
 
 u32 glk_plane_color_ctl(const struct intel_crtc_state *crtc_state,
 			const struct intel_plane_state *plane_state);
+u32 glk_plane_color_ctl_crtc(const struct intel_crtc_state *crtc_state);
 u32 skl_plane_ctl(const struct intel_crtc_state *crtc_state,
 		  const struct intel_plane_state *plane_state);
-u32 glk_color_ctl(const struct intel_plane_state *plane_state);
+u32 skl_plane_ctl_crtc(const struct intel_crtc_state *crtc_state);
 u32 skl_plane_stride(const struct intel_plane_state *plane_state,
 		     int plane);
 int skl_check_plane_surface(struct intel_plane_state *plane_state);
@@ -1802,10 +1813,10 @@ bool intel_dp_init(struct drm_i915_private *dev_priv, i915_reg_t output_reg,
 bool intel_dp_init_connector(struct intel_digital_port *intel_dig_port,
 			     struct intel_connector *intel_connector);
 void intel_dp_set_link_params(struct intel_dp *intel_dp,
-			      int link_rate, uint8_t lane_count,
+			      int link_rate, u8 lane_count,
 			      bool link_mst);
 int intel_dp_get_link_train_fallback_values(struct intel_dp *intel_dp,
-					    int link_rate, uint8_t lane_count);
+					    int link_rate, u8 lane_count);
 void intel_dp_start_link_train(struct intel_dp *intel_dp);
 void intel_dp_stop_link_train(struct intel_dp *intel_dp);
 int intel_dp_retrain_link(struct intel_encoder *encoder,
@@ -1816,10 +1827,10 @@ void intel_dp_sink_set_decompression_state(struct intel_dp *intel_dp,
 					   bool enable);
 void intel_dp_encoder_reset(struct drm_encoder *encoder);
 void intel_dp_encoder_suspend(struct intel_encoder *intel_encoder);
-void intel_dp_encoder_destroy(struct drm_encoder *encoder);
-bool intel_dp_compute_config(struct intel_encoder *encoder,
-			     struct intel_crtc_state *pipe_config,
-			     struct drm_connector_state *conn_state);
+void intel_dp_encoder_flush_work(struct drm_encoder *encoder);
+int intel_dp_compute_config(struct intel_encoder *encoder,
+			    struct intel_crtc_state *pipe_config,
+			    struct drm_connector_state *conn_state);
 bool intel_dp_is_edp(struct intel_dp *intel_dp);
 bool intel_dp_is_port_edp(struct drm_i915_private *dev_priv, enum port port);
 enum irqreturn intel_dp_hpd_pulse(struct intel_digital_port *intel_dig_port,
@@ -1837,7 +1848,7 @@ int intel_dp_max_lane_count(struct intel_dp *intel_dp);
 int intel_dp_rate_select(struct intel_dp *intel_dp, int rate);
 void intel_dp_hot_plug(struct intel_encoder *intel_encoder);
 void intel_power_sequencer_reset(struct drm_i915_private *dev_priv);
-uint32_t intel_dp_pack_aux(const uint8_t *src, int src_bytes);
+u32 intel_dp_pack_aux(const u8 *src, int src_bytes);
 void intel_plane_destroy(struct drm_plane *plane);
 void intel_edp_drrs_enable(struct intel_dp *intel_dp,
 			   const struct intel_crtc_state *crtc_state);
@@ -1850,24 +1861,24 @@ void intel_edp_drrs_flush(struct drm_i915_private *dev_priv,
 
 void
 intel_dp_program_link_training_pattern(struct intel_dp *intel_dp,
-				       uint8_t dp_train_pat);
+				       u8 dp_train_pat);
 void
 intel_dp_set_signal_levels(struct intel_dp *intel_dp);
 void intel_dp_set_idle_link_train(struct intel_dp *intel_dp);
-uint8_t
+u8
 intel_dp_voltage_max(struct intel_dp *intel_dp);
-uint8_t
-intel_dp_pre_emphasis_max(struct intel_dp *intel_dp, uint8_t voltage_swing);
+u8
+intel_dp_pre_emphasis_max(struct intel_dp *intel_dp, u8 voltage_swing);
 void intel_dp_compute_rate(struct intel_dp *intel_dp, int port_clock,
-			   uint8_t *link_bw, uint8_t *rate_select);
+			   u8 *link_bw, u8 *rate_select);
 bool intel_dp_source_supports_hbr2(struct intel_dp *intel_dp);
 bool intel_dp_source_supports_hbr3(struct intel_dp *intel_dp);
 bool
-intel_dp_get_link_status(struct intel_dp *intel_dp, uint8_t link_status[DP_LINK_STATUS_SIZE]);
-uint16_t intel_dp_dsc_get_output_bpp(int link_clock, uint8_t lane_count,
-				     int mode_clock, int mode_hdisplay);
-uint8_t intel_dp_dsc_get_slice_count(struct intel_dp *intel_dp, int mode_clock,
-				     int mode_hdisplay);
+intel_dp_get_link_status(struct intel_dp *intel_dp, u8 link_status[DP_LINK_STATUS_SIZE]);
+u16 intel_dp_dsc_get_output_bpp(int link_clock, u8 lane_count,
+				int mode_clock, int mode_hdisplay);
+u8 intel_dp_dsc_get_slice_count(struct intel_dp *intel_dp, int mode_clock,
+				int mode_hdisplay);
 
 /* intel_vdsc.c */
 int intel_dp_compute_dsc_params(struct intel_dp *intel_dp,
@@ -1884,6 +1895,8 @@ bool intel_dp_read_dpcd(struct intel_dp *intel_dp);
 int intel_dp_link_required(int pixel_clock, int bpp);
 int intel_dp_max_data_rate(int max_link_clock, int max_lanes);
 bool intel_digital_port_connected(struct intel_encoder *encoder);
+void icl_tc_phy_disconnect(struct drm_i915_private *dev_priv,
+			   struct intel_digital_port *dig_port);
 
 /* intel_dp_aux_backlight.c */
 int intel_dp_aux_init_backlight_funcs(struct intel_connector *intel_connector);
@@ -1977,9 +1990,9 @@ void intel_hdmi_init(struct drm_i915_private *dev_priv, i915_reg_t hdmi_reg,
 void intel_hdmi_init_connector(struct intel_digital_port *intel_dig_port,
 			       struct intel_connector *intel_connector);
 struct intel_hdmi *enc_to_intel_hdmi(struct drm_encoder *encoder);
-bool intel_hdmi_compute_config(struct intel_encoder *encoder,
-			       struct intel_crtc_state *pipe_config,
-			       struct drm_connector_state *conn_state);
+int intel_hdmi_compute_config(struct intel_encoder *encoder,
+			      struct intel_crtc_state *pipe_config,
+			      struct drm_connector_state *conn_state);
 bool intel_hdmi_handle_sink_scrambling(struct intel_encoder *encoder,
 				       struct drm_connector *connector,
 				       bool high_tmds_clock_ratio,
@@ -2024,6 +2037,9 @@ int intel_panel_setup_backlight(struct drm_connector *connector,
 				enum pipe pipe);
 void intel_panel_enable_backlight(const struct intel_crtc_state *crtc_state,
 				  const struct drm_connector_state *conn_state);
+void intel_panel_update_backlight(struct intel_encoder *encoder,
+				  const struct intel_crtc_state *crtc_state,
+				  const struct drm_connector_state *conn_state);
 void intel_panel_disable_backlight(const struct drm_connector_state *old_conn_state);
 extern struct drm_display_mode *intel_find_panel_downclock(
 				struct drm_i915_private *dev_priv,
@@ -2085,6 +2101,7 @@ bool intel_psr_enabled(struct intel_dp *intel_dp);
 void intel_init_quirks(struct drm_i915_private *dev_priv);
 
 /* intel_runtime_pm.c */
+void intel_runtime_pm_init_early(struct drm_i915_private *dev_priv);
 int intel_power_domains_init(struct drm_i915_private *);
 void intel_power_domains_cleanup(struct drm_i915_private *dev_priv);
 void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume);
@@ -2107,6 +2124,7 @@ void bxt_display_core_init(struct drm_i915_private *dev_priv, bool resume);
 void bxt_display_core_uninit(struct drm_i915_private *dev_priv);
 void intel_runtime_pm_enable(struct drm_i915_private *dev_priv);
 void intel_runtime_pm_disable(struct drm_i915_private *dev_priv);
+void intel_runtime_pm_cleanup(struct drm_i915_private *dev_priv);
 const char *
 intel_display_power_domain_str(enum intel_display_power_domain domain);
 
@@ -2114,33 +2132,42 @@ bool intel_display_power_is_enabled(struct drm_i915_private *dev_priv,
 				    enum intel_display_power_domain domain);
 bool __intel_display_power_is_enabled(struct drm_i915_private *dev_priv,
 				      enum intel_display_power_domain domain);
-void intel_display_power_get(struct drm_i915_private *dev_priv,
-			     enum intel_display_power_domain domain);
-bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
+intel_wakeref_t intel_display_power_get(struct drm_i915_private *dev_priv,
 					enum intel_display_power_domain domain);
+intel_wakeref_t
+intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
+				   enum intel_display_power_domain domain);
+void intel_display_power_put_unchecked(struct drm_i915_private *dev_priv,
+				       enum intel_display_power_domain domain);
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
 void intel_display_power_put(struct drm_i915_private *dev_priv,
-			     enum intel_display_power_domain domain);
+			     enum intel_display_power_domain domain,
+			     intel_wakeref_t wakeref);
+#else
+#define intel_display_power_put(i915, domain, wakeref) \
+	intel_display_power_put_unchecked(i915, domain)
+#endif
 void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
 			    u8 req_slices);
 
 static inline void
-assert_rpm_device_not_suspended(struct drm_i915_private *dev_priv)
+assert_rpm_device_not_suspended(struct drm_i915_private *i915)
 {
-	WARN_ONCE(dev_priv->runtime_pm.suspended,
+	WARN_ONCE(i915->runtime_pm.suspended,
 		  "Device suspended during HW access\n");
 }
 
 static inline void
-assert_rpm_wakelock_held(struct drm_i915_private *dev_priv)
+assert_rpm_wakelock_held(struct drm_i915_private *i915)
 {
-	assert_rpm_device_not_suspended(dev_priv);
-	WARN_ONCE(!atomic_read(&dev_priv->runtime_pm.wakeref_count),
+	assert_rpm_device_not_suspended(i915);
+	WARN_ONCE(!atomic_read(&i915->runtime_pm.wakeref_count),
 		  "RPM wakelock ref not held during HW access");
 }
 
 /**
  * disable_rpm_wakeref_asserts - disable the RPM assert checks
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * This function disable asserts that check if we hold an RPM wakelock
  * reference, while keeping the device-not-suspended checks still enabled.
@@ -2157,14 +2184,14 @@ assert_rpm_wakelock_held(struct drm_i915_private *dev_priv)
  * enable_rpm_wakeref_asserts().
  */
 static inline void
-disable_rpm_wakeref_asserts(struct drm_i915_private *dev_priv)
+disable_rpm_wakeref_asserts(struct drm_i915_private *i915)
 {
-	atomic_inc(&dev_priv->runtime_pm.wakeref_count);
+	atomic_inc(&i915->runtime_pm.wakeref_count);
 }
 
 /**
  * enable_rpm_wakeref_asserts - re-enable the RPM assert checks
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * This function re-enables the RPM assert checks after disabling them with
  * disable_rpm_wakeref_asserts. It's meant to be used only in special
@@ -2174,15 +2201,39 @@ disable_rpm_wakeref_asserts(struct drm_i915_private *dev_priv)
  * disable_rpm_wakeref_asserts().
  */
 static inline void
-enable_rpm_wakeref_asserts(struct drm_i915_private *dev_priv)
+enable_rpm_wakeref_asserts(struct drm_i915_private *i915)
 {
-	atomic_dec(&dev_priv->runtime_pm.wakeref_count);
+	atomic_dec(&i915->runtime_pm.wakeref_count);
 }
 
-void intel_runtime_pm_get(struct drm_i915_private *dev_priv);
-bool intel_runtime_pm_get_if_in_use(struct drm_i915_private *dev_priv);
-void intel_runtime_pm_get_noresume(struct drm_i915_private *dev_priv);
-void intel_runtime_pm_put(struct drm_i915_private *dev_priv);
+intel_wakeref_t intel_runtime_pm_get(struct drm_i915_private *i915);
+intel_wakeref_t intel_runtime_pm_get_if_in_use(struct drm_i915_private *i915);
+intel_wakeref_t intel_runtime_pm_get_noresume(struct drm_i915_private *i915);
+
+#define with_intel_runtime_pm(i915, wf) \
+	for ((wf) = intel_runtime_pm_get(i915); (wf); \
+	     intel_runtime_pm_put((i915), (wf)), (wf) = 0)
+
+#define with_intel_runtime_pm_if_in_use(i915, wf) \
+	for ((wf) = intel_runtime_pm_get_if_in_use(i915); (wf); \
+	     intel_runtime_pm_put((i915), (wf)), (wf) = 0)
+
+void intel_runtime_pm_put_unchecked(struct drm_i915_private *i915);
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
+void intel_runtime_pm_put(struct drm_i915_private *i915, intel_wakeref_t wref);
+#else
+#define intel_runtime_pm_put(i915, wref) intel_runtime_pm_put_unchecked(i915)
+#endif
+
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
+void print_intel_runtime_pm_wakeref(struct drm_i915_private *i915,
+				    struct drm_printer *p);
+#else
+static inline void print_intel_runtime_pm_wakeref(struct drm_i915_private *i915,
+						  struct drm_printer *p)
+{
+}
+#endif
 
 void chv_phy_powergate_lanes(struct intel_encoder *encoder,
 			     bool override, unsigned int mask);
@@ -2210,16 +2261,16 @@ void gen6_rps_busy(struct drm_i915_private *dev_priv);
 void gen6_rps_reset_ei(struct drm_i915_private *dev_priv);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
 void gen6_rps_boost(struct i915_request *rq, struct intel_rps_client *rps);
-void g4x_wm_get_hw_state(struct drm_device *dev);
-void vlv_wm_get_hw_state(struct drm_device *dev);
-void ilk_wm_get_hw_state(struct drm_device *dev);
-void skl_wm_get_hw_state(struct drm_device *dev);
+void g4x_wm_get_hw_state(struct drm_i915_private *dev_priv);
+void vlv_wm_get_hw_state(struct drm_i915_private *dev_priv);
+void ilk_wm_get_hw_state(struct drm_i915_private *dev_priv);
+void skl_wm_get_hw_state(struct drm_i915_private *dev_priv);
 void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
 			       struct skl_ddb_entry *ddb_y,
 			       struct skl_ddb_entry *ddb_uv);
 void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
 			  struct skl_ddb_allocation *ddb /* out */);
-void skl_pipe_wm_get_hw_state(struct drm_crtc *crtc,
+void skl_pipe_wm_get_hw_state(struct intel_crtc *crtc,
 			      struct skl_pipe_wm *out);
 void g4x_wm_sanitize(struct drm_i915_private *dev_priv);
 void vlv_wm_sanitize(struct drm_i915_private *dev_priv);
@@ -2288,11 +2339,11 @@ void intel_tv_init(struct drm_i915_private *dev_priv);
 int intel_digital_connector_atomic_get_property(struct drm_connector *connector,
 						const struct drm_connector_state *state,
 						struct drm_property *property,
-						uint64_t *val);
+						u64 *val);
 int intel_digital_connector_atomic_set_property(struct drm_connector *connector,
 						struct drm_connector_state *state,
 						struct drm_property *property,
-						uint64_t val);
+						u64 val);
 int intel_digital_connector_atomic_check(struct drm_connector *conn,
 					 struct drm_connector_state *new_state);
 struct drm_connector_state *
@@ -2337,10 +2388,10 @@ int intel_plane_atomic_check_with_state(const struct intel_crtc_state *old_crtc_
 					struct intel_plane_state *intel_state);
 
 /* intel_color.c */
-void intel_color_init(struct drm_crtc *crtc);
-int intel_color_check(struct drm_crtc *crtc, struct drm_crtc_state *state);
-void intel_color_set_csc(struct drm_crtc_state *crtc_state);
-void intel_color_load_luts(struct drm_crtc_state *crtc_state);
+void intel_color_init(struct intel_crtc *crtc);
+int intel_color_check(struct intel_crtc_state *crtc_state);
+void intel_color_commit(const struct intel_crtc_state *crtc_state);
+void intel_color_load_luts(const struct intel_crtc_state *crtc_state);
 
 /* intel_lspcon.c */
 bool lspcon_init(struct intel_digital_port *intel_dig_port);
diff --git a/drivers/gpu/drm/i915/intel_dsi.h b/drivers/gpu/drm/i915/intel_dsi.h
index d968f1f13e09..a9a19778dc7f 100644
--- a/drivers/gpu/drm/i915/intel_dsi.h
+++ b/drivers/gpu/drm/i915/intel_dsi.h
@@ -24,7 +24,6 @@
 #ifndef _INTEL_DSI_H
 #define _INTEL_DSI_H
 
-#include <drm/drmP.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_mipi_dsi.h>
 #include "intel_drv.h"
@@ -40,6 +39,7 @@ struct intel_dsi {
 	struct intel_encoder base;
 
 	struct intel_dsi_host *dsi_hosts[I915_MAX_PORTS];
+	intel_wakeref_t io_wakeref[I915_MAX_PORTS];
 
 	/* GPIO Desc for CRC based Panel control */
 	struct gpio_desc *gpio_panel;
@@ -173,7 +173,7 @@ int vlv_dsi_pll_compute(struct intel_encoder *encoder,
 void vlv_dsi_pll_enable(struct intel_encoder *encoder,
 			const struct intel_crtc_state *config);
 void vlv_dsi_pll_disable(struct intel_encoder *encoder);
-u32 vlv_dsi_get_pclk(struct intel_encoder *encoder, int pipe_bpp,
+u32 vlv_dsi_get_pclk(struct intel_encoder *encoder,
 		     struct intel_crtc_state *config);
 void vlv_dsi_reset_clocks(struct intel_encoder *encoder, enum port port);
 
@@ -183,7 +183,7 @@ int bxt_dsi_pll_compute(struct intel_encoder *encoder,
 void bxt_dsi_pll_enable(struct intel_encoder *encoder,
 			const struct intel_crtc_state *config);
 void bxt_dsi_pll_disable(struct intel_encoder *encoder);
-u32 bxt_dsi_get_pclk(struct intel_encoder *encoder, int pipe_bpp,
+u32 bxt_dsi_get_pclk(struct intel_encoder *encoder,
 		     struct intel_crtc_state *config);
 void bxt_dsi_reset_clocks(struct intel_encoder *encoder, enum port port);
 
diff --git a/drivers/gpu/drm/i915/intel_dsi_vbt.c b/drivers/gpu/drm/i915/intel_dsi_vbt.c
index a1a8b3790e61..06a11c35a784 100644
--- a/drivers/gpu/drm/i915/intel_dsi_vbt.c
+++ b/drivers/gpu/drm/i915/intel_dsi_vbt.c
@@ -24,15 +24,15 @@
  *
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
 #include <drm/i915_drm.h>
 #include <linux/gpio/consumer.h>
+#include <linux/mfd/intel_soc_pmic.h>
 #include <linux/slab.h>
 #include <video/mipi_display.h>
 #include <asm/intel-mid.h>
-#include <video/mipi_display.h>
+#include <asm/unaligned.h>
 #include "i915_drv.h"
 #include "intel_drv.h"
 #include "intel_dsi.h"
@@ -393,7 +393,25 @@ static const u8 *mipi_exec_spi(struct intel_dsi *intel_dsi, const u8 *data)
 
 static const u8 *mipi_exec_pmic(struct intel_dsi *intel_dsi, const u8 *data)
 {
-	DRM_DEBUG_KMS("Skipping PMIC element execution\n");
+#ifdef CONFIG_PMIC_OPREGION
+	u32 value, mask, reg_address;
+	u16 i2c_address;
+	int ret;
+
+	/* byte 0 aka PMIC Flag is reserved */
+	i2c_address	= get_unaligned_le16(data + 1);
+	reg_address	= get_unaligned_le32(data + 3);
+	value		= get_unaligned_le32(data + 7);
+	mask		= get_unaligned_le32(data + 11);
+
+	ret = intel_soc_pmic_exec_mipi_pmic_seq_element(i2c_address,
+							reg_address,
+							value, mask);
+	if (ret)
+		DRM_ERROR("%s failed, error: %d\n", __func__, ret);
+#else
+	DRM_ERROR("Your hardware requires CONFIG_PMIC_OPREGION and it is not set\n");
+#endif
 
 	return data + 15;
 }
diff --git a/drivers/gpu/drm/i915/intel_dvo.c b/drivers/gpu/drm/i915/intel_dvo.c
index 0042a7f69387..a6c82482a841 100644
--- a/drivers/gpu/drm/i915/intel_dvo.c
+++ b/drivers/gpu/drm/i915/intel_dvo.c
@@ -26,7 +26,6 @@
  */
 #include <linux/i2c.h>
 #include <linux/slab.h>
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
 #include "intel_drv.h"
@@ -235,9 +234,9 @@ intel_dvo_mode_valid(struct drm_connector *connector,
 	return intel_dvo->dev.dev_ops->mode_valid(&intel_dvo->dev, mode);
 }
 
-static bool intel_dvo_compute_config(struct intel_encoder *encoder,
-				     struct intel_crtc_state *pipe_config,
-				     struct drm_connector_state *conn_state)
+static int intel_dvo_compute_config(struct intel_encoder *encoder,
+				    struct intel_crtc_state *pipe_config,
+				    struct drm_connector_state *conn_state)
 {
 	struct intel_dvo *intel_dvo = enc_to_dvo(encoder);
 	const struct drm_display_mode *fixed_mode =
@@ -254,10 +253,11 @@ static bool intel_dvo_compute_config(struct intel_encoder *encoder,
 		intel_fixed_panel_mode(fixed_mode, adjusted_mode);
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
-	return true;
+
+	return 0;
 }
 
 static void intel_dvo_pre_enable(struct intel_encoder *encoder,
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index af2873403009..49fa43ff02ba 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -25,6 +25,7 @@
 #include <drm/drm_print.h>
 
 #include "i915_drv.h"
+#include "i915_reset.h"
 #include "intel_ringbuffer.h"
 #include "intel_lrc.h"
 
@@ -261,6 +262,31 @@ static void __sprint_engine_name(char *name, const struct engine_info *info)
 			 info->instance) >= INTEL_ENGINE_CS_MAX_NAME);
 }
 
+void intel_engine_set_hwsp_writemask(struct intel_engine_cs *engine, u32 mask)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	i915_reg_t hwstam;
+
+	/*
+	 * Though they added more rings on g4x/ilk, they did not add
+	 * per-engine HWSTAM until gen6.
+	 */
+	if (INTEL_GEN(dev_priv) < 6 && engine->class != RENDER_CLASS)
+		return;
+
+	hwstam = RING_HWSTAM(engine->mmio_base);
+	if (INTEL_GEN(dev_priv) >= 3)
+		I915_WRITE(hwstam, mask);
+	else
+		I915_WRITE16(hwstam, mask);
+}
+
+static void intel_engine_sanitize_mmio(struct intel_engine_cs *engine)
+{
+	/* Mask off all writes into the unknown HWSP */
+	intel_engine_set_hwsp_writemask(engine, ~0u);
+}
+
 static int
 intel_engine_setup(struct drm_i915_private *dev_priv,
 		   enum intel_engine_id id)
@@ -312,6 +338,9 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
 
+	/* Scrub mmio state on takeover */
+	intel_engine_sanitize_mmio(engine);
+
 	dev_priv->engine_class[info->class][info->instance] = engine;
 	dev_priv->engine[id] = engine;
 	return 0;
@@ -365,7 +394,7 @@ int intel_engines_init_mmio(struct drm_i915_private *dev_priv)
 		goto cleanup;
 	}
 
-	device_info->num_rings = hweight32(mask);
+	RUNTIME_INFO(dev_priv)->num_rings = hweight32(mask);
 
 	i915_check_and_clear_faults(dev_priv);
 
@@ -426,33 +455,9 @@ cleanup:
 	return err;
 }
 
-void intel_engine_init_global_seqno(struct intel_engine_cs *engine, u32 seqno)
+void intel_engine_write_global_seqno(struct intel_engine_cs *engine, u32 seqno)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	/* Our semaphore implementation is strictly monotonic (i.e. we proceed
-	 * so long as the semaphore value in the register/page is greater
-	 * than the sync value), so whenever we reset the seqno,
-	 * so long as we reset the tracking semaphore value to 0, it will
-	 * always be before the next request's seqno. If we don't reset
-	 * the semaphore value, then when the seqno moves backwards all
-	 * future waits will complete instantly (causing rendering corruption).
-	 */
-	if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) {
-		I915_WRITE(RING_SYNC_0(engine->mmio_base), 0);
-		I915_WRITE(RING_SYNC_1(engine->mmio_base), 0);
-		if (HAS_VEBOX(dev_priv))
-			I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
-	}
-
 	intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno);
-	clear_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
-
-	/* After manually advancing the seqno, fake the interrupt in case
-	 * there are any waiters for that seqno.
-	 */
-	intel_engine_wakeup(engine);
-
 	GEM_BUG_ON(intel_engine_get_seqno(engine) != seqno);
 }
 
@@ -469,50 +474,67 @@ static void intel_engine_init_execlist(struct intel_engine_cs *engine)
 	GEM_BUG_ON(!is_power_of_2(execlists_num_ports(execlists)));
 	GEM_BUG_ON(execlists_num_ports(execlists) > EXECLIST_MAX_PORTS);
 
-	execlists->queue_priority = INT_MIN;
+	execlists->queue_priority_hint = INT_MIN;
 	execlists->queue = RB_ROOT_CACHED;
 }
 
-/**
- * intel_engines_setup_common - setup engine state not requiring hw access
- * @engine: Engine to setup.
- *
- * Initializes @engine@ structure members shared between legacy and execlists
- * submission modes which do not require hardware access.
- *
- * Typically done early in the submission mode specific engine setup stage.
- */
-void intel_engine_setup_common(struct intel_engine_cs *engine)
+static void cleanup_status_page(struct intel_engine_cs *engine)
 {
-	i915_timeline_init(engine->i915, &engine->timeline, engine->name);
-	i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE);
+	struct i915_vma *vma;
 
-	intel_engine_init_execlist(engine);
-	intel_engine_init_hangcheck(engine);
-	intel_engine_init_batch_pool(engine);
-	intel_engine_init_cmd_parser(engine);
+	/* Prevent writes into HWSP after returning the page to the system */
+	intel_engine_set_hwsp_writemask(engine, ~0u);
+
+	vma = fetch_and_zero(&engine->status_page.vma);
+	if (!vma)
+		return;
+
+	if (!HWS_NEEDS_PHYSICAL(engine->i915))
+		i915_vma_unpin(vma);
+
+	i915_gem_object_unpin_map(vma->obj);
+	__i915_gem_object_release_unless_active(vma->obj);
 }
 
-static void cleanup_status_page(struct intel_engine_cs *engine)
+static int pin_ggtt_status_page(struct intel_engine_cs *engine,
+				struct i915_vma *vma)
 {
-	if (HWS_NEEDS_PHYSICAL(engine->i915)) {
-		void *addr = fetch_and_zero(&engine->status_page.page_addr);
+	unsigned int flags;
 
-		__free_page(virt_to_page(addr));
-	}
+	flags = PIN_GLOBAL;
+	if (!HAS_LLC(engine->i915))
+		/*
+		 * On g33, we cannot place HWS above 256MiB, so
+		 * restrict its pinning to the low mappable arena.
+		 * Though this restriction is not documented for
+		 * gen4, gen5, or byt, they also behave similarly
+		 * and hang if the HWS is placed at the top of the
+		 * GTT. To generalise, it appears that all !llc
+		 * platforms have issues with us placing the HWS
+		 * above the mappable region (even though we never
+		 * actually map it).
+		 */
+		flags |= PIN_MAPPABLE;
+	else
+		flags |= PIN_HIGH;
 
-	i915_vma_unpin_and_release(&engine->status_page.vma,
-				   I915_VMA_RELEASE_MAP);
+	return i915_vma_pin(vma, 0, 0, flags);
 }
 
 static int init_status_page(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_object *obj;
 	struct i915_vma *vma;
-	unsigned int flags;
 	void *vaddr;
 	int ret;
 
+	/*
+	 * Though the HWS register does support 36bit addresses, historically
+	 * we have had hangs and corruption reported due to wild writes if
+	 * the HWS is placed above 4G. We only allow objects to be allocated
+	 * in GFP_DMA32 for i965, and no earlier physical address users had
+	 * access to more than 4G.
+	 */
 	obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE);
 	if (IS_ERR(obj)) {
 		DRM_ERROR("Failed to allocate status page\n");
@@ -529,59 +551,67 @@ static int init_status_page(struct intel_engine_cs *engine)
 		goto err;
 	}
 
-	flags = PIN_GLOBAL;
-	if (!HAS_LLC(engine->i915))
-		/* On g33, we cannot place HWS above 256MiB, so
-		 * restrict its pinning to the low mappable arena.
-		 * Though this restriction is not documented for
-		 * gen4, gen5, or byt, they also behave similarly
-		 * and hang if the HWS is placed at the top of the
-		 * GTT. To generalise, it appears that all !llc
-		 * platforms have issues with us placing the HWS
-		 * above the mappable region (even though we never
-		 * actually map it).
-		 */
-		flags |= PIN_MAPPABLE;
-	else
-		flags |= PIN_HIGH;
-	ret = i915_vma_pin(vma, 0, 0, flags);
-	if (ret)
-		goto err;
-
 	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
-		goto err_unpin;
+		goto err;
 	}
 
+	engine->status_page.addr = memset(vaddr, 0, PAGE_SIZE);
 	engine->status_page.vma = vma;
-	engine->status_page.ggtt_offset = i915_ggtt_offset(vma);
-	engine->status_page.page_addr = memset(vaddr, 0, PAGE_SIZE);
+
+	if (!HWS_NEEDS_PHYSICAL(engine->i915)) {
+		ret = pin_ggtt_status_page(engine, vma);
+		if (ret)
+			goto err_unpin;
+	}
+
 	return 0;
 
 err_unpin:
-	i915_vma_unpin(vma);
+	i915_gem_object_unpin_map(obj);
 err:
 	i915_gem_object_put(obj);
 	return ret;
 }
 
-static int init_phys_status_page(struct intel_engine_cs *engine)
+/**
+ * intel_engines_setup_common - setup engine state not requiring hw access
+ * @engine: Engine to setup.
+ *
+ * Initializes @engine@ structure members shared between legacy and execlists
+ * submission modes which do not require hardware access.
+ *
+ * Typically done early in the submission mode specific engine setup stage.
+ */
+int intel_engine_setup_common(struct intel_engine_cs *engine)
 {
-	struct page *page;
+	int err;
 
-	/*
-	 * Though the HWS register does support 36bit addresses, historically
-	 * we have had hangs and corruption reported due to wild writes if
-	 * the HWS is placed above 4G.
-	 */
-	page = alloc_page(GFP_KERNEL | __GFP_DMA32 | __GFP_ZERO);
-	if (!page)
-		return -ENOMEM;
+	err = init_status_page(engine);
+	if (err)
+		return err;
+
+	err = i915_timeline_init(engine->i915,
+				 &engine->timeline,
+				 engine->name,
+				 engine->status_page.vma);
+	if (err)
+		goto err_hwsp;
 
-	engine->status_page.page_addr = page_address(page);
+	i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE);
+
+	intel_engine_init_breadcrumbs(engine);
+	intel_engine_init_execlist(engine);
+	intel_engine_init_hangcheck(engine);
+	intel_engine_init_batch_pool(engine);
+	intel_engine_init_cmd_parser(engine);
 
 	return 0;
+
+err_hwsp:
+	cleanup_status_page(engine);
+	return err;
 }
 
 static void __intel_context_unpin(struct i915_gem_context *ctx,
@@ -590,6 +620,56 @@ static void __intel_context_unpin(struct i915_gem_context *ctx,
 	intel_context_unpin(to_intel_context(ctx, engine));
 }
 
+struct measure_breadcrumb {
+	struct i915_request rq;
+	struct i915_timeline timeline;
+	struct intel_ring ring;
+	u32 cs[1024];
+};
+
+static int measure_breadcrumb_dw(struct intel_engine_cs *engine)
+{
+	struct measure_breadcrumb *frame;
+	int dw = -ENOMEM;
+
+	GEM_BUG_ON(!engine->i915->gt.scratch);
+
+	frame = kzalloc(sizeof(*frame), GFP_KERNEL);
+	if (!frame)
+		return -ENOMEM;
+
+	if (i915_timeline_init(engine->i915,
+			       &frame->timeline, "measure",
+			       engine->status_page.vma))
+		goto out_frame;
+
+	INIT_LIST_HEAD(&frame->ring.request_list);
+	frame->ring.timeline = &frame->timeline;
+	frame->ring.vaddr = frame->cs;
+	frame->ring.size = sizeof(frame->cs);
+	frame->ring.effective_size = frame->ring.size;
+	intel_ring_update_space(&frame->ring);
+
+	frame->rq.i915 = engine->i915;
+	frame->rq.engine = engine;
+	frame->rq.ring = &frame->ring;
+	frame->rq.timeline = &frame->timeline;
+
+	dw = i915_timeline_pin(&frame->timeline);
+	if (dw < 0)
+		goto out_timeline;
+
+	dw = engine->emit_fini_breadcrumb(&frame->rq, frame->cs) - frame->cs;
+
+	i915_timeline_unpin(&frame->timeline);
+
+out_timeline:
+	i915_timeline_fini(&frame->timeline);
+out_frame:
+	kfree(frame);
+	return dw;
+}
+
 /**
  * intel_engines_init_common - initialize cengine state which might require hw access
  * @engine: Engine to initialize.
@@ -632,21 +712,14 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
 		}
 	}
 
-	ret = intel_engine_init_breadcrumbs(engine);
-	if (ret)
+	ret = measure_breadcrumb_dw(engine);
+	if (ret < 0)
 		goto err_unpin_preempt;
 
-	if (HWS_NEEDS_PHYSICAL(i915))
-		ret = init_phys_status_page(engine);
-	else
-		ret = init_status_page(engine);
-	if (ret)
-		goto err_breadcrumbs;
+	engine->emit_fini_breadcrumb_dw = ret;
 
 	return 0;
 
-err_breadcrumbs:
-	intel_engine_fini_breadcrumbs(engine);
 err_unpin_preempt:
 	if (i915->preempt_context)
 		__intel_context_unpin(i915->preempt_context, engine);
@@ -769,12 +842,12 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
 
 u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv)
 {
-	const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu);
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 mcr_s_ss_select;
 	u32 slice = fls(sseu->slice_mask);
 	u32 subslice = fls(sseu->subslice_mask[slice]);
 
-	if (IS_GEN10(dev_priv))
+	if (IS_GEN(dev_priv, 10))
 		mcr_s_ss_select = GEN8_MCR_SLICE(slice) |
 				  GEN8_MCR_SUBSLICE(subslice);
 	else if (INTEL_GEN(dev_priv) >= 11)
@@ -786,15 +859,15 @@ u32 intel_calculate_mcr_s_ss_select(struct drm_i915_private *dev_priv)
 	return mcr_s_ss_select;
 }
 
-static inline uint32_t
+static inline u32
 read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
 		  int subslice, i915_reg_t reg)
 {
-	uint32_t mcr_slice_subslice_mask;
-	uint32_t mcr_slice_subslice_select;
-	uint32_t default_mcr_s_ss_select;
-	uint32_t mcr;
-	uint32_t ret;
+	u32 mcr_slice_subslice_mask;
+	u32 mcr_slice_subslice_select;
+	u32 default_mcr_s_ss_select;
+	u32 mcr;
+	u32 ret;
 	enum forcewake_domains fw_domains;
 
 	if (INTEL_GEN(dev_priv) >= 11) {
@@ -900,10 +973,15 @@ void intel_engine_get_instdone(struct intel_engine_cs *engine,
 static bool ring_is_idle(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
+	intel_wakeref_t wakeref;
 	bool idle = true;
 
+	if (I915_SELFTEST_ONLY(!engine->mmio_base))
+		return true;
+
 	/* If the whole device is asleep, the engine must be idle */
-	if (!intel_runtime_pm_get_if_in_use(dev_priv))
+	wakeref = intel_runtime_pm_get_if_in_use(dev_priv);
+	if (!wakeref)
 		return true;
 
 	/* First check that no commands are left in the ring */
@@ -915,7 +993,7 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
 	if (INTEL_GEN(dev_priv) > 2 && !(I915_READ_MODE(engine) & MODE_IDLE))
 		idle = false;
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 
 	return idle;
 }
@@ -939,9 +1017,6 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (!intel_engine_signaled(engine, intel_engine_last_submit(engine)))
 		return false;
 
-	if (I915_SELFTEST_ONLY(engine->breadcrumbs.mock))
-		return true;
-
 	/* Waiting to drain ELSP? */
 	if (READ_ONCE(engine->execlists.active)) {
 		struct tasklet_struct *t = &engine->execlists.tasklet;
@@ -967,10 +1042,7 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 		return false;
 
 	/* Ring stopped? */
-	if (!ring_is_idle(engine))
-		return false;
-
-	return true;
+	return ring_is_idle(engine);
 }
 
 bool intel_engines_are_idle(struct drm_i915_private *dev_priv)
@@ -1014,7 +1086,7 @@ bool intel_engine_has_kernel_context(const struct intel_engine_cs *engine)
 	 * the last request that remains in the timeline. When idle, it is
 	 * the last executed context as tracked by retirement.
 	 */
-	rq = __i915_gem_active_peek(&engine->timeline.last_request);
+	rq = __i915_active_request_peek(&engine->timeline.last_request);
 	if (rq)
 		return rq->hw_context == kernel_context;
 	else
@@ -1030,26 +1102,36 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915)
 		engine->set_default_submission(engine);
 }
 
+static bool reset_engines(struct drm_i915_private *i915)
+{
+	if (INTEL_INFO(i915)->gpu_reset_clobbers_display)
+		return false;
+
+	return intel_gpu_reset(i915, ALL_ENGINES) == 0;
+}
+
 /**
  * intel_engines_sanitize: called after the GPU has lost power
  * @i915: the i915 device
+ * @force: ignore a failed reset and sanitize engine state anyway
  *
  * Anytime we reset the GPU, either with an explicit GPU reset or through a
  * PCI power cycle, the GPU loses state and we must reset our state tracking
  * to match. Note that calling intel_engines_sanitize() if the GPU has not
  * been reset results in much confusion!
  */
-void intel_engines_sanitize(struct drm_i915_private *i915)
+void intel_engines_sanitize(struct drm_i915_private *i915, bool force)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
 	GEM_TRACE("\n");
 
-	for_each_engine(engine, i915, id) {
-		if (engine->reset.reset)
-			engine->reset.reset(engine, NULL);
-	}
+	if (!reset_engines(i915) && !force)
+		return;
+
+	for_each_engine(engine, i915, id)
+		intel_engine_reset(engine, false);
 }
 
 /**
@@ -1085,7 +1167,7 @@ void intel_engines_park(struct drm_i915_private *i915)
 		}
 
 		/* Must be reset upon idling, or we may miss the busy wakeup. */
-		GEM_BUG_ON(engine->execlists.queue_priority != INT_MIN);
+		GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN);
 
 		if (engine->park)
 			engine->park(engine);
@@ -1201,10 +1283,14 @@ static void print_request(struct drm_printer *m,
 
 	x = print_sched_attr(rq->i915, &rq->sched.attr, buf, x, sizeof(buf));
 
-	drm_printf(m, "%s%x%s [%llx:%x]%s @ %dms: %s\n",
+	drm_printf(m, "%s%x%s%s [%llx:%llx]%s @ %dms: %s\n",
 		   prefix,
 		   rq->global_seqno,
-		   i915_request_completed(rq) ? "!" : "",
+		   i915_request_completed(rq) ? "!" :
+		   i915_request_started(rq) ? "*" :
+		   "",
+		   test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+			    &rq->fence.flags) ?  "+" : "",
 		   rq->fence.context, rq->fence.seqno,
 		   buf,
 		   jiffies_to_msecs(jiffies - rq->emitted_jiffies),
@@ -1248,7 +1334,7 @@ static void intel_engine_print_registers(const struct intel_engine_cs *engine,
 		&engine->execlists;
 	u64 addr;
 
-	if (engine->id == RCS && IS_GEN(dev_priv, 4, 7))
+	if (engine->id == RCS && IS_GEN_RANGE(dev_priv, 4, 7))
 		drm_printf(m, "\tCCID: 0x%08x\n", I915_READ(CCID));
 	drm_printf(m, "\tRING_START: 0x%08x\n",
 		   I915_READ(RING_START(engine->mmio_base)));
@@ -1269,16 +1355,6 @@ static void intel_engine_print_registers(const struct intel_engine_cs *engine,
 		drm_printf(m, "\tRING_IMR: %08x\n", I915_READ_IMR(engine));
 	}
 
-	if (HAS_LEGACY_SEMAPHORES(dev_priv)) {
-		drm_printf(m, "\tSYNC_0: 0x%08x\n",
-			   I915_READ(RING_SYNC_0(engine->mmio_base)));
-		drm_printf(m, "\tSYNC_1: 0x%08x\n",
-			   I915_READ(RING_SYNC_1(engine->mmio_base)));
-		if (HAS_VEBOX(dev_priv))
-			drm_printf(m, "\tSYNC_2: 0x%08x\n",
-				   I915_READ(RING_SYNC_2(engine->mmio_base)));
-	}
-
 	addr = intel_engine_get_active_head(engine);
 	drm_printf(m, "\tACTHD:  0x%08x_%08x\n",
 		   upper_32_bits(addr), lower_32_bits(addr));
@@ -1305,7 +1381,8 @@ static void intel_engine_print_registers(const struct intel_engine_cs *engine,
 	}
 
 	if (HAS_EXECLISTS(dev_priv)) {
-		const u32 *hws = &engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
+		const u32 *hws =
+			&engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
 		unsigned int idx;
 		u8 read, write;
 
@@ -1348,9 +1425,10 @@ static void intel_engine_print_registers(const struct intel_engine_cs *engine,
 				char hdr[80];
 
 				snprintf(hdr, sizeof(hdr),
-					 "\t\tELSP[%d] count=%d, ring->start=%08x, rq: ",
+					 "\t\tELSP[%d] count=%d, ring:{start:%08x, hwsp:%08x}, rq: ",
 					 idx, count,
-					 i915_ggtt_offset(rq->ring->vma));
+					 i915_ggtt_offset(rq->ring->vma),
+					 rq->timeline->hwsp_offset);
 				print_request(m, rq, hdr);
 			} else {
 				drm_printf(m, "\t\tELSP[%d] idle\n", idx);
@@ -1405,14 +1483,9 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 		       struct drm_printer *m,
 		       const char *header, ...)
 {
-	const int MAX_REQUESTS_TO_SHOW = 8;
-	struct intel_breadcrumbs * const b = &engine->breadcrumbs;
-	const struct intel_engine_execlists * const execlists = &engine->execlists;
 	struct i915_gpu_error * const error = &engine->i915->gpu_error;
-	struct i915_request *rq, *last;
-	unsigned long flags;
-	struct rb_node *rb;
-	int count;
+	struct i915_request *rq;
+	intel_wakeref_t wakeref;
 
 	if (header) {
 		va_list ap;
@@ -1462,85 +1535,30 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 			   rq->ring->emit);
 		drm_printf(m, "\t\tring->space:  0x%08x\n",
 			   rq->ring->space);
+		drm_printf(m, "\t\tring->hwsp:   0x%08x\n",
+			   rq->timeline->hwsp_offset);
 
 		print_request_ring(m, rq);
 	}
 
 	rcu_read_unlock();
 
-	if (intel_runtime_pm_get_if_in_use(engine->i915)) {
+	wakeref = intel_runtime_pm_get_if_in_use(engine->i915);
+	if (wakeref) {
 		intel_engine_print_registers(engine, m);
-		intel_runtime_pm_put(engine->i915);
+		intel_runtime_pm_put(engine->i915, wakeref);
 	} else {
 		drm_printf(m, "\tDevice is asleep; skipping register dump\n");
 	}
 
-	local_irq_save(flags);
-	spin_lock(&engine->timeline.lock);
-
-	last = NULL;
-	count = 0;
-	list_for_each_entry(rq, &engine->timeline.requests, link) {
-		if (count++ < MAX_REQUESTS_TO_SHOW - 1)
-			print_request(m, rq, "\t\tE ");
-		else
-			last = rq;
-	}
-	if (last) {
-		if (count > MAX_REQUESTS_TO_SHOW) {
-			drm_printf(m,
-				   "\t\t...skipping %d executing requests...\n",
-				   count - MAX_REQUESTS_TO_SHOW);
-		}
-		print_request(m, last, "\t\tE ");
-	}
-
-	last = NULL;
-	count = 0;
-	drm_printf(m, "\t\tQueue priority: %d\n", execlists->queue_priority);
-	for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
-		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
-		int i;
-
-		priolist_for_each_request(rq, p, i) {
-			if (count++ < MAX_REQUESTS_TO_SHOW - 1)
-				print_request(m, rq, "\t\tQ ");
-			else
-				last = rq;
-		}
-	}
-	if (last) {
-		if (count > MAX_REQUESTS_TO_SHOW) {
-			drm_printf(m,
-				   "\t\t...skipping %d queued requests...\n",
-				   count - MAX_REQUESTS_TO_SHOW);
-		}
-		print_request(m, last, "\t\tQ ");
-	}
-
-	spin_unlock(&engine->timeline.lock);
-
-	spin_lock(&b->rb_lock);
-	for (rb = rb_first(&b->waiters); rb; rb = rb_next(rb)) {
-		struct intel_wait *w = rb_entry(rb, typeof(*w), node);
-
-		drm_printf(m, "\t%s [%d:%c] waiting for %x\n",
-			   w->tsk->comm, w->tsk->pid,
-			   task_state_to_char(w->tsk),
-			   w->seqno);
-	}
-	spin_unlock(&b->rb_lock);
-	local_irq_restore(flags);
-
-	drm_printf(m, "IRQ? 0x%lx (breadcrumbs? %s)\n",
-		   engine->irq_posted,
-		   yesno(test_bit(ENGINE_IRQ_BREADCRUMB,
-				  &engine->irq_posted)));
+	intel_execlists_show_requests(engine, m, print_request, 8);
 
 	drm_printf(m, "HWSP:\n");
-	hexdump(m, engine->status_page.page_addr, PAGE_SIZE);
+	hexdump(m, engine->status_page.addr, PAGE_SIZE);
 
 	drm_printf(m, "Idle? %s\n", yesno(intel_engine_is_idle(engine)));
+
+	intel_engine_print_breadcrumbs(engine, m);
 }
 
 static u8 user_class_map[] = {
diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index f23570c44323..656e684e7c9a 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -38,6 +38,8 @@
  * forcibly disable it to allow proper screen updates.
  */
 
+#include <drm/drm_fourcc.h>
+
 #include "intel_drv.h"
 #include "i915_drv.h"
 
@@ -84,7 +86,7 @@ static int intel_fbc_calculate_cfb_size(struct drm_i915_private *dev_priv,
 	int lines;
 
 	intel_fbc_get_plane_source_size(cache, NULL, &lines);
-	if (IS_GEN7(dev_priv))
+	if (IS_GEN(dev_priv, 7))
 		lines = min(lines, 2048);
 	else if (INTEL_GEN(dev_priv) >= 8)
 		lines = min(lines, 2560);
@@ -127,7 +129,7 @@ static void i8xx_fbc_activate(struct drm_i915_private *dev_priv)
 		cfb_pitch = params->fb.stride;
 
 	/* FBC_CTL wants 32B or 64B units */
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		cfb_pitch = (cfb_pitch / 32) - 1;
 	else
 		cfb_pitch = (cfb_pitch / 64) - 1;
@@ -136,7 +138,7 @@ static void i8xx_fbc_activate(struct drm_i915_private *dev_priv)
 	for (i = 0; i < (FBC_LL_SIZE / 32) + 1; i++)
 		I915_WRITE(FBC_TAG(i), 0);
 
-	if (IS_GEN4(dev_priv)) {
+	if (IS_GEN(dev_priv, 4)) {
 		u32 fbc_ctl2;
 
 		/* Set it up... */
@@ -233,9 +235,9 @@ static void ilk_fbc_activate(struct drm_i915_private *dev_priv)
 
 	if (params->flags & PLANE_HAS_FENCE) {
 		dpfc_ctl |= DPFC_CTL_FENCE_EN;
-		if (IS_GEN5(dev_priv))
+		if (IS_GEN(dev_priv, 5))
 			dpfc_ctl |= params->vma->fence->id;
-		if (IS_GEN6(dev_priv)) {
+		if (IS_GEN(dev_priv, 6)) {
 			I915_WRITE(SNB_DPFC_CTL_SA,
 				   SNB_CPU_FENCE_ENABLE |
 				   params->vma->fence->id);
@@ -243,7 +245,7 @@ static void ilk_fbc_activate(struct drm_i915_private *dev_priv)
 				   params->crtc.fence_y_offset);
 		}
 	} else {
-		if (IS_GEN6(dev_priv)) {
+		if (IS_GEN(dev_priv, 6)) {
 			I915_WRITE(SNB_DPFC_CTL_SA, 0);
 			I915_WRITE(DPFC_CPU_FENCE_OFFSET, 0);
 		}
@@ -282,7 +284,7 @@ static void gen7_fbc_activate(struct drm_i915_private *dev_priv)
 	int threshold = dev_priv->fbc.threshold;
 
 	/* Display WA #0529: skl, kbl, bxt. */
-	if (IS_GEN9(dev_priv) && !IS_GEMINILAKE(dev_priv)) {
+	if (IS_GEN(dev_priv, 9) && !IS_GEMINILAKE(dev_priv)) {
 		u32 val = I915_READ(CHICKEN_MISC_4);
 
 		val &= ~(FBC_STRIDE_OVERRIDE | FBC_STRIDE_MASK);
@@ -581,10 +583,10 @@ static bool stride_is_valid(struct drm_i915_private *dev_priv,
 	if (stride < 512)
 		return false;
 
-	if (IS_GEN2(dev_priv) || IS_GEN3(dev_priv))
+	if (IS_GEN(dev_priv, 2) || IS_GEN(dev_priv, 3))
 		return stride == 4096 || stride == 8192;
 
-	if (IS_GEN4(dev_priv) && !IS_G4X(dev_priv) && stride < 2048)
+	if (IS_GEN(dev_priv, 4) && !IS_G4X(dev_priv) && stride < 2048)
 		return false;
 
 	if (stride > 16384)
@@ -594,7 +596,7 @@ static bool stride_is_valid(struct drm_i915_private *dev_priv,
 }
 
 static bool pixel_format_is_valid(struct drm_i915_private *dev_priv,
-				  uint32_t pixel_format)
+				  u32 pixel_format)
 {
 	switch (pixel_format) {
 	case DRM_FORMAT_XRGB8888:
@@ -603,7 +605,7 @@ static bool pixel_format_is_valid(struct drm_i915_private *dev_priv,
 	case DRM_FORMAT_XRGB1555:
 	case DRM_FORMAT_RGB565:
 		/* 16bpp not supported on gen2 */
-		if (IS_GEN2(dev_priv))
+		if (IS_GEN(dev_priv, 2))
 			return false;
 		/* WaFbcOnly1to1Ratio:ctg */
 		if (IS_G4X(dev_priv))
@@ -626,7 +628,10 @@ static bool intel_fbc_hw_tracking_covers_screen(struct intel_crtc *crtc)
 	struct intel_fbc *fbc = &dev_priv->fbc;
 	unsigned int effective_w, effective_h, max_w, max_h;
 
-	if (INTEL_GEN(dev_priv) >= 8 || IS_HASWELL(dev_priv)) {
+	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv)) {
+		max_w = 5120;
+		max_h = 4096;
+	} else if (INTEL_GEN(dev_priv) >= 8 || IS_HASWELL(dev_priv)) {
 		max_w = 4096;
 		max_h = 4096;
 	} else if (IS_G4X(dev_priv) || INTEL_GEN(dev_priv) >= 5) {
@@ -784,7 +789,7 @@ static bool intel_fbc_can_activate(struct intel_crtc *crtc)
 	 * having a Y offset that isn't divisible by 4 causes FIFO underrun
 	 * and screen flicker.
 	 */
-	if (IS_GEN(dev_priv, 9, 10) &&
+	if (IS_GEN_RANGE(dev_priv, 9, 10) &&
 	    (fbc->state_cache.plane.adjusted_y & 3)) {
 		fbc->no_fbc_reason = "plane Y offset is misaligned";
 		return false;
@@ -839,7 +844,7 @@ static void intel_fbc_get_reg_params(struct intel_crtc *crtc,
 
 	params->cfb_size = intel_fbc_calculate_cfb_size(dev_priv, cache);
 
-	if (IS_GEN9(dev_priv) && !IS_GEMINILAKE(dev_priv))
+	if (IS_GEN(dev_priv, 9) && !IS_GEMINILAKE(dev_priv))
 		params->gen9_wa_cfb_stride = DIV_ROUND_UP(cache->plane.src_w,
 						32 * fbc->threshold) * 8;
 }
@@ -1126,8 +1131,6 @@ void intel_fbc_disable(struct intel_crtc *crtc)
 	if (!fbc_supported(dev_priv))
 		return;
 
-	WARN_ON(crtc->active);
-
 	mutex_lock(&fbc->lock);
 	if (fbc->crtc == crtc)
 		__intel_fbc_disable(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 4ee16b264dbe..e8f694b57b8a 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -37,9 +37,10 @@
 #include <linux/init.h>
 #include <linux/vga_switcheroo.h>
 
-#include <drm/drmP.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_fourcc.h>
+
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
 #include <drm/i915_drm.h>
@@ -178,8 +179,9 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	const struct i915_ggtt_view view = {
 		.type = I915_GGTT_VIEW_NORMAL,
 	};
-	struct fb_info *info;
 	struct drm_framebuffer *fb;
+	intel_wakeref_t wakeref;
+	struct fb_info *info;
 	struct i915_vma *vma;
 	unsigned long flags = 0;
 	bool prealloc = false;
@@ -210,7 +212,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	}
 
 	mutex_lock(&dev->struct_mutex);
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	/* Pin the GGTT vma for our access via info->screen_base.
 	 * This also validates that any existing fb inherited from the
@@ -277,7 +279,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	ifbdev->vma = vma;
 	ifbdev->vma_flags = flags;
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 	mutex_unlock(&dev->struct_mutex);
 	vga_switcheroo_client_fb_set(pdev, info);
 	return 0;
@@ -285,7 +287,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 out_unpin:
 	intel_unpin_fb_vma(vma, flags);
 out_unlock:
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 	mutex_unlock(&dev->struct_mutex);
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/intel_fifo_underrun.c b/drivers/gpu/drm/i915/intel_fifo_underrun.c
index 77c123cc8817..f33de4be4b89 100644
--- a/drivers/gpu/drm/i915/intel_fifo_underrun.c
+++ b/drivers/gpu/drm/i915/intel_fifo_underrun.c
@@ -127,8 +127,8 @@ static void ironlake_set_fifo_underrun_reporting(struct drm_device *dev,
 						 enum pipe pipe, bool enable)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	uint32_t bit = (pipe == PIPE_A) ? DE_PIPEA_FIFO_UNDERRUN :
-					  DE_PIPEB_FIFO_UNDERRUN;
+	u32 bit = (pipe == PIPE_A) ?
+		DE_PIPEA_FIFO_UNDERRUN : DE_PIPEB_FIFO_UNDERRUN;
 
 	if (enable)
 		ilk_enable_display_irq(dev_priv, bit);
@@ -140,7 +140,7 @@ static void ivybridge_check_fifo_underruns(struct intel_crtc *crtc)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum pipe pipe = crtc->pipe;
-	uint32_t err_int = I915_READ(GEN7_ERR_INT);
+	u32 err_int = I915_READ(GEN7_ERR_INT);
 
 	lockdep_assert_held(&dev_priv->irq_lock);
 
@@ -193,8 +193,8 @@ static void ibx_set_fifo_underrun_reporting(struct drm_device *dev,
 					    bool enable)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	uint32_t bit = (pch_transcoder == PIPE_A) ?
-		       SDE_TRANSA_FIFO_UNDER : SDE_TRANSB_FIFO_UNDER;
+	u32 bit = (pch_transcoder == PIPE_A) ?
+		SDE_TRANSA_FIFO_UNDER : SDE_TRANSB_FIFO_UNDER;
 
 	if (enable)
 		ibx_enable_display_interrupt(dev_priv, bit);
@@ -206,7 +206,7 @@ static void cpt_check_pch_fifo_underruns(struct intel_crtc *crtc)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum pipe pch_transcoder = crtc->pipe;
-	uint32_t serr_int = I915_READ(SERR_INT);
+	u32 serr_int = I915_READ(SERR_INT);
 
 	lockdep_assert_held(&dev_priv->irq_lock);
 
@@ -258,11 +258,11 @@ static bool __intel_set_cpu_fifo_underrun_reporting(struct drm_device *dev,
 	old = !crtc->cpu_fifo_underrun_disabled;
 	crtc->cpu_fifo_underrun_disabled = !enable;
 
-	if (HAS_GMCH_DISPLAY(dev_priv))
+	if (HAS_GMCH(dev_priv))
 		i9xx_set_fifo_underrun_reporting(dev, pipe, enable, old);
-	else if (IS_GEN5(dev_priv) || IS_GEN6(dev_priv))
+	else if (IS_GEN_RANGE(dev_priv, 5, 6))
 		ironlake_set_fifo_underrun_reporting(dev, pipe, enable);
-	else if (IS_GEN7(dev_priv))
+	else if (IS_GEN(dev_priv, 7))
 		ivybridge_set_fifo_underrun_reporting(dev, pipe, enable, old);
 	else if (INTEL_GEN(dev_priv) >= 8)
 		broadwell_set_fifo_underrun_reporting(dev, pipe, enable);
@@ -369,7 +369,7 @@ void intel_cpu_fifo_underrun_irq_handler(struct drm_i915_private *dev_priv,
 		return;
 
 	/* GMCH can't disable fifo underruns, filter them. */
-	if (HAS_GMCH_DISPLAY(dev_priv) &&
+	if (HAS_GMCH(dev_priv) &&
 	    crtc->cpu_fifo_underrun_disabled)
 		return;
 
@@ -421,9 +421,9 @@ void intel_check_cpu_fifo_underruns(struct drm_i915_private *dev_priv)
 		if (crtc->cpu_fifo_underrun_disabled)
 			continue;
 
-		if (HAS_GMCH_DISPLAY(dev_priv))
+		if (HAS_GMCH(dev_priv))
 			i9xx_check_fifo_underruns(crtc);
-		else if (IS_GEN7(dev_priv))
+		else if (IS_GEN(dev_priv, 7))
 			ivybridge_check_fifo_underruns(crtc);
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c b/drivers/gpu/drm/i915/intel_frontbuffer.c
index c3379bde266f..16f253deaf8d 100644
--- a/drivers/gpu/drm/i915/intel_frontbuffer.c
+++ b/drivers/gpu/drm/i915/intel_frontbuffer.c
@@ -60,7 +60,6 @@
  * functions is deprecated and should be avoided.
  */
 
-#include <drm/drmP.h>
 
 #include "intel_drv.h"
 #include "intel_frontbuffer.h"
diff --git a/drivers/gpu/drm/i915/intel_gpu_commands.h b/drivers/gpu/drm/i915/intel_gpu_commands.h
index 105e2a9e874a..b96a31bc1080 100644
--- a/drivers/gpu/drm/i915/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/intel_gpu_commands.h
@@ -112,7 +112,6 @@
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 945,g33,965 */
 #define   MI_USE_GGTT		(1 << 22) /* g4x+ */
 #define MI_STORE_DWORD_INDEX	MI_INSTR(0x21, 1)
-#define   MI_STORE_DWORD_INDEX_SHIFT 2
 /*
  * Official intel docs are somewhat sloppy concerning MI_LOAD_REGISTER_IMM:
  * - Always issue a MI_NOOP _before_ the MI_LOAD_REGISTER_IMM - otherwise hw
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0f1c4f9ebfd8..744220296653 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -192,4 +192,7 @@ static inline void intel_guc_disable_msg(struct intel_guc *guc, u32 mask)
 	spin_unlock_irq(&guc->irq_lock);
 }
 
+int intel_guc_reset_engine(struct intel_guc *guc,
+			   struct intel_engine_cs *engine);
+
 #endif
diff --git a/drivers/gpu/drm/i915/intel_guc_fw.c b/drivers/gpu/drm/i915/intel_guc_fw.c
index a67144ee5ceb..13ff7003c6be 100644
--- a/drivers/gpu/drm/i915/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/intel_guc_fw.c
@@ -77,10 +77,6 @@ static void guc_fw_select(struct intel_uc_fw *guc_fw)
 		guc_fw->path = I915_KBL_GUC_UCODE;
 		guc_fw->major_ver_wanted = KBL_FW_MAJOR;
 		guc_fw->minor_ver_wanted = KBL_FW_MINOR;
-	} else {
-		dev_info(dev_priv->drm.dev,
-			 "%s: No firmware known for this platform!\n",
-			 intel_uc_fw_type_repr(guc_fw->type));
 	}
 }
 
@@ -115,7 +111,7 @@ static void guc_prepare_xfer(struct intel_guc *guc)
 	else
 		I915_WRITE(GEN9_GT_PM_CONFIG, GT_DOORBELL_ENABLE);
 
-	if (IS_GEN9(dev_priv)) {
+	if (IS_GEN(dev_priv, 9)) {
 		/* DOP Clock Gating Enable for GuC clocks */
 		I915_WRITE(GEN7_MISCCPCTL, (GEN8_DOP_CLOCK_GATE_GUC_ENABLE |
 					    I915_READ(GEN7_MISCCPCTL)));
diff --git a/drivers/gpu/drm/i915/intel_guc_log.c b/drivers/gpu/drm/i915/intel_guc_log.c
index d3ebdbc0182e..806fdfd7c78a 100644
--- a/drivers/gpu/drm/i915/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/intel_guc_log.c
@@ -140,6 +140,9 @@ static struct dentry *create_buf_file_callback(const char *filename,
 
 	buf_file = debugfs_create_file(filename, mode,
 				       parent, buf, &relay_file_operations);
+	if (IS_ERR(buf_file))
+		return NULL;
+
 	return buf_file;
 }
 
@@ -436,6 +439,7 @@ static void guc_log_capture_logs(struct intel_guc_log *log)
 {
 	struct intel_guc *guc = log_to_guc(log);
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
+	intel_wakeref_t wakeref;
 
 	guc_read_update_log_buffer(log);
 
@@ -443,9 +447,8 @@ static void guc_log_capture_logs(struct intel_guc_log *log)
 	 * Generally device is expected to be active only at this
 	 * time, so get/put should be really quick.
 	 */
-	intel_runtime_pm_get(dev_priv);
-	guc_action_flush_log_complete(guc);
-	intel_runtime_pm_put(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref)
+		guc_action_flush_log_complete(guc);
 }
 
 int intel_guc_log_create(struct intel_guc_log *log)
@@ -505,7 +508,8 @@ int intel_guc_log_set_level(struct intel_guc_log *log, u32 level)
 {
 	struct intel_guc *guc = log_to_guc(log);
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
-	int ret;
+	intel_wakeref_t wakeref;
+	int ret = 0;
 
 	BUILD_BUG_ON(GUC_LOG_VERBOSITY_MIN != 0);
 	GEM_BUG_ON(!log->vma);
@@ -519,16 +523,14 @@ int intel_guc_log_set_level(struct intel_guc_log *log, u32 level)
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
 
-	if (log->level == level) {
-		ret = 0;
+	if (log->level == level)
 		goto out_unlock;
-	}
 
-	intel_runtime_pm_get(dev_priv);
-	ret = guc_action_control_log(guc, GUC_LOG_LEVEL_IS_VERBOSE(level),
-				     GUC_LOG_LEVEL_IS_ENABLED(level),
-				     GUC_LOG_LEVEL_TO_VERBOSITY(level));
-	intel_runtime_pm_put(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref)
+		ret = guc_action_control_log(guc,
+					     GUC_LOG_LEVEL_IS_VERBOSE(level),
+					     GUC_LOG_LEVEL_IS_ENABLED(level),
+					     GUC_LOG_LEVEL_TO_VERBOSITY(level));
 	if (ret) {
 		DRM_DEBUG_DRIVER("guc_log_control action failed %d\n", ret);
 		goto out_unlock;
@@ -601,6 +603,7 @@ void intel_guc_log_relay_flush(struct intel_guc_log *log)
 {
 	struct intel_guc *guc = log_to_guc(log);
 	struct drm_i915_private *i915 = guc_to_i915(guc);
+	intel_wakeref_t wakeref;
 
 	/*
 	 * Before initiating the forceful flush, wait for any pending/ongoing
@@ -608,9 +611,8 @@ void intel_guc_log_relay_flush(struct intel_guc_log *log)
 	 */
 	flush_work(&log->relay.flush_work);
 
-	intel_runtime_pm_get(i915);
-	guc_action_flush_log(guc);
-	intel_runtime_pm_put(i915);
+	with_intel_runtime_pm(i915, wakeref)
+		guc_action_flush_log(guc);
 
 	/* GuC would have updated log buffer by now, so capture it */
 	guc_log_capture_logs(log);
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 1570dcbe249c..8bc8aa54aa35 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -81,6 +81,12 @@
  *
  */
 
+static inline u32 intel_hws_preempt_done_address(struct intel_engine_cs *engine)
+{
+	return (i915_ggtt_offset(engine->status_page.vma) +
+		I915_GEM_HWS_PREEMPT_ADDR);
+}
+
 static inline struct i915_priolist *to_priolist(struct rb_node *rb)
 {
 	return rb_entry(rb, struct i915_priolist, node);
@@ -572,7 +578,8 @@ static void inject_preempt_context(struct work_struct *work)
 		if (engine->id == RCS) {
 			cs = gen8_emit_ggtt_write_rcs(cs,
 						      GUC_PREEMPT_FINISHED,
-						      addr);
+						      addr,
+						      PIPE_CONTROL_CS_STALL);
 		} else {
 			cs = gen8_emit_ggtt_write(cs,
 						  GUC_PREEMPT_FINISHED,
@@ -622,6 +629,8 @@ static void inject_preempt_context(struct work_struct *work)
 				       EXECLISTS_ACTIVE_PREEMPT);
 		tasklet_schedule(&engine->execlists.tasklet);
 	}
+
+	(void)I915_SELFTEST_ONLY(engine->execlists.preempt_hang.count++);
 }
 
 /*
@@ -665,7 +674,7 @@ static void complete_preempt_context(struct intel_engine_cs *engine)
 	execlists_unwind_incomplete_requests(execlists);
 
 	wait_for_guc_preempt_report(engine);
-	intel_write_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX, 0);
+	intel_write_status_page(engine, I915_GEM_HWS_PREEMPT, 0);
 }
 
 /**
@@ -730,7 +739,7 @@ static bool __guc_dequeue(struct intel_engine_cs *engine)
 		if (intel_engine_has_preemption(engine)) {
 			struct guc_preempt_work *preempt_work =
 				&engine->i915->guc.preempt_work[engine->id];
-			int prio = execlists->queue_priority;
+			int prio = execlists->queue_priority_hint;
 
 			if (__execlists_need_preempt(prio, port_prio(port))) {
 				execlists_set_active(execlists,
@@ -776,7 +785,8 @@ static bool __guc_dequeue(struct intel_engine_cs *engine)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
-	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
+	execlists->queue_priority_hint =
+		rb ? to_priolist(rb)->priority : INT_MIN;
 	if (submit)
 		port_assign(port, last);
 	if (last)
@@ -823,7 +833,7 @@ static void guc_submission_tasklet(unsigned long data)
 	}
 
 	if (execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT) &&
-	    intel_read_status_page(engine, I915_GEM_HWS_PREEMPT_INDEX) ==
+	    intel_read_status_page(engine, I915_GEM_HWS_PREEMPT) ==
 	    GUC_PREEMPT_FINISHED)
 		complete_preempt_context(engine);
 
@@ -833,8 +843,7 @@ static void guc_submission_tasklet(unsigned long data)
 	spin_unlock_irqrestore(&engine->timeline.lock, flags);
 }
 
-static struct i915_request *
-guc_reset_prepare(struct intel_engine_cs *engine)
+static void guc_reset_prepare(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
 
@@ -860,8 +869,6 @@ guc_reset_prepare(struct intel_engine_cs *engine)
 	 */
 	if (engine->i915->guc.preempt_wq)
 		flush_workqueue(engine->i915->guc.preempt_wq);
-
-	return i915_gem_find_active_request(engine);
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/intel_gvt.c b/drivers/gpu/drm/i915/intel_gvt.c
index c22b3e18a0f5..1d7d26e4cf14 100644
--- a/drivers/gpu/drm/i915/intel_gvt.c
+++ b/drivers/gpu/drm/i915/intel_gvt.c
@@ -49,6 +49,9 @@ static bool is_supported_device(struct drm_i915_private *dev_priv)
 		return true;
 	if (IS_BROXTON(dev_priv))
 		return true;
+	if (IS_COFFEELAKE(dev_priv))
+		return true;
+
 	return false;
 }
 
@@ -105,15 +108,6 @@ int intel_gvt_init(struct drm_i915_private *dev_priv)
 		return -EIO;
 	}
 
-	/*
-	 * We're not in host or fail to find a MPT module, disable GVT-g
-	 */
-	ret = intel_gvt_init_host();
-	if (ret) {
-		DRM_DEBUG_DRIVER("Not in host or MPT modules not found\n");
-		goto bail;
-	}
-
 	ret = intel_gvt_init_device(dev_priv);
 	if (ret) {
 		DRM_DEBUG_DRIVER("Fail to init GVT device\n");
diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
index e26d05a46451..a219c796e56d 100644
--- a/drivers/gpu/drm/i915/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/intel_hangcheck.c
@@ -23,144 +23,18 @@
  */
 
 #include "i915_drv.h"
+#include "i915_reset.h"
 
-static bool
-ipehr_is_semaphore_wait(struct intel_engine_cs *engine, u32 ipehr)
-{
-	ipehr &= ~MI_SEMAPHORE_SYNC_MASK;
-	return ipehr == (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE |
-			 MI_SEMAPHORE_REGISTER);
-}
-
-static struct intel_engine_cs *
-semaphore_wait_to_signaller_ring(struct intel_engine_cs *engine, u32 ipehr,
-				 u64 offset)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	u32 sync_bits = ipehr & MI_SEMAPHORE_SYNC_MASK;
-	struct intel_engine_cs *signaller;
-	enum intel_engine_id id;
-
-	for_each_engine(signaller, dev_priv, id) {
-		if (engine == signaller)
-			continue;
-
-		if (sync_bits == signaller->semaphore.mbox.wait[engine->hw_id])
-			return signaller;
-	}
-
-	DRM_DEBUG_DRIVER("No signaller ring found for %s, ipehr 0x%08x\n",
-			 engine->name, ipehr);
-
-	return ERR_PTR(-ENODEV);
-}
-
-static struct intel_engine_cs *
-semaphore_waits_for(struct intel_engine_cs *engine, u32 *seqno)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	void __iomem *vaddr;
-	u32 cmd, ipehr, head;
-	u64 offset = 0;
-	int i, backwards;
-
-	/*
-	 * This function does not support execlist mode - any attempt to
-	 * proceed further into this function will result in a kernel panic
-	 * when dereferencing ring->buffer, which is not set up in execlist
-	 * mode.
-	 *
-	 * The correct way of doing it would be to derive the currently
-	 * executing ring buffer from the current context, which is derived
-	 * from the currently running request. Unfortunately, to get the
-	 * current request we would have to grab the struct_mutex before doing
-	 * anything else, which would be ill-advised since some other thread
-	 * might have grabbed it already and managed to hang itself, causing
-	 * the hang checker to deadlock.
-	 *
-	 * Therefore, this function does not support execlist mode in its
-	 * current form. Just return NULL and move on.
-	 */
-	if (engine->buffer == NULL)
-		return NULL;
-
-	ipehr = I915_READ(RING_IPEHR(engine->mmio_base));
-	if (!ipehr_is_semaphore_wait(engine, ipehr))
-		return NULL;
-
-	/*
-	 * HEAD is likely pointing to the dword after the actual command,
-	 * so scan backwards until we find the MBOX. But limit it to just 3
-	 * or 4 dwords depending on the semaphore wait command size.
-	 * Note that we don't care about ACTHD here since that might
-	 * point at at batch, and semaphores are always emitted into the
-	 * ringbuffer itself.
-	 */
-	head = I915_READ_HEAD(engine) & HEAD_ADDR;
-	backwards = (INTEL_GEN(dev_priv) >= 8) ? 5 : 4;
-	vaddr = (void __iomem *)engine->buffer->vaddr;
-
-	for (i = backwards; i; --i) {
-		/*
-		 * Be paranoid and presume the hw has gone off into the wild -
-		 * our ring is smaller than what the hardware (and hence
-		 * HEAD_ADDR) allows. Also handles wrap-around.
-		 */
-		head &= engine->buffer->size - 1;
-
-		/* This here seems to blow up */
-		cmd = ioread32(vaddr + head);
-		if (cmd == ipehr)
-			break;
-
-		head -= 4;
-	}
-
-	if (!i)
-		return NULL;
-
-	*seqno = ioread32(vaddr + head + 4) + 1;
-	return semaphore_wait_to_signaller_ring(engine, ipehr, offset);
-}
-
-static int semaphore_passed(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	struct intel_engine_cs *signaller;
+struct hangcheck {
+	u64 acthd;
 	u32 seqno;
-
-	engine->hangcheck.deadlock++;
-
-	signaller = semaphore_waits_for(engine, &seqno);
-	if (signaller == NULL)
-		return -1;
-
-	if (IS_ERR(signaller))
-		return 0;
-
-	/* Prevent pathological recursion due to driver bugs */
-	if (signaller->hangcheck.deadlock >= I915_NUM_ENGINES)
-		return -1;
-
-	if (intel_engine_signaled(signaller, seqno))
-		return 1;
-
-	/* cursory check for an unkickable deadlock */
-	if (I915_READ_CTL(signaller) & RING_WAIT_SEMAPHORE &&
-	    semaphore_passed(signaller) < 0)
-		return -1;
-
-	return 0;
-}
-
-static void semaphore_clear_deadlocks(struct drm_i915_private *dev_priv)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	for_each_engine(engine, dev_priv, id)
-		engine->hangcheck.deadlock = 0;
-}
+	enum intel_engine_hangcheck_action action;
+	unsigned long action_timestamp;
+	int deadlock;
+	struct intel_instdone instdone;
+	bool wedged:1;
+	bool stalled:1;
+};
 
 static bool instdone_unchanged(u32 current_instdone, u32 *old_instdone)
 {
@@ -236,7 +110,7 @@ engine_stuck(struct intel_engine_cs *engine, u64 acthd)
 	if (ha != ENGINE_DEAD)
 		return ha;
 
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		return ENGINE_DEAD;
 
 	/* Is the chip hanging on a WAIT_FOR_EVENT?
@@ -252,54 +126,26 @@ engine_stuck(struct intel_engine_cs *engine, u64 acthd)
 		return ENGINE_WAIT_KICK;
 	}
 
-	if (IS_GEN(dev_priv, 6, 7) && tmp & RING_WAIT_SEMAPHORE) {
-		switch (semaphore_passed(engine)) {
-		default:
-			return ENGINE_DEAD;
-		case 1:
-			i915_handle_error(dev_priv, ALL_ENGINES, 0,
-					  "stuck semaphore on %s",
-					  engine->name);
-			I915_WRITE_CTL(engine, tmp);
-			return ENGINE_WAIT_KICK;
-		case 0:
-			return ENGINE_WAIT;
-		}
-	}
-
 	return ENGINE_DEAD;
 }
 
 static void hangcheck_load_sample(struct intel_engine_cs *engine,
-				  struct intel_engine_hangcheck *hc)
+				  struct hangcheck *hc)
 {
-	/* We don't strictly need an irq-barrier here, as we are not
-	 * serving an interrupt request, be paranoid in case the
-	 * barrier has side-effects (such as preventing a broken
-	 * cacheline snoop) and so be sure that we can see the seqno
-	 * advance. If the seqno should stick, due to a stale
-	 * cacheline, we would erroneously declare the GPU hung.
-	 */
-	if (engine->irq_seqno_barrier)
-		engine->irq_seqno_barrier(engine);
-
 	hc->acthd = intel_engine_get_active_head(engine);
 	hc->seqno = intel_engine_get_seqno(engine);
 }
 
 static void hangcheck_store_sample(struct intel_engine_cs *engine,
-				   const struct intel_engine_hangcheck *hc)
+				   const struct hangcheck *hc)
 {
 	engine->hangcheck.acthd = hc->acthd;
 	engine->hangcheck.seqno = hc->seqno;
-	engine->hangcheck.action = hc->action;
-	engine->hangcheck.stalled = hc->stalled;
-	engine->hangcheck.wedged = hc->wedged;
 }
 
 static enum intel_engine_hangcheck_action
 hangcheck_get_action(struct intel_engine_cs *engine,
-		     const struct intel_engine_hangcheck *hc)
+		     const struct hangcheck *hc)
 {
 	if (engine->hangcheck.seqno != hc->seqno)
 		return ENGINE_ACTIVE_SEQNO;
@@ -311,7 +157,7 @@ hangcheck_get_action(struct intel_engine_cs *engine,
 }
 
 static void hangcheck_accumulate_sample(struct intel_engine_cs *engine,
-					struct intel_engine_hangcheck *hc)
+					struct hangcheck *hc)
 {
 	unsigned long timeout = I915_ENGINE_DEAD_TIMEOUT;
 
@@ -357,10 +203,6 @@ static void hangcheck_accumulate_sample(struct intel_engine_cs *engine,
 		break;
 
 	case ENGINE_DEAD:
-		if (GEM_SHOW_DEBUG()) {
-			struct drm_printer p = drm_debug_printer("hangcheck");
-			intel_engine_dump(engine, &p, "%s\n", engine->name);
-		}
 		break;
 
 	default:
@@ -431,24 +273,35 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	intel_uncore_arm_unclaimed_mmio_detection(dev_priv);
 
 	for_each_engine(engine, dev_priv, id) {
-		struct intel_engine_hangcheck hc;
+		struct hangcheck hc;
 
-		semaphore_clear_deadlocks(dev_priv);
+		intel_engine_signal_breadcrumbs(engine);
 
 		hangcheck_load_sample(engine, &hc);
 		hangcheck_accumulate_sample(engine, &hc);
 		hangcheck_store_sample(engine, &hc);
 
-		if (engine->hangcheck.stalled) {
+		if (hc.stalled) {
 			hung |= intel_engine_flag(engine);
 			if (hc.action != ENGINE_DEAD)
 				stuck |= intel_engine_flag(engine);
 		}
 
-		if (engine->hangcheck.wedged)
+		if (hc.wedged)
 			wedged |= intel_engine_flag(engine);
 	}
 
+	if (GEM_SHOW_DEBUG() && (hung | stuck)) {
+		struct drm_printer p = drm_debug_printer("hangcheck");
+
+		for_each_engine(engine, dev_priv, id) {
+			if (intel_engine_is_idle(engine))
+				continue;
+
+			intel_engine_dump(engine, &p, "%s\n", engine->name);
+		}
+	}
+
 	if (wedged) {
 		dev_err(dev_priv->drm.dev,
 			"GPU recovery timed out,"
diff --git a/drivers/gpu/drm/i915/intel_hdcp.c b/drivers/gpu/drm/i915/intel_hdcp.c
index 1bf487f94254..ce7ba3a9c000 100644
--- a/drivers/gpu/drm/i915/intel_hdcp.c
+++ b/drivers/gpu/drm/i915/intel_hdcp.c
@@ -6,7 +6,6 @@
  * Sean Paul <seanpaul@chromium.org>
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_hdcp.h>
 #include <linux/i2c.h>
 #include <linux/random.h>
@@ -15,6 +14,7 @@
 #include "i915_reg.h"
 
 #define KEY_LOAD_TRIES	5
+#define ENCRYPT_STATUS_CHANGE_TIMEOUT_MS	50
 
 static
 bool intel_hdcp_is_ksv_valid(u8 *ksv)
@@ -157,10 +157,11 @@ static int intel_hdcp_load_keys(struct drm_i915_private *dev_priv)
 	/*
 	 * Initiate loading the HDCP key from fuses.
 	 *
-	 * BXT+ platforms, HDCP key needs to be loaded by SW. Only SKL and KBL
-	 * differ in the key load trigger process from other platforms.
+	 * BXT+ platforms, HDCP key needs to be loaded by SW. Only Gen 9
+	 * platforms except BXT and GLK, differ in the key load trigger process
+	 * from other platforms. So GEN9_BC uses the GT Driver Mailbox i/f.
 	 */
-	if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) {
+	if (IS_GEN9_BC(dev_priv)) {
 		mutex_lock(&dev_priv->pcu_lock);
 		ret = sandybridge_pcode_write(dev_priv,
 					      SKL_PCODE_LOAD_HDCP_KEYS, 1);
@@ -636,7 +637,8 @@ static int intel_hdcp_auth(struct intel_digital_port *intel_dig_port,
 
 	/* Wait for encryption confirmation */
 	if (intel_wait_for_register(dev_priv, PORT_HDCP_STATUS(port),
-				    HDCP_STATUS_ENC, HDCP_STATUS_ENC, 20)) {
+				    HDCP_STATUS_ENC, HDCP_STATUS_ENC,
+				    ENCRYPT_STATUS_CHANGE_TIMEOUT_MS)) {
 		DRM_ERROR("Timed out waiting for encryption\n");
 		return -ETIMEDOUT;
 	}
@@ -666,7 +668,7 @@ static int _intel_hdcp_disable(struct intel_connector *connector)
 
 	I915_WRITE(PORT_HDCP_CONF(port), 0);
 	if (intel_wait_for_register(dev_priv, PORT_HDCP_STATUS(port), ~0, 0,
-				    20)) {
+				    ENCRYPT_STATUS_CHANGE_TIMEOUT_MS)) {
 		DRM_ERROR("Failed to disable HDCP, timeout clearing status\n");
 		return -ETIMEDOUT;
 	}
@@ -768,8 +770,7 @@ static void intel_hdcp_prop_work(struct work_struct *work)
 bool is_hdcp_supported(struct drm_i915_private *dev_priv, enum port port)
 {
 	/* PORT E doesn't have HDCP, and PORT F is disabled */
-	return ((INTEL_GEN(dev_priv) >= 8 || IS_HASWELL(dev_priv)) &&
-		!IS_CHERRYVIEW(dev_priv) && port < PORT_E);
+	return INTEL_GEN(dev_priv) >= 9 && port < PORT_E;
 }
 
 int intel_hdcp_init(struct intel_connector *connector,
@@ -837,8 +838,8 @@ void intel_hdcp_atomic_check(struct drm_connector *connector,
 			     struct drm_connector_state *old_state,
 			     struct drm_connector_state *new_state)
 {
-	uint64_t old_cp = old_state->content_protection;
-	uint64_t new_cp = new_state->content_protection;
+	u64 old_cp = old_state->content_protection;
+	u64 new_cp = new_state->content_protection;
 	struct drm_crtc_state *crtc_state;
 
 	if (!new_state->crtc) {
diff --git a/drivers/gpu/drm/i915/intel_hdmi.c b/drivers/gpu/drm/i915/intel_hdmi.c
index 07e803a604bd..f125a62eba8c 100644
--- a/drivers/gpu/drm/i915/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/intel_hdmi.c
@@ -30,7 +30,6 @@
 #include <linux/slab.h>
 #include <linux/delay.h>
 #include <linux/hdmi.h>
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
@@ -479,18 +478,14 @@ static void intel_hdmi_set_avi_infoframe(struct intel_encoder *encoder,
 					 const struct intel_crtc_state *crtc_state,
 					 const struct drm_connector_state *conn_state)
 {
-	struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(&encoder->base);
 	const struct drm_display_mode *adjusted_mode =
 		&crtc_state->base.adjusted_mode;
-	struct drm_connector *connector = &intel_hdmi->attached_connector->base;
-	bool is_hdmi2_sink = connector->display_info.hdmi.scdc.supported ||
-	   connector->display_info.color_formats & DRM_COLOR_FORMAT_YCRCB420;
 	union hdmi_infoframe frame;
 	int ret;
 
 	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
-						       adjusted_mode,
-						       is_hdmi2_sink);
+						       conn_state->connector,
+						       adjusted_mode);
 	if (ret < 0) {
 		DRM_ERROR("couldn't fill AVI infoframe\n");
 		return;
@@ -503,12 +498,12 @@ static void intel_hdmi_set_avi_infoframe(struct intel_encoder *encoder,
 	else
 		frame.avi.colorspace = HDMI_COLORSPACE_RGB;
 
-	drm_hdmi_avi_infoframe_quant_range(&frame.avi, adjusted_mode,
+	drm_hdmi_avi_infoframe_quant_range(&frame.avi,
+					   conn_state->connector,
+					   adjusted_mode,
 					   crtc_state->limited_color_range ?
 					   HDMI_QUANTIZATION_RANGE_LIMITED :
-					   HDMI_QUANTIZATION_RANGE_FULL,
-					   intel_hdmi->rgb_quant_range_selectable,
-					   is_hdmi2_sink);
+					   HDMI_QUANTIZATION_RANGE_FULL);
 
 	drm_hdmi_avi_infoframe_content_type(&frame.avi,
 					    conn_state);
@@ -1191,15 +1186,17 @@ static bool intel_hdmi_get_hw_state(struct intel_encoder *encoder,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(&encoder->base);
+	intel_wakeref_t wakeref;
 	bool ret;
 
-	if (!intel_display_power_get_if_enabled(dev_priv,
-						encoder->power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     encoder->power_domain);
+	if (!wakeref)
 		return false;
 
 	ret = intel_sdvo_port_enabled(dev_priv, intel_hdmi->hdmi_reg, pipe);
 
-	intel_display_power_put(dev_priv, encoder->power_domain);
+	intel_display_power_put(dev_priv, encoder->power_domain, wakeref);
 
 	return ret;
 }
@@ -1591,7 +1588,7 @@ intel_hdmi_mode_valid(struct drm_connector *connector,
 
 	if (hdmi->has_hdmi_sink && !force_dvi) {
 		/* if we can't do 8bpc we may still be able to do 12bpc */
-		if (status != MODE_OK && !HAS_GMCH_DISPLAY(dev_priv))
+		if (status != MODE_OK && !HAS_GMCH(dev_priv))
 			status = hdmi_port_clock_valid(hdmi, clock * 3 / 2,
 						       true, force_dvi);
 
@@ -1616,7 +1613,7 @@ static bool hdmi_deep_color_possible(const struct intel_crtc_state *crtc_state,
 		&crtc_state->base.adjusted_mode;
 	int i;
 
-	if (HAS_GMCH_DISPLAY(dev_priv))
+	if (HAS_GMCH(dev_priv))
 		return false;
 
 	if (bpc == 10 && INTEL_GEN(dev_priv) < 11)
@@ -1707,9 +1704,9 @@ intel_hdmi_ycbcr420_config(struct drm_connector *connector,
 	return true;
 }
 
-bool intel_hdmi_compute_config(struct intel_encoder *encoder,
-			       struct intel_crtc_state *pipe_config,
-			       struct drm_connector_state *conn_state)
+int intel_hdmi_compute_config(struct intel_encoder *encoder,
+			      struct intel_crtc_state *pipe_config,
+			      struct drm_connector_state *conn_state)
 {
 	struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(&encoder->base);
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
@@ -1725,7 +1722,7 @@ bool intel_hdmi_compute_config(struct intel_encoder *encoder,
 	bool force_dvi = intel_conn_state->force_audio == HDMI_AUDIO_OFF_DVI;
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
 	pipe_config->has_hdmi_sink = !force_dvi && intel_hdmi->has_hdmi_sink;
@@ -1756,7 +1753,7 @@ bool intel_hdmi_compute_config(struct intel_encoder *encoder,
 						&clock_12bpc, &clock_10bpc,
 						&clock_8bpc)) {
 			DRM_ERROR("Can't support YCBCR420 output\n");
-			return false;
+			return -EINVAL;
 		}
 	}
 
@@ -1806,7 +1803,7 @@ bool intel_hdmi_compute_config(struct intel_encoder *encoder,
 	if (hdmi_port_clock_valid(intel_hdmi, pipe_config->port_clock,
 				  false, force_dvi) != MODE_OK) {
 		DRM_DEBUG_KMS("unsupported HDMI clock, rejecting mode\n");
-		return false;
+		return -EINVAL;
 	}
 
 	/* Set user selected PAR to incoming mode's member */
@@ -1825,7 +1822,7 @@ bool intel_hdmi_compute_config(struct intel_encoder *encoder,
 		}
 	}
 
-	return true;
+	return 0;
 }
 
 static void
@@ -1835,7 +1832,6 @@ intel_hdmi_unset_edid(struct drm_connector *connector)
 
 	intel_hdmi->has_hdmi_sink = false;
 	intel_hdmi->has_audio = false;
-	intel_hdmi->rgb_quant_range_selectable = false;
 
 	intel_hdmi->dp_dual_mode.type = DRM_DP_DUAL_MODE_NONE;
 	intel_hdmi->dp_dual_mode.max_tmds_clock = 0;
@@ -1896,11 +1892,12 @@ intel_hdmi_set_edid(struct drm_connector *connector)
 {
 	struct drm_i915_private *dev_priv = to_i915(connector->dev);
 	struct intel_hdmi *intel_hdmi = intel_attached_hdmi(connector);
+	intel_wakeref_t wakeref;
 	struct edid *edid;
 	bool connected = false;
 	struct i2c_adapter *i2c;
 
-	intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS);
+	wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS);
 
 	i2c = intel_gmbus_get_adapter(dev_priv, intel_hdmi->ddc_bus);
 
@@ -1915,13 +1912,10 @@ intel_hdmi_set_edid(struct drm_connector *connector)
 
 	intel_hdmi_dp_dual_mode_detect(connector, edid != NULL);
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_GMBUS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_GMBUS, wakeref);
 
 	to_intel_connector(connector)->detect_edid = edid;
 	if (edid && edid->input & DRM_EDID_INPUT_DIGITAL) {
-		intel_hdmi->rgb_quant_range_selectable =
-			drm_rgb_quant_range_selectable(edid);
-
 		intel_hdmi->has_audio = drm_detect_monitor_audio(edid);
 		intel_hdmi->has_hdmi_sink = drm_detect_hdmi_monitor(edid);
 
@@ -1940,11 +1934,12 @@ intel_hdmi_detect(struct drm_connector *connector, bool force)
 	struct drm_i915_private *dev_priv = to_i915(connector->dev);
 	struct intel_hdmi *intel_hdmi = intel_attached_hdmi(connector);
 	struct intel_encoder *encoder = &hdmi_to_dig_port(intel_hdmi)->base;
+	intel_wakeref_t wakeref;
 
 	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n",
 		      connector->base.id, connector->name);
 
-	intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS);
+	wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS);
 
 	if (IS_ICELAKE(dev_priv) &&
 	    !intel_digital_port_connected(encoder))
@@ -1956,7 +1951,7 @@ intel_hdmi_detect(struct drm_connector *connector, bool force)
 		status = connector_status_connected;
 
 out:
-	intel_display_power_put(dev_priv, POWER_DOMAIN_GMBUS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_GMBUS, wakeref);
 
 	if (status != connector_status_connected)
 		cec_notifier_phys_addr_invalidate(intel_hdmi->cec_notifier);
@@ -2155,7 +2150,7 @@ intel_hdmi_add_properties(struct intel_hdmi *intel_hdmi, struct drm_connector *c
 	drm_connector_attach_content_type_property(connector);
 	connector->state->picture_aspect_ratio = HDMI_PICTURE_ASPECT_NONE;
 
-	if (!HAS_GMCH_DISPLAY(dev_priv))
+	if (!HAS_GMCH(dev_priv))
 		drm_connector_attach_max_bpc_property(connector, 8, 12);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c
index e24174d08fed..b8937c788f03 100644
--- a/drivers/gpu/drm/i915/intel_hotplug.c
+++ b/drivers/gpu/drm/i915/intel_hotplug.c
@@ -23,7 +23,6 @@
 
 #include <linux/kernel.h>
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 
 #include "i915_drv.h"
@@ -227,9 +226,10 @@ static void intel_hpd_irq_storm_reenable_work(struct work_struct *work)
 		container_of(work, typeof(*dev_priv),
 			     hotplug.reenable_work.work);
 	struct drm_device *dev = &dev_priv->drm;
+	intel_wakeref_t wakeref;
 	enum hpd_pin pin;
 
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	spin_lock_irq(&dev_priv->irq_lock);
 	for_each_hpd_pin(pin) {
@@ -262,7 +262,7 @@ static void intel_hpd_irq_storm_reenable_work(struct work_struct *work)
 		dev_priv->display.hpd_irq_setup(dev_priv);
 	spin_unlock_irq(&dev_priv->irq_lock);
 
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 }
 
 bool intel_encoder_hotplug(struct intel_encoder *encoder,
@@ -470,7 +470,7 @@ void intel_hpd_irq_handler(struct drm_i915_private *dev_priv,
 			 * hotplug bits itself. So only WARN about unexpected
 			 * interrupts on saner platforms.
 			 */
-			WARN_ONCE(!HAS_GMCH_DISPLAY(dev_priv),
+			WARN_ONCE(!HAS_GMCH(dev_priv),
 				  "Received HPD interrupt on pin %d although disabled\n", pin);
 			continue;
 		}
diff --git a/drivers/gpu/drm/i915/intel_huc.c b/drivers/gpu/drm/i915/intel_huc.c
index bc27b691d824..9bd1c9002c2a 100644
--- a/drivers/gpu/drm/i915/intel_huc.c
+++ b/drivers/gpu/drm/i915/intel_huc.c
@@ -115,14 +115,14 @@ fail:
 int intel_huc_check_status(struct intel_huc *huc)
 {
 	struct drm_i915_private *dev_priv = huc_to_i915(huc);
-	bool status;
+	intel_wakeref_t wakeref;
+	bool status = false;
 
 	if (!HAS_HUC(dev_priv))
 		return -ENODEV;
 
-	intel_runtime_pm_get(dev_priv);
-	status = I915_READ(HUC_STATUS2) & HUC_FW_VERIFIED;
-	intel_runtime_pm_put(dev_priv);
+	with_intel_runtime_pm(dev_priv, wakeref)
+		status = I915_READ(HUC_STATUS2) & HUC_FW_VERIFIED;
 
 	return status;
 }
diff --git a/drivers/gpu/drm/i915/intel_huc_fw.c b/drivers/gpu/drm/i915/intel_huc_fw.c
index f93d2384d482..7d7bfc7f7ca7 100644
--- a/drivers/gpu/drm/i915/intel_huc_fw.c
+++ b/drivers/gpu/drm/i915/intel_huc_fw.c
@@ -23,8 +23,8 @@
  */
 
 #define BXT_HUC_FW_MAJOR 01
-#define BXT_HUC_FW_MINOR 07
-#define BXT_BLD_NUM 1398
+#define BXT_HUC_FW_MINOR 8
+#define BXT_BLD_NUM 2893
 
 #define SKL_HUC_FW_MAJOR 01
 #define SKL_HUC_FW_MINOR 07
@@ -76,9 +76,6 @@ static void huc_fw_select(struct intel_uc_fw *huc_fw)
 		huc_fw->path = I915_KBL_HUC_UCODE;
 		huc_fw->major_ver_wanted = KBL_HUC_FW_MAJOR;
 		huc_fw->minor_ver_wanted = KBL_HUC_FW_MINOR;
-	} else {
-		DRM_WARN("%s: No firmware known for this platform!\n",
-			 intel_uc_fw_type_repr(huc_fw->type));
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/intel_i2c.c b/drivers/gpu/drm/i915/intel_i2c.c
index 802d0394ccc4..5a733e711355 100644
--- a/drivers/gpu/drm/i915/intel_i2c.c
+++ b/drivers/gpu/drm/i915/intel_i2c.c
@@ -29,7 +29,6 @@
 #include <linux/i2c.h>
 #include <linux/i2c-algo-bit.h>
 #include <linux/export.h>
-#include <drm/drmP.h>
 #include <drm/drm_hdcp.h>
 #include "intel_drv.h"
 #include <drm/i915_drm.h>
@@ -698,12 +697,13 @@ out:
 static int
 gmbus_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs, int num)
 {
-	struct intel_gmbus *bus = container_of(adapter, struct intel_gmbus,
-					       adapter);
+	struct intel_gmbus *bus =
+		container_of(adapter, struct intel_gmbus, adapter);
 	struct drm_i915_private *dev_priv = bus->dev_priv;
+	intel_wakeref_t wakeref;
 	int ret;
 
-	intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS);
+	wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS);
 
 	if (bus->force_bit) {
 		ret = i2c_bit_algo.master_xfer(adapter, msgs, num);
@@ -715,17 +715,16 @@ gmbus_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs, int num)
 			bus->force_bit |= GMBUS_FORCE_BIT_RETRY;
 	}
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_GMBUS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_GMBUS, wakeref);
 
 	return ret;
 }
 
 int intel_gmbus_output_aksv(struct i2c_adapter *adapter)
 {
-	struct intel_gmbus *bus = container_of(adapter, struct intel_gmbus,
-					       adapter);
+	struct intel_gmbus *bus =
+		container_of(adapter, struct intel_gmbus, adapter);
 	struct drm_i915_private *dev_priv = bus->dev_priv;
-	int ret;
 	u8 cmd = DRM_HDCP_DDC_AKSV;
 	u8 buf[DRM_HDCP_KSV_LEN] = { 0 };
 	struct i2c_msg msgs[] = {
@@ -742,8 +741,10 @@ int intel_gmbus_output_aksv(struct i2c_adapter *adapter)
 			.buf = buf,
 		}
 	};
+	intel_wakeref_t wakeref;
+	int ret;
 
-	intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS);
+	wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS);
 	mutex_lock(&dev_priv->gmbus_mutex);
 
 	/*
@@ -754,7 +755,7 @@ int intel_gmbus_output_aksv(struct i2c_adapter *adapter)
 	ret = do_gmbus_xfer(adapter, msgs, ARRAY_SIZE(msgs), GMBUS_AKSV_SELECT);
 
 	mutex_unlock(&dev_priv->gmbus_mutex);
-	intel_display_power_put(dev_priv, POWER_DOMAIN_GMBUS);
+	intel_display_power_put(dev_priv, POWER_DOMAIN_GMBUS, wakeref);
 
 	return ret;
 }
@@ -822,7 +823,7 @@ int intel_setup_gmbus(struct drm_i915_private *dev_priv)
 
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
 		dev_priv->gpio_mmio_base = VLV_DISPLAY_BASE;
-	else if (!HAS_GMCH_DISPLAY(dev_priv))
+	else if (!HAS_GMCH(dev_priv))
 		/*
 		 * Broxton uses the same PCH offsets for South Display Engine,
 		 * even though it doesn't have a PCH.
diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c
index 5d5336fbe7b0..f8239bca3820 100644
--- a/drivers/gpu/drm/i915/intel_lpe_audio.c
+++ b/drivers/gpu/drm/i915/intel_lpe_audio.c
@@ -65,6 +65,7 @@
 #include <linux/irq.h>
 #include <linux/pci.h>
 #include <linux/pm_runtime.h>
+#include <linux/platform_device.h>
 
 #include "i915_drv.h"
 #include <linux/delay.h>
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index eab9341a5152..5e98fd79bd9d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -133,10 +133,10 @@
  */
 #include <linux/interrupt.h>
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "i915_gem_render_state.h"
+#include "i915_reset.h"
 #include "i915_vgpu.h"
 #include "intel_lrc_reg.h"
 #include "intel_mocs.h"
@@ -172,6 +172,12 @@ static void execlists_init_reg_state(u32 *reg_state,
 				     struct intel_engine_cs *engine,
 				     struct intel_ring *ring);
 
+static inline u32 intel_hws_seqno_address(struct intel_engine_cs *engine)
+{
+	return (i915_ggtt_offset(engine->status_page.vma) +
+		I915_GEM_HWS_INDEX_ADDR);
+}
+
 static inline struct i915_priolist *to_priolist(struct rb_node *rb)
 {
 	return rb_entry(rb, struct i915_priolist, node);
@@ -182,13 +188,90 @@ static inline int rq_prio(const struct i915_request *rq)
 	return rq->sched.attr.priority;
 }
 
+static int queue_prio(const struct intel_engine_execlists *execlists)
+{
+	struct i915_priolist *p;
+	struct rb_node *rb;
+
+	rb = rb_first_cached(&execlists->queue);
+	if (!rb)
+		return INT_MIN;
+
+	/*
+	 * As the priolist[] are inverted, with the highest priority in [0],
+	 * we have to flip the index value to become priority.
+	 */
+	p = to_priolist(rb);
+	return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
+}
+
 static inline bool need_preempt(const struct intel_engine_cs *engine,
-				const struct i915_request *last,
-				int prio)
+				const struct i915_request *rq)
 {
-	return (intel_engine_has_preemption(engine) &&
-		__execlists_need_preempt(prio, rq_prio(last)) &&
-		!i915_request_completed(last));
+	const int last_prio = rq_prio(rq);
+
+	if (!intel_engine_has_preemption(engine))
+		return false;
+
+	if (i915_request_completed(rq))
+		return false;
+
+	/*
+	 * Check if the current priority hint merits a preemption attempt.
+	 *
+	 * We record the highest value priority we saw during rescheduling
+	 * prior to this dequeue, therefore we know that if it is strictly
+	 * less than the current tail of ESLP[0], we do not need to force
+	 * a preempt-to-idle cycle.
+	 *
+	 * However, the priority hint is a mere hint that we may need to
+	 * preempt. If that hint is stale or we may be trying to preempt
+	 * ourselves, ignore the request.
+	 */
+	if (!__execlists_need_preempt(engine->execlists.queue_priority_hint,
+				      last_prio))
+		return false;
+
+	/*
+	 * Check against the first request in ELSP[1], it will, thanks to the
+	 * power of PI, be the highest priority of that context.
+	 */
+	if (!list_is_last(&rq->link, &engine->timeline.requests) &&
+	    rq_prio(list_next_entry(rq, link)) > last_prio)
+		return true;
+
+	/*
+	 * If the inflight context did not trigger the preemption, then maybe
+	 * it was the set of queued requests? Pick the highest priority in
+	 * the queue (the first active priolist) and see if it deserves to be
+	 * running instead of ELSP[0].
+	 *
+	 * The highest priority request in the queue can not be either
+	 * ELSP[0] or ELSP[1] as, thanks again to PI, if it was the same
+	 * context, it's priority would not exceed ELSP[0] aka last_prio.
+	 */
+	return queue_prio(&engine->execlists) > last_prio;
+}
+
+__maybe_unused static inline bool
+assert_priority_queue(const struct intel_engine_execlists *execlists,
+		      const struct i915_request *prev,
+		      const struct i915_request *next)
+{
+	if (!prev)
+		return true;
+
+	/*
+	 * Without preemption, the prev may refer to the still active element
+	 * which we refuse to let go.
+	 *
+	 * Even with preemption, there are times when we think it is better not
+	 * to preempt and leave an ostensibly lower priority request in flight.
+	 */
+	if (port_request(execlists->port) == prev)
+		return true;
+
+	return rq_prio(prev) >= rq_prio(next);
 }
 
 /*
@@ -265,7 +348,8 @@ static void unwind_wa_tail(struct i915_request *rq)
 	assert_ring_tail_valid(rq->ring, rq->tail);
 }
 
-static void __unwind_incomplete_requests(struct intel_engine_cs *engine)
+static struct i915_request *
+__unwind_incomplete_requests(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq, *rn, *active = NULL;
 	struct list_head *uninitialized_var(pl);
@@ -307,6 +391,8 @@ static void __unwind_incomplete_requests(struct intel_engine_cs *engine)
 		list_move_tail(&active->sched.link,
 			       i915_sched_lookup_priolist(engine, prio));
 	}
+
+	return active;
 }
 
 void
@@ -364,31 +450,12 @@ execlists_context_schedule_out(struct i915_request *rq, unsigned long status)
 	trace_i915_request_out(rq);
 }
 
-static void
-execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
-{
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
-}
-
 static u64 execlists_update_context(struct i915_request *rq)
 {
-	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
 	struct intel_context *ce = rq->hw_context;
-	u32 *reg_state = ce->lrc_reg_state;
-
-	reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
 
-	/*
-	 * True 32b PPGTT with dynamic page allocation: update PDP
-	 * registers and point the unallocated PDPs to scratch page.
-	 * PML4 is allocated during ppgtt init, so this is not needed
-	 * in 48-bit mode.
-	 */
-	if (!i915_vm_is_48bit(&ppgtt->vm))
-		execlists_update_context_pdps(ppgtt, reg_state);
+	ce->lrc_reg_state[CTX_RING_TAIL + 1] =
+		intel_ring_set_tail(rq->ring, rq->tail);
 
 	/*
 	 * Make sure the context image is complete before we submit it to HW.
@@ -456,11 +523,12 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 			desc = execlists_update_context(rq);
 			GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
 
-			GEM_TRACE("%s in[%d]:  ctx=%d.%d, global=%d (fence %llx:%d) (current %d), prio=%d\n",
+			GEM_TRACE("%s in[%d]:  ctx=%d.%d, global=%d (fence %llx:%lld) (current %d:%d), prio=%d\n",
 				  engine->name, n,
 				  port[n].context_id, count,
 				  rq->global_seqno,
 				  rq->fence.context, rq->fence.seqno,
+				  hwsp_seqno(rq),
 				  intel_engine_get_seqno(engine),
 				  rq_prio(rq));
 		} else {
@@ -532,6 +600,8 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
 
 	execlists_clear_active(execlists, EXECLISTS_ACTIVE_HWACK);
 	execlists_set_active(execlists, EXECLISTS_ACTIVE_PREEMPT);
+
+	(void)I915_SELFTEST_ONLY(execlists->preempt_hang.count++);
 }
 
 static void complete_preempt_context(struct intel_engine_execlists *execlists)
@@ -600,7 +670,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_HWACK))
 			return;
 
-		if (need_preempt(engine, last, execlists->queue_priority)) {
+		if (need_preempt(engine, last)) {
 			inject_preempt_context(engine);
 			return;
 		}
@@ -633,7 +703,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		 * WaIdleLiteRestore:bdw,skl
 		 * Apply the wa NOOPs to prevent
 		 * ring:HEAD == rq:TAIL as we resubmit the
-		 * request. See gen8_emit_breadcrumb() for
+		 * request. See gen8_emit_fini_breadcrumb() for
 		 * where we prepare the padding after the
 		 * end of the request.
 		 */
@@ -646,8 +716,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		int i;
 
 		priolist_for_each_request_consume(rq, rn, p, i) {
-			GEM_BUG_ON(last &&
-				   need_preempt(engine, last, rq_prio(rq)));
+			GEM_BUG_ON(!assert_priority_queue(execlists, last, rq));
 
 			/*
 			 * Can we combine this request with the current port?
@@ -708,20 +777,20 @@ done:
 	/*
 	 * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
 	 *
-	 * We choose queue_priority such that if we add a request of greater
+	 * We choose the priority hint such that if we add a request of greater
 	 * priority than this, we kick the submission tasklet to decide on
 	 * the right order of submitting the requests to hardware. We must
 	 * also be prepared to reorder requests as they are in-flight on the
-	 * HW. We derive the queue_priority then as the first "hole" in
+	 * HW. We derive the priority hint then as the first "hole" in
 	 * the HW submission ports and if there are no available slots,
 	 * the priority of the lowest executing request, i.e. last.
 	 *
 	 * When we do receive a higher priority request ready to run from the
-	 * user, see queue_request(), the queue_priority is bumped to that
+	 * user, see queue_request(), the priority hint is bumped to that
 	 * request triggering preemption on the next dequeue (or subsequent
 	 * interrupt for secondary ports).
 	 */
-	execlists->queue_priority =
+	execlists->queue_priority_hint =
 		port != execlists->port ? rq_prio(last) : INT_MIN;
 
 	if (submit) {
@@ -752,11 +821,12 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 	while (num_ports-- && port_isset(port)) {
 		struct i915_request *rq = port_request(port);
 
-		GEM_TRACE("%s:port%u global=%d (fence %llx:%d), (current %d)\n",
+		GEM_TRACE("%s:port%u global=%d (fence %llx:%lld), (current %d:%d)\n",
 			  rq->engine->name,
 			  (unsigned int)(port - execlists->port),
 			  rq->global_seqno,
 			  rq->fence.context, rq->fence.seqno,
+			  hwsp_seqno(rq),
 			  intel_engine_get_seqno(rq->engine));
 
 		GEM_BUG_ON(!execlists->active);
@@ -774,6 +844,13 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 	execlists_clear_all_active(execlists);
 }
 
+static inline void
+invalidate_csb_entries(const u32 *first, const u32 *last)
+{
+	clflush((void *)first);
+	clflush((void *)last);
+}
+
 static void reset_csb_pointers(struct intel_engine_execlists *execlists)
 {
 	const unsigned int reset_value = GEN8_CSB_ENTRIES - 1;
@@ -789,6 +866,9 @@ static void reset_csb_pointers(struct intel_engine_execlists *execlists)
 	 */
 	execlists->csb_head = reset_value;
 	WRITE_ONCE(*execlists->csb_write, reset_value);
+
+	invalidate_csb_entries(&execlists->csb_status[0],
+			       &execlists->csb_status[GEN8_CSB_ENTRIES - 1]);
 }
 
 static void nop_submission_tasklet(unsigned long data)
@@ -830,10 +910,10 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	list_for_each_entry(rq, &engine->timeline.requests, link) {
 		GEM_BUG_ON(!rq->global_seqno);
 
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags))
-			continue;
+		if (!i915_request_signaled(rq))
+			dma_fence_set_error(&rq->fence, -EIO);
 
-		dma_fence_set_error(&rq->fence, -EIO);
+		i915_request_mark_complete(rq);
 	}
 
 	/* Flush the queued requests to the timeline list (for retiring). */
@@ -843,9 +923,9 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 
 		priolist_for_each_request_consume(rq, rn, p, i) {
 			list_del_init(&rq->sched.link);
-
-			dma_fence_set_error(&rq->fence, -EIO);
 			__i915_request_submit(rq);
+			dma_fence_set_error(&rq->fence, -EIO);
+			i915_request_mark_complete(rq);
 		}
 
 		rb_erase_cached(&p->node, &execlists->queue);
@@ -859,7 +939,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
-	execlists->queue_priority = INT_MIN;
+	execlists->queue_priority_hint = INT_MIN;
 	execlists->queue = RB_ROOT_CACHED;
 	GEM_BUG_ON(port_isset(execlists->port));
 
@@ -882,6 +962,8 @@ static void process_csb(struct intel_engine_cs *engine)
 	const u32 * const buf = execlists->csb_status;
 	u8 head, tail;
 
+	lockdep_assert_held(&engine->timeline.lock);
+
 	/*
 	 * Note that csb_write, csb_status may be either in HWSP or mmio.
 	 * When reading from the csb_write mmio register, we have to be
@@ -970,12 +1052,13 @@ static void process_csb(struct intel_engine_cs *engine)
 						EXECLISTS_ACTIVE_USER));
 
 		rq = port_unpack(port, &count);
-		GEM_TRACE("%s out[0]: ctx=%d.%d, global=%d (fence %llx:%d) (current %d), prio=%d\n",
+		GEM_TRACE("%s out[0]: ctx=%d.%d, global=%d (fence %llx:%lld) (current %d:%d), prio=%d\n",
 			  engine->name,
 			  port->context_id, count,
 			  rq ? rq->global_seqno : 0,
 			  rq ? rq->fence.context : 0,
 			  rq ? rq->fence.seqno : 0,
+			  rq ? hwsp_seqno(rq) : 0,
 			  intel_engine_get_seqno(engine),
 			  rq ? rq_prio(rq) : 0);
 
@@ -1024,6 +1107,19 @@ static void process_csb(struct intel_engine_cs *engine)
 	} while (head != tail);
 
 	execlists->csb_head = head;
+
+	/*
+	 * Gen11 has proven to fail wrt global observation point between
+	 * entry and tail update, failing on the ordering and thus
+	 * we see an old entry in the context status buffer.
+	 *
+	 * Forcibly evict out entries for the next gpu csb update,
+	 * to increase the odds that we get a fresh entries with non
+	 * working hardware. The cost for doing so comes out mostly with
+	 * the wash as hardware, working or not, will need to do the
+	 * invalidation before.
+	 */
+	invalidate_csb_entries(&buf[0], &buf[GEN8_CSB_ENTRIES - 1]);
 }
 
 static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
@@ -1046,7 +1142,7 @@ static void execlists_submission_tasklet(unsigned long data)
 
 	GEM_TRACE("%s awake?=%d, active=%x\n",
 		  engine->name,
-		  engine->i915->gt.awake,
+		  !!engine->i915->gt.awake,
 		  engine->execlists.active);
 
 	spin_lock_irqsave(&engine->timeline.lock, flags);
@@ -1076,8 +1172,8 @@ static void __submit_queue_imm(struct intel_engine_cs *engine)
 
 static void submit_queue(struct intel_engine_cs *engine, int prio)
 {
-	if (prio > engine->execlists.queue_priority) {
-		engine->execlists.queue_priority = prio;
+	if (prio > engine->execlists.queue_priority_hint) {
+		engine->execlists.queue_priority_hint = prio;
 		__submit_queue_imm(engine);
 	}
 }
@@ -1170,6 +1266,23 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
 	return i915_vma_pin(vma, 0, 0, flags);
 }
 
+static void
+__execlists_update_reg_state(struct intel_engine_cs *engine,
+			     struct intel_context *ce)
+{
+	u32 *regs = ce->lrc_reg_state;
+	struct intel_ring *ring = ce->ring;
+
+	regs[CTX_RING_BUFFER_START + 1] = i915_ggtt_offset(ring->vma);
+	regs[CTX_RING_HEAD + 1] = ring->head;
+	regs[CTX_RING_TAIL + 1] = ring->tail;
+
+	/* RPCS */
+	if (engine->class == RENDER_CLASS)
+		regs[CTX_R_PWR_CLK_STATE + 1] = gen8_make_rpcs(engine->i915,
+							       &ce->sseu);
+}
+
 static struct intel_context *
 __execlists_context_pin(struct intel_engine_cs *engine,
 			struct i915_gem_context *ctx,
@@ -1208,10 +1321,8 @@ __execlists_context_pin(struct intel_engine_cs *engine,
 	GEM_BUG_ON(!intel_ring_offset_valid(ce->ring, ce->ring->head));
 
 	ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
-	ce->lrc_reg_state[CTX_RING_BUFFER_START+1] =
-		i915_ggtt_offset(ce->ring->vma);
-	ce->lrc_reg_state[CTX_RING_HEAD + 1] = ce->ring->head;
-	ce->lrc_reg_state[CTX_RING_TAIL + 1] = ce->ring->tail;
+
+	__execlists_update_reg_state(engine, ce);
 
 	ce->state->obj->pin_global++;
 	i915_gem_context_get(ctx);
@@ -1251,29 +1362,116 @@ execlists_context_pin(struct intel_engine_cs *engine,
 	return __execlists_context_pin(engine, ctx, ce);
 }
 
+static int gen8_emit_init_breadcrumb(struct i915_request *rq)
+{
+	u32 *cs;
+
+	GEM_BUG_ON(!rq->timeline->has_initial_breadcrumb);
+
+	cs = intel_ring_begin(rq, 6);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/*
+	 * Check if we have been preempted before we even get started.
+	 *
+	 * After this point i915_request_started() reports true, even if
+	 * we get preempted and so are no longer running.
+	 */
+	*cs++ = MI_ARB_CHECK;
+	*cs++ = MI_NOOP;
+
+	*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+	*cs++ = rq->timeline->hwsp_offset;
+	*cs++ = 0;
+	*cs++ = rq->fence.seqno - 1;
+
+	intel_ring_advance(rq, cs);
+	return 0;
+}
+
+static int emit_pdps(struct i915_request *rq)
+{
+	const struct intel_engine_cs * const engine = rq->engine;
+	struct i915_hw_ppgtt * const ppgtt = rq->gem_context->ppgtt;
+	int err, i;
+	u32 *cs;
+
+	GEM_BUG_ON(intel_vgpu_active(rq->i915));
+
+	/*
+	 * Beware ye of the dragons, this sequence is magic!
+	 *
+	 * Small changes to this sequence can cause anything from
+	 * GPU hangs to forcewake errors and machine lockups!
+	 */
+
+	/* Flush any residual operations from the context load */
+	err = engine->emit_flush(rq, EMIT_FLUSH);
+	if (err)
+		return err;
+
+	/* Magic required to prevent forcewake errors! */
+	err = engine->emit_flush(rq, EMIT_INVALIDATE);
+	if (err)
+		return err;
+
+	cs = intel_ring_begin(rq, 4 * GEN8_3LVL_PDPES + 2);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/* Ensure the LRI have landed before we invalidate & continue */
+	*cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | MI_LRI_FORCE_POSTED;
+	for (i = GEN8_3LVL_PDPES; i--; ) {
+		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
+
+		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
+		*cs++ = upper_32_bits(pd_daddr);
+		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
+		*cs++ = lower_32_bits(pd_daddr);
+	}
+	*cs++ = MI_NOOP;
+
+	intel_ring_advance(rq, cs);
+
+	/* Be doubly sure the LRI have landed before proceeding */
+	err = engine->emit_flush(rq, EMIT_FLUSH);
+	if (err)
+		return err;
+
+	/* Re-invalidate the TLB for luck */
+	return engine->emit_flush(rq, EMIT_INVALIDATE);
+}
+
 static int execlists_request_alloc(struct i915_request *request)
 {
 	int ret;
 
 	GEM_BUG_ON(!request->hw_context->pin_count);
 
-	/* Flush enough space to reduce the likelihood of waiting after
+	/*
+	 * Flush enough space to reduce the likelihood of waiting after
 	 * we start building the request - in which case we will just
 	 * have to repeat work.
 	 */
 	request->reserved_space += EXECLISTS_REQUEST_SIZE;
 
-	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
-	if (ret)
-		return ret;
-
-	/* Note that after this point, we have committed to using
+	/*
+	 * Note that after this point, we have committed to using
 	 * this request as it is being used to both track the
 	 * state of engine initialisation and liveness of the
 	 * golden renderstate above. Think twice before you try
 	 * to cancel/unwind this request now.
 	 */
 
+	/* Unconditionally invalidate GPU caches and TLBs. */
+	if (i915_vm_is_48bit(&request->gem_context->ppgtt->vm))
+		ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
+	else
+		ret = emit_pdps(request);
+	if (ret)
+		return ret;
+
 	request->reserved_space -= EXECLISTS_REQUEST_SIZE;
 	return 0;
 }
@@ -1596,7 +1794,7 @@ static void enable_execlists(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 
-	I915_WRITE(RING_HWSTAM(engine->mmio_base), 0xffffffff);
+	intel_engine_set_hwsp_writemask(engine, ~0u); /* HWSTAM */
 
 	/*
 	 * Make sure we're not enabling the new 12-deep CSB
@@ -1617,7 +1815,7 @@ static void enable_execlists(struct intel_engine_cs *engine)
 		   _MASKED_BIT_DISABLE(STOP_RING));
 
 	I915_WRITE(RING_HWS_PGA(engine->mmio_base),
-		   engine->status_page.ggtt_offset);
+		   i915_ggtt_offset(engine->status_page.vma));
 	POSTING_READ(RING_HWS_PGA(engine->mmio_base));
 }
 
@@ -1637,6 +1835,7 @@ static bool unexpected_starting_state(struct intel_engine_cs *engine)
 static int gen8_init_common_ring(struct intel_engine_cs *engine)
 {
 	intel_engine_apply_workarounds(engine);
+	intel_engine_apply_whitelist(engine);
 
 	intel_mocs_init_engine(engine);
 
@@ -1653,48 +1852,9 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static int gen8_init_render_ring(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	ret = gen8_init_common_ring(engine);
-	if (ret)
-		return ret;
-
-	intel_engine_apply_whitelist(engine);
-
-	/* We need to disable the AsyncFlip performance optimisations in order
-	 * to use MI_WAIT_FOR_EVENT within the CS. It should already be
-	 * programmed to '1' on all products.
-	 *
-	 * WaDisableAsyncFlipPerfMode:snb,ivb,hsw,vlv,bdw,chv
-	 */
-	I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(ASYNC_FLIP_PERF_DISABLE));
-
-	I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
-
-	return 0;
-}
-
-static int gen9_init_render_ring(struct intel_engine_cs *engine)
-{
-	int ret;
-
-	ret = gen8_init_common_ring(engine);
-	if (ret)
-		return ret;
-
-	intel_engine_apply_whitelist(engine);
-
-	return 0;
-}
-
-static struct i915_request *
-execlists_reset_prepare(struct intel_engine_cs *engine)
+static void execlists_reset_prepare(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
-	struct i915_request *request, *active;
 	unsigned long flags;
 
 	GEM_TRACE("%s: depth<-%d\n", engine->name,
@@ -1710,59 +1870,21 @@ execlists_reset_prepare(struct intel_engine_cs *engine)
 	 * prevents the race.
 	 */
 	__tasklet_disable_sync_once(&execlists->tasklet);
+	GEM_BUG_ON(!reset_in_progress(execlists));
 
+	/* And flush any current direct submission. */
 	spin_lock_irqsave(&engine->timeline.lock, flags);
-
-	/*
-	 * We want to flush the pending context switches, having disabled
-	 * the tasklet above, we can assume exclusive access to the execlists.
-	 * For this allows us to catch up with an inflight preemption event,
-	 * and avoid blaming an innocent request if the stall was due to the
-	 * preemption itself.
-	 */
-	process_csb(engine);
-
-	/*
-	 * The last active request can then be no later than the last request
-	 * now in ELSP[0]. So search backwards from there, so that if the GPU
-	 * has advanced beyond the last CSB update, it will be pardoned.
-	 */
-	active = NULL;
-	request = port_request(execlists->port);
-	if (request) {
-		/*
-		 * Prevent the breadcrumb from advancing before we decide
-		 * which request is currently active.
-		 */
-		intel_engine_stop_cs(engine);
-
-		list_for_each_entry_from_reverse(request,
-						 &engine->timeline.requests,
-						 link) {
-			if (__i915_request_completed(request,
-						     request->global_seqno))
-				break;
-
-			active = request;
-		}
-	}
-
+	process_csb(engine); /* drain preemption events */
 	spin_unlock_irqrestore(&engine->timeline.lock, flags);
-
-	return active;
 }
 
-static void execlists_reset(struct intel_engine_cs *engine,
-			    struct i915_request *request)
+static void execlists_reset(struct intel_engine_cs *engine, bool stalled)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
+	struct i915_request *rq;
 	unsigned long flags;
 	u32 *regs;
 
-	GEM_TRACE("%s request global=%d, current=%d\n",
-		  engine->name, request ? request->global_seqno : 0,
-		  intel_engine_get_seqno(engine));
-
 	spin_lock_irqsave(&engine->timeline.lock, flags);
 
 	/*
@@ -1777,12 +1899,18 @@ static void execlists_reset(struct intel_engine_cs *engine,
 	execlists_cancel_port_requests(execlists);
 
 	/* Push back any incomplete requests for replay after the reset. */
-	__unwind_incomplete_requests(engine);
+	rq = __unwind_incomplete_requests(engine);
 
 	/* Following the reset, we need to reload the CSB read/write pointers */
 	reset_csb_pointers(&engine->execlists);
 
-	spin_unlock_irqrestore(&engine->timeline.lock, flags);
+	GEM_TRACE("%s seqno=%d, current=%d, stalled? %s\n",
+		  engine->name,
+		  rq ? rq->global_seqno : 0,
+		  intel_engine_get_seqno(engine),
+		  yesno(stalled));
+	if (!rq)
+		goto out_unlock;
 
 	/*
 	 * If the request was innocent, we leave the request in the ELSP
@@ -1795,8 +1923,9 @@ static void execlists_reset(struct intel_engine_cs *engine,
 	 * and have to at least restore the RING register in the context
 	 * image back to the expected values to skip over the guilty request.
 	 */
-	if (!request || request->fence.error != -EIO)
-		return;
+	i915_reset_request(rq, stalled);
+	if (!stalled)
+		goto out_unlock;
 
 	/*
 	 * We want a simple context + ring to execute the breadcrumb update.
@@ -1806,25 +1935,22 @@ static void execlists_reset(struct intel_engine_cs *engine,
 	 * future request will be after userspace has had the opportunity
 	 * to recreate its own state.
 	 */
-	regs = request->hw_context->lrc_reg_state;
+	regs = rq->hw_context->lrc_reg_state;
 	if (engine->pinned_default_state) {
 		memcpy(regs, /* skip restoring the vanilla PPHWSP */
 		       engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
 		       engine->context_size - PAGE_SIZE);
 	}
-	execlists_init_reg_state(regs,
-				 request->gem_context, engine, request->ring);
 
 	/* Move the RING_HEAD onto the breadcrumb, past the hanging batch */
-	regs[CTX_RING_BUFFER_START + 1] = i915_ggtt_offset(request->ring->vma);
+	rq->ring->head = intel_ring_wrap(rq->ring, rq->postfix);
+	intel_ring_update_space(rq->ring);
 
-	request->ring->head = intel_ring_wrap(request->ring, request->postfix);
-	regs[CTX_RING_HEAD + 1] = request->ring->head;
+	execlists_init_reg_state(regs, rq->gem_context, engine, rq->ring);
+	__execlists_update_reg_state(engine, rq->hw_context);
 
-	intel_ring_update_space(request->ring);
-
-	/* Reset WaIdleLiteRestore:bdw,skl as well */
-	unwind_wa_tail(request);
+out_unlock:
+	spin_unlock_irqrestore(&engine->timeline.lock, flags);
 }
 
 static void execlists_reset_finish(struct intel_engine_cs *engine)
@@ -1837,6 +1963,7 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
 	 * to sleep before we restart and reload a context.
 	 *
 	 */
+	GEM_BUG_ON(!reset_in_progress(execlists));
 	if (!RB_EMPTY_ROOT(&execlists->queue.rb_root))
 		execlists->tasklet.func(execlists->tasklet.data);
 
@@ -1845,56 +1972,11 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
 		  atomic_read(&execlists->tasklet.count));
 }
 
-static int intel_logical_ring_emit_pdps(struct i915_request *rq)
-{
-	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
-	struct intel_engine_cs *engine = rq->engine;
-	const int num_lri_cmds = GEN8_3LVL_PDPES * 2;
-	u32 *cs;
-	int i;
-
-	cs = intel_ring_begin(rq, num_lri_cmds * 2 + 2);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_LOAD_REGISTER_IMM(num_lri_cmds);
-	for (i = GEN8_3LVL_PDPES - 1; i >= 0; i--) {
-		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
-
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
-		*cs++ = upper_32_bits(pd_daddr);
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
-		*cs++ = lower_32_bits(pd_daddr);
-	}
-
-	*cs++ = MI_NOOP;
-	intel_ring_advance(rq, cs);
-
-	return 0;
-}
-
 static int gen8_emit_bb_start(struct i915_request *rq,
 			      u64 offset, u32 len,
 			      const unsigned int flags)
 {
 	u32 *cs;
-	int ret;
-
-	/* Don't rely in hw updating PDPs, specially in lite-restore.
-	 * Ideally, we should set Force PD Restore in ctx descriptor,
-	 * but we can't. Force Restore would be a second option, but
-	 * it is unsafe in case of lite-restore (because the ctx is
-	 * not idle). PML4 is allocated during ppgtt init so this is
-	 * not needed in 48-bit.*/
-	if ((intel_engine_flag(rq->engine) & rq->gem_context->ppgtt->pd_dirty_rings) &&
-	    !i915_vm_is_48bit(&rq->gem_context->ppgtt->vm) &&
-	    !intel_vgpu_active(rq->i915)) {
-		ret = intel_logical_ring_emit_pdps(rq);
-		if (ret)
-			return ret;
-
-		rq->gem_context->ppgtt->pd_dirty_rings &= ~intel_engine_flag(rq->engine);
-	}
 
 	cs = intel_ring_begin(rq, 6);
 	if (IS_ERR(cs))
@@ -1927,6 +2009,7 @@ static int gen8_emit_bb_start(struct i915_request *rq,
 
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
 	*cs++ = MI_NOOP;
+
 	intel_ring_advance(rq, cs);
 
 	return 0;
@@ -2011,7 +2094,7 @@ static int gen8_emit_flush_render(struct i915_request *request,
 		 * On GEN9: before VF_CACHE_INVALIDATE we need to emit a NULL
 		 * pipe control.
 		 */
-		if (IS_GEN9(request->i915))
+		if (IS_GEN(request->i915, 9))
 			vf_flush_wa = true;
 
 		/* WaForGAMHang:kbl */
@@ -2053,45 +2136,62 @@ static int gen8_emit_flush_render(struct i915_request *request,
  * used as a workaround for not being allowed to do lite
  * restore with HEAD==TAIL (WaIdleLiteRestore).
  */
-static void gen8_emit_wa_tail(struct i915_request *request, u32 *cs)
+static u32 *gen8_emit_wa_tail(struct i915_request *request, u32 *cs)
 {
 	/* Ensure there's always at least one preemption point per-request. */
 	*cs++ = MI_ARB_CHECK;
 	*cs++ = MI_NOOP;
 	request->wa_tail = intel_ring_offset(request, cs);
+
+	return cs;
 }
 
-static void gen8_emit_breadcrumb(struct i915_request *request, u32 *cs)
+static u32 *gen8_emit_fini_breadcrumb(struct i915_request *request, u32 *cs)
 {
 	/* w/a: bit 5 needs to be zero for MI_FLUSH_DW address. */
 	BUILD_BUG_ON(I915_GEM_HWS_INDEX_ADDR & (1 << 5));
 
-	cs = gen8_emit_ggtt_write(cs, request->global_seqno,
+	cs = gen8_emit_ggtt_write(cs,
+				  request->fence.seqno,
+				  request->timeline->hwsp_offset);
+
+	cs = gen8_emit_ggtt_write(cs,
+				  request->global_seqno,
 				  intel_hws_seqno_address(request->engine));
+
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
 	request->tail = intel_ring_offset(request, cs);
 	assert_ring_tail_valid(request->ring, request->tail);
 
-	gen8_emit_wa_tail(request, cs);
+	return gen8_emit_wa_tail(request, cs);
 }
-static const int gen8_emit_breadcrumb_sz = 6 + WA_TAIL_DWORDS;
 
-static void gen8_emit_breadcrumb_rcs(struct i915_request *request, u32 *cs)
+static u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs)
 {
-	/* We're using qword write, seqno should be aligned to 8 bytes. */
-	BUILD_BUG_ON(I915_GEM_HWS_INDEX & 1);
+	cs = gen8_emit_ggtt_write_rcs(cs,
+				      request->fence.seqno,
+				      request->timeline->hwsp_offset,
+				      PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
+				      PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+				      PIPE_CONTROL_DC_FLUSH_ENABLE |
+				      PIPE_CONTROL_FLUSH_ENABLE |
+				      PIPE_CONTROL_CS_STALL);
+
+	cs = gen8_emit_ggtt_write_rcs(cs,
+				      request->global_seqno,
+				      intel_hws_seqno_address(request->engine),
+				      PIPE_CONTROL_CS_STALL);
 
-	cs = gen8_emit_ggtt_write_rcs(cs, request->global_seqno,
-				      intel_hws_seqno_address(request->engine));
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
 	request->tail = intel_ring_offset(request, cs);
 	assert_ring_tail_valid(request->ring, request->tail);
 
-	gen8_emit_wa_tail(request, cs);
+	return gen8_emit_wa_tail(request, cs);
 }
-static const int gen8_emit_breadcrumb_rcs_sz = 8 + WA_TAIL_DWORDS;
 
 static int gen8_init_rcs_context(struct i915_request *rq)
 {
@@ -2183,8 +2283,8 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
 	engine->request_alloc = execlists_request_alloc;
 
 	engine->emit_flush = gen8_emit_flush;
-	engine->emit_breadcrumb = gen8_emit_breadcrumb;
-	engine->emit_breadcrumb_sz = gen8_emit_breadcrumb_sz;
+	engine->emit_init_breadcrumb = gen8_emit_init_breadcrumb;
+	engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb;
 
 	engine->set_default_submission = intel_execlists_set_default_submission;
 
@@ -2223,10 +2323,14 @@ logical_ring_default_irqs(struct intel_engine_cs *engine)
 	engine->irq_keep_mask = GT_CONTEXT_SWITCH_INTERRUPT << shift;
 }
 
-static void
+static int
 logical_ring_setup(struct intel_engine_cs *engine)
 {
-	intel_engine_setup_common(engine);
+	int err;
+
+	err = intel_engine_setup_common(engine);
+	if (err)
+		return err;
 
 	/* Intentionally left blank. */
 	engine->buffer = NULL;
@@ -2236,6 +2340,8 @@ logical_ring_setup(struct intel_engine_cs *engine)
 
 	logical_ring_default_vfuncs(engine);
 	logical_ring_default_irqs(engine);
+
+	return 0;
 }
 
 static int logical_ring_init(struct intel_engine_cs *engine)
@@ -2270,10 +2376,10 @@ static int logical_ring_init(struct intel_engine_cs *engine)
 	}
 
 	execlists->csb_status =
-		&engine->status_page.page_addr[I915_HWS_CSB_BUF0_INDEX];
+		&engine->status_page.addr[I915_HWS_CSB_BUF0_INDEX];
 
 	execlists->csb_write =
-		&engine->status_page.page_addr[intel_hws_csb_write_index(i915)];
+		&engine->status_page.addr[intel_hws_csb_write_index(i915)];
 
 	reset_csb_pointers(execlists);
 
@@ -2282,23 +2388,16 @@ static int logical_ring_init(struct intel_engine_cs *engine)
 
 int logical_render_ring_init(struct intel_engine_cs *engine)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
 	int ret;
 
-	logical_ring_setup(engine);
-
-	if (HAS_L3_DPF(dev_priv))
-		engine->irq_keep_mask |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
+	ret = logical_ring_setup(engine);
+	if (ret)
+		return ret;
 
 	/* Override some for render ring. */
-	if (INTEL_GEN(dev_priv) >= 9)
-		engine->init_hw = gen9_init_render_ring;
-	else
-		engine->init_hw = gen8_init_render_ring;
 	engine->init_context = gen8_init_rcs_context;
 	engine->emit_flush = gen8_emit_flush_render;
-	engine->emit_breadcrumb = gen8_emit_breadcrumb_rcs;
-	engine->emit_breadcrumb_sz = gen8_emit_breadcrumb_rcs_sz;
+	engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
 
 	ret = logical_ring_init(engine);
 	if (ret)
@@ -2322,27 +2421,59 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 
 int logical_xcs_ring_init(struct intel_engine_cs *engine)
 {
-	logical_ring_setup(engine);
+	int err;
+
+	err = logical_ring_setup(engine);
+	if (err)
+		return err;
 
 	return logical_ring_init(engine);
 }
 
-static u32
-make_rpcs(struct drm_i915_private *dev_priv)
+u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu)
 {
-	bool subslice_pg = INTEL_INFO(dev_priv)->sseu.has_subslice_pg;
-	u8 slices = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
-	u8 subslices = hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
+	bool subslice_pg = sseu->has_subslice_pg;
+	struct intel_sseu ctx_sseu;
+	u8 slices, subslices;
 	u32 rpcs = 0;
 
 	/*
 	 * No explicit RPCS request is needed to ensure full
 	 * slice/subslice/EU enablement prior to Gen9.
 	*/
-	if (INTEL_GEN(dev_priv) < 9)
+	if (INTEL_GEN(i915) < 9)
 		return 0;
 
 	/*
+	 * If i915/perf is active, we want a stable powergating configuration
+	 * on the system.
+	 *
+	 * We could choose full enablement, but on ICL we know there are use
+	 * cases which disable slices for functional, apart for performance
+	 * reasons. So in this case we select a known stable subset.
+	 */
+	if (!i915->perf.oa.exclusive_stream) {
+		ctx_sseu = *req_sseu;
+	} else {
+		ctx_sseu = intel_device_default_sseu(i915);
+
+		if (IS_GEN(i915, 11)) {
+			/*
+			 * We only need subslice count so it doesn't matter
+			 * which ones we select - just turn off low bits in the
+			 * amount of half of all available subslices per slice.
+			 */
+			ctx_sseu.subslice_mask =
+				~(~0 << (hweight8(ctx_sseu.subslice_mask) / 2));
+			ctx_sseu.slice_mask = 0x1;
+		}
+	}
+
+	slices = hweight8(ctx_sseu.slice_mask);
+	subslices = hweight8(ctx_sseu.subslice_mask);
+
+	/*
 	 * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits
 	 * wide and Icelake has up to eight subslices, specfial programming is
 	 * needed in order to correctly enable all subslices.
@@ -2367,7 +2498,9 @@ make_rpcs(struct drm_i915_private *dev_priv)
 	 * subslices are enabled, or a count between one and four on the first
 	 * slice.
 	 */
-	if (IS_GEN11(dev_priv) && slices == 1 && subslices >= 4) {
+	if (IS_GEN(i915, 11) &&
+	    slices == 1 &&
+	    subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) {
 		GEM_BUG_ON(subslices & 1);
 
 		subslice_pg = false;
@@ -2380,10 +2513,10 @@ make_rpcs(struct drm_i915_private *dev_priv)
 	 * must make an explicit request through RPCS for full
 	 * enablement.
 	*/
-	if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
+	if (sseu->has_slice_pg) {
 		u32 mask, val = slices;
 
-		if (INTEL_GEN(dev_priv) >= 11) {
+		if (INTEL_GEN(i915) >= 11) {
 			mask = GEN11_RPCS_S_CNT_MASK;
 			val <<= GEN11_RPCS_S_CNT_SHIFT;
 		} else {
@@ -2408,18 +2541,16 @@ make_rpcs(struct drm_i915_private *dev_priv)
 		rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val;
 	}
 
-	if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) {
+	if (sseu->has_eu_pg) {
 		u32 val;
 
-		val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice <<
-		      GEN8_RPCS_EU_MIN_SHIFT;
+		val = ctx_sseu.min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT;
 		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK);
 		val &= GEN8_RPCS_EU_MIN_MASK;
 
 		rpcs |= val;
 
-		val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice <<
-		      GEN8_RPCS_EU_MAX_SHIFT;
+		val = ctx_sseu.max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT;
 		GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK);
 		val &= GEN8_RPCS_EU_MAX_MASK;
 
@@ -2543,12 +2674,16 @@ static void execlists_init_reg_state(u32 *regs,
 		 * other PDP Descriptors are ignored.
 		 */
 		ASSIGN_CTX_PML4(ctx->ppgtt, regs);
+	} else {
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 3);
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 2);
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 1);
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 0);
 	}
 
 	if (rcs) {
 		regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
-		CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
-			make_rpcs(dev_priv));
+		CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, 0);
 
 		i915_oa_init_reg_state(engine, ctx, regs);
 	}
@@ -2625,7 +2760,7 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 {
 	struct drm_i915_gem_object *ctx_obj;
 	struct i915_vma *vma;
-	uint32_t context_size;
+	u32 context_size;
 	struct intel_ring *ring;
 	struct i915_timeline *timeline;
 	int ret;
@@ -2651,7 +2786,7 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 		goto error_deref_obj;
 	}
 
-	timeline = i915_timeline_create(ctx->i915, ctx->name);
+	timeline = i915_timeline_create(ctx->i915, ctx->name, NULL);
 	if (IS_ERR(timeline)) {
 		ret = PTR_ERR(timeline);
 		goto error_deref_obj;
@@ -2709,14 +2844,70 @@ void intel_lr_context_resume(struct drm_i915_private *i915)
 
 			intel_ring_reset(ce->ring, 0);
 
-			if (ce->pin_count) { /* otherwise done in context_pin */
-				u32 *regs = ce->lrc_reg_state;
+			if (ce->pin_count) /* otherwise done in context_pin */
+				__execlists_update_reg_state(engine, ce);
+		}
+	}
+}
+
+void intel_execlists_show_requests(struct intel_engine_cs *engine,
+				   struct drm_printer *m,
+				   void (*show_request)(struct drm_printer *m,
+							struct i915_request *rq,
+							const char *prefix),
+				   unsigned int max)
+{
+	const struct intel_engine_execlists *execlists = &engine->execlists;
+	struct i915_request *rq, *last;
+	unsigned long flags;
+	unsigned int count;
+	struct rb_node *rb;
 
-				regs[CTX_RING_HEAD + 1] = ce->ring->head;
-				regs[CTX_RING_TAIL + 1] = ce->ring->tail;
-			}
+	spin_lock_irqsave(&engine->timeline.lock, flags);
+
+	last = NULL;
+	count = 0;
+	list_for_each_entry(rq, &engine->timeline.requests, link) {
+		if (count++ < max - 1)
+			show_request(m, rq, "\t\tE ");
+		else
+			last = rq;
+	}
+	if (last) {
+		if (count > max) {
+			drm_printf(m,
+				   "\t\t...skipping %d executing requests...\n",
+				   count - max);
+		}
+		show_request(m, last, "\t\tE ");
+	}
+
+	last = NULL;
+	count = 0;
+	if (execlists->queue_priority_hint != INT_MIN)
+		drm_printf(m, "\t\tQueue priority hint: %d\n",
+			   execlists->queue_priority_hint);
+	for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
+		struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
+		int i;
+
+		priolist_for_each_request(rq, p, i) {
+			if (count++ < max - 1)
+				show_request(m, rq, "\t\tQ ");
+			else
+				last = rq;
+		}
+	}
+	if (last) {
+		if (count > max) {
+			drm_printf(m,
+				   "\t\t...skipping %d queued requests...\n",
+				   count - max);
 		}
+		show_request(m, last, "\t\tQ ");
 	}
+
+	spin_unlock_irqrestore(&engine->timeline.lock, flags);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index f5a5502ecf70..f1aec8a6986f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -97,11 +97,21 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine);
  */
 #define LRC_HEADER_PAGES LRC_PPHWSP_PN
 
+struct drm_printer;
+
 struct drm_i915_private;
 struct i915_gem_context;
 
 void intel_lr_context_resume(struct drm_i915_private *dev_priv);
-
 void intel_execlists_set_default_submission(struct intel_engine_cs *engine);
 
+void intel_execlists_show_requests(struct intel_engine_cs *engine,
+				   struct drm_printer *m,
+				   void (*show_request)(struct drm_printer *m,
+							struct i915_request *rq,
+							const char *prefix),
+				   unsigned int max);
+
+u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu);
+
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_lspcon.c b/drivers/gpu/drm/i915/intel_lspcon.c
index 96a8d9524b0c..322bdddda164 100644
--- a/drivers/gpu/drm/i915/intel_lspcon.c
+++ b/drivers/gpu/drm/i915/intel_lspcon.c
@@ -288,12 +288,12 @@ static bool lspcon_parade_fw_ready(struct drm_dp_aux *aux)
 }
 
 static bool _lspcon_parade_write_infoframe_blocks(struct drm_dp_aux *aux,
-						  uint8_t *avi_buf)
+						  u8 *avi_buf)
 {
 	u8 avi_if_ctrl;
 	u8 block_count = 0;
 	u8 *data;
-	uint16_t reg;
+	u16 reg;
 	ssize_t ret;
 
 	while (block_count < 4) {
@@ -335,10 +335,10 @@ static bool _lspcon_parade_write_infoframe_blocks(struct drm_dp_aux *aux,
 }
 
 static bool _lspcon_write_avi_infoframe_parade(struct drm_dp_aux *aux,
-					       const uint8_t *frame,
+					       const u8 *frame,
 					       ssize_t len)
 {
-	uint8_t avi_if[LSPCON_PARADE_AVI_IF_DATA_SIZE] = {1, };
+	u8 avi_if[LSPCON_PARADE_AVI_IF_DATA_SIZE] = {1, };
 
 	/*
 	 * Parade's frames contains 32 bytes of data, divided
@@ -367,13 +367,13 @@ static bool _lspcon_write_avi_infoframe_parade(struct drm_dp_aux *aux,
 }
 
 static bool _lspcon_write_avi_infoframe_mca(struct drm_dp_aux *aux,
-					    const uint8_t *buffer, ssize_t len)
+					    const u8 *buffer, ssize_t len)
 {
 	int ret;
-	uint32_t val = 0;
-	uint32_t retry;
-	uint16_t reg;
-	const uint8_t *data = buffer;
+	u32 val = 0;
+	u32 retry;
+	u16 reg;
+	const u8 *data = buffer;
 
 	reg = LSPCON_MCA_AVI_IF_WRITE_OFFSET;
 	while (val < len) {
@@ -459,13 +459,11 @@ void lspcon_set_infoframes(struct intel_encoder *encoder,
 {
 	ssize_t ret;
 	union hdmi_infoframe frame;
-	uint8_t buf[VIDEO_DIP_DATA_SIZE];
+	u8 buf[VIDEO_DIP_DATA_SIZE];
 	struct intel_digital_port *dig_port = enc_to_dig_port(&encoder->base);
 	struct intel_lspcon *lspcon = &dig_port->lspcon;
-	struct intel_dp *intel_dp = &dig_port->dp;
-	struct drm_connector *connector = &intel_dp->attached_connector->base;
-	const struct drm_display_mode *mode = &crtc_state->base.adjusted_mode;
-	bool is_hdmi2_sink = connector->display_info.hdmi.scdc.supported;
+	const struct drm_display_mode *adjusted_mode =
+		&crtc_state->base.adjusted_mode;
 
 	if (!lspcon->active) {
 		DRM_ERROR("Writing infoframes while LSPCON disabled ?\n");
@@ -473,7 +471,8 @@ void lspcon_set_infoframes(struct intel_encoder *encoder,
 	}
 
 	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
-						       mode, is_hdmi2_sink);
+						       conn_state->connector,
+						       adjusted_mode);
 	if (ret < 0) {
 		DRM_ERROR("couldn't fill AVI infoframe\n");
 		return;
@@ -488,11 +487,12 @@ void lspcon_set_infoframes(struct intel_encoder *encoder,
 		frame.avi.colorspace = HDMI_COLORSPACE_RGB;
 	}
 
-	drm_hdmi_avi_infoframe_quant_range(&frame.avi, mode,
+	drm_hdmi_avi_infoframe_quant_range(&frame.avi,
+					   conn_state->connector,
+					   adjusted_mode,
 					   crtc_state->limited_color_range ?
 					   HDMI_QUANTIZATION_RANGE_LIMITED :
-					   HDMI_QUANTIZATION_RANGE_FULL,
-					   false, is_hdmi2_sink);
+					   HDMI_QUANTIZATION_RANGE_FULL);
 
 	ret = hdmi_infoframe_pack(&frame, buf, sizeof(buf));
 	if (ret < 0) {
diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds.c
index e6c5d985ea0a..b4aa49768e90 100644
--- a/drivers/gpu/drm/i915/intel_lvds.c
+++ b/drivers/gpu/drm/i915/intel_lvds.c
@@ -32,7 +32,6 @@
 #include <linux/i2c.h>
 #include <linux/slab.h>
 #include <linux/vga_switcheroo.h>
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
@@ -95,15 +94,17 @@ static bool intel_lvds_get_hw_state(struct intel_encoder *encoder,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_lvds_encoder *lvds_encoder = to_lvds_encoder(&encoder->base);
+	intel_wakeref_t wakeref;
 	bool ret;
 
-	if (!intel_display_power_get_if_enabled(dev_priv,
-						encoder->power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     encoder->power_domain);
+	if (!wakeref)
 		return false;
 
 	ret = intel_lvds_port_enabled(dev_priv, lvds_encoder->reg, pipe);
 
-	intel_display_power_put(dev_priv, encoder->power_domain);
+	intel_display_power_put(dev_priv, encoder->power_domain, wakeref);
 
 	return ret;
 }
@@ -279,7 +280,7 @@ static void intel_pre_enable_lvds(struct intel_encoder *encoder,
 	 * special lvds dither control bit on pch-split platforms, dithering is
 	 * only controlled through the PIPECONF reg.
 	 */
-	if (IS_GEN4(dev_priv)) {
+	if (IS_GEN(dev_priv, 4)) {
 		/*
 		 * Bspec wording suggests that LVDS port dithering only exists
 		 * for 18bpp panels.
@@ -379,9 +380,9 @@ intel_lvds_mode_valid(struct drm_connector *connector,
 	return MODE_OK;
 }
 
-static bool intel_lvds_compute_config(struct intel_encoder *intel_encoder,
-				      struct intel_crtc_state *pipe_config,
-				      struct drm_connector_state *conn_state)
+static int intel_lvds_compute_config(struct intel_encoder *intel_encoder,
+				     struct intel_crtc_state *pipe_config,
+				     struct drm_connector_state *conn_state)
 {
 	struct drm_i915_private *dev_priv = to_i915(intel_encoder->base.dev);
 	struct intel_lvds_encoder *lvds_encoder =
@@ -395,7 +396,7 @@ static bool intel_lvds_compute_config(struct intel_encoder *intel_encoder,
 	/* Should never happen!! */
 	if (INTEL_GEN(dev_priv) < 4 && intel_crtc->pipe == 0) {
 		DRM_ERROR("Can't support LVDS on pipe A\n");
-		return false;
+		return -EINVAL;
 	}
 
 	if (lvds_encoder->a3_power == LVDS_A3_POWER_UP)
@@ -421,7 +422,7 @@ static bool intel_lvds_compute_config(struct intel_encoder *intel_encoder,
 			       adjusted_mode);
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	if (HAS_PCH_SPLIT(dev_priv)) {
 		pipe_config->has_pch_encoder = true;
@@ -440,7 +441,7 @@ static bool intel_lvds_compute_config(struct intel_encoder *intel_encoder,
 	 * user's requested refresh rate.
 	 */
 
-	return true;
+	return 0;
 }
 
 static enum drm_connector_status
@@ -797,26 +798,6 @@ static bool compute_is_dual_link_lvds(struct intel_lvds_encoder *lvds_encoder)
 	return (val & LVDS_CLKB_POWER_MASK) == LVDS_CLKB_POWER_UP;
 }
 
-static bool intel_lvds_supported(struct drm_i915_private *dev_priv)
-{
-	/*
-	 * With the introduction of the PCH we gained a dedicated
-	 * LVDS presence pin, use it.
-	 */
-	if (HAS_PCH_IBX(dev_priv) || HAS_PCH_CPT(dev_priv))
-		return true;
-
-	/*
-	 * Otherwise LVDS was only attached to mobile products,
-	 * except for the inglorious 830gm
-	 */
-	if (INTEL_GEN(dev_priv) <= 4 &&
-	    IS_MOBILE(dev_priv) && !IS_I830(dev_priv))
-		return true;
-
-	return false;
-}
-
 /**
  * intel_lvds_init - setup LVDS connectors on this device
  * @dev_priv: i915 device
@@ -841,9 +822,6 @@ void intel_lvds_init(struct drm_i915_private *dev_priv)
 	u8 pin;
 	u32 allowed_scalers;
 
-	if (!intel_lvds_supported(dev_priv))
-		return;
-
 	/* Skip init on machines we know falsely report LVDS */
 	if (dmi_check_system(intel_no_lvds)) {
 		WARN(!dev_priv->vbt.int_lvds_support,
@@ -909,6 +887,7 @@ void intel_lvds_init(struct drm_i915_private *dev_priv)
 	}
 	intel_encoder->get_hw_state = intel_lvds_get_hw_state;
 	intel_encoder->get_config = intel_lvds_get_config;
+	intel_encoder->update_pipe = intel_panel_update_backlight;
 	intel_connector->get_hw_state = intel_connector_get_hw_state;
 
 	intel_connector_attach_encoder(intel_connector, intel_encoder);
@@ -919,7 +898,7 @@ void intel_lvds_init(struct drm_i915_private *dev_priv)
 	intel_encoder->cloneable = 0;
 	if (HAS_PCH_SPLIT(dev_priv))
 		intel_encoder->crtc_mask = (1 << 0) | (1 << 1) | (1 << 2);
-	else if (IS_GEN4(dev_priv))
+	else if (IS_GEN(dev_priv, 4))
 		intel_encoder->crtc_mask = (1 << 0) | (1 << 1);
 	else
 		intel_encoder->crtc_mask = (1 << 1);
diff --git a/drivers/gpu/drm/i915/intel_mocs.c b/drivers/gpu/drm/i915/intel_mocs.c
index 77e9871a8c9a..331e7a678fb7 100644
--- a/drivers/gpu/drm/i915/intel_mocs.c
+++ b/drivers/gpu/drm/i915/intel_mocs.c
@@ -28,48 +28,60 @@
 struct drm_i915_mocs_entry {
 	u32 control_value;
 	u16 l3cc_value;
+	u16 used;
 };
 
 struct drm_i915_mocs_table {
-	u32 size;
+	unsigned int size;
+	unsigned int n_entries;
 	const struct drm_i915_mocs_entry *table;
 };
 
 /* Defines for the tables (XXX_MOCS_0 - XXX_MOCS_63) */
-#define LE_CACHEABILITY(value)	((value) << 0)
-#define LE_TGT_CACHE(value)	((value) << 2)
+#define _LE_CACHEABILITY(value)	((value) << 0)
+#define _LE_TGT_CACHE(value)	((value) << 2)
 #define LE_LRUM(value)		((value) << 4)
 #define LE_AOM(value)		((value) << 6)
 #define LE_RSC(value)		((value) << 7)
 #define LE_SCC(value)		((value) << 8)
 #define LE_PFM(value)		((value) << 11)
 #define LE_SCF(value)		((value) << 14)
+#define LE_COS(value)		((value) << 15)
+#define LE_SSE(value)		((value) << 17)
 
 /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */
 #define L3_ESC(value)		((value) << 0)
 #define L3_SCC(value)		((value) << 1)
-#define L3_CACHEABILITY(value)	((value) << 4)
+#define _L3_CACHEABILITY(value)	((value) << 4)
 
 /* Helper defines */
 #define GEN9_NUM_MOCS_ENTRIES	62  /* 62 out of 64 - 63 & 64 are reserved. */
+#define GEN11_NUM_MOCS_ENTRIES	64  /* 63-64 are reserved, but configured. */
 
 /* (e)LLC caching options */
-#define LE_PAGETABLE		0
-#define LE_UC			1
-#define LE_WT			2
-#define LE_WB			3
-
-/* L3 caching options */
-#define L3_DIRECT		0
-#define L3_UC			1
-#define L3_RESERVED		2
-#define L3_WB			3
+#define LE_0_PAGETABLE		_LE_CACHEABILITY(0)
+#define LE_1_UC			_LE_CACHEABILITY(1)
+#define LE_2_WT			_LE_CACHEABILITY(2)
+#define LE_3_WB			_LE_CACHEABILITY(3)
 
 /* Target cache */
-#define LE_TC_PAGETABLE		0
-#define LE_TC_LLC		1
-#define LE_TC_LLC_ELLC		2
-#define LE_TC_LLC_ELLC_ALT	3
+#define LE_TC_0_PAGETABLE	_LE_TGT_CACHE(0)
+#define LE_TC_1_LLC		_LE_TGT_CACHE(1)
+#define LE_TC_2_LLC_ELLC	_LE_TGT_CACHE(2)
+#define LE_TC_3_LLC_ELLC_ALT	_LE_TGT_CACHE(3)
+
+/* L3 caching options */
+#define L3_0_DIRECT		_L3_CACHEABILITY(0)
+#define L3_1_UC			_L3_CACHEABILITY(1)
+#define L3_2_RESERVED		_L3_CACHEABILITY(2)
+#define L3_3_WB			_L3_CACHEABILITY(3)
+
+#define MOCS_ENTRY(__idx, __control_value, __l3cc_value) \
+	[__idx] = { \
+		.control_value = __control_value, \
+		.l3cc_value = __l3cc_value, \
+		.used = 1, \
+	}
 
 /*
  * MOCS tables
@@ -80,85 +92,147 @@ struct drm_i915_mocs_table {
  * LNCFCMOCS0 - LNCFCMOCS32 registers.
  *
  * These tables are intended to be kept reasonably consistent across
- * platforms. However some of the fields are not applicable to all of
- * them.
+ * HW platforms, and for ICL+, be identical across OSes. To achieve
+ * that, for Icelake and above, list of entries is published as part
+ * of bspec.
  *
  * Entries not part of the following tables are undefined as far as
  * userspace is concerned and shouldn't be relied upon.  For the time
- * being they will be implicitly initialized to the strictest caching
- * configuration (uncached) to guarantee forwards compatibility with
- * userspace programs written against more recent kernels providing
- * additional MOCS entries.
+ * being they will be initialized to PTE.
  *
- * NOTE: These tables MUST start with being uncached and the length
- *       MUST be less than 63 as the last two registers are reserved
- *       by the hardware.  These tables are part of the kernel ABI and
- *       may only be updated incrementally by adding entries at the
- *       end.
+ * The last two entries are reserved by the hardware. For ICL+ they
+ * should be initialized according to bspec and never used, for older
+ * platforms they should never be written to.
+ *
+ * NOTE: These tables are part of bspec and defined as part of hardware
+ *       interface for ICL+. For older platforms, they are part of kernel
+ *       ABI. It is expected that, for specific hardware platform, existing
+ *       entries will remain constant and the table will only be updated by
+ *       adding new entries, filling unused positions.
  */
+#define GEN9_MOCS_ENTRIES \
+	MOCS_ENTRY(I915_MOCS_UNCACHED, \
+		   LE_1_UC | LE_TC_2_LLC_ELLC, \
+		   L3_1_UC), \
+	MOCS_ENTRY(I915_MOCS_PTE, \
+		   LE_0_PAGETABLE | LE_TC_2_LLC_ELLC | LE_LRUM(3), \
+		   L3_3_WB)
+
 static const struct drm_i915_mocs_entry skylake_mocs_table[] = {
-	[I915_MOCS_UNCACHED] = {
-	  /* 0x00000009 */
-	  .control_value = LE_CACHEABILITY(LE_UC) |
-			   LE_TGT_CACHE(LE_TC_LLC_ELLC) |
-			   LE_LRUM(0) | LE_AOM(0) | LE_RSC(0) | LE_SCC(0) |
-			   LE_PFM(0) | LE_SCF(0),
-
-	  /* 0x0010 */
-	  .l3cc_value =    L3_ESC(0) | L3_SCC(0) | L3_CACHEABILITY(L3_UC),
-	},
-	[I915_MOCS_PTE] = {
-	  /* 0x00000038 */
-	  .control_value = LE_CACHEABILITY(LE_PAGETABLE) |
-			   LE_TGT_CACHE(LE_TC_LLC_ELLC) |
-			   LE_LRUM(3) | LE_AOM(0) | LE_RSC(0) | LE_SCC(0) |
-			   LE_PFM(0) | LE_SCF(0),
-	  /* 0x0030 */
-	  .l3cc_value =    L3_ESC(0) | L3_SCC(0) | L3_CACHEABILITY(L3_WB),
-	},
-	[I915_MOCS_CACHED] = {
-	  /* 0x0000003b */
-	  .control_value = LE_CACHEABILITY(LE_WB) |
-			   LE_TGT_CACHE(LE_TC_LLC_ELLC) |
-			   LE_LRUM(3) | LE_AOM(0) | LE_RSC(0) | LE_SCC(0) |
-			   LE_PFM(0) | LE_SCF(0),
-	  /* 0x0030 */
-	  .l3cc_value =   L3_ESC(0) | L3_SCC(0) | L3_CACHEABILITY(L3_WB),
-	},
+	GEN9_MOCS_ENTRIES,
+	MOCS_ENTRY(I915_MOCS_CACHED,
+		   LE_3_WB | LE_TC_2_LLC_ELLC | LE_LRUM(3),
+		   L3_3_WB)
 };
 
 /* NOTE: the LE_TGT_CACHE is not used on Broxton */
 static const struct drm_i915_mocs_entry broxton_mocs_table[] = {
-	[I915_MOCS_UNCACHED] = {
-	  /* 0x00000009 */
-	  .control_value = LE_CACHEABILITY(LE_UC) |
-			   LE_TGT_CACHE(LE_TC_LLC_ELLC) |
-			   LE_LRUM(0) | LE_AOM(0) | LE_RSC(0) | LE_SCC(0) |
-			   LE_PFM(0) | LE_SCF(0),
-
-	  /* 0x0010 */
-	  .l3cc_value =    L3_ESC(0) | L3_SCC(0) | L3_CACHEABILITY(L3_UC),
-	},
-	[I915_MOCS_PTE] = {
-	  /* 0x00000038 */
-	  .control_value = LE_CACHEABILITY(LE_PAGETABLE) |
-			   LE_TGT_CACHE(LE_TC_LLC_ELLC) |
-			   LE_LRUM(3) | LE_AOM(0) | LE_RSC(0) | LE_SCC(0) |
-			   LE_PFM(0) | LE_SCF(0),
-
-	  /* 0x0030 */
-	  .l3cc_value =    L3_ESC(0) | L3_SCC(0) | L3_CACHEABILITY(L3_WB),
-	},
-	[I915_MOCS_CACHED] = {
-	  /* 0x00000039 */
-	  .control_value = LE_CACHEABILITY(LE_UC) |
-			   LE_TGT_CACHE(LE_TC_LLC_ELLC) |
-			   LE_LRUM(3) | LE_AOM(0) | LE_RSC(0) | LE_SCC(0) |
-			   LE_PFM(0) | LE_SCF(0),
-
-	  /* 0x0030 */
-	  .l3cc_value =    L3_ESC(0) | L3_SCC(0) | L3_CACHEABILITY(L3_WB),
-	},
+	GEN9_MOCS_ENTRIES,
+	MOCS_ENTRY(I915_MOCS_CACHED,
+		   LE_1_UC | LE_TC_2_LLC_ELLC | LE_LRUM(3),
+		   L3_3_WB)
+};
+
+#define GEN11_MOCS_ENTRIES \
+	/* Base - Uncached (Deprecated) */ \
+	MOCS_ENTRY(I915_MOCS_UNCACHED, \
+		   LE_1_UC | LE_TC_1_LLC, \
+		   L3_1_UC), \
+	/* Base - L3 + LeCC:PAT (Deprecated) */ \
+	MOCS_ENTRY(I915_MOCS_PTE, \
+		   LE_0_PAGETABLE | LE_TC_1_LLC, \
+		   L3_3_WB), \
+	/* Base - L3 + LLC */ \
+	MOCS_ENTRY(2, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3), \
+		   L3_3_WB), \
+	/* Base - Uncached */ \
+	MOCS_ENTRY(3, \
+		   LE_1_UC | LE_TC_1_LLC, \
+		   L3_1_UC), \
+	/* Base - L3 */ \
+	MOCS_ENTRY(4, \
+		   LE_1_UC | LE_TC_1_LLC, \
+		   L3_3_WB), \
+	/* Base - LLC */ \
+	MOCS_ENTRY(5, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3), \
+		   L3_1_UC), \
+	/* Age 0 - LLC */ \
+	MOCS_ENTRY(6, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(1), \
+		   L3_1_UC), \
+	/* Age 0 - L3 + LLC */ \
+	MOCS_ENTRY(7, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(1), \
+		   L3_3_WB), \
+	/* Age: Don't Chg. - LLC */ \
+	MOCS_ENTRY(8, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(2), \
+		   L3_1_UC), \
+	/* Age: Don't Chg. - L3 + LLC */ \
+	MOCS_ENTRY(9, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(2), \
+		   L3_3_WB), \
+	/* No AOM - LLC */ \
+	MOCS_ENTRY(10, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3) | LE_AOM(1), \
+		   L3_1_UC), \
+	/* No AOM - L3 + LLC */ \
+	MOCS_ENTRY(11, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3) | LE_AOM(1), \
+		   L3_3_WB), \
+	/* No AOM; Age 0 - LLC */ \
+	MOCS_ENTRY(12, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(1) | LE_AOM(1), \
+		   L3_1_UC), \
+	/* No AOM; Age 0 - L3 + LLC */ \
+	MOCS_ENTRY(13, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(1) | LE_AOM(1), \
+		   L3_3_WB), \
+	/* No AOM; Age:DC - LLC */ \
+	MOCS_ENTRY(14, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(2) | LE_AOM(1), \
+		   L3_1_UC), \
+	/* No AOM; Age:DC - L3 + LLC */ \
+	MOCS_ENTRY(15, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(2) | LE_AOM(1), \
+		   L3_3_WB), \
+	/* Self-Snoop - L3 + LLC */ \
+	MOCS_ENTRY(18, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3) | LE_SSE(3), \
+		   L3_3_WB), \
+	/* Skip Caching - L3 + LLC(12.5%) */ \
+	MOCS_ENTRY(19, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3) | LE_SCC(7), \
+		   L3_3_WB), \
+	/* Skip Caching - L3 + LLC(25%) */ \
+	MOCS_ENTRY(20, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3) | LE_SCC(3), \
+		   L3_3_WB), \
+	/* Skip Caching - L3 + LLC(50%) */ \
+	MOCS_ENTRY(21, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3) | LE_SCC(1), \
+		   L3_3_WB), \
+	/* Skip Caching - L3 + LLC(75%) */ \
+	MOCS_ENTRY(22, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3) | LE_RSC(1) | LE_SCC(3), \
+		   L3_3_WB), \
+	/* Skip Caching - L3 + LLC(87.5%) */ \
+	MOCS_ENTRY(23, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3) | LE_RSC(1) | LE_SCC(7), \
+		   L3_3_WB), \
+	/* HW Reserved - SW program but never use */ \
+	MOCS_ENTRY(62, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3), \
+		   L3_1_UC), \
+	/* HW Reserved - SW program but never use */ \
+	MOCS_ENTRY(63, \
+		   LE_3_WB | LE_TC_1_LLC | LE_LRUM(3), \
+		   L3_1_UC)
+
+static const struct drm_i915_mocs_entry icelake_mocs_table[] = {
+	GEN11_MOCS_ENTRIES
 };
 
 /**
@@ -178,13 +252,19 @@ static bool get_mocs_settings(struct drm_i915_private *dev_priv,
 {
 	bool result = false;
 
-	if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv) ||
-	    IS_ICELAKE(dev_priv)) {
+	if (IS_ICELAKE(dev_priv)) {
+		table->size  = ARRAY_SIZE(icelake_mocs_table);
+		table->table = icelake_mocs_table;
+		table->n_entries = GEN11_NUM_MOCS_ENTRIES;
+		result = true;
+	} else if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 		table->size  = ARRAY_SIZE(skylake_mocs_table);
+		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
 		table->table = skylake_mocs_table;
 		result = true;
 	} else if (IS_GEN9_LP(dev_priv)) {
 		table->size  = ARRAY_SIZE(broxton_mocs_table);
+		table->n_entries = GEN9_NUM_MOCS_ENTRIES;
 		table->table = broxton_mocs_table;
 		result = true;
 	} else {
@@ -193,7 +273,7 @@ static bool get_mocs_settings(struct drm_i915_private *dev_priv,
 	}
 
 	/* WaDisableSkipCaching:skl,bxt,kbl,glk */
-	if (IS_GEN9(dev_priv)) {
+	if (IS_GEN(dev_priv, 9)) {
 		int i;
 
 		for (i = 0; i < table->size; i++)
@@ -226,6 +306,19 @@ static i915_reg_t mocs_register(enum intel_engine_id engine_id, int index)
 	}
 }
 
+/*
+ * Get control_value from MOCS entry taking into account when it's not used:
+ * I915_MOCS_PTE's value is returned in this case.
+ */
+static u32 get_entry_control(const struct drm_i915_mocs_table *table,
+			     unsigned int index)
+{
+	if (table->table[index].used)
+		return table->table[index].control_value;
+
+	return table->table[I915_MOCS_PTE].control_value;
+}
+
 /**
  * intel_mocs_init_engine() - emit the mocs control table
  * @engine:	The engine for whom to emit the registers.
@@ -238,27 +331,23 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
 	struct drm_i915_private *dev_priv = engine->i915;
 	struct drm_i915_mocs_table table;
 	unsigned int index;
+	u32 unused_value;
 
 	if (!get_mocs_settings(dev_priv, &table))
 		return;
 
-	GEM_BUG_ON(table.size > GEN9_NUM_MOCS_ENTRIES);
-
-	for (index = 0; index < table.size; index++)
-		I915_WRITE(mocs_register(engine->id, index),
-			   table.table[index].control_value);
-
-	/*
-	 * Ok, now set the unused entries to uncached. These entries
-	 * are officially undefined and no contract for the contents
-	 * and settings is given for these entries.
-	 *
-	 * Entry 0 in the table is uncached - so we are just writing
-	 * that value to all the used entries.
-	 */
-	for (; index < GEN9_NUM_MOCS_ENTRIES; index++)
-		I915_WRITE(mocs_register(engine->id, index),
-			   table.table[0].control_value);
+	/* Set unused values to PTE */
+	unused_value = table.table[I915_MOCS_PTE].control_value;
+
+	for (index = 0; index < table.size; index++) {
+		u32 value = get_entry_control(&table, index);
+
+		I915_WRITE(mocs_register(engine->id, index), value);
+	}
+
+	/* All remaining entries are also unused */
+	for (; index < table.n_entries; index++)
+		I915_WRITE(mocs_register(engine->id, index), unused_value);
 }
 
 /**
@@ -276,33 +365,32 @@ static int emit_mocs_control_table(struct i915_request *rq,
 {
 	enum intel_engine_id engine = rq->engine->id;
 	unsigned int index;
+	u32 unused_value;
 	u32 *cs;
 
-	if (WARN_ON(table->size > GEN9_NUM_MOCS_ENTRIES))
+	if (GEM_WARN_ON(table->size > table->n_entries))
 		return -ENODEV;
 
-	cs = intel_ring_begin(rq, 2 + 2 * GEN9_NUM_MOCS_ENTRIES);
+	/* Set unused values to PTE */
+	unused_value = table->table[I915_MOCS_PTE].control_value;
+
+	cs = intel_ring_begin(rq, 2 + 2 * table->n_entries);
 	if (IS_ERR(cs))
 		return PTR_ERR(cs);
 
-	*cs++ = MI_LOAD_REGISTER_IMM(GEN9_NUM_MOCS_ENTRIES);
+	*cs++ = MI_LOAD_REGISTER_IMM(table->n_entries);
 
 	for (index = 0; index < table->size; index++) {
+		u32 value = get_entry_control(table, index);
+
 		*cs++ = i915_mmio_reg_offset(mocs_register(engine, index));
-		*cs++ = table->table[index].control_value;
+		*cs++ = value;
 	}
 
-	/*
-	 * Ok, now set the unused entries to uncached. These entries
-	 * are officially undefined and no contract for the contents
-	 * and settings is given for these entries.
-	 *
-	 * Entry 0 in the table is uncached - so we are just writing
-	 * that value to all the used entries.
-	 */
-	for (; index < GEN9_NUM_MOCS_ENTRIES; index++) {
+	/* All remaining entries are also unused */
+	for (; index < table->n_entries; index++) {
 		*cs++ = i915_mmio_reg_offset(mocs_register(engine, index));
-		*cs++ = table->table[0].control_value;
+		*cs++ = unused_value;
 	}
 
 	*cs++ = MI_NOOP;
@@ -311,12 +399,24 @@ static int emit_mocs_control_table(struct i915_request *rq,
 	return 0;
 }
 
+/*
+ * Get l3cc_value from MOCS entry taking into account when it's not used:
+ * I915_MOCS_PTE's value is returned in this case.
+ */
+static u16 get_entry_l3cc(const struct drm_i915_mocs_table *table,
+			  unsigned int index)
+{
+	if (table->table[index].used)
+		return table->table[index].l3cc_value;
+
+	return table->table[I915_MOCS_PTE].l3cc_value;
+}
+
 static inline u32 l3cc_combine(const struct drm_i915_mocs_table *table,
 			       u16 low,
 			       u16 high)
 {
-	return table->table[low].l3cc_value |
-	       table->table[high].l3cc_value << 16;
+	return low | high << 16;
 }
 
 /**
@@ -333,38 +433,43 @@ static inline u32 l3cc_combine(const struct drm_i915_mocs_table *table,
 static int emit_mocs_l3cc_table(struct i915_request *rq,
 				const struct drm_i915_mocs_table *table)
 {
+	u16 unused_value;
 	unsigned int i;
 	u32 *cs;
 
-	if (WARN_ON(table->size > GEN9_NUM_MOCS_ENTRIES))
+	if (GEM_WARN_ON(table->size > table->n_entries))
 		return -ENODEV;
 
-	cs = intel_ring_begin(rq, 2 + GEN9_NUM_MOCS_ENTRIES);
+	/* Set unused values to PTE */
+	unused_value = table->table[I915_MOCS_PTE].l3cc_value;
+
+	cs = intel_ring_begin(rq, 2 + table->n_entries);
 	if (IS_ERR(cs))
 		return PTR_ERR(cs);
 
-	*cs++ = MI_LOAD_REGISTER_IMM(GEN9_NUM_MOCS_ENTRIES / 2);
+	*cs++ = MI_LOAD_REGISTER_IMM(table->n_entries / 2);
+
+	for (i = 0; i < table->size / 2; i++) {
+		u16 low = get_entry_l3cc(table, 2 * i);
+		u16 high = get_entry_l3cc(table, 2 * i + 1);
 
-	for (i = 0; i < table->size/2; i++) {
 		*cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i));
-		*cs++ = l3cc_combine(table, 2 * i, 2 * i + 1);
+		*cs++ = l3cc_combine(table, low, high);
 	}
 
+	/* Odd table size - 1 left over */
 	if (table->size & 0x01) {
-		/* Odd table size - 1 left over */
+		u16 low = get_entry_l3cc(table, 2 * i);
+
 		*cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i));
-		*cs++ = l3cc_combine(table, 2 * i, 0);
+		*cs++ = l3cc_combine(table, low, unused_value);
 		i++;
 	}
 
-	/*
-	 * Now set the rest of the table to uncached - use entry 0 as
-	 * this will be uncached. Leave the last pair uninitialised as
-	 * they are reserved by the hardware.
-	 */
-	for (; i < GEN9_NUM_MOCS_ENTRIES / 2; i++) {
+	/* All remaining entries are also unused */
+	for (; i < table->n_entries / 2; i++) {
 		*cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i));
-		*cs++ = l3cc_combine(table, 0, 0);
+		*cs++ = l3cc_combine(table, unused_value, unused_value);
 	}
 
 	*cs++ = MI_NOOP;
@@ -391,26 +496,35 @@ void intel_mocs_init_l3cc_table(struct drm_i915_private *dev_priv)
 {
 	struct drm_i915_mocs_table table;
 	unsigned int i;
+	u16 unused_value;
 
 	if (!get_mocs_settings(dev_priv, &table))
 		return;
 
-	for (i = 0; i < table.size/2; i++)
-		I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 2*i, 2*i+1));
+	/* Set unused values to PTE */
+	unused_value = table.table[I915_MOCS_PTE].l3cc_value;
+
+	for (i = 0; i < table.size / 2; i++) {
+		u16 low = get_entry_l3cc(&table, 2 * i);
+		u16 high = get_entry_l3cc(&table, 2 * i + 1);
+
+		I915_WRITE(GEN9_LNCFCMOCS(i),
+			   l3cc_combine(&table, low, high));
+	}
 
 	/* Odd table size - 1 left over */
 	if (table.size & 0x01) {
-		I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 2*i, 0));
+		u16 low = get_entry_l3cc(&table, 2 * i);
+
+		I915_WRITE(GEN9_LNCFCMOCS(i),
+			   l3cc_combine(&table, low, unused_value));
 		i++;
 	}
 
-	/*
-	 * Now set the rest of the table to uncached - use entry 0 as
-	 * this will be uncached. Leave the last pair as initialised as
-	 * they are reserved by the hardware.
-	 */
-	for (; i < (GEN9_NUM_MOCS_ENTRIES / 2); i++)
-		I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 0, 0));
+	/* All remaining entries are also unused */
+	for (; i < table.n_entries / 2; i++)
+		I915_WRITE(GEN9_LNCFCMOCS(i),
+			   l3cc_combine(&table, unused_value, unused_value));
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_mocs.h b/drivers/gpu/drm/i915/intel_mocs.h
index d89080d75b80..3d99d1271b2b 100644
--- a/drivers/gpu/drm/i915/intel_mocs.h
+++ b/drivers/gpu/drm/i915/intel_mocs.h
@@ -49,7 +49,6 @@
  * context handling keep the MOCS in step.
  */
 
-#include <drm/drmP.h>
 #include "i915_drv.h"
 
 int intel_rcs_context_init_mocs(struct i915_request *rq);
diff --git a/drivers/gpu/drm/i915/intel_opregion.c b/drivers/gpu/drm/i915/intel_opregion.c
index 3ac20153705a..5e00ee9270b5 100644
--- a/drivers/gpu/drm/i915/intel_opregion.c
+++ b/drivers/gpu/drm/i915/intel_opregion.c
@@ -30,7 +30,6 @@
 #include <linux/firmware.h>
 #include <acpi/video.h>
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 
 #include "intel_opregion.h"
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 20ea7c99d13a..c0df1dbb0069 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -25,8 +25,9 @@
  *
  * Derived from Xorg ddx, xf86-video-intel, src/i830_video.c
  */
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
+#include <drm/drm_fourcc.h>
+
 #include "i915_drv.h"
 #include "i915_reg.h"
 #include "intel_drv.h"
@@ -185,7 +186,7 @@ struct intel_overlay {
 	struct overlay_registers __iomem *regs;
 	u32 flip_addr;
 	/* flip handling */
-	struct i915_gem_active last_flip;
+	struct i915_active_request last_flip;
 };
 
 static void i830_overlay_clock_gating(struct drm_i915_private *dev_priv,
@@ -213,23 +214,23 @@ static void i830_overlay_clock_gating(struct drm_i915_private *dev_priv,
 
 static void intel_overlay_submit_request(struct intel_overlay *overlay,
 					 struct i915_request *rq,
-					 i915_gem_retire_fn retire)
+					 i915_active_retire_fn retire)
 {
-	GEM_BUG_ON(i915_gem_active_peek(&overlay->last_flip,
-					&overlay->i915->drm.struct_mutex));
-	i915_gem_active_set_retire_fn(&overlay->last_flip, retire,
-				      &overlay->i915->drm.struct_mutex);
-	i915_gem_active_set(&overlay->last_flip, rq);
+	GEM_BUG_ON(i915_active_request_peek(&overlay->last_flip,
+					    &overlay->i915->drm.struct_mutex));
+	i915_active_request_set_retire_fn(&overlay->last_flip, retire,
+					  &overlay->i915->drm.struct_mutex);
+	__i915_active_request_set(&overlay->last_flip, rq);
 	i915_request_add(rq);
 }
 
 static int intel_overlay_do_wait_request(struct intel_overlay *overlay,
 					 struct i915_request *rq,
-					 i915_gem_retire_fn retire)
+					 i915_active_retire_fn retire)
 {
 	intel_overlay_submit_request(overlay, rq, retire);
-	return i915_gem_active_retire(&overlay->last_flip,
-				      &overlay->i915->drm.struct_mutex);
+	return i915_active_request_retire(&overlay->last_flip,
+					  &overlay->i915->drm.struct_mutex);
 }
 
 static struct i915_request *alloc_request(struct intel_overlay *overlay)
@@ -350,8 +351,9 @@ static void intel_overlay_release_old_vma(struct intel_overlay *overlay)
 	i915_vma_put(vma);
 }
 
-static void intel_overlay_release_old_vid_tail(struct i915_gem_active *active,
-					       struct i915_request *rq)
+static void
+intel_overlay_release_old_vid_tail(struct i915_active_request *active,
+				   struct i915_request *rq)
 {
 	struct intel_overlay *overlay =
 		container_of(active, typeof(*overlay), last_flip);
@@ -359,7 +361,7 @@ static void intel_overlay_release_old_vid_tail(struct i915_gem_active *active,
 	intel_overlay_release_old_vma(overlay);
 }
 
-static void intel_overlay_off_tail(struct i915_gem_active *active,
+static void intel_overlay_off_tail(struct i915_active_request *active,
 				   struct i915_request *rq)
 {
 	struct intel_overlay *overlay =
@@ -422,8 +424,8 @@ static int intel_overlay_off(struct intel_overlay *overlay)
  * We have to be careful not to repeat work forever an make forward progess. */
 static int intel_overlay_recover_from_interrupt(struct intel_overlay *overlay)
 {
-	return i915_gem_active_retire(&overlay->last_flip,
-				      &overlay->i915->drm.struct_mutex);
+	return i915_active_request_retire(&overlay->last_flip,
+					  &overlay->i915->drm.struct_mutex);
 }
 
 /* Wait for pending overlay flip and release old frame.
@@ -479,8 +481,6 @@ void intel_overlay_reset(struct drm_i915_private *dev_priv)
 	if (!overlay)
 		return;
 
-	intel_overlay_release_old_vid(overlay);
-
 	overlay->old_xscale = 0;
 	overlay->old_yscale = 0;
 	overlay->crtc = NULL;
@@ -541,7 +541,7 @@ static u32 calc_swidthsw(struct drm_i915_private *dev_priv, u32 offset, u32 widt
 {
 	u32 sw;
 
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		sw = ALIGN((offset & 31) + width, 32);
 	else
 		sw = ALIGN((offset & 63) + width, 64);
@@ -778,7 +778,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 		u32 oconfig;
 
 		oconfig = OCONF_CC_OUT_8BIT;
-		if (IS_GEN4(dev_priv))
+		if (IS_GEN(dev_priv, 4))
 			oconfig |= OCONF_CSC_MODE_BT709;
 		oconfig |= pipe == 0 ?
 			OCONF_PIPE_A : OCONF_PIPE_B;
@@ -1012,7 +1012,7 @@ static int check_overlay_src(struct drm_i915_private *dev_priv,
 
 	if (rec->stride_Y & stride_mask || rec->stride_UV & stride_mask)
 		return -EINVAL;
-	if (IS_GEN4(dev_priv) && rec->stride_Y < 512)
+	if (IS_GEN(dev_priv, 4) && rec->stride_Y < 512)
 		return -EINVAL;
 
 	tmp = (rec->flags & I915_OVERLAY_TYPE_MASK) == I915_OVERLAY_YUV_PLANAR ?
@@ -1246,7 +1246,7 @@ int intel_overlay_attrs_ioctl(struct drm_device *dev, void *data,
 		attrs->contrast   = overlay->contrast;
 		attrs->saturation = overlay->saturation;
 
-		if (!IS_GEN2(dev_priv)) {
+		if (!IS_GEN(dev_priv, 2)) {
 			attrs->gamma0 = I915_READ(OGAMC0);
 			attrs->gamma1 = I915_READ(OGAMC1);
 			attrs->gamma2 = I915_READ(OGAMC2);
@@ -1270,7 +1270,7 @@ int intel_overlay_attrs_ioctl(struct drm_device *dev, void *data,
 		update_reg_attrs(overlay, overlay->regs);
 
 		if (attrs->flags & I915_OVERLAY_UPDATE_GAMMA) {
-			if (IS_GEN2(dev_priv))
+			if (IS_GEN(dev_priv, 2))
 				goto out_unlock;
 
 			if (overlay->active) {
@@ -1358,7 +1358,7 @@ void intel_overlay_setup(struct drm_i915_private *dev_priv)
 	overlay->contrast = 75;
 	overlay->saturation = 146;
 
-	init_request_active(&overlay->last_flip, NULL);
+	INIT_ACTIVE_REQUEST(&overlay->last_flip);
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/intel_panel.c b/drivers/gpu/drm/i915/intel_panel.c
index e6cd7b55c018..beca98d2b035 100644
--- a/drivers/gpu/drm/i915/intel_panel.c
+++ b/drivers/gpu/drm/i915/intel_panel.c
@@ -563,7 +563,7 @@ static void i9xx_set_backlight(const struct drm_connector_state *conn_state, u32
 		pci_write_config_byte(dev_priv->drm.pdev, LBPC, lbpc);
 	}
 
-	if (IS_GEN4(dev_priv)) {
+	if (IS_GEN(dev_priv, 4)) {
 		mask = BACKLIGHT_DUTY_CYCLE_MASK;
 	} else {
 		level <<= 1;
@@ -929,7 +929,7 @@ static void i9xx_enable_backlight(const struct intel_crtc_state *crtc_state,
 	 * 855gm only, but checking for gen2 is safe, as 855gm is the only gen2
 	 * that has backlight.
 	 */
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		I915_WRITE(BLC_HIST_CTL, BLM_HISTOGRAM_ENABLE);
 }
 
@@ -1087,20 +1087,11 @@ static void pwm_enable_backlight(const struct intel_crtc_state *crtc_state,
 	intel_panel_actually_set_backlight(conn_state, panel->backlight.level);
 }
 
-void intel_panel_enable_backlight(const struct intel_crtc_state *crtc_state,
-				  const struct drm_connector_state *conn_state)
+static void __intel_panel_enable_backlight(const struct intel_crtc_state *crtc_state,
+					   const struct drm_connector_state *conn_state)
 {
 	struct intel_connector *connector = to_intel_connector(conn_state->connector);
-	struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
 	struct intel_panel *panel = &connector->panel;
-	enum pipe pipe = to_intel_crtc(crtc_state->base.crtc)->pipe;
-
-	if (!panel->backlight.present)
-		return;
-
-	DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe));
-
-	mutex_lock(&dev_priv->backlight_lock);
 
 	WARN_ON(panel->backlight.max == 0);
 
@@ -1117,6 +1108,24 @@ void intel_panel_enable_backlight(const struct intel_crtc_state *crtc_state,
 	panel->backlight.enabled = true;
 	if (panel->backlight.device)
 		panel->backlight.device->props.power = FB_BLANK_UNBLANK;
+}
+
+void intel_panel_enable_backlight(const struct intel_crtc_state *crtc_state,
+				  const struct drm_connector_state *conn_state)
+{
+	struct intel_connector *connector = to_intel_connector(conn_state->connector);
+	struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
+	struct intel_panel *panel = &connector->panel;
+	enum pipe pipe = to_intel_crtc(crtc_state->base.crtc)->pipe;
+
+	if (!panel->backlight.present)
+		return;
+
+	DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe));
+
+	mutex_lock(&dev_priv->backlight_lock);
+
+	__intel_panel_enable_backlight(crtc_state, conn_state);
 
 	mutex_unlock(&dev_priv->backlight_lock);
 }
@@ -1203,17 +1212,20 @@ static int intel_backlight_device_get_brightness(struct backlight_device *bd)
 	struct intel_connector *connector = bl_get_data(bd);
 	struct drm_device *dev = connector->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	u32 hw_level;
-	int ret;
+	intel_wakeref_t wakeref;
+	int ret = 0;
 
-	intel_runtime_pm_get(dev_priv);
-	drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		u32 hw_level;
 
-	hw_level = intel_panel_get_backlight(connector);
-	ret = scale_hw_to_user(connector, hw_level, bd->props.max_brightness);
+		drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);
 
-	drm_modeset_unlock(&dev->mode_config.connection_mutex);
-	intel_runtime_pm_put(dev_priv);
+		hw_level = intel_panel_get_backlight(connector);
+		ret = scale_hw_to_user(connector,
+				       hw_level, bd->props.max_brightness);
+
+		drm_modeset_unlock(&dev->mode_config.connection_mutex);
+	}
 
 	return ret;
 }
@@ -1484,8 +1496,8 @@ static int lpt_setup_backlight(struct intel_connector *connector, enum pipe unus
 {
 	struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
 	struct intel_panel *panel = &connector->panel;
-	u32 pch_ctl1, pch_ctl2, val;
-	bool alt;
+	u32 cpu_ctl2, pch_ctl1, pch_ctl2, val;
+	bool alt, cpu_mode;
 
 	if (HAS_PCH_LPT(dev_priv))
 		alt = I915_READ(SOUTH_CHICKEN2) & LPT_PWM_GRANULARITY;
@@ -1499,6 +1511,8 @@ static int lpt_setup_backlight(struct intel_connector *connector, enum pipe unus
 	pch_ctl2 = I915_READ(BLC_PWM_PCH_CTL2);
 	panel->backlight.max = pch_ctl2 >> 16;
 
+	cpu_ctl2 = I915_READ(BLC_PWM_CPU_CTL2);
+
 	if (!panel->backlight.max)
 		panel->backlight.max = get_backlight_max_vbt(connector);
 
@@ -1507,12 +1521,28 @@ static int lpt_setup_backlight(struct intel_connector *connector, enum pipe unus
 
 	panel->backlight.min = get_backlight_min_vbt(connector);
 
-	val = lpt_get_backlight(connector);
+	panel->backlight.enabled = pch_ctl1 & BLM_PCH_PWM_ENABLE;
+
+	cpu_mode = panel->backlight.enabled && HAS_PCH_LPT(dev_priv) &&
+		   !(pch_ctl1 & BLM_PCH_OVERRIDE_ENABLE) &&
+		   (cpu_ctl2 & BLM_PWM_ENABLE);
+	if (cpu_mode)
+		val = pch_get_backlight(connector);
+	else
+		val = lpt_get_backlight(connector);
 	val = intel_panel_compute_brightness(connector, val);
 	panel->backlight.level = clamp(val, panel->backlight.min,
 				       panel->backlight.max);
 
-	panel->backlight.enabled = pch_ctl1 & BLM_PCH_PWM_ENABLE;
+	if (cpu_mode) {
+		DRM_DEBUG_KMS("CPU backlight register was enabled, switching to PCH override\n");
+
+		/* Write converted CPU PWM value to PCH override register */
+		lpt_set_backlight(connector->base.state, panel->backlight.level);
+		I915_WRITE(BLC_PWM_PCH_CTL1, pch_ctl1 | BLM_PCH_OVERRIDE_ENABLE);
+
+		I915_WRITE(BLC_PWM_CPU_CTL2, cpu_ctl2 & ~BLM_PWM_ENABLE);
+	}
 
 	return 0;
 }
@@ -1557,7 +1587,7 @@ static int i9xx_setup_backlight(struct intel_connector *connector, enum pipe unu
 
 	ctl = I915_READ(BLC_PWM_CTL);
 
-	if (IS_GEN2(dev_priv) || IS_I915GM(dev_priv) || IS_I945GM(dev_priv))
+	if (IS_GEN(dev_priv, 2) || IS_I915GM(dev_priv) || IS_I945GM(dev_priv))
 		panel->backlight.combination_mode = ctl & BLM_LEGACY_MODE;
 
 	if (IS_PINEVIEW(dev_priv))
@@ -1773,6 +1803,24 @@ static int pwm_setup_backlight(struct intel_connector *connector,
 	return 0;
 }
 
+void intel_panel_update_backlight(struct intel_encoder *encoder,
+				  const struct intel_crtc_state *crtc_state,
+				  const struct drm_connector_state *conn_state)
+{
+	struct intel_connector *connector = to_intel_connector(conn_state->connector);
+	struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
+	struct intel_panel *panel = &connector->panel;
+
+	if (!panel->backlight.present)
+		return;
+
+	mutex_lock(&dev_priv->backlight_lock);
+	if (!panel->backlight.enabled)
+		__intel_panel_enable_backlight(crtc_state, conn_state);
+
+	mutex_unlock(&dev_priv->backlight_lock);
+}
+
 int intel_panel_setup_backlight(struct drm_connector *connector, enum pipe pipe)
 {
 	struct drm_i915_private *dev_priv = to_i915(connector->dev);
@@ -1886,7 +1934,7 @@ intel_panel_init_backlight_funcs(struct intel_panel *panel)
 			panel->backlight.get = vlv_get_backlight;
 			panel->backlight.hz_to_pwm = vlv_hz_to_pwm;
 		}
-	} else if (IS_GEN4(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 4)) {
 		panel->backlight.setup = i965_setup_backlight;
 		panel->backlight.enable = i965_enable_backlight;
 		panel->backlight.disable = i965_disable_backlight;
diff --git a/drivers/gpu/drm/i915/intel_pipe_crc.c b/drivers/gpu/drm/i915/intel_pipe_crc.c
index f3c9010e332a..a8554dc4f196 100644
--- a/drivers/gpu/drm/i915/intel_pipe_crc.c
+++ b/drivers/gpu/drm/i915/intel_pipe_crc.c
@@ -44,7 +44,7 @@ static const char * const pipe_crc_sources[] = {
 };
 
 static int i8xx_pipe_crc_ctl_reg(enum intel_pipe_crc_source *source,
-				 uint32_t *val)
+				 u32 *val)
 {
 	if (*source == INTEL_PIPE_CRC_SOURCE_AUTO)
 		*source = INTEL_PIPE_CRC_SOURCE_PIPE;
@@ -120,7 +120,7 @@ static int i9xx_pipe_crc_auto_source(struct drm_i915_private *dev_priv,
 static int vlv_pipe_crc_ctl_reg(struct drm_i915_private *dev_priv,
 				enum pipe pipe,
 				enum intel_pipe_crc_source *source,
-				uint32_t *val)
+				u32 *val)
 {
 	bool need_stable_symbols = false;
 
@@ -165,7 +165,7 @@ static int vlv_pipe_crc_ctl_reg(struct drm_i915_private *dev_priv,
 	 *   - DisplayPort scrambling: used for EMI reduction
 	 */
 	if (need_stable_symbols) {
-		uint32_t tmp = I915_READ(PORT_DFT2_G4X);
+		u32 tmp = I915_READ(PORT_DFT2_G4X);
 
 		tmp |= DC_BALANCE_RESET_VLV;
 		switch (pipe) {
@@ -190,7 +190,7 @@ static int vlv_pipe_crc_ctl_reg(struct drm_i915_private *dev_priv,
 static int i9xx_pipe_crc_ctl_reg(struct drm_i915_private *dev_priv,
 				 enum pipe pipe,
 				 enum intel_pipe_crc_source *source,
-				 uint32_t *val)
+				 u32 *val)
 {
 	bool need_stable_symbols = false;
 
@@ -244,7 +244,7 @@ static int i9xx_pipe_crc_ctl_reg(struct drm_i915_private *dev_priv,
 	 *   - DisplayPort scrambling: used for EMI reduction
 	 */
 	if (need_stable_symbols) {
-		uint32_t tmp = I915_READ(PORT_DFT2_G4X);
+		u32 tmp = I915_READ(PORT_DFT2_G4X);
 
 		WARN_ON(!IS_G4X(dev_priv));
 
@@ -265,7 +265,7 @@ static int i9xx_pipe_crc_ctl_reg(struct drm_i915_private *dev_priv,
 static void vlv_undo_pipe_scramble_reset(struct drm_i915_private *dev_priv,
 					 enum pipe pipe)
 {
-	uint32_t tmp = I915_READ(PORT_DFT2_G4X);
+	u32 tmp = I915_READ(PORT_DFT2_G4X);
 
 	switch (pipe) {
 	case PIPE_A:
@@ -289,7 +289,7 @@ static void vlv_undo_pipe_scramble_reset(struct drm_i915_private *dev_priv,
 static void g4x_undo_pipe_scramble_reset(struct drm_i915_private *dev_priv,
 					 enum pipe pipe)
 {
-	uint32_t tmp = I915_READ(PORT_DFT2_G4X);
+	u32 tmp = I915_READ(PORT_DFT2_G4X);
 
 	if (pipe == PIPE_A)
 		tmp &= ~PIPE_A_SCRAMBLE_RESET;
@@ -304,7 +304,7 @@ static void g4x_undo_pipe_scramble_reset(struct drm_i915_private *dev_priv,
 }
 
 static int ilk_pipe_crc_ctl_reg(enum intel_pipe_crc_source *source,
-				uint32_t *val)
+				u32 *val)
 {
 	if (*source == INTEL_PIPE_CRC_SOURCE_AUTO)
 		*source = INTEL_PIPE_CRC_SOURCE_PIPE;
@@ -392,7 +392,7 @@ unlock:
 static int ivb_pipe_crc_ctl_reg(struct drm_i915_private *dev_priv,
 				enum pipe pipe,
 				enum intel_pipe_crc_source *source,
-				uint32_t *val,
+				u32 *val,
 				bool set_wa)
 {
 	if (*source == INTEL_PIPE_CRC_SOURCE_AUTO)
@@ -427,13 +427,13 @@ static int get_new_crc_ctl_reg(struct drm_i915_private *dev_priv,
 			       enum intel_pipe_crc_source *source, u32 *val,
 			       bool set_wa)
 {
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		return i8xx_pipe_crc_ctl_reg(source, val);
 	else if (INTEL_GEN(dev_priv) < 5)
 		return i9xx_pipe_crc_ctl_reg(dev_priv, pipe, source, val);
 	else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
 		return vlv_pipe_crc_ctl_reg(dev_priv, pipe, source, val);
-	else if (IS_GEN5(dev_priv) || IS_GEN6(dev_priv))
+	else if (IS_GEN_RANGE(dev_priv, 5, 6))
 		return ilk_pipe_crc_ctl_reg(source, val);
 	else
 		return ivb_pipe_crc_ctl_reg(dev_priv, pipe, source, val, set_wa);
@@ -544,13 +544,13 @@ static int
 intel_is_valid_crc_source(struct drm_i915_private *dev_priv,
 			  const enum intel_pipe_crc_source source)
 {
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		return i8xx_crc_source_valid(dev_priv, source);
 	else if (INTEL_GEN(dev_priv) < 5)
 		return i9xx_crc_source_valid(dev_priv, source);
 	else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
 		return vlv_crc_source_valid(dev_priv, source);
-	else if (IS_GEN5(dev_priv) || IS_GEN6(dev_priv))
+	else if (IS_GEN_RANGE(dev_priv, 5, 6))
 		return ilk_crc_source_valid(dev_priv, source);
 	else
 		return ivb_crc_source_valid(dev_priv, source);
@@ -589,6 +589,7 @@ int intel_crtc_set_crc_source(struct drm_crtc *crtc, const char *source_name)
 	struct intel_pipe_crc *pipe_crc = &dev_priv->pipe_crc[crtc->index];
 	enum intel_display_power_domain power_domain;
 	enum intel_pipe_crc_source source;
+	intel_wakeref_t wakeref;
 	u32 val = 0; /* shut up gcc */
 	int ret = 0;
 
@@ -598,7 +599,8 @@ int intel_crtc_set_crc_source(struct drm_crtc *crtc, const char *source_name)
 	}
 
 	power_domain = POWER_DOMAIN_PIPE(crtc->index);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain)) {
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref) {
 		DRM_DEBUG_KMS("Trying to capture CRC while pipe is off\n");
 		return -EIO;
 	}
@@ -624,7 +626,7 @@ int intel_crtc_set_crc_source(struct drm_crtc *crtc, const char *source_name)
 	pipe_crc->skipped = 0;
 
 out:
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index a26b4eddda25..54307f1df6cf 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -26,13 +26,16 @@
  */
 
 #include <linux/cpufreq.h>
+#include <linux/module.h>
 #include <linux/pm_runtime.h>
+
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_fourcc.h>
 #include <drm/drm_plane_helper.h>
+
 #include "i915_drv.h"
 #include "intel_drv.h"
 #include "../../../platform/x86/intel_ips.h"
-#include <linux/module.h>
-#include <drm/drm_atomic_helper.h>
 
 /**
  * DOC: RC6
@@ -480,7 +483,7 @@ static void vlv_get_fifo_size(struct intel_crtc_state *crtc_state)
 	int sprite0_start, sprite1_start;
 
 	switch (pipe) {
-		uint32_t dsparb, dsparb2, dsparb3;
+		u32 dsparb, dsparb2, dsparb3;
 	case PIPE_A:
 		dsparb = I915_READ(DSPARB);
 		dsparb2 = I915_READ(DSPARB2);
@@ -513,7 +516,7 @@ static void vlv_get_fifo_size(struct intel_crtc_state *crtc_state)
 static int i9xx_get_fifo_size(struct drm_i915_private *dev_priv,
 			      enum i9xx_plane_id i9xx_plane)
 {
-	uint32_t dsparb = I915_READ(DSPARB);
+	u32 dsparb = I915_READ(DSPARB);
 	int size;
 
 	size = dsparb & 0x7f;
@@ -529,7 +532,7 @@ static int i9xx_get_fifo_size(struct drm_i915_private *dev_priv,
 static int i830_get_fifo_size(struct drm_i915_private *dev_priv,
 			      enum i9xx_plane_id i9xx_plane)
 {
-	uint32_t dsparb = I915_READ(DSPARB);
+	u32 dsparb = I915_READ(DSPARB);
 	int size;
 
 	size = dsparb & 0x1ff;
@@ -546,7 +549,7 @@ static int i830_get_fifo_size(struct drm_i915_private *dev_priv,
 static int i845_get_fifo_size(struct drm_i915_private *dev_priv,
 			      enum i9xx_plane_id i9xx_plane)
 {
-	uint32_t dsparb = I915_READ(DSPARB);
+	u32 dsparb = I915_READ(DSPARB);
 	int size;
 
 	size = dsparb & 0x7f;
@@ -667,9 +670,9 @@ static unsigned int intel_wm_method1(unsigned int pixel_rate,
 				     unsigned int cpp,
 				     unsigned int latency)
 {
-	uint64_t ret;
+	u64 ret;
 
-	ret = (uint64_t) pixel_rate * cpp * latency;
+	ret = (u64)pixel_rate * cpp * latency;
 	ret = DIV_ROUND_UP_ULL(ret, 10000);
 
 	return ret;
@@ -1089,9 +1092,9 @@ static int g4x_fbc_fifo_size(int level)
 	}
 }
 
-static uint16_t g4x_compute_wm(const struct intel_crtc_state *crtc_state,
-			       const struct intel_plane_state *plane_state,
-			       int level)
+static u16 g4x_compute_wm(const struct intel_crtc_state *crtc_state,
+			  const struct intel_plane_state *plane_state,
+			  int level)
 {
 	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
@@ -1188,9 +1191,9 @@ static bool g4x_raw_fbc_wm_set(struct intel_crtc_state *crtc_state,
 	return dirty;
 }
 
-static uint32_t ilk_compute_fbc_wm(const struct intel_crtc_state *cstate,
-				   const struct intel_plane_state *pstate,
-				   uint32_t pri_val);
+static u32 ilk_compute_fbc_wm(const struct intel_crtc_state *cstate,
+			      const struct intel_plane_state *pstate,
+			      u32 pri_val);
 
 static bool g4x_raw_plane_wm_compute(struct intel_crtc_state *crtc_state,
 				     const struct intel_plane_state *plane_state)
@@ -1399,10 +1402,9 @@ static int g4x_compute_pipe_wm(struct intel_crtc_state *crtc_state)
 	return 0;
 }
 
-static int g4x_compute_intermediate_wm(struct drm_device *dev,
-				       struct intel_crtc *crtc,
-				       struct intel_crtc_state *new_crtc_state)
+static int g4x_compute_intermediate_wm(struct intel_crtc_state *new_crtc_state)
 {
+	struct intel_crtc *crtc = to_intel_crtc(new_crtc_state->base.crtc);
 	struct g4x_wm_state *intermediate = &new_crtc_state->wm.g4x.intermediate;
 	const struct g4x_wm_state *optimal = &new_crtc_state->wm.g4x.optimal;
 	struct intel_atomic_state *intel_state =
@@ -1599,9 +1601,9 @@ static void vlv_setup_wm_latency(struct drm_i915_private *dev_priv)
 	}
 }
 
-static uint16_t vlv_compute_wm_level(const struct intel_crtc_state *crtc_state,
-				     const struct intel_plane_state *plane_state,
-				     int level)
+static u16 vlv_compute_wm_level(const struct intel_crtc_state *crtc_state,
+				const struct intel_plane_state *plane_state,
+				int level)
 {
 	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
@@ -1969,7 +1971,7 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state,
 	spin_lock(&dev_priv->uncore.lock);
 
 	switch (crtc->pipe) {
-		uint32_t dsparb, dsparb2, dsparb3;
+		u32 dsparb, dsparb2, dsparb3;
 	case PIPE_A:
 		dsparb = I915_READ_FW(DSPARB);
 		dsparb2 = I915_READ_FW(DSPARB2);
@@ -2032,10 +2034,9 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state,
 
 #undef VLV_FIFO
 
-static int vlv_compute_intermediate_wm(struct drm_device *dev,
-				       struct intel_crtc *crtc,
-				       struct intel_crtc_state *new_crtc_state)
+static int vlv_compute_intermediate_wm(struct intel_crtc_state *new_crtc_state)
 {
+	struct intel_crtc *crtc = to_intel_crtc(new_crtc_state->base.crtc);
 	struct vlv_wm_state *intermediate = &new_crtc_state->wm.vlv.intermediate;
 	const struct vlv_wm_state *optimal = &new_crtc_state->wm.vlv.optimal;
 	struct intel_atomic_state *intel_state =
@@ -2264,8 +2265,8 @@ static void i9xx_update_wm(struct intel_crtc *unused_crtc)
 {
 	struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev);
 	const struct intel_watermark_params *wm_info;
-	uint32_t fwater_lo;
-	uint32_t fwater_hi;
+	u32 fwater_lo;
+	u32 fwater_hi;
 	int cwm, srwm = 1;
 	int fifo_size;
 	int planea_wm, planeb_wm;
@@ -2273,7 +2274,7 @@ static void i9xx_update_wm(struct intel_crtc *unused_crtc)
 
 	if (IS_I945GM(dev_priv))
 		wm_info = &i945_wm_info;
-	else if (!IS_GEN2(dev_priv))
+	else if (!IS_GEN(dev_priv, 2))
 		wm_info = &i915_wm_info;
 	else
 		wm_info = &i830_a_wm_info;
@@ -2287,7 +2288,7 @@ static void i9xx_update_wm(struct intel_crtc *unused_crtc)
 			crtc->base.primary->state->fb;
 		int cpp;
 
-		if (IS_GEN2(dev_priv))
+		if (IS_GEN(dev_priv, 2))
 			cpp = 4;
 		else
 			cpp = fb->format->cpp[0];
@@ -2302,7 +2303,7 @@ static void i9xx_update_wm(struct intel_crtc *unused_crtc)
 			planea_wm = wm_info->max_wm;
 	}
 
-	if (IS_GEN2(dev_priv))
+	if (IS_GEN(dev_priv, 2))
 		wm_info = &i830_bc_wm_info;
 
 	fifo_size = dev_priv->display.get_fifo_size(dev_priv, PLANE_B);
@@ -2314,7 +2315,7 @@ static void i9xx_update_wm(struct intel_crtc *unused_crtc)
 			crtc->base.primary->state->fb;
 		int cpp;
 
-		if (IS_GEN2(dev_priv))
+		if (IS_GEN(dev_priv, 2))
 			cpp = 4;
 		else
 			cpp = fb->format->cpp[0];
@@ -2408,7 +2409,7 @@ static void i845_update_wm(struct intel_crtc *unused_crtc)
 	struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev);
 	struct intel_crtc *crtc;
 	const struct drm_display_mode *adjusted_mode;
-	uint32_t fwater_lo;
+	u32 fwater_lo;
 	int planea_wm;
 
 	crtc = single_enabled_crtc(dev_priv);
@@ -2457,8 +2458,7 @@ static unsigned int ilk_wm_method2(unsigned int pixel_rate,
 	return ret;
 }
 
-static uint32_t ilk_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
-			   uint8_t cpp)
+static u32 ilk_wm_fbc(u32 pri_val, u32 horiz_pixels, u8 cpp)
 {
 	/*
 	 * Neither of these should be possible since this function shouldn't be
@@ -2475,22 +2475,21 @@ static uint32_t ilk_wm_fbc(uint32_t pri_val, uint32_t horiz_pixels,
 }
 
 struct ilk_wm_maximums {
-	uint16_t pri;
-	uint16_t spr;
-	uint16_t cur;
-	uint16_t fbc;
+	u16 pri;
+	u16 spr;
+	u16 cur;
+	u16 fbc;
 };
 
 /*
  * For both WM_PIPE and WM_LP.
  * mem_value must be in 0.1us units.
  */
-static uint32_t ilk_compute_pri_wm(const struct intel_crtc_state *cstate,
-				   const struct intel_plane_state *pstate,
-				   uint32_t mem_value,
-				   bool is_lp)
+static u32 ilk_compute_pri_wm(const struct intel_crtc_state *cstate,
+			      const struct intel_plane_state *pstate,
+			      u32 mem_value, bool is_lp)
 {
-	uint32_t method1, method2;
+	u32 method1, method2;
 	int cpp;
 
 	if (mem_value == 0)
@@ -2518,11 +2517,11 @@ static uint32_t ilk_compute_pri_wm(const struct intel_crtc_state *cstate,
  * For both WM_PIPE and WM_LP.
  * mem_value must be in 0.1us units.
  */
-static uint32_t ilk_compute_spr_wm(const struct intel_crtc_state *cstate,
-				   const struct intel_plane_state *pstate,
-				   uint32_t mem_value)
+static u32 ilk_compute_spr_wm(const struct intel_crtc_state *cstate,
+			      const struct intel_plane_state *pstate,
+			      u32 mem_value)
 {
-	uint32_t method1, method2;
+	u32 method1, method2;
 	int cpp;
 
 	if (mem_value == 0)
@@ -2545,9 +2544,9 @@ static uint32_t ilk_compute_spr_wm(const struct intel_crtc_state *cstate,
  * For both WM_PIPE and WM_LP.
  * mem_value must be in 0.1us units.
  */
-static uint32_t ilk_compute_cur_wm(const struct intel_crtc_state *cstate,
-				   const struct intel_plane_state *pstate,
-				   uint32_t mem_value)
+static u32 ilk_compute_cur_wm(const struct intel_crtc_state *cstate,
+			      const struct intel_plane_state *pstate,
+			      u32 mem_value)
 {
 	int cpp;
 
@@ -2565,9 +2564,9 @@ static uint32_t ilk_compute_cur_wm(const struct intel_crtc_state *cstate,
 }
 
 /* Only for WM_LP. */
-static uint32_t ilk_compute_fbc_wm(const struct intel_crtc_state *cstate,
-				   const struct intel_plane_state *pstate,
-				   uint32_t pri_val)
+static u32 ilk_compute_fbc_wm(const struct intel_crtc_state *cstate,
+			      const struct intel_plane_state *pstate,
+			      u32 pri_val)
 {
 	int cpp;
 
@@ -2626,13 +2625,12 @@ static unsigned int ilk_fbc_wm_reg_max(const struct drm_i915_private *dev_priv)
 }
 
 /* Calculate the maximum primary/sprite plane watermark */
-static unsigned int ilk_plane_wm_max(const struct drm_device *dev,
+static unsigned int ilk_plane_wm_max(const struct drm_i915_private *dev_priv,
 				     int level,
 				     const struct intel_wm_config *config,
 				     enum intel_ddb_partitioning ddb_partitioning,
 				     bool is_sprite)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
 	unsigned int fifo_size = ilk_display_fifo_size(dev_priv);
 
 	/* if sprites aren't enabled, sprites get nothing */
@@ -2668,7 +2666,7 @@ static unsigned int ilk_plane_wm_max(const struct drm_device *dev,
 }
 
 /* Calculate the maximum cursor plane watermark */
-static unsigned int ilk_cursor_wm_max(const struct drm_device *dev,
+static unsigned int ilk_cursor_wm_max(const struct drm_i915_private *dev_priv,
 				      int level,
 				      const struct intel_wm_config *config)
 {
@@ -2677,19 +2675,19 @@ static unsigned int ilk_cursor_wm_max(const struct drm_device *dev,
 		return 64;
 
 	/* otherwise just report max that registers can hold */
-	return ilk_cursor_wm_reg_max(to_i915(dev), level);
+	return ilk_cursor_wm_reg_max(dev_priv, level);
 }
 
-static void ilk_compute_wm_maximums(const struct drm_device *dev,
+static void ilk_compute_wm_maximums(const struct drm_i915_private *dev_priv,
 				    int level,
 				    const struct intel_wm_config *config,
 				    enum intel_ddb_partitioning ddb_partitioning,
 				    struct ilk_wm_maximums *max)
 {
-	max->pri = ilk_plane_wm_max(dev, level, config, ddb_partitioning, false);
-	max->spr = ilk_plane_wm_max(dev, level, config, ddb_partitioning, true);
-	max->cur = ilk_cursor_wm_max(dev, level, config);
-	max->fbc = ilk_fbc_wm_reg_max(to_i915(dev));
+	max->pri = ilk_plane_wm_max(dev_priv, level, config, ddb_partitioning, false);
+	max->spr = ilk_plane_wm_max(dev_priv, level, config, ddb_partitioning, true);
+	max->cur = ilk_cursor_wm_max(dev_priv, level, config);
+	max->fbc = ilk_fbc_wm_reg_max(dev_priv);
 }
 
 static void ilk_compute_wm_reg_maximums(const struct drm_i915_private *dev_priv,
@@ -2734,9 +2732,9 @@ static bool ilk_validate_wm_level(int level,
 			DRM_DEBUG_KMS("Cursor WM%d too large %u (max %u)\n",
 				      level, result->cur_val, max->cur);
 
-		result->pri_val = min_t(uint32_t, result->pri_val, max->pri);
-		result->spr_val = min_t(uint32_t, result->spr_val, max->spr);
-		result->cur_val = min_t(uint32_t, result->cur_val, max->cur);
+		result->pri_val = min_t(u32, result->pri_val, max->pri);
+		result->spr_val = min_t(u32, result->spr_val, max->spr);
+		result->cur_val = min_t(u32, result->cur_val, max->cur);
 		result->enable = true;
 	}
 
@@ -2752,9 +2750,9 @@ static void ilk_compute_wm_level(const struct drm_i915_private *dev_priv,
 				 const struct intel_plane_state *curstate,
 				 struct intel_wm_level *result)
 {
-	uint16_t pri_latency = dev_priv->wm.pri_latency[level];
-	uint16_t spr_latency = dev_priv->wm.spr_latency[level];
-	uint16_t cur_latency = dev_priv->wm.cur_latency[level];
+	u16 pri_latency = dev_priv->wm.pri_latency[level];
+	u16 spr_latency = dev_priv->wm.spr_latency[level];
+	u16 cur_latency = dev_priv->wm.cur_latency[level];
 
 	/* WM1+ latency values stored in 0.5us units */
 	if (level > 0) {
@@ -2778,7 +2776,7 @@ static void ilk_compute_wm_level(const struct drm_i915_private *dev_priv,
 	result->enable = true;
 }
 
-static uint32_t
+static u32
 hsw_compute_linetime_wm(const struct intel_crtc_state *cstate)
 {
 	const struct intel_atomic_state *intel_state =
@@ -2807,10 +2805,10 @@ hsw_compute_linetime_wm(const struct intel_crtc_state *cstate)
 }
 
 static void intel_read_wm_latency(struct drm_i915_private *dev_priv,
-				  uint16_t wm[8])
+				  u16 wm[8])
 {
 	if (INTEL_GEN(dev_priv) >= 9) {
-		uint32_t val;
+		u32 val;
 		int ret, i;
 		int level, max_level = ilk_wm_max_level(dev_priv);
 
@@ -2894,7 +2892,7 @@ static void intel_read_wm_latency(struct drm_i915_private *dev_priv,
 			wm[0] += 1;
 
 	} else if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) {
-		uint64_t sskpd = I915_READ64(MCH_SSKPD);
+		u64 sskpd = I915_READ64(MCH_SSKPD);
 
 		wm[0] = (sskpd >> 56) & 0xFF;
 		if (wm[0] == 0)
@@ -2904,14 +2902,14 @@ static void intel_read_wm_latency(struct drm_i915_private *dev_priv,
 		wm[3] = (sskpd >> 20) & 0x1FF;
 		wm[4] = (sskpd >> 32) & 0x1FF;
 	} else if (INTEL_GEN(dev_priv) >= 6) {
-		uint32_t sskpd = I915_READ(MCH_SSKPD);
+		u32 sskpd = I915_READ(MCH_SSKPD);
 
 		wm[0] = (sskpd >> SSKPD_WM0_SHIFT) & SSKPD_WM_MASK;
 		wm[1] = (sskpd >> SSKPD_WM1_SHIFT) & SSKPD_WM_MASK;
 		wm[2] = (sskpd >> SSKPD_WM2_SHIFT) & SSKPD_WM_MASK;
 		wm[3] = (sskpd >> SSKPD_WM3_SHIFT) & SSKPD_WM_MASK;
 	} else if (INTEL_GEN(dev_priv) >= 5) {
-		uint32_t mltr = I915_READ(MLTR_ILK);
+		u32 mltr = I915_READ(MLTR_ILK);
 
 		/* ILK primary LP0 latency is 700 ns */
 		wm[0] = 7;
@@ -2923,18 +2921,18 @@ static void intel_read_wm_latency(struct drm_i915_private *dev_priv,
 }
 
 static void intel_fixup_spr_wm_latency(struct drm_i915_private *dev_priv,
-				       uint16_t wm[5])
+				       u16 wm[5])
 {
 	/* ILK sprite LP0 latency is 1300 ns */
-	if (IS_GEN5(dev_priv))
+	if (IS_GEN(dev_priv, 5))
 		wm[0] = 13;
 }
 
 static void intel_fixup_cur_wm_latency(struct drm_i915_private *dev_priv,
-				       uint16_t wm[5])
+				       u16 wm[5])
 {
 	/* ILK cursor LP0 latency is 1300 ns */
-	if (IS_GEN5(dev_priv))
+	if (IS_GEN(dev_priv, 5))
 		wm[0] = 13;
 }
 
@@ -2953,7 +2951,7 @@ int ilk_wm_max_level(const struct drm_i915_private *dev_priv)
 
 static void intel_print_wm_latency(struct drm_i915_private *dev_priv,
 				   const char *name,
-				   const uint16_t wm[8])
+				   const u16 wm[8])
 {
 	int level, max_level = ilk_wm_max_level(dev_priv);
 
@@ -2982,7 +2980,7 @@ static void intel_print_wm_latency(struct drm_i915_private *dev_priv,
 }
 
 static bool ilk_increase_wm_latency(struct drm_i915_private *dev_priv,
-				    uint16_t wm[5], uint16_t min)
+				    u16 wm[5], u16 min)
 {
 	int level, max_level = ilk_wm_max_level(dev_priv);
 
@@ -2991,7 +2989,7 @@ static bool ilk_increase_wm_latency(struct drm_i915_private *dev_priv,
 
 	wm[0] = max(wm[0], min);
 	for (level = 1; level <= max_level; level++)
-		wm[level] = max_t(uint16_t, wm[level], DIV_ROUND_UP(min, 5));
+		wm[level] = max_t(u16, wm[level], DIV_ROUND_UP(min, 5));
 
 	return true;
 }
@@ -3061,7 +3059,7 @@ static void ilk_setup_wm_latency(struct drm_i915_private *dev_priv)
 	intel_print_wm_latency(dev_priv, "Sprite", dev_priv->wm.spr_latency);
 	intel_print_wm_latency(dev_priv, "Cursor", dev_priv->wm.cur_latency);
 
-	if (IS_GEN6(dev_priv)) {
+	if (IS_GEN(dev_priv, 6)) {
 		snb_wm_latency_quirk(dev_priv);
 		snb_wm_lp3_irq_quirk(dev_priv);
 	}
@@ -3073,7 +3071,7 @@ static void skl_setup_wm_latency(struct drm_i915_private *dev_priv)
 	intel_print_wm_latency(dev_priv, "Gen9 Plane", dev_priv->wm.skl_latency);
 }
 
-static bool ilk_validate_pipe_wm(struct drm_device *dev,
+static bool ilk_validate_pipe_wm(const struct drm_i915_private *dev_priv,
 				 struct intel_pipe_wm *pipe_wm)
 {
 	/* LP0 watermark maximums depend on this pipe alone */
@@ -3085,7 +3083,7 @@ static bool ilk_validate_pipe_wm(struct drm_device *dev,
 	struct ilk_wm_maximums max;
 
 	/* LP0 watermarks always use 1/2 DDB partitioning */
-	ilk_compute_wm_maximums(dev, 0, &config, INTEL_DDB_PART_1_2, &max);
+	ilk_compute_wm_maximums(dev_priv, 0, &config, INTEL_DDB_PART_1_2, &max);
 
 	/* At least LP0 must be valid */
 	if (!ilk_validate_wm_level(0, &max, &pipe_wm->wm[0])) {
@@ -3150,7 +3148,7 @@ static int ilk_compute_pipe_wm(struct intel_crtc_state *cstate)
 	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
 		pipe_wm->linetime = hsw_compute_linetime_wm(cstate);
 
-	if (!ilk_validate_pipe_wm(dev, pipe_wm))
+	if (!ilk_validate_pipe_wm(dev_priv, pipe_wm))
 		return -EINVAL;
 
 	ilk_compute_wm_reg_maximums(dev_priv, 1, &max);
@@ -3180,17 +3178,17 @@ static int ilk_compute_pipe_wm(struct intel_crtc_state *cstate)
  * state and the new state.  These can be programmed to the hardware
  * immediately.
  */
-static int ilk_compute_intermediate_wm(struct drm_device *dev,
-				       struct intel_crtc *intel_crtc,
-				       struct intel_crtc_state *newstate)
+static int ilk_compute_intermediate_wm(struct intel_crtc_state *newstate)
 {
+	struct intel_crtc *intel_crtc = to_intel_crtc(newstate->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(intel_crtc->base.dev);
 	struct intel_pipe_wm *a = &newstate->wm.ilk.intermediate;
 	struct intel_atomic_state *intel_state =
 		to_intel_atomic_state(newstate->base.state);
 	const struct intel_crtc_state *oldstate =
 		intel_atomic_get_old_crtc_state(intel_state, intel_crtc);
 	const struct intel_pipe_wm *b = &oldstate->wm.ilk.optimal;
-	int level, max_level = ilk_wm_max_level(to_i915(dev));
+	int level, max_level = ilk_wm_max_level(dev_priv);
 
 	/*
 	 * Start with the final, target watermarks, then combine with the
@@ -3223,7 +3221,7 @@ static int ilk_compute_intermediate_wm(struct drm_device *dev,
 	 * there's no safe way to transition from the old state to
 	 * the new state, so we need to fail the atomic transaction.
 	 */
-	if (!ilk_validate_pipe_wm(dev, a))
+	if (!ilk_validate_pipe_wm(dev_priv, a))
 		return -EINVAL;
 
 	/*
@@ -3239,7 +3237,7 @@ static int ilk_compute_intermediate_wm(struct drm_device *dev,
 /*
  * Merge the watermarks from all active pipes for a specific level.
  */
-static void ilk_merge_wm_level(struct drm_device *dev,
+static void ilk_merge_wm_level(struct drm_i915_private *dev_priv,
 			       int level,
 			       struct intel_wm_level *ret_wm)
 {
@@ -3247,7 +3245,7 @@ static void ilk_merge_wm_level(struct drm_device *dev,
 
 	ret_wm->enable = true;
 
-	for_each_intel_crtc(dev, intel_crtc) {
+	for_each_intel_crtc(&dev_priv->drm, intel_crtc) {
 		const struct intel_pipe_wm *active = &intel_crtc->wm.active.ilk;
 		const struct intel_wm_level *wm = &active->wm[level];
 
@@ -3272,12 +3270,11 @@ static void ilk_merge_wm_level(struct drm_device *dev,
 /*
  * Merge all low power watermarks for all active pipes.
  */
-static void ilk_wm_merge(struct drm_device *dev,
+static void ilk_wm_merge(struct drm_i915_private *dev_priv,
 			 const struct intel_wm_config *config,
 			 const struct ilk_wm_maximums *max,
 			 struct intel_pipe_wm *merged)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
 	int level, max_level = ilk_wm_max_level(dev_priv);
 	int last_enabled_level = max_level;
 
@@ -3293,7 +3290,7 @@ static void ilk_wm_merge(struct drm_device *dev,
 	for (level = 1; level <= max_level; level++) {
 		struct intel_wm_level *wm = &merged->wm[level];
 
-		ilk_merge_wm_level(dev, level, wm);
+		ilk_merge_wm_level(dev_priv, level, wm);
 
 		if (level > last_enabled_level)
 			wm->enable = false;
@@ -3318,7 +3315,7 @@ static void ilk_wm_merge(struct drm_device *dev,
 	 * What we should check here is whether FBC can be
 	 * enabled sometime later.
 	 */
-	if (IS_GEN5(dev_priv) && !merged->fbc_wm_enabled &&
+	if (IS_GEN(dev_priv, 5) && !merged->fbc_wm_enabled &&
 	    intel_fbc_is_active(dev_priv)) {
 		for (level = 2; level <= max_level; level++) {
 			struct intel_wm_level *wm = &merged->wm[level];
@@ -3335,22 +3332,20 @@ static int ilk_wm_lp_to_level(int wm_lp, const struct intel_pipe_wm *pipe_wm)
 }
 
 /* The value we need to program into the WM_LPx latency field */
-static unsigned int ilk_wm_lp_latency(struct drm_device *dev, int level)
+static unsigned int ilk_wm_lp_latency(struct drm_i915_private *dev_priv,
+				      int level)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
-
 	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
 		return 2 * level;
 	else
 		return dev_priv->wm.pri_latency[level];
 }
 
-static void ilk_compute_wm_results(struct drm_device *dev,
+static void ilk_compute_wm_results(struct drm_i915_private *dev_priv,
 				   const struct intel_pipe_wm *merged,
 				   enum intel_ddb_partitioning partitioning,
 				   struct ilk_wm_values *results)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_crtc *intel_crtc;
 	int level, wm_lp;
 
@@ -3370,7 +3365,7 @@ static void ilk_compute_wm_results(struct drm_device *dev,
 		 * disabled. Doing otherwise could cause underruns.
 		 */
 		results->wm_lp[wm_lp - 1] =
-			(ilk_wm_lp_latency(dev, level) << WM1_LP_LATENCY_SHIFT) |
+			(ilk_wm_lp_latency(dev_priv, level) << WM1_LP_LATENCY_SHIFT) |
 			(r->pri_val << WM1_LP_SR_SHIFT) |
 			r->cur_val;
 
@@ -3396,7 +3391,7 @@ static void ilk_compute_wm_results(struct drm_device *dev,
 	}
 
 	/* LP0 register values */
-	for_each_intel_crtc(dev, intel_crtc) {
+	for_each_intel_crtc(&dev_priv->drm, intel_crtc) {
 		enum pipe pipe = intel_crtc->pipe;
 		const struct intel_wm_level *r =
 			&intel_crtc->wm.active.ilk.wm[0];
@@ -3415,11 +3410,12 @@ static void ilk_compute_wm_results(struct drm_device *dev,
 
 /* Find the result with the highest level enabled. Check for enable_fbc_wm in
  * case both are at the same level. Prefer r1 in case they're the same. */
-static struct intel_pipe_wm *ilk_find_best_result(struct drm_device *dev,
-						  struct intel_pipe_wm *r1,
-						  struct intel_pipe_wm *r2)
+static struct intel_pipe_wm *
+ilk_find_best_result(struct drm_i915_private *dev_priv,
+		     struct intel_pipe_wm *r1,
+		     struct intel_pipe_wm *r2)
 {
-	int level, max_level = ilk_wm_max_level(to_i915(dev));
+	int level, max_level = ilk_wm_max_level(dev_priv);
 	int level1 = 0, level2 = 0;
 
 	for (level = 1; level <= max_level; level++) {
@@ -3540,7 +3536,7 @@ static void ilk_write_wm_values(struct drm_i915_private *dev_priv,
 {
 	struct ilk_wm_values *previous = &dev_priv->wm.hw;
 	unsigned int dirty;
-	uint32_t val;
+	u32 val;
 
 	dirty = ilk_compute_wm_dirty(dev_priv, previous, results);
 	if (!dirty)
@@ -3638,14 +3634,9 @@ static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private *dev_priv)
  * FIXME: We still don't have the proper code detect if we need to apply the WA,
  * so assume we'll always need it in order to avoid underruns.
  */
-static bool skl_needs_memory_bw_wa(struct intel_atomic_state *state)
+static bool skl_needs_memory_bw_wa(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
-
-	if (IS_GEN9_BC(dev_priv) || IS_BROXTON(dev_priv))
-		return true;
-
-	return false;
+	return IS_GEN9_BC(dev_priv) || IS_BROXTON(dev_priv);
 }
 
 static bool
@@ -3677,25 +3668,25 @@ intel_enable_sagv(struct drm_i915_private *dev_priv)
 	if (dev_priv->sagv_status == I915_SAGV_ENABLED)
 		return 0;
 
-	DRM_DEBUG_KMS("Enabling the SAGV\n");
+	DRM_DEBUG_KMS("Enabling SAGV\n");
 	mutex_lock(&dev_priv->pcu_lock);
 
 	ret = sandybridge_pcode_write(dev_priv, GEN9_PCODE_SAGV_CONTROL,
 				      GEN9_SAGV_ENABLE);
 
-	/* We don't need to wait for the SAGV when enabling */
+	/* We don't need to wait for SAGV when enabling */
 	mutex_unlock(&dev_priv->pcu_lock);
 
 	/*
 	 * Some skl systems, pre-release machines in particular,
-	 * don't actually have an SAGV.
+	 * don't actually have SAGV.
 	 */
 	if (IS_SKYLAKE(dev_priv) && ret == -ENXIO) {
 		DRM_DEBUG_DRIVER("No SAGV found on system, ignoring\n");
 		dev_priv->sagv_status = I915_SAGV_NOT_CONTROLLED;
 		return 0;
 	} else if (ret < 0) {
-		DRM_ERROR("Failed to enable the SAGV\n");
+		DRM_ERROR("Failed to enable SAGV\n");
 		return ret;
 	}
 
@@ -3714,7 +3705,7 @@ intel_disable_sagv(struct drm_i915_private *dev_priv)
 	if (dev_priv->sagv_status == I915_SAGV_DISABLED)
 		return 0;
 
-	DRM_DEBUG_KMS("Disabling the SAGV\n");
+	DRM_DEBUG_KMS("Disabling SAGV\n");
 	mutex_lock(&dev_priv->pcu_lock);
 
 	/* bspec says to keep retrying for at least 1 ms */
@@ -3726,14 +3717,14 @@ intel_disable_sagv(struct drm_i915_private *dev_priv)
 
 	/*
 	 * Some skl systems, pre-release machines in particular,
-	 * don't actually have an SAGV.
+	 * don't actually have SAGV.
 	 */
 	if (IS_SKYLAKE(dev_priv) && ret == -ENXIO) {
 		DRM_DEBUG_DRIVER("No SAGV found on system, ignoring\n");
 		dev_priv->sagv_status = I915_SAGV_NOT_CONTROLLED;
 		return 0;
 	} else if (ret < 0) {
-		DRM_ERROR("Failed to disable the SAGV (%d)\n", ret);
+		DRM_ERROR("Failed to disable SAGV (%d)\n", ret);
 		return ret;
 	}
 
@@ -3756,15 +3747,15 @@ bool intel_can_enable_sagv(struct drm_atomic_state *state)
 	if (!intel_has_sagv(dev_priv))
 		return false;
 
-	if (IS_GEN9(dev_priv))
+	if (IS_GEN(dev_priv, 9))
 		sagv_block_time_us = 30;
-	else if (IS_GEN10(dev_priv))
+	else if (IS_GEN(dev_priv, 10))
 		sagv_block_time_us = 20;
 	else
 		sagv_block_time_us = 10;
 
 	/*
-	 * SKL+ workaround: bspec recommends we disable the SAGV when we have
+	 * SKL+ workaround: bspec recommends we disable SAGV when we have
 	 * more then one pipe enabled
 	 *
 	 * If there are no active CRTCs, no additional checks need be performed
@@ -3797,7 +3788,7 @@ bool intel_can_enable_sagv(struct drm_atomic_state *state)
 
 		latency = dev_priv->wm.skl_latency[level];
 
-		if (skl_needs_memory_bw_wa(intel_state) &&
+		if (skl_needs_memory_bw_wa(dev_priv) &&
 		    plane->base.state->fb->modifier ==
 		    I915_FORMAT_MOD_X_TILED)
 			latency += 15;
@@ -3805,7 +3796,7 @@ bool intel_can_enable_sagv(struct drm_atomic_state *state)
 		/*
 		 * If any of the planes on this pipe don't enable wm levels that
 		 * incur memory latencies higher than sagv_block_time_us we
-		 * can't enable the SAGV.
+		 * can't enable SAGV.
 		 */
 		if (latency < sagv_block_time_us)
 			return false;
@@ -3834,8 +3825,13 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv,
 
 	/*
 	 * 12GB/s is maximum BW supported by single DBuf slice.
+	 *
+	 * FIXME dbuf slice code is broken:
+	 * - must wait for planes to stop using the slice before powering it off
+	 * - plane straddling both slices is illegal in multi-pipe scenarios
+	 * - should validate we stay within the hw bandwidth limits
 	 */
-	if (num_active > 1 || total_data_bw >= GBps(12)) {
+	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
 		ddb->enabled_slices = 2;
 	} else {
 		ddb->enabled_slices = 1;
@@ -3934,14 +3930,9 @@ static unsigned int skl_cursor_allocation(int num_active)
 static void skl_ddb_entry_init_from_hw(struct drm_i915_private *dev_priv,
 				       struct skl_ddb_entry *entry, u32 reg)
 {
-	u16 mask;
 
-	if (INTEL_GEN(dev_priv) >= 11)
-		mask = ICL_DDB_ENTRY_MASK;
-	else
-		mask = SKL_DDB_ENTRY_MASK;
-	entry->start = reg & mask;
-	entry->end = (reg >> DDB_ENTRY_END_SHIFT) & mask;
+	entry->start = reg & DDB_ENTRY_MASK;
+	entry->end = (reg >> DDB_ENTRY_END_SHIFT) & DDB_ENTRY_MASK;
 
 	if (entry->end)
 		entry->end += 1;
@@ -3994,10 +3985,12 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum intel_display_power_domain power_domain;
 	enum pipe pipe = crtc->pipe;
+	intel_wakeref_t wakeref;
 	enum plane_id plane_id;
 
 	power_domain = POWER_DOMAIN_PIPE(pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return;
 
 	for_each_plane_id_on_crtc(crtc, plane_id)
@@ -4006,7 +3999,7 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
 					   &ddb_y[plane_id],
 					   &ddb_uv[plane_id]);
 
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 }
 
 void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
@@ -4036,7 +4029,7 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 			   const struct intel_plane_state *pstate)
 {
 	struct intel_plane *plane = to_intel_plane(pstate->base.plane);
-	uint32_t src_w, src_h, dst_w, dst_h;
+	u32 src_w, src_h, dst_w, dst_h;
 	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
 	uint_fixed_16_16_t downscale_h, downscale_w;
 
@@ -4082,8 +4075,8 @@ skl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
 		return pipe_downscale;
 
 	if (crtc_state->pch_pfit.enabled) {
-		uint32_t src_w, src_h, dst_w, dst_h;
-		uint32_t pfit_size = crtc_state->pch_pfit.size;
+		u32 src_w, src_h, dst_w, dst_h;
+		u32 pfit_size = crtc_state->pch_pfit.size;
 		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
 		uint_fixed_16_16_t downscale_h, downscale_w;
 
@@ -4116,7 +4109,7 @@ int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
 	const struct drm_plane_state *pstate;
 	struct intel_plane_state *intel_pstate;
 	int crtc_clock, dotclk;
-	uint32_t pipe_max_pixel_rate;
+	u32 pipe_max_pixel_rate;
 	uint_fixed_16_16_t pipe_downscale;
 	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
 
@@ -4172,8 +4165,8 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 {
 	struct intel_plane *intel_plane =
 		to_intel_plane(intel_pstate->base.plane);
-	uint32_t data_rate;
-	uint32_t width = 0, height = 0;
+	u32 data_rate;
+	u32 width = 0, height = 0;
 	struct drm_framebuffer *fb;
 	u32 format;
 	uint_fixed_16_16_t down_scale_amount;
@@ -4306,102 +4299,6 @@ icl_get_total_relative_data_rate(struct intel_crtc_state *intel_cstate,
 	return total_data_rate;
 }
 
-static uint16_t
-skl_ddb_min_alloc(const struct drm_plane_state *pstate, const int plane)
-{
-	struct drm_framebuffer *fb = pstate->fb;
-	struct intel_plane_state *intel_pstate = to_intel_plane_state(pstate);
-	uint32_t src_w, src_h;
-	uint32_t min_scanlines = 8;
-	uint8_t plane_bpp;
-
-	if (WARN_ON(!fb))
-		return 0;
-
-	/* For packed formats, and uv-plane, return 0 */
-	if (plane == 1 && fb->format->format != DRM_FORMAT_NV12)
-		return 0;
-
-	/* For Non Y-tile return 8-blocks */
-	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
-	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
-	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
-	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
-		return 8;
-
-	/*
-	 * Src coordinates are already rotated by 270 degrees for
-	 * the 90/270 degree plane rotation cases (to match the
-	 * GTT mapping), hence no need to account for rotation here.
-	 */
-	src_w = drm_rect_width(&intel_pstate->base.src) >> 16;
-	src_h = drm_rect_height(&intel_pstate->base.src) >> 16;
-
-	/* Halve UV plane width and height for NV12 */
-	if (plane == 1) {
-		src_w /= 2;
-		src_h /= 2;
-	}
-
-	plane_bpp = fb->format->cpp[plane];
-
-	if (drm_rotation_90_or_270(pstate->rotation)) {
-		switch (plane_bpp) {
-		case 1:
-			min_scanlines = 32;
-			break;
-		case 2:
-			min_scanlines = 16;
-			break;
-		case 4:
-			min_scanlines = 8;
-			break;
-		case 8:
-			min_scanlines = 4;
-			break;
-		default:
-			WARN(1, "Unsupported pixel depth %u for rotation",
-			     plane_bpp);
-			min_scanlines = 32;
-		}
-	}
-
-	return DIV_ROUND_UP((4 * src_w * plane_bpp), 512) * min_scanlines/4 + 3;
-}
-
-static void
-skl_ddb_calc_min(const struct intel_crtc_state *cstate, int num_active,
-		 uint16_t *minimum, uint16_t *uv_minimum)
-{
-	const struct drm_plane_state *pstate;
-	struct drm_plane *plane;
-
-	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, &cstate->base) {
-		enum plane_id plane_id = to_intel_plane(plane)->id;
-		struct intel_plane_state *plane_state = to_intel_plane_state(pstate);
-
-		if (plane_id == PLANE_CURSOR)
-			continue;
-
-		/* slave plane must be invisible and calculated from master */
-		if (!pstate->visible || WARN_ON(plane_state->slave))
-			continue;
-
-		if (!plane_state->linked_plane) {
-			minimum[plane_id] = skl_ddb_min_alloc(pstate, 0);
-			uv_minimum[plane_id] = skl_ddb_min_alloc(pstate, 1);
-		} else {
-			enum plane_id y_plane_id =
-				plane_state->linked_plane->id;
-
-			minimum[y_plane_id] = skl_ddb_min_alloc(pstate, 0);
-			minimum[plane_id] = skl_ddb_min_alloc(pstate, 1);
-		}
-	}
-
-	minimum[PLANE_CURSOR] = skl_cursor_allocation(num_active);
-}
-
 static int
 skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		      struct skl_ddb_allocation *ddb /* out */)
@@ -4411,15 +4308,17 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	struct skl_ddb_entry *alloc = &cstate->wm.skl.ddb;
-	uint16_t alloc_size, start;
-	uint16_t minimum[I915_MAX_PLANES] = {};
-	uint16_t uv_minimum[I915_MAX_PLANES] = {};
+	struct skl_plane_wm *wm;
+	u16 alloc_size, start = 0;
+	u16 total[I915_MAX_PLANES] = {};
+	u16 uv_total[I915_MAX_PLANES] = {};
 	u64 total_data_rate;
 	enum plane_id plane_id;
 	int num_active;
 	u64 plane_data_rate[I915_MAX_PLANES] = {};
 	u64 uv_plane_data_rate[I915_MAX_PLANES] = {};
-	uint16_t total_min_blocks = 0;
+	u32 blocks;
+	int level;
 
 	/* Clear the partitioning for disabled planes. */
 	memset(cstate->wm.skl.plane_ddb_y, 0, sizeof(cstate->wm.skl.plane_ddb_y));
@@ -4449,81 +4348,135 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	if (alloc_size == 0)
 		return 0;
 
-	skl_ddb_calc_min(cstate, num_active, minimum, uv_minimum);
+	/* Allocate fixed number of blocks for cursor. */
+	total[PLANE_CURSOR] = skl_cursor_allocation(num_active);
+	alloc_size -= total[PLANE_CURSOR];
+	cstate->wm.skl.plane_ddb_y[PLANE_CURSOR].start =
+		alloc->end - total[PLANE_CURSOR];
+	cstate->wm.skl.plane_ddb_y[PLANE_CURSOR].end = alloc->end;
+
+	if (total_data_rate == 0)
+		return 0;
 
 	/*
-	 * 1. Allocate the mininum required blocks for each active plane
-	 * and allocate the cursor, it doesn't require extra allocation
-	 * proportional to the data rate.
+	 * Find the highest watermark level for which we can satisfy the block
+	 * requirement of active planes.
 	 */
+	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
+		blocks = 0;
+		for_each_plane_id_on_crtc(intel_crtc, plane_id) {
+			if (plane_id == PLANE_CURSOR)
+				continue;
 
-	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
-		total_min_blocks += minimum[plane_id];
-		total_min_blocks += uv_minimum[plane_id];
+			wm = &cstate->wm.skl.optimal.planes[plane_id];
+			blocks += wm->wm[level].min_ddb_alloc;
+			blocks += wm->uv_wm[level].min_ddb_alloc;
+		}
+
+		if (blocks < alloc_size) {
+			alloc_size -= blocks;
+			break;
+		}
 	}
 
-	if (total_min_blocks > alloc_size) {
+	if (level < 0) {
 		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
-		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
-							alloc_size);
+		DRM_DEBUG_KMS("minimum required %d/%d\n", blocks,
+			      alloc_size);
 		return -EINVAL;
 	}
 
-	alloc_size -= total_min_blocks;
-	cstate->wm.skl.plane_ddb_y[PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
-	cstate->wm.skl.plane_ddb_y[PLANE_CURSOR].end = alloc->end;
-
 	/*
-	 * 2. Distribute the remaining space in proportion to the amount of
-	 * data each plane needs to fetch from memory.
-	 *
-	 * FIXME: we may not allocate every single block here.
+	 * Grant each plane the blocks it requires at the highest achievable
+	 * watermark level, plus an extra share of the leftover blocks
+	 * proportional to its relative data rate.
 	 */
-	if (total_data_rate == 0)
-		return 0;
-
-	start = alloc->start;
 	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
-		u64 data_rate, uv_data_rate;
-		uint16_t plane_blocks, uv_plane_blocks;
+		u64 rate;
+		u16 extra;
 
 		if (plane_id == PLANE_CURSOR)
 			continue;
 
-		data_rate = plane_data_rate[plane_id];
-
 		/*
-		 * allocation for (packed formats) or (uv-plane part of planar format):
-		 * promote the expression to 64 bits to avoid overflowing, the
-		 * result is < available as data_rate / total_data_rate < 1
+		 * We've accounted for all active planes; remaining planes are
+		 * all disabled.
 		 */
-		plane_blocks = minimum[plane_id];
-		plane_blocks += div64_u64(alloc_size * data_rate, total_data_rate);
+		if (total_data_rate == 0)
+			break;
 
-		/* Leave disabled planes at (0,0) */
-		if (data_rate) {
-			cstate->wm.skl.plane_ddb_y[plane_id].start = start;
-			cstate->wm.skl.plane_ddb_y[plane_id].end = start + plane_blocks;
-		}
+		wm = &cstate->wm.skl.optimal.planes[plane_id];
+
+		rate = plane_data_rate[plane_id];
+		extra = min_t(u16, alloc_size,
+			      DIV64_U64_ROUND_UP(alloc_size * rate,
+						 total_data_rate));
+		total[plane_id] = wm->wm[level].min_ddb_alloc + extra;
+		alloc_size -= extra;
+		total_data_rate -= rate;
 
-		start += plane_blocks;
+		if (total_data_rate == 0)
+			break;
 
-		/* Allocate DDB for UV plane for planar format/NV12 */
-		uv_data_rate = uv_plane_data_rate[plane_id];
+		rate = uv_plane_data_rate[plane_id];
+		extra = min_t(u16, alloc_size,
+			      DIV64_U64_ROUND_UP(alloc_size * rate,
+						 total_data_rate));
+		uv_total[plane_id] = wm->uv_wm[level].min_ddb_alloc + extra;
+		alloc_size -= extra;
+		total_data_rate -= rate;
+	}
+	WARN_ON(alloc_size != 0 || total_data_rate != 0);
 
-		uv_plane_blocks = uv_minimum[plane_id];
-		uv_plane_blocks += div64_u64(alloc_size * uv_data_rate, total_data_rate);
+	/* Set the actual DDB start/end points for each plane */
+	start = alloc->start;
+	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
+		struct skl_ddb_entry *plane_alloc, *uv_plane_alloc;
+
+		if (plane_id == PLANE_CURSOR)
+			continue;
+
+		plane_alloc = &cstate->wm.skl.plane_ddb_y[plane_id];
+		uv_plane_alloc = &cstate->wm.skl.plane_ddb_uv[plane_id];
 
 		/* Gen11+ uses a separate plane for UV watermarks */
-		WARN_ON(INTEL_GEN(dev_priv) >= 11 && uv_plane_blocks);
+		WARN_ON(INTEL_GEN(dev_priv) >= 11 && uv_total[plane_id]);
+
+		/* Leave disabled planes at (0,0) */
+		if (total[plane_id]) {
+			plane_alloc->start = start;
+			start += total[plane_id];
+			plane_alloc->end = start;
+		}
+
+		if (uv_total[plane_id]) {
+			uv_plane_alloc->start = start;
+			start += uv_total[plane_id];
+			uv_plane_alloc->end = start;
+		}
+	}
 
-		if (uv_data_rate) {
-			cstate->wm.skl.plane_ddb_uv[plane_id].start = start;
-			cstate->wm.skl.plane_ddb_uv[plane_id].end =
-				start + uv_plane_blocks;
+	/*
+	 * When we calculated watermark values we didn't know how high
+	 * of a level we'd actually be able to hit, so we just marked
+	 * all levels as "enabled."  Go back now and disable the ones
+	 * that aren't actually possible.
+	 */
+	for (level++; level <= ilk_wm_max_level(dev_priv); level++) {
+		for_each_plane_id_on_crtc(intel_crtc, plane_id) {
+			wm = &cstate->wm.skl.optimal.planes[plane_id];
+			memset(&wm->wm[level], 0, sizeof(wm->wm[level]));
 		}
+	}
 
-		start += uv_plane_blocks;
+	/*
+	 * Go back and disable the transition watermark if it turns out we
+	 * don't have enough DDB blocks for it.
+	 */
+	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
+		wm = &cstate->wm.skl.optimal.planes[plane_id];
+		if (wm->trans_wm.plane_res_b >= total[plane_id])
+			memset(&wm->trans_wm, 0, sizeof(wm->trans_wm));
 	}
 
 	return 0;
@@ -4536,10 +4489,10 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
  * 2xcdclk is 1350 MHz and the pixel rate should never exceed that.
 */
 static uint_fixed_16_16_t
-skl_wm_method1(const struct drm_i915_private *dev_priv, uint32_t pixel_rate,
-	       uint8_t cpp, uint32_t latency, uint32_t dbuf_block_size)
+skl_wm_method1(const struct drm_i915_private *dev_priv, u32 pixel_rate,
+	       u8 cpp, u32 latency, u32 dbuf_block_size)
 {
-	uint32_t wm_intermediate_val;
+	u32 wm_intermediate_val;
 	uint_fixed_16_16_t ret;
 
 	if (latency == 0)
@@ -4554,12 +4507,11 @@ skl_wm_method1(const struct drm_i915_private *dev_priv, uint32_t pixel_rate,
 	return ret;
 }
 
-static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
-			uint32_t pipe_htotal,
-			uint32_t latency,
-			uint_fixed_16_16_t plane_blocks_per_line)
+static uint_fixed_16_16_t
+skl_wm_method2(u32 pixel_rate, u32 pipe_htotal, u32 latency,
+	       uint_fixed_16_16_t plane_blocks_per_line)
 {
-	uint32_t wm_intermediate_val;
+	u32 wm_intermediate_val;
 	uint_fixed_16_16_t ret;
 
 	if (latency == 0)
@@ -4575,8 +4527,8 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
 static uint_fixed_16_16_t
 intel_get_linetime_us(const struct intel_crtc_state *cstate)
 {
-	uint32_t pixel_rate;
-	uint32_t crtc_htotal;
+	u32 pixel_rate;
+	u32 crtc_htotal;
 	uint_fixed_16_16_t linetime_us;
 
 	if (!cstate->base.active)
@@ -4593,11 +4545,11 @@ intel_get_linetime_us(const struct intel_crtc_state *cstate)
 	return linetime_us;
 }
 
-static uint32_t
+static u32
 skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
 			      const struct intel_plane_state *pstate)
 {
-	uint64_t adjusted_pixel_rate;
+	u64 adjusted_pixel_rate;
 	uint_fixed_16_16_t downscale_amount;
 
 	/* Shouldn't reach here on disabled planes... */
@@ -4624,10 +4576,7 @@ skl_compute_plane_wm_params(const struct intel_crtc_state *cstate,
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	const struct drm_plane_state *pstate = &intel_pstate->base;
 	const struct drm_framebuffer *fb = pstate->fb;
-	uint32_t interm_pbpl;
-	struct intel_atomic_state *state =
-		to_intel_atomic_state(cstate->base.state);
-	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
+	u32 interm_pbpl;
 
 	/* only NV12 format has two planes */
 	if (color_plane == 1 && fb->format->format != DRM_FORMAT_NV12) {
@@ -4663,7 +4612,7 @@ skl_compute_plane_wm_params(const struct intel_crtc_state *cstate,
 							     intel_pstate);
 
 	if (INTEL_GEN(dev_priv) >= 11 &&
-	    fb->modifier == I915_FORMAT_MOD_Yf_TILED && wp->cpp == 8)
+	    fb->modifier == I915_FORMAT_MOD_Yf_TILED && wp->cpp == 1)
 		wp->dbuf_block_size = 256;
 	else
 		wp->dbuf_block_size = 512;
@@ -4688,7 +4637,7 @@ skl_compute_plane_wm_params(const struct intel_crtc_state *cstate,
 		wp->y_min_scanlines = 4;
 	}
 
-	if (apply_memory_bw_wa)
+	if (skl_needs_memory_bw_wa(dev_priv))
 		wp->y_min_scanlines *= 2;
 
 	wp->plane_bytes_per_line = wp->width * wp->cpp;
@@ -4702,7 +4651,7 @@ skl_compute_plane_wm_params(const struct intel_crtc_state *cstate,
 
 		wp->plane_blocks_per_line = div_fixed16(interm_pbpl,
 							wp->y_min_scanlines);
-	} else if (wp->x_tiled && IS_GEN9(dev_priv)) {
+	} else if (wp->x_tiled && IS_GEN(dev_priv, 9)) {
 		interm_pbpl = DIV_ROUND_UP(wp->plane_bytes_per_line,
 					   wp->dbuf_block_size);
 		wp->plane_blocks_per_line = u32_to_fixed16(interm_pbpl);
@@ -4720,28 +4669,34 @@ skl_compute_plane_wm_params(const struct intel_crtc_state *cstate,
 	return 0;
 }
 
-static int skl_compute_plane_wm(const struct intel_crtc_state *cstate,
-				const struct intel_plane_state *intel_pstate,
-				uint16_t ddb_allocation,
-				int level,
-				const struct skl_wm_params *wp,
-				const struct skl_wm_level *result_prev,
-				struct skl_wm_level *result /* out */)
+static bool skl_wm_has_lines(struct drm_i915_private *dev_priv, int level)
+{
+	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
+		return true;
+
+	/* The number of lines are ignored for the level 0 watermark. */
+	return level > 0;
+}
+
+static void skl_compute_plane_wm(const struct intel_crtc_state *cstate,
+				 const struct intel_plane_state *intel_pstate,
+				 int level,
+				 const struct skl_wm_params *wp,
+				 const struct skl_wm_level *result_prev,
+				 struct skl_wm_level *result /* out */)
 {
 	struct drm_i915_private *dev_priv =
 		to_i915(intel_pstate->base.plane->dev);
-	const struct drm_plane_state *pstate = &intel_pstate->base;
-	uint32_t latency = dev_priv->wm.skl_latency[level];
+	u32 latency = dev_priv->wm.skl_latency[level];
 	uint_fixed_16_16_t method1, method2;
 	uint_fixed_16_16_t selected_result;
-	uint32_t res_blocks, res_lines;
-	struct intel_atomic_state *state =
-		to_intel_atomic_state(cstate->base.state);
-	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
-	uint32_t min_disp_buf_needed;
+	u32 res_blocks, res_lines, min_ddb_alloc = 0;
 
-	if (latency == 0)
-		return level == 0 ? -EINVAL : 0;
+	if (latency == 0) {
+		/* reject it */
+		result->min_ddb_alloc = U16_MAX;
+		return;
+	}
 
 	/* Display WA #1141: kbl,cfl */
 	if ((IS_KABYLAKE(dev_priv) || IS_COFFEELAKE(dev_priv) ||
@@ -4749,7 +4704,7 @@ static int skl_compute_plane_wm(const struct intel_crtc_state *cstate,
 	    dev_priv->ipc_enabled)
 		latency += 4;
 
-	if (apply_memory_bw_wa && wp->x_tiled)
+	if (skl_needs_memory_bw_wa(dev_priv) && wp->x_tiled)
 		latency += 15;
 
 	method1 = skl_wm_method1(dev_priv, wp->plane_pixel_rate,
@@ -4766,15 +4721,8 @@ static int skl_compute_plane_wm(const struct intel_crtc_state *cstate,
 		     wp->dbuf_block_size < 1) &&
 		     (wp->plane_bytes_per_line / wp->dbuf_block_size < 1)) {
 			selected_result = method2;
-		} else if (ddb_allocation >=
-			 fixed16_to_u32_round_up(wp->plane_blocks_per_line)) {
-			if (IS_GEN9(dev_priv) &&
-			    !IS_GEMINILAKE(dev_priv))
-				selected_result = min_fixed16(method1, method2);
-			else
-				selected_result = method2;
 		} else if (latency >= wp->linetime_us) {
-			if (IS_GEN9(dev_priv) &&
+			if (IS_GEN(dev_priv, 9) &&
 			    !IS_GEMINILAKE(dev_priv))
 				selected_result = min_fixed16(method1, method2);
 			else
@@ -4788,85 +4736,76 @@ static int skl_compute_plane_wm(const struct intel_crtc_state *cstate,
 	res_lines = div_round_up_fixed16(selected_result,
 					 wp->plane_blocks_per_line);
 
-	/* Display WA #1125: skl,bxt,kbl,glk */
-	if (level == 0 && wp->rc_surface)
-		res_blocks += fixed16_to_u32_round_up(wp->y_tile_minimum);
+	if (IS_GEN9_BC(dev_priv) || IS_BROXTON(dev_priv)) {
+		/* Display WA #1125: skl,bxt,kbl */
+		if (level == 0 && wp->rc_surface)
+			res_blocks +=
+				fixed16_to_u32_round_up(wp->y_tile_minimum);
+
+		/* Display WA #1126: skl,bxt,kbl */
+		if (level >= 1 && level <= 7) {
+			if (wp->y_tiled) {
+				res_blocks +=
+				    fixed16_to_u32_round_up(wp->y_tile_minimum);
+				res_lines += wp->y_min_scanlines;
+			} else {
+				res_blocks++;
+			}
 
-	/* Display WA #1126: skl,bxt,kbl,glk */
-	if (level >= 1 && level <= 7) {
-		if (wp->y_tiled) {
-			res_blocks += fixed16_to_u32_round_up(
-							wp->y_tile_minimum);
-			res_lines += wp->y_min_scanlines;
-		} else {
-			res_blocks++;
+			/*
+			 * Make sure result blocks for higher latency levels are
+			 * atleast as high as level below the current level.
+			 * Assumption in DDB algorithm optimization for special
+			 * cases. Also covers Display WA #1125 for RC.
+			 */
+			if (result_prev->plane_res_b > res_blocks)
+				res_blocks = result_prev->plane_res_b;
 		}
-
-		/*
-		 * Make sure result blocks for higher latency levels are atleast
-		 * as high as level below the current level.
-		 * Assumption in DDB algorithm optimization for special cases.
-		 * Also covers Display WA #1125 for RC.
-		 */
-		if (result_prev->plane_res_b > res_blocks)
-			res_blocks = result_prev->plane_res_b;
 	}
 
 	if (INTEL_GEN(dev_priv) >= 11) {
 		if (wp->y_tiled) {
-			uint32_t extra_lines;
-			uint_fixed_16_16_t fp_min_disp_buf_needed;
+			int extra_lines;
 
 			if (res_lines % wp->y_min_scanlines == 0)
 				extra_lines = wp->y_min_scanlines;
 			else
 				extra_lines = wp->y_min_scanlines * 2 -
-					      res_lines % wp->y_min_scanlines;
+					res_lines % wp->y_min_scanlines;
 
-			fp_min_disp_buf_needed = mul_u32_fixed16(res_lines +
-						extra_lines,
-						wp->plane_blocks_per_line);
-			min_disp_buf_needed = fixed16_to_u32_round_up(
-						fp_min_disp_buf_needed);
+			min_ddb_alloc = mul_round_up_u32_fixed16(res_lines + extra_lines,
+								 wp->plane_blocks_per_line);
 		} else {
-			min_disp_buf_needed = DIV_ROUND_UP(res_blocks * 11, 10);
+			min_ddb_alloc = res_blocks +
+				DIV_ROUND_UP(res_blocks, 10);
 		}
-	} else {
-		min_disp_buf_needed = res_blocks;
 	}
 
-	if ((level > 0 && res_lines > 31) ||
-	    res_blocks >= ddb_allocation ||
-	    min_disp_buf_needed >= ddb_allocation) {
-		/*
-		 * If there are no valid level 0 watermarks, then we can't
-		 * support this display configuration.
-		 */
-		if (level) {
-			return 0;
-		} else {
-			struct drm_plane *plane = pstate->plane;
+	if (!skl_wm_has_lines(dev_priv, level))
+		res_lines = 0;
 
-			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
-			DRM_DEBUG_KMS("[PLANE:%d:%s] blocks required = %u/%u, lines required = %u/31\n",
-				      plane->base.id, plane->name,
-				      res_blocks, ddb_allocation, res_lines);
-			return -EINVAL;
-		}
+	if (res_lines > 31) {
+		/* reject it */
+		result->min_ddb_alloc = U16_MAX;
+		return;
 	}
 
-	/* The number of lines are ignored for the level 0 watermark. */
+	/*
+	 * If res_lines is valid, assume we can use this watermark level
+	 * for now.  We'll come back and disable it after we calculate the
+	 * DDB allocation if it turns out we don't actually have enough
+	 * blocks to satisfy it.
+	 */
 	result->plane_res_b = res_blocks;
 	result->plane_res_l = res_lines;
+	/* Bspec says: value >= plane ddb allocation -> invalid, hence the +1 here */
+	result->min_ddb_alloc = max(min_ddb_alloc, res_blocks) + 1;
 	result->plane_en = true;
-
-	return 0;
 }
 
-static int
+static void
 skl_compute_wm_levels(const struct intel_crtc_state *cstate,
 		      const struct intel_plane_state *intel_pstate,
-		      uint16_t ddb_blocks,
 		      const struct skl_wm_params *wm_params,
 		      struct skl_wm_level *levels)
 {
@@ -4874,45 +4813,30 @@ skl_compute_wm_levels(const struct intel_crtc_state *cstate,
 		to_i915(intel_pstate->base.plane->dev);
 	int level, max_level = ilk_wm_max_level(dev_priv);
 	struct skl_wm_level *result_prev = &levels[0];
-	int ret;
 
 	for (level = 0; level <= max_level; level++) {
 		struct skl_wm_level *result = &levels[level];
 
-		ret = skl_compute_plane_wm(cstate,
-					   intel_pstate,
-					   ddb_blocks,
-					   level,
-					   wm_params,
-					   result_prev,
-					   result);
-		if (ret)
-			return ret;
+		skl_compute_plane_wm(cstate, intel_pstate, level, wm_params,
+				     result_prev, result);
 
 		result_prev = result;
 	}
-
-	return 0;
 }
 
-static uint32_t
+static u32
 skl_compute_linetime_wm(const struct intel_crtc_state *cstate)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct drm_i915_private *dev_priv = to_i915(state->dev);
 	uint_fixed_16_16_t linetime_us;
-	uint32_t linetime_wm;
+	u32 linetime_wm;
 
 	linetime_us = intel_get_linetime_us(cstate);
-
-	if (is_fixed16_zero(linetime_us))
-		return 0;
-
 	linetime_wm = fixed16_to_u32_round_up(mul_u32_fixed16(8, linetime_us));
 
-	/* Display WA #1135: bxt:ALL GLK:ALL */
-	if ((IS_BROXTON(dev_priv) || IS_GEMINILAKE(dev_priv)) &&
-	    dev_priv->ipc_enabled)
+	/* Display WA #1135: BXT:ALL GLK:ALL */
+	if (IS_GEN9_LP(dev_priv) && dev_priv->ipc_enabled)
 		linetime_wm /= 2;
 
 	return linetime_wm;
@@ -4920,14 +4844,13 @@ skl_compute_linetime_wm(const struct intel_crtc_state *cstate)
 
 static void skl_compute_transition_wm(const struct intel_crtc_state *cstate,
 				      const struct skl_wm_params *wp,
-				      struct skl_plane_wm *wm,
-				      uint16_t ddb_allocation)
+				      struct skl_plane_wm *wm)
 {
 	struct drm_device *dev = cstate->base.crtc->dev;
 	const struct drm_i915_private *dev_priv = to_i915(dev);
-	uint16_t trans_min, trans_y_tile_min;
-	const uint16_t trans_amount = 10; /* This is configurable amount */
-	uint16_t wm0_sel_res_b, trans_offset_b, res_blocks;
+	u16 trans_min, trans_y_tile_min;
+	const u16 trans_amount = 10; /* This is configurable amount */
+	u16 wm0_sel_res_b, trans_offset_b, res_blocks;
 
 	/* Transition WM are not recommended by HW team for GEN9 */
 	if (INTEL_GEN(dev_priv) <= 9)
@@ -4956,8 +4879,8 @@ static void skl_compute_transition_wm(const struct intel_crtc_state *cstate,
 	wm0_sel_res_b = wm->wm[0].plane_res_b - 1;
 
 	if (wp->y_tiled) {
-		trans_y_tile_min = (uint16_t) mul_round_up_u32_fixed16(2,
-							wp->y_tile_minimum);
+		trans_y_tile_min =
+			(u16)mul_round_up_u32_fixed16(2, wp->y_tile_minimum);
 		res_blocks = max(wm0_sel_res_b, trans_y_tile_min) +
 				trans_offset_b;
 	} else {
@@ -4969,12 +4892,13 @@ static void skl_compute_transition_wm(const struct intel_crtc_state *cstate,
 
 	}
 
-	res_blocks += 1;
-
-	if (res_blocks < ddb_allocation) {
-		wm->trans_wm.plane_res_b = res_blocks;
-		wm->trans_wm.plane_en = true;
-	}
+	/*
+	 * Just assume we can enable the transition watermark.  After
+	 * computing the DDB we'll come back and disable it if that
+	 * assumption turns out to be false.
+	 */
+	wm->trans_wm.plane_res_b = res_blocks + 1;
+	wm->trans_wm.plane_en = true;
 }
 
 static int skl_build_plane_wm_single(struct intel_crtc_state *crtc_state,
@@ -4982,7 +4906,6 @@ static int skl_build_plane_wm_single(struct intel_crtc_state *crtc_state,
 				     enum plane_id plane_id, int color_plane)
 {
 	struct skl_plane_wm *wm = &crtc_state->wm.skl.optimal.planes[plane_id];
-	u16 ddb_blocks = skl_ddb_entry_size(&crtc_state->wm.skl.plane_ddb_y[plane_id]);
 	struct skl_wm_params wm_params;
 	int ret;
 
@@ -4991,12 +4914,8 @@ static int skl_build_plane_wm_single(struct intel_crtc_state *crtc_state,
 	if (ret)
 		return ret;
 
-	ret = skl_compute_wm_levels(crtc_state, plane_state,
-				    ddb_blocks, &wm_params, wm->wm);
-	if (ret)
-		return ret;
-
-	skl_compute_transition_wm(crtc_state, &wm_params, wm, ddb_blocks);
+	skl_compute_wm_levels(crtc_state, plane_state, &wm_params, wm->wm);
+	skl_compute_transition_wm(crtc_state, &wm_params, wm);
 
 	return 0;
 }
@@ -5006,7 +4925,6 @@ static int skl_build_plane_wm_uv(struct intel_crtc_state *crtc_state,
 				 enum plane_id plane_id)
 {
 	struct skl_plane_wm *wm = &crtc_state->wm.skl.optimal.planes[plane_id];
-	u16 ddb_blocks = skl_ddb_entry_size(&crtc_state->wm.skl.plane_ddb_uv[plane_id]);
 	struct skl_wm_params wm_params;
 	int ret;
 
@@ -5018,10 +4936,7 @@ static int skl_build_plane_wm_uv(struct intel_crtc_state *crtc_state,
 	if (ret)
 		return ret;
 
-	ret = skl_compute_wm_levels(crtc_state, plane_state,
-				    ddb_blocks, &wm_params, wm->uv_wm);
-	if (ret)
-		return ret;
+	skl_compute_wm_levels(crtc_state, plane_state, &wm_params, wm->uv_wm);
 
 	return 0;
 }
@@ -5139,7 +5054,7 @@ static void skl_write_wm_level(struct drm_i915_private *dev_priv,
 			       i915_reg_t reg,
 			       const struct skl_wm_level *level)
 {
-	uint32_t val = 0;
+	u32 val = 0;
 
 	if (level->plane_en) {
 		val |= PLANE_WM_EN;
@@ -5230,6 +5145,23 @@ static bool skl_plane_wm_equals(struct drm_i915_private *dev_priv,
 	return skl_wm_level_equals(&wm1->trans_wm, &wm2->trans_wm);
 }
 
+static bool skl_pipe_wm_equals(struct intel_crtc *crtc,
+			       const struct skl_pipe_wm *wm1,
+			       const struct skl_pipe_wm *wm2)
+{
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	enum plane_id plane_id;
+
+	for_each_plane_id_on_crtc(crtc, plane_id) {
+		if (!skl_plane_wm_equals(dev_priv,
+					 &wm1->planes[plane_id],
+					 &wm2->planes[plane_id]))
+			return false;
+	}
+
+	return wm1->linetime == wm2->linetime;
+}
+
 static inline bool skl_ddb_entries_overlap(const struct skl_ddb_entry *a,
 					   const struct skl_ddb_entry *b)
 {
@@ -5251,35 +5183,32 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry *ddb,
 	return false;
 }
 
-static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
+static int skl_update_pipe_wm(struct intel_crtc_state *cstate,
 			      const struct skl_pipe_wm *old_pipe_wm,
 			      struct skl_pipe_wm *pipe_wm, /* out */
 			      bool *changed /* out */)
 {
-	struct intel_crtc_state *intel_cstate = to_intel_crtc_state(cstate);
+	struct intel_crtc *crtc = to_intel_crtc(cstate->base.crtc);
 	int ret;
 
-	ret = skl_build_pipe_wm(intel_cstate, pipe_wm);
+	ret = skl_build_pipe_wm(cstate, pipe_wm);
 	if (ret)
 		return ret;
 
-	if (!memcmp(old_pipe_wm, pipe_wm, sizeof(*pipe_wm)))
-		*changed = false;
-	else
-		*changed = true;
+	*changed = !skl_pipe_wm_equals(crtc, old_pipe_wm, pipe_wm);
 
 	return 0;
 }
 
-static uint32_t
-pipes_modified(struct drm_atomic_state *state)
+static u32
+pipes_modified(struct intel_atomic_state *state)
 {
-	struct drm_crtc *crtc;
-	struct drm_crtc_state *cstate;
-	uint32_t i, ret = 0;
+	struct intel_crtc *crtc;
+	struct intel_crtc_state *cstate;
+	u32 i, ret = 0;
 
-	for_each_new_crtc_in_state(state, crtc, cstate, i)
-		ret |= drm_crtc_mask(crtc);
+	for_each_new_intel_crtc_in_state(state, crtc, cstate, i)
+		ret |= drm_crtc_mask(&crtc->base);
 
 	return ret;
 }
@@ -5314,11 +5243,10 @@ skl_ddb_add_affected_planes(const struct intel_crtc_state *old_crtc_state,
 }
 
 static int
-skl_compute_ddb(struct drm_atomic_state *state)
+skl_compute_ddb(struct intel_atomic_state *state)
 {
-	const struct drm_i915_private *dev_priv = to_i915(state->dev);
-	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
-	struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;
+	const struct drm_i915_private *dev_priv = to_i915(state->base.dev);
+	struct skl_ddb_allocation *ddb = &state->wm_results.ddb;
 	struct intel_crtc_state *old_crtc_state;
 	struct intel_crtc_state *new_crtc_state;
 	struct intel_crtc *crtc;
@@ -5326,7 +5254,7 @@ skl_compute_ddb(struct drm_atomic_state *state)
 
 	memcpy(ddb, &dev_priv->wm.skl_hw.ddb, sizeof(*ddb));
 
-	for_each_oldnew_intel_crtc_in_state(intel_state, crtc, old_crtc_state,
+	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
 					    new_crtc_state, i) {
 		ret = skl_allocate_pipe_ddb(new_crtc_state, ddb);
 		if (ret)
@@ -5372,15 +5300,13 @@ skl_print_wm_changes(struct intel_atomic_state *state)
 }
 
 static int
-skl_ddb_add_affected_pipes(struct drm_atomic_state *state, bool *changed)
+skl_ddb_add_affected_pipes(struct intel_atomic_state *state, bool *changed)
 {
-	struct drm_device *dev = state->dev;
+	struct drm_device *dev = state->base.dev;
 	const struct drm_i915_private *dev_priv = to_i915(dev);
-	const struct drm_crtc *crtc;
-	const struct drm_crtc_state *cstate;
-	struct intel_crtc *intel_crtc;
-	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
-	uint32_t realloc_pipes = pipes_modified(state);
+	struct intel_crtc *crtc;
+	struct intel_crtc_state *crtc_state;
+	u32 realloc_pipes = pipes_modified(state);
 	int ret, i;
 
 	/*
@@ -5398,7 +5324,7 @@ skl_ddb_add_affected_pipes(struct drm_atomic_state *state, bool *changed)
 	 * since any racing commits that want to update them would need to
 	 * hold _all_ CRTC state mutexes.
 	 */
-	for_each_new_crtc_in_state(state, crtc, cstate, i)
+	for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i)
 		(*changed) = true;
 
 	if (!*changed)
@@ -5412,20 +5338,20 @@ skl_ddb_add_affected_pipes(struct drm_atomic_state *state, bool *changed)
 	 */
 	if (dev_priv->wm.distrust_bios_wm) {
 		ret = drm_modeset_lock(&dev->mode_config.connection_mutex,
-				       state->acquire_ctx);
+				       state->base.acquire_ctx);
 		if (ret)
 			return ret;
 
-		intel_state->active_pipe_changes = ~0;
+		state->active_pipe_changes = ~0;
 
 		/*
-		 * We usually only initialize intel_state->active_crtcs if we
+		 * We usually only initialize state->active_crtcs if we
 		 * we're doing a modeset; make sure this field is always
 		 * initialized during the sanitization process that happens
 		 * on the first commit too.
 		 */
-		if (!intel_state->modeset)
-			intel_state->active_crtcs = dev_priv->active_crtcs;
+		if (!state->modeset)
+			state->active_crtcs = dev_priv->active_crtcs;
 	}
 
 	/*
@@ -5441,21 +5367,19 @@ skl_ddb_add_affected_pipes(struct drm_atomic_state *state, bool *changed)
 	 * any other display updates race with this transaction, so we need
 	 * to grab the lock on *all* CRTC's.
 	 */
-	if (intel_state->active_pipe_changes || intel_state->modeset) {
+	if (state->active_pipe_changes || state->modeset) {
 		realloc_pipes = ~0;
-		intel_state->wm_results.dirty_pipes = ~0;
+		state->wm_results.dirty_pipes = ~0;
 	}
 
 	/*
 	 * We're not recomputing for the pipes not included in the commit, so
 	 * make sure we start with the current state.
 	 */
-	for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
-		struct intel_crtc_state *cstate;
-
-		cstate = intel_atomic_get_crtc_state(state, intel_crtc);
-		if (IS_ERR(cstate))
-			return PTR_ERR(cstate);
+	for_each_intel_crtc_mask(dev, crtc, realloc_pipes) {
+		crtc_state = intel_atomic_get_crtc_state(&state->base, crtc);
+		if (IS_ERR(crtc_state))
+			return PTR_ERR(crtc_state);
 	}
 
 	return 0;
@@ -5522,12 +5446,12 @@ static int skl_wm_add_affected_planes(struct intel_atomic_state *state,
 }
 
 static int
-skl_compute_wm(struct drm_atomic_state *state)
+skl_compute_wm(struct intel_atomic_state *state)
 {
-	struct drm_crtc *crtc;
-	struct drm_crtc_state *cstate;
-	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
-	struct skl_ddb_values *results = &intel_state->wm_results;
+	struct intel_crtc *crtc;
+	struct intel_crtc_state *cstate;
+	struct intel_crtc_state *old_crtc_state;
+	struct skl_ddb_values *results = &state->wm_results;
 	struct skl_pipe_wm *pipe_wm;
 	bool changed = false;
 	int ret, i;
@@ -5539,47 +5463,35 @@ skl_compute_wm(struct drm_atomic_state *state)
 	if (ret || !changed)
 		return ret;
 
-	ret = skl_compute_ddb(state);
-	if (ret)
-		return ret;
-
 	/*
 	 * Calculate WM's for all pipes that are part of this transaction.
-	 * Note that the DDB allocation above may have added more CRTC's that
+	 * Note that skl_ddb_add_affected_pipes may have added more CRTC's that
 	 * weren't otherwise being modified (and set bits in dirty_pipes) if
 	 * pipe allocations had to change.
-	 *
-	 * FIXME:  Now that we're doing this in the atomic check phase, we
-	 * should allow skl_update_pipe_wm() to return failure in cases where
-	 * no suitable watermark values can be found.
 	 */
-	for_each_new_crtc_in_state(state, crtc, cstate, i) {
-		struct intel_crtc_state *intel_cstate =
-			to_intel_crtc_state(cstate);
+	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
+					    cstate, i) {
 		const struct skl_pipe_wm *old_pipe_wm =
-			&to_intel_crtc_state(crtc->state)->wm.skl.optimal;
+			&old_crtc_state->wm.skl.optimal;
 
-		pipe_wm = &intel_cstate->wm.skl.optimal;
+		pipe_wm = &cstate->wm.skl.optimal;
 		ret = skl_update_pipe_wm(cstate, old_pipe_wm, pipe_wm, &changed);
 		if (ret)
 			return ret;
 
-		ret = skl_wm_add_affected_planes(intel_state,
-						 to_intel_crtc(crtc));
+		ret = skl_wm_add_affected_planes(state, crtc);
 		if (ret)
 			return ret;
 
 		if (changed)
-			results->dirty_pipes |= drm_crtc_mask(crtc);
-
-		if ((results->dirty_pipes & drm_crtc_mask(crtc)) == 0)
-			/* This pipe's WM's did not change */
-			continue;
-
-		intel_cstate->update_wm_pre = true;
+			results->dirty_pipes |= drm_crtc_mask(&crtc->base);
 	}
 
-	skl_print_wm_changes(intel_state);
+	ret = skl_compute_ddb(state);
+	if (ret)
+		return ret;
+
+	skl_print_wm_changes(state);
 
 	return 0;
 }
@@ -5617,13 +5529,13 @@ static void skl_initial_wm(struct intel_atomic_state *state,
 	mutex_unlock(&dev_priv->wm.wm_mutex);
 }
 
-static void ilk_compute_wm_config(struct drm_device *dev,
+static void ilk_compute_wm_config(struct drm_i915_private *dev_priv,
 				  struct intel_wm_config *config)
 {
 	struct intel_crtc *crtc;
 
 	/* Compute the currently _active_ config */
-	for_each_intel_crtc(dev, crtc) {
+	for_each_intel_crtc(&dev_priv->drm, crtc) {
 		const struct intel_pipe_wm *wm = &crtc->wm.active.ilk;
 
 		if (!wm->pipe_enabled)
@@ -5637,25 +5549,24 @@ static void ilk_compute_wm_config(struct drm_device *dev,
 
 static void ilk_program_watermarks(struct drm_i915_private *dev_priv)
 {
-	struct drm_device *dev = &dev_priv->drm;
 	struct intel_pipe_wm lp_wm_1_2 = {}, lp_wm_5_6 = {}, *best_lp_wm;
 	struct ilk_wm_maximums max;
 	struct intel_wm_config config = {};
 	struct ilk_wm_values results = {};
 	enum intel_ddb_partitioning partitioning;
 
-	ilk_compute_wm_config(dev, &config);
+	ilk_compute_wm_config(dev_priv, &config);
 
-	ilk_compute_wm_maximums(dev, 1, &config, INTEL_DDB_PART_1_2, &max);
-	ilk_wm_merge(dev, &config, &max, &lp_wm_1_2);
+	ilk_compute_wm_maximums(dev_priv, 1, &config, INTEL_DDB_PART_1_2, &max);
+	ilk_wm_merge(dev_priv, &config, &max, &lp_wm_1_2);
 
 	/* 5/6 split only in single pipe config on IVB+ */
 	if (INTEL_GEN(dev_priv) >= 7 &&
 	    config.num_pipes_active == 1 && config.sprites_enabled) {
-		ilk_compute_wm_maximums(dev, 1, &config, INTEL_DDB_PART_5_6, &max);
-		ilk_wm_merge(dev, &config, &max, &lp_wm_5_6);
+		ilk_compute_wm_maximums(dev_priv, 1, &config, INTEL_DDB_PART_5_6, &max);
+		ilk_wm_merge(dev_priv, &config, &max, &lp_wm_5_6);
 
-		best_lp_wm = ilk_find_best_result(dev, &lp_wm_1_2, &lp_wm_5_6);
+		best_lp_wm = ilk_find_best_result(dev_priv, &lp_wm_1_2, &lp_wm_5_6);
 	} else {
 		best_lp_wm = &lp_wm_1_2;
 	}
@@ -5663,7 +5574,7 @@ static void ilk_program_watermarks(struct drm_i915_private *dev_priv)
 	partitioning = (best_lp_wm == &lp_wm_1_2) ?
 		       INTEL_DDB_PART_1_2 : INTEL_DDB_PART_5_6;
 
-	ilk_compute_wm_results(dev, best_lp_wm, partitioning, &results);
+	ilk_compute_wm_results(dev_priv, best_lp_wm, partitioning, &results);
 
 	ilk_write_wm_values(dev_priv, &results);
 }
@@ -5694,7 +5605,7 @@ static void ilk_optimize_watermarks(struct intel_atomic_state *state,
 	mutex_unlock(&dev_priv->wm.wm_mutex);
 }
 
-static inline void skl_wm_level_from_reg_val(uint32_t val,
+static inline void skl_wm_level_from_reg_val(u32 val,
 					     struct skl_wm_level *level)
 {
 	level->plane_en = val & PLANE_WM_EN;
@@ -5703,19 +5614,18 @@ static inline void skl_wm_level_from_reg_val(uint32_t val,
 		PLANE_WM_LINES_MASK;
 }
 
-void skl_pipe_wm_get_hw_state(struct drm_crtc *crtc,
+void skl_pipe_wm_get_hw_state(struct intel_crtc *crtc,
 			      struct skl_pipe_wm *out)
 {
-	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	enum pipe pipe = intel_crtc->pipe;
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	enum pipe pipe = crtc->pipe;
 	int level, max_level;
 	enum plane_id plane_id;
-	uint32_t val;
+	u32 val;
 
 	max_level = ilk_wm_max_level(dev_priv);
 
-	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
+	for_each_plane_id_on_crtc(crtc, plane_id) {
 		struct skl_plane_wm *wm = &out->planes[plane_id];
 
 		for (level = 0; level <= max_level; level++) {
@@ -5735,30 +5645,27 @@ void skl_pipe_wm_get_hw_state(struct drm_crtc *crtc,
 		skl_wm_level_from_reg_val(val, &wm->trans_wm);
 	}
 
-	if (!intel_crtc->active)
+	if (!crtc->active)
 		return;
 
 	out->linetime = I915_READ(PIPE_WM_LINETIME(pipe));
 }
 
-void skl_wm_get_hw_state(struct drm_device *dev)
+void skl_wm_get_hw_state(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct skl_ddb_values *hw = &dev_priv->wm.skl_hw;
 	struct skl_ddb_allocation *ddb = &dev_priv->wm.skl_hw.ddb;
-	struct drm_crtc *crtc;
-	struct intel_crtc *intel_crtc;
+	struct intel_crtc *crtc;
 	struct intel_crtc_state *cstate;
 
 	skl_ddb_get_hw_state(dev_priv, ddb);
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		intel_crtc = to_intel_crtc(crtc);
-		cstate = to_intel_crtc_state(crtc->state);
+	for_each_intel_crtc(&dev_priv->drm, crtc) {
+		cstate = to_intel_crtc_state(crtc->base.state);
 
 		skl_pipe_wm_get_hw_state(crtc, &cstate->wm.skl.optimal);
 
-		if (intel_crtc->active)
-			hw->dirty_pipes |= drm_crtc_mask(crtc);
+		if (crtc->active)
+			hw->dirty_pipes |= drm_crtc_mask(&crtc->base);
 	}
 
 	if (dev_priv->active_crtcs) {
@@ -5767,15 +5674,14 @@ void skl_wm_get_hw_state(struct drm_device *dev)
 	}
 }
 
-static void ilk_pipe_wm_get_hw_state(struct drm_crtc *crtc)
+static void ilk_pipe_wm_get_hw_state(struct intel_crtc *crtc)
 {
-	struct drm_device *dev = crtc->dev;
+	struct drm_device *dev = crtc->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct ilk_wm_values *hw = &dev_priv->wm.hw;
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct intel_crtc_state *cstate = to_intel_crtc_state(crtc->state);
+	struct intel_crtc_state *cstate = to_intel_crtc_state(crtc->base.state);
 	struct intel_pipe_wm *active = &cstate->wm.ilk.optimal;
-	enum pipe pipe = intel_crtc->pipe;
+	enum pipe pipe = crtc->pipe;
 	static const i915_reg_t wm0_pipe_reg[] = {
 		[PIPE_A] = WM0_PIPEA_ILK,
 		[PIPE_B] = WM0_PIPEB_ILK,
@@ -5788,7 +5694,7 @@ static void ilk_pipe_wm_get_hw_state(struct drm_crtc *crtc)
 
 	memset(active, 0, sizeof(*active));
 
-	active->pipe_enabled = intel_crtc->active;
+	active->pipe_enabled = crtc->active;
 
 	if (active->pipe_enabled) {
 		u32 tmp = hw->wm_pipe[pipe];
@@ -5816,7 +5722,7 @@ static void ilk_pipe_wm_get_hw_state(struct drm_crtc *crtc)
 			active->wm[level].enable = true;
 	}
 
-	intel_crtc->wm.active.ilk = *active;
+	crtc->wm.active.ilk = *active;
 }
 
 #define _FW_WM(value, plane) \
@@ -5827,7 +5733,7 @@ static void ilk_pipe_wm_get_hw_state(struct drm_crtc *crtc)
 static void g4x_read_wm_values(struct drm_i915_private *dev_priv,
 			       struct g4x_wm_values *wm)
 {
-	uint32_t tmp;
+	u32 tmp;
 
 	tmp = I915_READ(DSPFW1);
 	wm->sr.plane = _FW_WM(tmp, SR);
@@ -5854,7 +5760,7 @@ static void vlv_read_wm_values(struct drm_i915_private *dev_priv,
 			       struct vlv_wm_values *wm)
 {
 	enum pipe pipe;
-	uint32_t tmp;
+	u32 tmp;
 
 	for_each_pipe(dev_priv, pipe) {
 		tmp = I915_READ(VLV_DDL(pipe));
@@ -5926,9 +5832,8 @@ static void vlv_read_wm_values(struct drm_i915_private *dev_priv,
 #undef _FW_WM
 #undef _FW_WM_VLV
 
-void g4x_wm_get_hw_state(struct drm_device *dev)
+void g4x_wm_get_hw_state(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct g4x_wm_values *wm = &dev_priv->wm.g4x;
 	struct intel_crtc *crtc;
 
@@ -5936,7 +5841,7 @@ void g4x_wm_get_hw_state(struct drm_device *dev)
 
 	wm->cxsr = I915_READ(FW_BLC_SELF) & FW_BLC_SELF_EN;
 
-	for_each_intel_crtc(dev, crtc) {
+	for_each_intel_crtc(&dev_priv->drm, crtc) {
 		struct intel_crtc_state *crtc_state =
 			to_intel_crtc_state(crtc->base.state);
 		struct g4x_wm_state *active = &crtc->wm.active.g4x;
@@ -6067,9 +5972,8 @@ void g4x_wm_sanitize(struct drm_i915_private *dev_priv)
 	mutex_unlock(&dev_priv->wm.wm_mutex);
 }
 
-void vlv_wm_get_hw_state(struct drm_device *dev)
+void vlv_wm_get_hw_state(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct vlv_wm_values *wm = &dev_priv->wm.vlv;
 	struct intel_crtc *crtc;
 	u32 val;
@@ -6113,7 +6017,7 @@ void vlv_wm_get_hw_state(struct drm_device *dev)
 		mutex_unlock(&dev_priv->pcu_lock);
 	}
 
-	for_each_intel_crtc(dev, crtc) {
+	for_each_intel_crtc(&dev_priv->drm, crtc) {
 		struct intel_crtc_state *crtc_state =
 			to_intel_crtc_state(crtc->base.state);
 		struct vlv_wm_state *active = &crtc->wm.active.vlv;
@@ -6230,15 +6134,14 @@ static void ilk_init_lp_watermarks(struct drm_i915_private *dev_priv)
 	 */
 }
 
-void ilk_wm_get_hw_state(struct drm_device *dev)
+void ilk_wm_get_hw_state(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct ilk_wm_values *hw = &dev_priv->wm.hw;
-	struct drm_crtc *crtc;
+	struct intel_crtc *crtc;
 
 	ilk_init_lp_watermarks(dev_priv);
 
-	for_each_crtc(dev, crtc)
+	for_each_intel_crtc(&dev_priv->drm, crtc)
 		ilk_pipe_wm_get_hw_state(crtc);
 
 	hw->wm_lp[0] = I915_READ(WM1_LP_ILK);
@@ -6339,10 +6242,6 @@ void intel_init_ipc(struct drm_i915_private *dev_priv)
  */
 DEFINE_SPINLOCK(mchdev_lock);
 
-/* Global for IPS driver to get at the current i915 device. Protected by
- * mchdev_lock. */
-static struct drm_i915_private *i915_mch_dev;
-
 bool ironlake_set_drps(struct drm_i915_private *dev_priv, u8 val)
 {
 	u16 rgvswctl;
@@ -6805,7 +6704,7 @@ void gen6_rps_boost(struct i915_request *rq,
 	if (!rps->enabled)
 		return;
 
-	if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags))
+	if (i915_request_signaled(rq))
 		return;
 
 	/* Serializes with i915_request_retire() */
@@ -7049,7 +6948,7 @@ static void gen9_enable_rps(struct drm_i915_private *dev_priv)
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
 	/* Program defaults and thresholds for RPS */
-	if (IS_GEN9(dev_priv))
+	if (IS_GEN(dev_priv, 9))
 		I915_WRITE(GEN6_RC_VIDEO_FREQ,
 			GEN9_FREQUENCY(dev_priv->gt_pm.rps.rp1_freq));
 
@@ -7285,9 +7184,9 @@ static void gen6_enable_rc6(struct drm_i915_private *dev_priv)
 
 	rc6vids = 0;
 	ret = sandybridge_pcode_read(dev_priv, GEN6_PCODE_READ_RC6VIDS, &rc6vids);
-	if (IS_GEN6(dev_priv) && ret) {
+	if (IS_GEN(dev_priv, 6) && ret) {
 		DRM_DEBUG_DRIVER("Couldn't check for BIOS workaround\n");
-	} else if (IS_GEN6(dev_priv) && (GEN6_DECODE_RC6_VID(rc6vids & 0xff) < 450)) {
+	} else if (IS_GEN(dev_priv, 6) && (GEN6_DECODE_RC6_VID(rc6vids & 0xff) < 450)) {
 		DRM_DEBUG_DRIVER("You should update your BIOS. Correcting minimum rc6 voltage (%dmV->%dmV)\n",
 			  GEN6_DECODE_RC6_VID(rc6vids & 0xff), 450);
 		rc6vids &= 0xffff00;
@@ -7412,7 +7311,7 @@ static int cherryview_rps_max_freq(struct drm_i915_private *dev_priv)
 
 	val = vlv_punit_read(dev_priv, FB_GFX_FMAX_AT_VMAX_FUSE);
 
-	switch (INTEL_INFO(dev_priv)->sseu.eu_total) {
+	switch (RUNTIME_INFO(dev_priv)->sseu.eu_total) {
 	case 8:
 		/* (2 * 4) config */
 		rp0 = (val >> FB_GFX_FMAX_AT_VMAX_2SS4EU_FUSE_SHIFT);
@@ -7985,16 +7884,17 @@ static unsigned long __i915_chipset_val(struct drm_i915_private *dev_priv)
 
 unsigned long i915_chipset_val(struct drm_i915_private *dev_priv)
 {
-	unsigned long val;
+	intel_wakeref_t wakeref;
+	unsigned long val = 0;
 
-	if (!IS_GEN5(dev_priv))
+	if (!IS_GEN(dev_priv, 5))
 		return 0;
 
-	spin_lock_irq(&mchdev_lock);
-
-	val = __i915_chipset_val(dev_priv);
-
-	spin_unlock_irq(&mchdev_lock);
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		spin_lock_irq(&mchdev_lock);
+		val = __i915_chipset_val(dev_priv);
+		spin_unlock_irq(&mchdev_lock);
+	}
 
 	return val;
 }
@@ -8071,14 +7971,16 @@ static void __i915_update_gfx_val(struct drm_i915_private *dev_priv)
 
 void i915_update_gfx_val(struct drm_i915_private *dev_priv)
 {
-	if (!IS_GEN5(dev_priv))
-		return;
+	intel_wakeref_t wakeref;
 
-	spin_lock_irq(&mchdev_lock);
-
-	__i915_update_gfx_val(dev_priv);
+	if (!IS_GEN(dev_priv, 5))
+		return;
 
-	spin_unlock_irq(&mchdev_lock);
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		spin_lock_irq(&mchdev_lock);
+		__i915_update_gfx_val(dev_priv);
+		spin_unlock_irq(&mchdev_lock);
+	}
 }
 
 static unsigned long __i915_gfx_val(struct drm_i915_private *dev_priv)
@@ -8120,18 +8022,34 @@ static unsigned long __i915_gfx_val(struct drm_i915_private *dev_priv)
 
 unsigned long i915_gfx_val(struct drm_i915_private *dev_priv)
 {
-	unsigned long val;
+	intel_wakeref_t wakeref;
+	unsigned long val = 0;
 
-	if (!IS_GEN5(dev_priv))
+	if (!IS_GEN(dev_priv, 5))
 		return 0;
 
-	spin_lock_irq(&mchdev_lock);
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		spin_lock_irq(&mchdev_lock);
+		val = __i915_gfx_val(dev_priv);
+		spin_unlock_irq(&mchdev_lock);
+	}
 
-	val = __i915_gfx_val(dev_priv);
+	return val;
+}
 
-	spin_unlock_irq(&mchdev_lock);
+static struct drm_i915_private *i915_mch_dev;
 
-	return val;
+static struct drm_i915_private *mchdev_get(void)
+{
+	struct drm_i915_private *i915;
+
+	rcu_read_lock();
+	i915 = i915_mch_dev;
+	if (!kref_get_unless_zero(&i915->drm.ref))
+		i915 = NULL;
+	rcu_read_unlock();
+
+	return i915;
 }
 
 /**
@@ -8142,23 +8060,24 @@ unsigned long i915_gfx_val(struct drm_i915_private *dev_priv)
  */
 unsigned long i915_read_mch_val(void)
 {
-	struct drm_i915_private *dev_priv;
-	unsigned long chipset_val, graphics_val, ret = 0;
-
-	spin_lock_irq(&mchdev_lock);
-	if (!i915_mch_dev)
-		goto out_unlock;
-	dev_priv = i915_mch_dev;
-
-	chipset_val = __i915_chipset_val(dev_priv);
-	graphics_val = __i915_gfx_val(dev_priv);
+	struct drm_i915_private *i915;
+	unsigned long chipset_val = 0;
+	unsigned long graphics_val = 0;
+	intel_wakeref_t wakeref;
 
-	ret = chipset_val + graphics_val;
+	i915 = mchdev_get();
+	if (!i915)
+		return 0;
 
-out_unlock:
-	spin_unlock_irq(&mchdev_lock);
+	with_intel_runtime_pm(i915, wakeref) {
+		spin_lock_irq(&mchdev_lock);
+		chipset_val = __i915_chipset_val(i915);
+		graphics_val = __i915_gfx_val(i915);
+		spin_unlock_irq(&mchdev_lock);
+	}
 
-	return ret;
+	drm_dev_put(&i915->drm);
+	return chipset_val + graphics_val;
 }
 EXPORT_SYMBOL_GPL(i915_read_mch_val);
 
@@ -8169,23 +8088,19 @@ EXPORT_SYMBOL_GPL(i915_read_mch_val);
  */
 bool i915_gpu_raise(void)
 {
-	struct drm_i915_private *dev_priv;
-	bool ret = true;
+	struct drm_i915_private *i915;
 
-	spin_lock_irq(&mchdev_lock);
-	if (!i915_mch_dev) {
-		ret = false;
-		goto out_unlock;
-	}
-	dev_priv = i915_mch_dev;
-
-	if (dev_priv->ips.max_delay > dev_priv->ips.fmax)
-		dev_priv->ips.max_delay--;
+	i915 = mchdev_get();
+	if (!i915)
+		return false;
 
-out_unlock:
+	spin_lock_irq(&mchdev_lock);
+	if (i915->ips.max_delay > i915->ips.fmax)
+		i915->ips.max_delay--;
 	spin_unlock_irq(&mchdev_lock);
 
-	return ret;
+	drm_dev_put(&i915->drm);
+	return true;
 }
 EXPORT_SYMBOL_GPL(i915_gpu_raise);
 
@@ -8197,23 +8112,19 @@ EXPORT_SYMBOL_GPL(i915_gpu_raise);
  */
 bool i915_gpu_lower(void)
 {
-	struct drm_i915_private *dev_priv;
-	bool ret = true;
+	struct drm_i915_private *i915;
 
-	spin_lock_irq(&mchdev_lock);
-	if (!i915_mch_dev) {
-		ret = false;
-		goto out_unlock;
-	}
-	dev_priv = i915_mch_dev;
-
-	if (dev_priv->ips.max_delay < dev_priv->ips.min_delay)
-		dev_priv->ips.max_delay++;
+	i915 = mchdev_get();
+	if (!i915)
+		return false;
 
-out_unlock:
+	spin_lock_irq(&mchdev_lock);
+	if (i915->ips.max_delay < i915->ips.min_delay)
+		i915->ips.max_delay++;
 	spin_unlock_irq(&mchdev_lock);
 
-	return ret;
+	drm_dev_put(&i915->drm);
+	return true;
 }
 EXPORT_SYMBOL_GPL(i915_gpu_lower);
 
@@ -8224,13 +8135,16 @@ EXPORT_SYMBOL_GPL(i915_gpu_lower);
  */
 bool i915_gpu_busy(void)
 {
-	bool ret = false;
+	struct drm_i915_private *i915;
+	bool ret;
 
-	spin_lock_irq(&mchdev_lock);
-	if (i915_mch_dev)
-		ret = i915_mch_dev->gt.awake;
-	spin_unlock_irq(&mchdev_lock);
+	i915 = mchdev_get();
+	if (!i915)
+		return false;
 
+	ret = i915->gt.awake;
+
+	drm_dev_put(&i915->drm);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(i915_gpu_busy);
@@ -8243,24 +8157,19 @@ EXPORT_SYMBOL_GPL(i915_gpu_busy);
  */
 bool i915_gpu_turbo_disable(void)
 {
-	struct drm_i915_private *dev_priv;
-	bool ret = true;
-
-	spin_lock_irq(&mchdev_lock);
-	if (!i915_mch_dev) {
-		ret = false;
-		goto out_unlock;
-	}
-	dev_priv = i915_mch_dev;
-
-	dev_priv->ips.max_delay = dev_priv->ips.fstart;
+	struct drm_i915_private *i915;
+	bool ret;
 
-	if (!ironlake_set_drps(dev_priv, dev_priv->ips.fstart))
-		ret = false;
+	i915 = mchdev_get();
+	if (!i915)
+		return false;
 
-out_unlock:
+	spin_lock_irq(&mchdev_lock);
+	i915->ips.max_delay = i915->ips.fstart;
+	ret = ironlake_set_drps(i915, i915->ips.fstart);
 	spin_unlock_irq(&mchdev_lock);
 
+	drm_dev_put(&i915->drm);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(i915_gpu_turbo_disable);
@@ -8289,18 +8198,14 @@ void intel_gpu_ips_init(struct drm_i915_private *dev_priv)
 {
 	/* We only register the i915 ips part with intel-ips once everything is
 	 * set up, to avoid intel-ips sneaking in and reading bogus values. */
-	spin_lock_irq(&mchdev_lock);
-	i915_mch_dev = dev_priv;
-	spin_unlock_irq(&mchdev_lock);
+	rcu_assign_pointer(i915_mch_dev, dev_priv);
 
 	ips_ping_for_i915_load();
 }
 
 void intel_gpu_ips_teardown(void)
 {
-	spin_lock_irq(&mchdev_lock);
-	i915_mch_dev = NULL;
-	spin_unlock_irq(&mchdev_lock);
+	rcu_assign_pointer(i915_mch_dev, NULL);
 }
 
 static void intel_init_emon(struct drm_i915_private *dev_priv)
@@ -8410,7 +8315,7 @@ void intel_init_gt_powersave(struct drm_i915_private *dev_priv)
 			      intel_freq_opcode(dev_priv, 450));
 
 	/* After setting max-softlimit, find the overclock max freq */
-	if (IS_GEN6(dev_priv) ||
+	if (IS_GEN(dev_priv, 6) ||
 	    IS_IVYBRIDGE(dev_priv) || IS_HASWELL(dev_priv)) {
 		u32 params = 0;
 
@@ -8639,7 +8544,7 @@ static void g4x_disable_trickle_feed(struct drm_i915_private *dev_priv)
 
 static void ilk_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	uint32_t dspclk_gate = ILK_VRHUNIT_CLOCK_GATE_DISABLE;
+	u32 dspclk_gate = ILK_VRHUNIT_CLOCK_GATE_DISABLE;
 
 	/*
 	 * Required for FBC
@@ -8711,7 +8616,7 @@ static void ilk_init_clock_gating(struct drm_i915_private *dev_priv)
 static void cpt_init_clock_gating(struct drm_i915_private *dev_priv)
 {
 	int pipe;
-	uint32_t val;
+	u32 val;
 
 	/*
 	 * On Ibex Peak and Cougar Point, we need to disable clock
@@ -8746,7 +8651,7 @@ static void cpt_init_clock_gating(struct drm_i915_private *dev_priv)
 
 static void gen6_check_mch_setup(struct drm_i915_private *dev_priv)
 {
-	uint32_t tmp;
+	u32 tmp;
 
 	tmp = I915_READ(MCH_SSKPD);
 	if ((tmp & MCH_SSKPD_WM0_MASK) != MCH_SSKPD_WM0_VAL)
@@ -8756,7 +8661,7 @@ static void gen6_check_mch_setup(struct drm_i915_private *dev_priv)
 
 static void gen6_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	uint32_t dspclk_gate = ILK_VRHUNIT_CLOCK_GATE_DISABLE;
+	u32 dspclk_gate = ILK_VRHUNIT_CLOCK_GATE_DISABLE;
 
 	I915_WRITE(ILK_DSPCLK_GATE_D, dspclk_gate);
 
@@ -8850,7 +8755,7 @@ static void gen6_init_clock_gating(struct drm_i915_private *dev_priv)
 
 static void gen7_setup_fixed_func_scheduler(struct drm_i915_private *dev_priv)
 {
-	uint32_t reg = I915_READ(GEN7_FF_THREAD_MODE);
+	u32 reg = I915_READ(GEN7_FF_THREAD_MODE);
 
 	/*
 	 * WaVSThreadDispatchOverride:ivb,vlv
@@ -8886,7 +8791,7 @@ static void lpt_init_clock_gating(struct drm_i915_private *dev_priv)
 static void lpt_suspend_hw(struct drm_i915_private *dev_priv)
 {
 	if (HAS_PCH_LPT_LP(dev_priv)) {
-		uint32_t val = I915_READ(SOUTH_DSPCLK_GATE_D);
+		u32 val = I915_READ(SOUTH_DSPCLK_GATE_D);
 
 		val &= ~PCH_LP_PARTITION_LEVEL_DISABLE;
 		I915_WRITE(SOUTH_DSPCLK_GATE_D, val);
@@ -9124,7 +9029,7 @@ static void hsw_init_clock_gating(struct drm_i915_private *dev_priv)
 
 static void ivb_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	uint32_t snpcr;
+	u32 snpcr;
 
 	I915_WRITE(ILK_DSPCLK_GATE_D, ILK_VRHUNIT_CLOCK_GATE_DISABLE);
 
@@ -9333,7 +9238,7 @@ static void chv_init_clock_gating(struct drm_i915_private *dev_priv)
 
 static void g4x_init_clock_gating(struct drm_i915_private *dev_priv)
 {
-	uint32_t dspclk_gate;
+	u32 dspclk_gate;
 
 	I915_WRITE(RENCLK_GATE_D1, 0);
 	I915_WRITE(RENCLK_GATE_D2, VF_UNIT_CLOCK_GATE_DISABLE |
@@ -9480,9 +9385,9 @@ void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 		dev_priv->display.init_clock_gating = ivb_init_clock_gating;
 	else if (IS_VALLEYVIEW(dev_priv))
 		dev_priv->display.init_clock_gating = vlv_init_clock_gating;
-	else if (IS_GEN6(dev_priv))
+	else if (IS_GEN(dev_priv, 6))
 		dev_priv->display.init_clock_gating = gen6_init_clock_gating;
-	else if (IS_GEN5(dev_priv))
+	else if (IS_GEN(dev_priv, 5))
 		dev_priv->display.init_clock_gating = ilk_init_clock_gating;
 	else if (IS_G4X(dev_priv))
 		dev_priv->display.init_clock_gating = g4x_init_clock_gating;
@@ -9490,11 +9395,11 @@ void intel_init_clock_gating_hooks(struct drm_i915_private *dev_priv)
 		dev_priv->display.init_clock_gating = i965gm_init_clock_gating;
 	else if (IS_I965G(dev_priv))
 		dev_priv->display.init_clock_gating = i965g_init_clock_gating;
-	else if (IS_GEN3(dev_priv))
+	else if (IS_GEN(dev_priv, 3))
 		dev_priv->display.init_clock_gating = gen3_init_clock_gating;
 	else if (IS_I85X(dev_priv) || IS_I865G(dev_priv))
 		dev_priv->display.init_clock_gating = i85x_init_clock_gating;
-	else if (IS_GEN2(dev_priv))
+	else if (IS_GEN(dev_priv, 2))
 		dev_priv->display.init_clock_gating = i830_init_clock_gating;
 	else {
 		MISSING_CASE(INTEL_DEVID(dev_priv));
@@ -9508,7 +9413,7 @@ void intel_init_pm(struct drm_i915_private *dev_priv)
 	/* For cxsr */
 	if (IS_PINEVIEW(dev_priv))
 		i915_pineview_get_mem_freq(dev_priv);
-	else if (IS_GEN5(dev_priv))
+	else if (IS_GEN(dev_priv, 5))
 		i915_ironlake_get_mem_freq(dev_priv);
 
 	/* For FIFO watermark updates */
@@ -9520,9 +9425,9 @@ void intel_init_pm(struct drm_i915_private *dev_priv)
 	} else if (HAS_PCH_SPLIT(dev_priv)) {
 		ilk_setup_wm_latency(dev_priv);
 
-		if ((IS_GEN5(dev_priv) && dev_priv->wm.pri_latency[1] &&
+		if ((IS_GEN(dev_priv, 5) && dev_priv->wm.pri_latency[1] &&
 		     dev_priv->wm.spr_latency[1] && dev_priv->wm.cur_latency[1]) ||
-		    (!IS_GEN5(dev_priv) && dev_priv->wm.pri_latency[0] &&
+		    (!IS_GEN(dev_priv, 5) && dev_priv->wm.pri_latency[0] &&
 		     dev_priv->wm.spr_latency[0] && dev_priv->wm.cur_latency[0])) {
 			dev_priv->display.compute_pipe_wm = ilk_compute_pipe_wm;
 			dev_priv->display.compute_intermediate_wm =
@@ -9563,12 +9468,12 @@ void intel_init_pm(struct drm_i915_private *dev_priv)
 			dev_priv->display.update_wm = NULL;
 		} else
 			dev_priv->display.update_wm = pineview_update_wm;
-	} else if (IS_GEN4(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 4)) {
 		dev_priv->display.update_wm = i965_update_wm;
-	} else if (IS_GEN3(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 3)) {
 		dev_priv->display.update_wm = i9xx_update_wm;
 		dev_priv->display.get_fifo_size = i9xx_get_fifo_size;
-	} else if (IS_GEN2(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 2)) {
 		if (INTEL_INFO(dev_priv)->num_pipes == 1) {
 			dev_priv->display.update_wm = i845_update_wm;
 			dev_priv->display.get_fifo_size = i845_get_fifo_size;
@@ -9583,7 +9488,7 @@ void intel_init_pm(struct drm_i915_private *dev_priv)
 
 static inline int gen6_check_mailbox_status(struct drm_i915_private *dev_priv)
 {
-	uint32_t flags =
+	u32 flags =
 		I915_READ_FW(GEN6_PCODE_MAILBOX) & GEN6_PCODE_ERROR_MASK;
 
 	switch (flags) {
@@ -9606,7 +9511,7 @@ static inline int gen6_check_mailbox_status(struct drm_i915_private *dev_priv)
 
 static inline int gen7_check_mailbox_status(struct drm_i915_private *dev_priv)
 {
-	uint32_t flags =
+	u32 flags =
 		I915_READ_FW(GEN6_PCODE_MAILBOX) & GEN6_PCODE_ERROR_MASK;
 
 	switch (flags) {
diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index f71970df9936..84a0fb981561 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -51,7 +51,6 @@
  * must be correctly synchronized/cancelled when shutting down the pipe."
  */
 
-#include <drm/drmP.h>
 
 #include "intel_drv.h"
 #include "i915_drv.h"
@@ -71,17 +70,17 @@ static bool psr_global_enabled(u32 debug)
 static bool intel_psr2_enabled(struct drm_i915_private *dev_priv,
 			       const struct intel_crtc_state *crtc_state)
 {
-	/* Disable PSR2 by default for all platforms */
-	if (i915_modparams.enable_psr == -1)
-		return false;
-
 	/* Cannot enable DSC and PSR2 simultaneously */
 	WARN_ON(crtc_state->dsc_params.compression_enable &&
 		crtc_state->has_psr2);
 
 	switch (dev_priv->psr.debug & I915_PSR_DEBUG_MODE_MASK) {
+	case I915_PSR_DEBUG_DISABLE:
 	case I915_PSR_DEBUG_FORCE_PSR1:
 		return false;
+	case I915_PSR_DEBUG_DEFAULT:
+		if (i915_modparams.enable_psr <= 0)
+			return false;
 	default:
 		return crtc_state->has_psr2;
 	}
@@ -231,7 +230,7 @@ void intel_psr_irq_handler(struct drm_i915_private *dev_priv, u32 psr_iir)
 
 static bool intel_dp_get_colorimetry_status(struct intel_dp *intel_dp)
 {
-	uint8_t dprx = 0;
+	u8 dprx = 0;
 
 	if (drm_dp_dpcd_readb(&intel_dp->aux, DP_DPRX_FEATURE_ENUMERATION_LIST,
 			      &dprx) != 1)
@@ -241,7 +240,7 @@ static bool intel_dp_get_colorimetry_status(struct intel_dp *intel_dp)
 
 static bool intel_dp_get_alpm_status(struct intel_dp *intel_dp)
 {
-	uint8_t alpm_caps = 0;
+	u8 alpm_caps = 0;
 
 	if (drm_dp_dpcd_readb(&intel_dp->aux, DP_RECEIVER_ALPM_CAP,
 			      &alpm_caps) != 1)
@@ -261,6 +260,32 @@ static u8 intel_dp_get_sink_sync_latency(struct intel_dp *intel_dp)
 	return val;
 }
 
+static u16 intel_dp_get_su_x_granulartiy(struct intel_dp *intel_dp)
+{
+	u16 val;
+	ssize_t r;
+
+	/*
+	 * Returning the default X granularity if granularity not required or
+	 * if DPCD read fails
+	 */
+	if (!(intel_dp->psr_dpcd[1] & DP_PSR2_SU_GRANULARITY_REQUIRED))
+		return 4;
+
+	r = drm_dp_dpcd_read(&intel_dp->aux, DP_PSR2_SU_X_GRANULARITY, &val, 2);
+	if (r != 2)
+		DRM_DEBUG_KMS("Unable to read DP_PSR2_SU_X_GRANULARITY\n");
+
+	/*
+	 * Spec says that if the value read is 0 the default granularity should
+	 * be used instead.
+	 */
+	if (r != 2 || val == 0)
+		val = 4;
+
+	return val;
+}
+
 void intel_psr_init_dpcd(struct intel_dp *intel_dp)
 {
 	struct drm_i915_private *dev_priv =
@@ -315,6 +340,8 @@ void intel_psr_init_dpcd(struct intel_dp *intel_dp)
 		if (dev_priv->psr.sink_psr2_support) {
 			dev_priv->psr.colorimetry_support =
 				intel_dp_get_colorimetry_status(intel_dp);
+			dev_priv->psr.su_x_granularity =
+				intel_dp_get_su_x_granulartiy(intel_dp);
 		}
 	}
 }
@@ -357,7 +384,7 @@ static void hsw_psr_setup_aux(struct intel_dp *intel_dp)
 	struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 	u32 aux_clock_divider, aux_ctl;
 	int i;
-	static const uint8_t aux_msg[] = {
+	static const u8 aux_msg[] = {
 		[0] = DP_AUX_NATIVE_WRITE << 4,
 		[1] = DP_SET_POWER >> 8,
 		[2] = DP_SET_POWER & 0xff,
@@ -394,13 +421,15 @@ static void intel_psr_enable_sink(struct intel_dp *intel_dp)
 	if (dev_priv->psr.psr2_enabled) {
 		drm_dp_dpcd_writeb(&intel_dp->aux, DP_RECEIVER_ALPM_CONFIG,
 				   DP_ALPM_ENABLE);
-		dpcd_val |= DP_PSR_ENABLE_PSR2;
+		dpcd_val |= DP_PSR_ENABLE_PSR2 | DP_PSR_IRQ_HPD_WITH_CRC_ERRORS;
+	} else {
+		if (dev_priv->psr.link_standby)
+			dpcd_val |= DP_PSR_MAIN_LINK_ACTIVE;
+
+		if (INTEL_GEN(dev_priv) >= 8)
+			dpcd_val |= DP_PSR_CRC_VERIFICATION;
 	}
 
-	if (dev_priv->psr.link_standby)
-		dpcd_val |= DP_PSR_MAIN_LINK_ACTIVE;
-	if (!dev_priv->psr.psr2_enabled && INTEL_GEN(dev_priv) >= 8)
-		dpcd_val |= DP_PSR_CRC_VERIFICATION;
 	drm_dp_dpcd_writeb(&intel_dp->aux, DP_PSR_EN_CFG, dpcd_val);
 
 	drm_dp_dpcd_writeb(&intel_dp->aux, DP_SET_POWER, DP_SET_POWER_D0);
@@ -474,9 +503,6 @@ static void hsw_activate_psr2(struct intel_dp *intel_dp)
 	idle_frames = max(idle_frames, dev_priv->psr.sink_sync_latency + 1);
 	val = idle_frames << EDP_PSR2_IDLE_FRAME_SHIFT;
 
-	/* FIXME: selective update is probably totally broken because it doesn't
-	 * mesh at all with our frontbuffer tracking. And the hw alone isn't
-	 * good enough. */
 	val |= EDP_PSR2_ENABLE | EDP_SU_TRACK_ENABLE;
 	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
 		val |= EDP_Y_COORDINATE_ENABLE;
@@ -525,7 +551,7 @@ static bool intel_psr2_config_valid(struct intel_dp *intel_dp,
 	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv)) {
 		psr_max_h = 4096;
 		psr_max_v = 2304;
-	} else if (IS_GEN9(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 9)) {
 		psr_max_h = 3640;
 		psr_max_v = 2304;
 	}
@@ -537,6 +563,18 @@ static bool intel_psr2_config_valid(struct intel_dp *intel_dp,
 		return false;
 	}
 
+	/*
+	 * HW sends SU blocks of size four scan lines, which means the starting
+	 * X coordinate and Y granularity requirements will always be met. We
+	 * only need to validate the SU block width is a multiple of
+	 * x granularity.
+	 */
+	if (crtc_hdisplay % dev_priv->psr.su_x_granularity) {
+		DRM_DEBUG_KMS("PSR2 not enabled, hdisplay(%d) not multiple of %d\n",
+			      crtc_hdisplay, dev_priv->psr.su_x_granularity);
+		return false;
+	}
+
 	return true;
 }
 
@@ -647,17 +685,14 @@ static void intel_psr_enable_source(struct intel_dp *intel_dp,
 	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
 		hsw_psr_setup_aux(intel_dp);
 
-	if (dev_priv->psr.psr2_enabled) {
+	if (dev_priv->psr.psr2_enabled && (IS_GEN(dev_priv, 9) &&
+					   !IS_GEMINILAKE(dev_priv))) {
 		i915_reg_t reg = gen9_chicken_trans_reg(dev_priv,
 							cpu_transcoder);
 		u32 chicken = I915_READ(reg);
 
-		if (IS_GEN9(dev_priv) && !IS_GEMINILAKE(dev_priv))
-			chicken |= (PSR2_VSC_ENABLE_PROG_HEADER
-				   | PSR2_ADD_VERTICAL_LINE_COUNT);
-
-		else
-			chicken &= ~VSC_DATA_SEL_SOFTWARE_CONTROL;
+		chicken |= PSR2_VSC_ENABLE_PROG_HEADER |
+			   PSR2_ADD_VERTICAL_LINE_COUNT;
 		I915_WRITE(reg, chicken);
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index fbeaec3994e7..7f841dba87b3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -29,11 +29,11 @@
 
 #include <linux/log2.h>
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 
 #include "i915_drv.h"
 #include "i915_gem_render_state.h"
+#include "i915_reset.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include "intel_workarounds.h"
@@ -43,17 +43,10 @@
  */
 #define LEGACY_REQUEST_SIZE 200
 
-static unsigned int __intel_ring_space(unsigned int head,
-				       unsigned int tail,
-				       unsigned int size)
+static inline u32 intel_hws_seqno_address(struct intel_engine_cs *engine)
 {
-	/*
-	 * "If the Ring Buffer Head Pointer and the Tail Pointer are on the
-	 * same cacheline, the Head Pointer must not be greater than the Tail
-	 * Pointer."
-	 */
-	GEM_BUG_ON(!is_power_of_2(size));
-	return (head - tail - CACHELINE_BYTES) & (size - 1);
+	return (i915_ggtt_offset(engine->status_page.vma) +
+		I915_GEM_HWS_INDEX_ADDR);
 }
 
 unsigned int intel_ring_update_space(struct intel_ring *ring)
@@ -133,7 +126,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
 	cmd = MI_FLUSH;
 	if (mode & EMIT_INVALIDATE) {
 		cmd |= MI_EXE_FLUSH;
-		if (IS_G4X(rq->i915) || IS_GEN5(rq->i915))
+		if (IS_G4X(rq->i915) || IS_GEN(rq->i915, 5))
 			cmd |= MI_INVALIDATE_ISP;
 	}
 
@@ -217,7 +210,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
  * really our business.  That leaves only stall at scoreboard.
  */
 static int
-intel_emit_post_sync_nonzero_flush(struct i915_request *rq)
+gen6_emit_post_sync_nonzero_flush(struct i915_request *rq)
 {
 	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
 	u32 *cs;
@@ -257,7 +250,7 @@ gen6_render_ring_flush(struct i915_request *rq, u32 mode)
 	int ret;
 
 	/* Force SNB workarounds for PIPE_CONTROL flushes */
-	ret = intel_emit_post_sync_nonzero_flush(rq);
+	ret = gen6_emit_post_sync_nonzero_flush(rq);
 	if (ret)
 		return ret;
 
@@ -300,6 +293,43 @@ gen6_render_ring_flush(struct i915_request *rq, u32 mode)
 	return 0;
 }
 
+static u32 *gen6_rcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
+{
+	/* First we do the gen6_emit_post_sync_nonzero_flush w/a */
+	*cs++ = GFX_OP_PIPE_CONTROL(4);
+	*cs++ = PIPE_CONTROL_CS_STALL | PIPE_CONTROL_STALL_AT_SCOREBOARD;
+	*cs++ = 0;
+	*cs++ = 0;
+
+	*cs++ = GFX_OP_PIPE_CONTROL(4);
+	*cs++ = PIPE_CONTROL_QW_WRITE;
+	*cs++ = i915_scratch_offset(rq->i915) | PIPE_CONTROL_GLOBAL_GTT;
+	*cs++ = 0;
+
+	/* Finally we can flush and with it emit the breadcrumb */
+	*cs++ = GFX_OP_PIPE_CONTROL(4);
+	*cs++ = (PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
+		 PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+		 PIPE_CONTROL_DC_FLUSH_ENABLE |
+		 PIPE_CONTROL_QW_WRITE |
+		 PIPE_CONTROL_CS_STALL);
+	*cs++ = rq->timeline->hwsp_offset | PIPE_CONTROL_GLOBAL_GTT;
+	*cs++ = rq->fence.seqno;
+
+	*cs++ = GFX_OP_PIPE_CONTROL(4);
+	*cs++ = PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL;
+	*cs++ = intel_hws_seqno_address(rq->engine) | PIPE_CONTROL_GLOBAL_GTT;
+	*cs++ = rq->global_seqno;
+
+	*cs++ = MI_USER_INTERRUPT;
+	*cs++ = MI_NOOP;
+
+	rq->tail = intel_ring_offset(rq, cs);
+	assert_ring_tail_valid(rq->ring, rq->tail);
+
+	return cs;
+}
+
 static int
 gen7_render_ring_cs_stall_wa(struct i915_request *rq)
 {
@@ -379,11 +409,111 @@ gen7_render_ring_flush(struct i915_request *rq, u32 mode)
 	return 0;
 }
 
-static void ring_setup_phys_status_page(struct intel_engine_cs *engine)
+static u32 *gen7_rcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
+{
+	*cs++ = GFX_OP_PIPE_CONTROL(4);
+	*cs++ = (PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
+		 PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+		 PIPE_CONTROL_DC_FLUSH_ENABLE |
+		 PIPE_CONTROL_FLUSH_ENABLE |
+		 PIPE_CONTROL_QW_WRITE |
+		 PIPE_CONTROL_GLOBAL_GTT_IVB |
+		 PIPE_CONTROL_CS_STALL);
+	*cs++ = rq->timeline->hwsp_offset;
+	*cs++ = rq->fence.seqno;
+
+	*cs++ = GFX_OP_PIPE_CONTROL(4);
+	*cs++ = (PIPE_CONTROL_QW_WRITE |
+		 PIPE_CONTROL_GLOBAL_GTT_IVB |
+		 PIPE_CONTROL_CS_STALL);
+	*cs++ = intel_hws_seqno_address(rq->engine);
+	*cs++ = rq->global_seqno;
+
+	*cs++ = MI_USER_INTERRUPT;
+	*cs++ = MI_NOOP;
+
+	rq->tail = intel_ring_offset(rq, cs);
+	assert_ring_tail_valid(rq->ring, rq->tail);
+
+	return cs;
+}
+
+static u32 *gen6_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
+{
+	GEM_BUG_ON(rq->timeline->hwsp_ggtt != rq->engine->status_page.vma);
+	GEM_BUG_ON(offset_in_page(rq->timeline->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
+
+	*cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
+	*cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT;
+	*cs++ = rq->fence.seqno;
+
+	*cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
+	*cs++ = I915_GEM_HWS_INDEX_ADDR | MI_FLUSH_DW_USE_GTT;
+	*cs++ = rq->global_seqno;
+
+	*cs++ = MI_USER_INTERRUPT;
+	*cs++ = MI_NOOP;
+
+	rq->tail = intel_ring_offset(rq, cs);
+	assert_ring_tail_valid(rq->ring, rq->tail);
+
+	return cs;
+}
+
+#define GEN7_XCS_WA 32
+static u32 *gen7_xcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
+{
+	int i;
+
+	GEM_BUG_ON(rq->timeline->hwsp_ggtt != rq->engine->status_page.vma);
+	GEM_BUG_ON(offset_in_page(rq->timeline->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
+
+	*cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
+	*cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT;
+	*cs++ = rq->fence.seqno;
+
+	*cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX;
+	*cs++ = I915_GEM_HWS_INDEX_ADDR | MI_FLUSH_DW_USE_GTT;
+	*cs++ = rq->global_seqno;
+
+	for (i = 0; i < GEN7_XCS_WA; i++) {
+		*cs++ = MI_STORE_DWORD_INDEX;
+		*cs++ = I915_GEM_HWS_SEQNO_ADDR;
+		*cs++ = rq->fence.seqno;
+	}
+
+	*cs++ = MI_FLUSH_DW;
+	*cs++ = 0;
+	*cs++ = 0;
+
+	*cs++ = MI_USER_INTERRUPT;
+
+	rq->tail = intel_ring_offset(rq, cs);
+	assert_ring_tail_valid(rq->ring, rq->tail);
+
+	return cs;
+}
+#undef GEN7_XCS_WA
+
+static void set_hwstam(struct intel_engine_cs *engine, u32 mask)
+{
+	/*
+	 * Keep the render interrupt unmasked as this papers over
+	 * lost interrupts following a reset.
+	 */
+	if (engine->class == RENDER_CLASS) {
+		if (INTEL_GEN(engine->i915) >= 6)
+			mask &= ~BIT(0);
+		else
+			mask &= ~I915_USER_INTERRUPT;
+	}
+
+	intel_engine_set_hwsp_writemask(engine, mask);
+}
+
+static void set_hws_pga(struct intel_engine_cs *engine, phys_addr_t phys)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
-	struct page *page = virt_to_page(engine->status_page.page_addr);
-	phys_addr_t phys = PFN_PHYS(page_to_pfn(page));
 	u32 addr;
 
 	addr = lower_32_bits(phys);
@@ -393,15 +523,30 @@ static void ring_setup_phys_status_page(struct intel_engine_cs *engine)
 	I915_WRITE(HWS_PGA, addr);
 }
 
-static void intel_ring_setup_status_page(struct intel_engine_cs *engine)
+static struct page *status_page(struct intel_engine_cs *engine)
+{
+	struct drm_i915_gem_object *obj = engine->status_page.vma->obj;
+
+	GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
+	return sg_page(obj->mm.pages->sgl);
+}
+
+static void ring_setup_phys_status_page(struct intel_engine_cs *engine)
+{
+	set_hws_pga(engine, PFN_PHYS(page_to_pfn(status_page(engine))));
+	set_hwstam(engine, ~0u);
+}
+
+static void set_hwsp(struct intel_engine_cs *engine, u32 offset)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
-	i915_reg_t mmio;
+	i915_reg_t hwsp;
 
-	/* The ring status page addresses are no longer next to the rest of
+	/*
+	 * The ring status page addresses are no longer next to the rest of
 	 * the ring registers as of gen7.
 	 */
-	if (IS_GEN7(dev_priv)) {
+	if (IS_GEN(dev_priv, 7)) {
 		switch (engine->id) {
 		/*
 		 * No more rings exist on Gen7. Default case is only to shut up
@@ -410,56 +555,55 @@ static void intel_ring_setup_status_page(struct intel_engine_cs *engine)
 		default:
 			GEM_BUG_ON(engine->id);
 		case RCS:
-			mmio = RENDER_HWS_PGA_GEN7;
+			hwsp = RENDER_HWS_PGA_GEN7;
 			break;
 		case BCS:
-			mmio = BLT_HWS_PGA_GEN7;
+			hwsp = BLT_HWS_PGA_GEN7;
 			break;
 		case VCS:
-			mmio = BSD_HWS_PGA_GEN7;
+			hwsp = BSD_HWS_PGA_GEN7;
 			break;
 		case VECS:
-			mmio = VEBOX_HWS_PGA_GEN7;
+			hwsp = VEBOX_HWS_PGA_GEN7;
 			break;
 		}
-	} else if (IS_GEN6(dev_priv)) {
-		mmio = RING_HWS_PGA_GEN6(engine->mmio_base);
+	} else if (IS_GEN(dev_priv, 6)) {
+		hwsp = RING_HWS_PGA_GEN6(engine->mmio_base);
 	} else {
-		mmio = RING_HWS_PGA(engine->mmio_base);
+		hwsp = RING_HWS_PGA(engine->mmio_base);
 	}
 
-	if (INTEL_GEN(dev_priv) >= 6) {
-		u32 mask = ~0u;
+	I915_WRITE(hwsp, offset);
+	POSTING_READ(hwsp);
+}
 
-		/*
-		 * Keep the render interrupt unmasked as this papers over
-		 * lost interrupts following a reset.
-		 */
-		if (engine->id == RCS)
-			mask &= ~BIT(0);
+static void flush_cs_tlb(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->i915;
+	i915_reg_t instpm = RING_INSTPM(engine->mmio_base);
 
-		I915_WRITE(RING_HWSTAM(engine->mmio_base), mask);
-	}
+	if (!IS_GEN_RANGE(dev_priv, 6, 7))
+		return;
 
-	I915_WRITE(mmio, engine->status_page.ggtt_offset);
-	POSTING_READ(mmio);
+	/* ring should be idle before issuing a sync flush*/
+	WARN_ON((I915_READ_MODE(engine) & MODE_IDLE) == 0);
 
-	/* Flush the TLB for this page */
-	if (IS_GEN(dev_priv, 6, 7)) {
-		i915_reg_t reg = RING_INSTPM(engine->mmio_base);
+	I915_WRITE(instpm,
+		   _MASKED_BIT_ENABLE(INSTPM_TLB_INVALIDATE |
+				      INSTPM_SYNC_FLUSH));
+	if (intel_wait_for_register(dev_priv,
+				    instpm, INSTPM_SYNC_FLUSH, 0,
+				    1000))
+		DRM_ERROR("%s: wait for SyncFlush to complete for TLB invalidation timed out\n",
+			  engine->name);
+}
 
-		/* ring should be idle before issuing a sync flush*/
-		WARN_ON((I915_READ_MODE(engine) & MODE_IDLE) == 0);
+static void ring_setup_status_page(struct intel_engine_cs *engine)
+{
+	set_hwsp(engine, i915_ggtt_offset(engine->status_page.vma));
+	set_hwstam(engine, ~0u);
 
-		I915_WRITE(reg,
-			   _MASKED_BIT_ENABLE(INSTPM_TLB_INVALIDATE |
-					      INSTPM_SYNC_FLUSH));
-		if (intel_wait_for_register(dev_priv,
-					    reg, INSTPM_SYNC_FLUSH, 0,
-					    1000))
-			DRM_ERROR("%s: wait for SyncFlush to complete for TLB invalidation timed out\n",
-				  engine->name);
-	}
+	flush_cs_tlb(engine);
 }
 
 static bool stop_ring(struct intel_engine_cs *engine)
@@ -529,17 +673,10 @@ static int init_ring_common(struct intel_engine_cs *engine)
 	if (HWS_NEEDS_PHYSICAL(dev_priv))
 		ring_setup_phys_status_page(engine);
 	else
-		intel_ring_setup_status_page(engine);
+		ring_setup_status_page(engine);
 
 	intel_engine_reset_breadcrumbs(engine);
 
-	if (HAS_LEGACY_SEMAPHORES(engine->i915)) {
-		I915_WRITE(RING_SYNC_0(engine->mmio_base), 0);
-		I915_WRITE(RING_SYNC_1(engine->mmio_base), 0);
-		if (HAS_VEBOX(dev_priv))
-			I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
-	}
-
 	/* Enforce ordering by reading HEAD register back */
 	I915_READ_HEAD(engine);
 
@@ -593,63 +730,87 @@ static int init_ring_common(struct intel_engine_cs *engine)
 	}
 
 	/* Papering over lost _interrupts_ immediately following the restart */
-	intel_engine_wakeup(engine);
+	intel_engine_queue_breadcrumbs(engine);
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
 	return ret;
 }
 
-static struct i915_request *reset_prepare(struct intel_engine_cs *engine)
+static void reset_prepare(struct intel_engine_cs *engine)
 {
 	intel_engine_stop_cs(engine);
-
-	if (engine->irq_seqno_barrier)
-		engine->irq_seqno_barrier(engine);
-
-	return i915_gem_find_active_request(engine);
 }
 
-static void skip_request(struct i915_request *rq)
+static void reset_ring(struct intel_engine_cs *engine, bool stalled)
 {
-	void *vaddr = rq->ring->vaddr;
+	struct i915_timeline *tl = &engine->timeline;
+	struct i915_request *pos, *rq;
+	unsigned long flags;
 	u32 head;
 
-	head = rq->infix;
-	if (rq->postfix < head) {
-		memset32(vaddr + head, MI_NOOP,
-			 (rq->ring->size - head) / sizeof(u32));
-		head = 0;
+	rq = NULL;
+	spin_lock_irqsave(&tl->lock, flags);
+	list_for_each_entry(pos, &tl->requests, link) {
+		if (!i915_request_completed(pos)) {
+			rq = pos;
+			break;
+		}
 	}
-	memset32(vaddr + head, MI_NOOP, (rq->postfix - head) / sizeof(u32));
-}
-
-static void reset_ring(struct intel_engine_cs *engine, struct i915_request *rq)
-{
-	GEM_TRACE("%s request global=%d, current=%d\n",
-		  engine->name, rq ? rq->global_seqno : 0,
-		  intel_engine_get_seqno(engine));
 
+	GEM_TRACE("%s seqno=%d, current=%d, stalled? %s\n",
+		  engine->name,
+		  rq ? rq->global_seqno : 0,
+		  intel_engine_get_seqno(engine),
+		  yesno(stalled));
 	/*
-	 * Try to restore the logical GPU state to match the continuation
-	 * of the request queue. If we skip the context/PD restore, then
-	 * the next request may try to execute assuming that its context
-	 * is valid and loaded on the GPU and so may try to access invalid
-	 * memory, prompting repeated GPU hangs.
+	 * The guilty request will get skipped on a hung engine.
 	 *
-	 * If the request was guilty, we still restore the logical state
-	 * in case the next request requires it (e.g. the aliasing ppgtt),
-	 * but skip over the hung batch.
+	 * Users of client default contexts do not rely on logical
+	 * state preserved between batches so it is safe to execute
+	 * queued requests following the hang. Non default contexts
+	 * rely on preserved state, so skipping a batch loses the
+	 * evolution of the state and it needs to be considered corrupted.
+	 * Executing more queued batches on top of corrupted state is
+	 * risky. But we take the risk by trying to advance through
+	 * the queued requests in order to make the client behaviour
+	 * more predictable around resets, by not throwing away random
+	 * amount of batches it has prepared for execution. Sophisticated
+	 * clients can use gem_reset_stats_ioctl and dma fence status
+	 * (exported via sync_file info ioctl on explicit fences) to observe
+	 * when it loses the context state and should rebuild accordingly.
 	 *
-	 * If the request was innocent, we try to replay the request with
-	 * the restored context.
+	 * The context ban, and ultimately the client ban, mechanism are safety
+	 * valves if client submission ends up resulting in nothing more than
+	 * subsequent hangs.
 	 */
+
 	if (rq) {
-		/* If the rq hung, jump to its breadcrumb and skip the batch */
-		rq->ring->head = intel_ring_wrap(rq->ring, rq->head);
-		if (rq->fence.error == -EIO)
-			skip_request(rq);
+		/*
+		 * Try to restore the logical GPU state to match the
+		 * continuation of the request queue. If we skip the
+		 * context/PD restore, then the next request may try to execute
+		 * assuming that its context is valid and loaded on the GPU and
+		 * so may try to access invalid memory, prompting repeated GPU
+		 * hangs.
+		 *
+		 * If the request was guilty, we still restore the logical
+		 * state in case the next request requires it (e.g. the
+		 * aliasing ppgtt), but skip over the hung batch.
+		 *
+		 * If the request was innocent, we try to replay the request
+		 * with the restored context.
+		 */
+		i915_reset_request(rq, stalled);
+
+		GEM_BUG_ON(rq->ring != engine->buffer);
+		head = rq->head;
+	} else {
+		head = engine->buffer->tail;
 	}
+	engine->buffer->head = intel_ring_wrap(engine->buffer, head);
+
+	spin_unlock_irqrestore(&tl->lock, flags);
 }
 
 static void reset_finish(struct intel_engine_cs *engine)
@@ -679,7 +840,7 @@ static int init_render_ring(struct intel_engine_cs *engine)
 		return ret;
 
 	/* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
-	if (IS_GEN(dev_priv, 4, 6))
+	if (IS_GEN_RANGE(dev_priv, 4, 6))
 		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
 
 	/* We need to disable the AsyncFlip performance optimisations in order
@@ -688,22 +849,22 @@ static int init_render_ring(struct intel_engine_cs *engine)
 	 *
 	 * WaDisableAsyncFlipPerfMode:snb,ivb,hsw,vlv
 	 */
-	if (IS_GEN(dev_priv, 6, 7))
+	if (IS_GEN_RANGE(dev_priv, 6, 7))
 		I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(ASYNC_FLIP_PERF_DISABLE));
 
 	/* Required for the hardware to program scanline values for waiting */
 	/* WaEnableFlushTlbInvalidationMode:snb */
-	if (IS_GEN6(dev_priv))
+	if (IS_GEN(dev_priv, 6))
 		I915_WRITE(GFX_MODE,
 			   _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_EXPLICIT));
 
 	/* WaBCSVCSTlbInvalidationMode:ivb,vlv,hsw */
-	if (IS_GEN7(dev_priv))
+	if (IS_GEN(dev_priv, 7))
 		I915_WRITE(GFX_MODE_GEN7,
 			   _MASKED_BIT_ENABLE(GFX_TLB_INVALIDATE_EXPLICIT) |
 			   _MASKED_BIT_ENABLE(GFX_REPLAY_MODE));
 
-	if (IS_GEN6(dev_priv)) {
+	if (IS_GEN(dev_priv, 6)) {
 		/* From the Sandybridge PRM, volume 1 part 3, page 24:
 		 * "If this bit is set, STCunit will have LRA as replacement
 		 *  policy. [...] This bit must be reset.  LRA replacement
@@ -713,7 +874,7 @@ static int init_render_ring(struct intel_engine_cs *engine)
 			   _MASKED_BIT_DISABLE(CM0_STC_EVICT_DISABLE_LRA_SNB));
 	}
 
-	if (IS_GEN(dev_priv, 6, 7))
+	if (IS_GEN_RANGE(dev_priv, 6, 7))
 		I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING));
 
 	if (INTEL_GEN(dev_priv) >= 6)
@@ -722,33 +883,6 @@ static int init_render_ring(struct intel_engine_cs *engine)
 	return 0;
 }
 
-static u32 *gen6_signal(struct i915_request *rq, u32 *cs)
-{
-	struct drm_i915_private *dev_priv = rq->i915;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int num_rings = 0;
-
-	for_each_engine(engine, dev_priv, id) {
-		i915_reg_t mbox_reg;
-
-		if (!(BIT(engine->hw_id) & GEN6_SEMAPHORES_MASK))
-			continue;
-
-		mbox_reg = rq->engine->semaphore.mbox.signal[engine->hw_id];
-		if (i915_mmio_reg_valid(mbox_reg)) {
-			*cs++ = MI_LOAD_REGISTER_IMM(1);
-			*cs++ = i915_mmio_reg_offset(mbox_reg);
-			*cs++ = rq->global_seqno;
-			num_rings++;
-		}
-	}
-	if (num_rings & 1)
-		*cs++ = MI_NOOP;
-
-	return cs;
-}
-
 static void cancel_requests(struct intel_engine_cs *engine)
 {
 	struct i915_request *request;
@@ -760,11 +894,10 @@ static void cancel_requests(struct intel_engine_cs *engine)
 	list_for_each_entry(request, &engine->timeline.requests, link) {
 		GEM_BUG_ON(!request->global_seqno);
 
-		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-			     &request->fence.flags))
-			continue;
+		if (!i915_request_signaled(request))
+			dma_fence_set_error(&request->fence, -EIO);
 
-		dma_fence_set_error(&request->fence, -EIO);
+		i915_request_mark_complete(request);
 	}
 
 	intel_write_status_page(engine,
@@ -786,94 +919,59 @@ static void i9xx_submit_request(struct i915_request *request)
 			intel_ring_set_tail(request->ring, request->tail));
 }
 
-static void i9xx_emit_breadcrumb(struct i915_request *rq, u32 *cs)
+static u32 *i9xx_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 {
+	GEM_BUG_ON(rq->timeline->hwsp_ggtt != rq->engine->status_page.vma);
+	GEM_BUG_ON(offset_in_page(rq->timeline->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
+
+	*cs++ = MI_FLUSH;
+
+	*cs++ = MI_STORE_DWORD_INDEX;
+	*cs++ = I915_GEM_HWS_SEQNO_ADDR;
+	*cs++ = rq->fence.seqno;
+
 	*cs++ = MI_STORE_DWORD_INDEX;
-	*cs++ = I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT;
+	*cs++ = I915_GEM_HWS_INDEX_ADDR;
 	*cs++ = rq->global_seqno;
+
 	*cs++ = MI_USER_INTERRUPT;
 
 	rq->tail = intel_ring_offset(rq, cs);
 	assert_ring_tail_valid(rq->ring, rq->tail);
-}
 
-static const int i9xx_emit_breadcrumb_sz = 4;
-
-static void gen6_sema_emit_breadcrumb(struct i915_request *rq, u32 *cs)
-{
-	return i9xx_emit_breadcrumb(rq, rq->engine->semaphore.signal(rq, cs));
+	return cs;
 }
 
-static int
-gen6_ring_sync_to(struct i915_request *rq, struct i915_request *signal)
+#define GEN5_WA_STORES 8 /* must be at least 1! */
+static u32 *gen5_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 {
-	u32 dw1 = MI_SEMAPHORE_MBOX |
-		  MI_SEMAPHORE_COMPARE |
-		  MI_SEMAPHORE_REGISTER;
-	u32 wait_mbox = signal->engine->semaphore.mbox.wait[rq->engine->hw_id];
-	u32 *cs;
-
-	WARN_ON(wait_mbox == MI_SEMAPHORE_SYNC_INVALID);
+	int i;
 
-	cs = intel_ring_begin(rq, 4);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
+	GEM_BUG_ON(rq->timeline->hwsp_ggtt != rq->engine->status_page.vma);
+	GEM_BUG_ON(offset_in_page(rq->timeline->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR);
 
-	*cs++ = dw1 | wait_mbox;
-	/* Throughout all of the GEM code, seqno passed implies our current
-	 * seqno is >= the last seqno executed. However for hardware the
-	 * comparison is strictly greater than.
-	 */
-	*cs++ = signal->global_seqno - 1;
-	*cs++ = 0;
-	*cs++ = MI_NOOP;
-	intel_ring_advance(rq, cs);
+	*cs++ = MI_FLUSH;
 
-	return 0;
-}
+	*cs++ = MI_STORE_DWORD_INDEX;
+	*cs++ = I915_GEM_HWS_SEQNO_ADDR;
+	*cs++ = rq->fence.seqno;
+
+	BUILD_BUG_ON(GEN5_WA_STORES < 1);
+	for (i = 0; i < GEN5_WA_STORES; i++) {
+		*cs++ = MI_STORE_DWORD_INDEX;
+		*cs++ = I915_GEM_HWS_INDEX_ADDR;
+		*cs++ = rq->global_seqno;
+	}
 
-static void
-gen5_seqno_barrier(struct intel_engine_cs *engine)
-{
-	/* MI_STORE are internally buffered by the GPU and not flushed
-	 * either by MI_FLUSH or SyncFlush or any other combination of
-	 * MI commands.
-	 *
-	 * "Only the submission of the store operation is guaranteed.
-	 * The write result will be complete (coherent) some time later
-	 * (this is practically a finite period but there is no guaranteed
-	 * latency)."
-	 *
-	 * Empirically, we observe that we need a delay of at least 75us to
-	 * be sure that the seqno write is visible by the CPU.
-	 */
-	usleep_range(125, 250);
-}
+	*cs++ = MI_USER_INTERRUPT;
+	*cs++ = MI_NOOP;
 
-static void
-gen6_seqno_barrier(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
+	rq->tail = intel_ring_offset(rq, cs);
+	assert_ring_tail_valid(rq->ring, rq->tail);
 
-	/* Workaround to force correct ordering between irq and seqno writes on
-	 * ivb (and maybe also on snb) by reading from a CS register (like
-	 * ACTHD) before reading the status page.
-	 *
-	 * Note that this effectively stalls the read by the time it takes to
-	 * do a memory transaction, which more or less ensures that the write
-	 * from the GPU has sufficient time to invalidate the CPU cacheline.
-	 * Alternatively we could delay the interrupt from the CS ring to give
-	 * the write time to land, but that would incur a delay after every
-	 * batch i.e. much more frequent than a delay when waiting for the
-	 * interrupt (with the same net latency).
-	 *
-	 * Also note that to prevent whole machine hangs on gen7, we have to
-	 * take the spinlock to guard against concurrent cacheline access.
-	 */
-	spin_lock_irq(&dev_priv->uncore.lock);
-	POSTING_READ_FW(RING_ACTHD(engine->mmio_base));
-	spin_unlock_irq(&dev_priv->uncore.lock);
+	return cs;
 }
+#undef GEN5_WA_STORES
 
 static void
 gen5_irq_enable(struct intel_engine_cs *engine)
@@ -948,6 +1046,10 @@ gen6_irq_enable(struct intel_engine_cs *engine)
 	I915_WRITE_IMR(engine,
 		       ~(engine->irq_enable_mask |
 			 engine->irq_keep_mask));
+
+	/* Flush/delay to ensure the RING_IMR is active before the GT IMR */
+	POSTING_READ_FW(RING_IMR(engine->mmio_base));
+
 	gen5_enable_gt_irq(dev_priv, engine->irq_enable_mask);
 }
 
@@ -966,6 +1068,10 @@ hsw_vebox_irq_enable(struct intel_engine_cs *engine)
 	struct drm_i915_private *dev_priv = engine->i915;
 
 	I915_WRITE_IMR(engine, ~engine->irq_enable_mask);
+
+	/* Flush/delay to ensure the RING_IMR is active before the GT IMR */
+	POSTING_READ_FW(RING_IMR(engine->mmio_base));
+
 	gen6_unmask_pm_irq(dev_priv, engine->irq_enable_mask);
 }
 
@@ -1091,6 +1197,10 @@ int intel_ring_pin(struct intel_ring *ring)
 
 	GEM_BUG_ON(ring->vaddr);
 
+	ret = i915_timeline_pin(ring->timeline);
+	if (ret)
+		return ret;
+
 	flags = PIN_GLOBAL;
 
 	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
@@ -1107,28 +1217,32 @@ int intel_ring_pin(struct intel_ring *ring)
 		else
 			ret = i915_gem_object_set_to_cpu_domain(vma->obj, true);
 		if (unlikely(ret))
-			return ret;
+			goto unpin_timeline;
 	}
 
 	ret = i915_vma_pin(vma, 0, 0, flags);
 	if (unlikely(ret))
-		return ret;
+		goto unpin_timeline;
 
 	if (i915_vma_is_map_and_fenceable(vma))
 		addr = (void __force *)i915_vma_pin_iomap(vma);
 	else
 		addr = i915_gem_object_pin_map(vma->obj, map);
-	if (IS_ERR(addr))
-		goto err;
+	if (IS_ERR(addr)) {
+		ret = PTR_ERR(addr);
+		goto unpin_ring;
+	}
 
 	vma->obj->pin_global++;
 
 	ring->vaddr = addr;
 	return 0;
 
-err:
+unpin_ring:
 	i915_vma_unpin(vma);
-	return PTR_ERR(addr);
+unpin_timeline:
+	i915_timeline_unpin(ring->timeline);
+	return ret;
 }
 
 void intel_ring_reset(struct intel_ring *ring, u32 tail)
@@ -1157,6 +1271,8 @@ void intel_ring_unpin(struct intel_ring *ring)
 
 	ring->vma->obj->pin_global--;
 	i915_vma_unpin(ring->vma);
+
+	i915_timeline_unpin(ring->timeline);
 }
 
 static struct i915_vma *
@@ -1467,13 +1583,18 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
 	struct intel_ring *ring;
 	int err;
 
-	intel_engine_setup_common(engine);
+	err = intel_engine_setup_common(engine);
+	if (err)
+		return err;
 
-	timeline = i915_timeline_create(engine->i915, engine->name);
+	timeline = i915_timeline_create(engine->i915,
+					engine->name,
+					engine->status_page.vma);
 	if (IS_ERR(timeline)) {
 		err = PTR_ERR(timeline);
 		goto err;
 	}
+	GEM_BUG_ON(timeline->has_initial_breadcrumb);
 
 	ring = intel_engine_create_ring(engine, timeline, 32 * PAGE_SIZE);
 	i915_timeline_put(timeline);
@@ -1493,6 +1614,8 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
 	if (err)
 		goto err_unpin;
 
+	GEM_BUG_ON(ring->timeline->hwsp_ggtt != engine->status_page.vma);
+
 	return 0;
 
 err_unpin:
@@ -1581,10 +1704,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 	struct intel_engine_cs *engine = rq->engine;
 	enum intel_engine_id id;
 	const int num_rings =
-		/* Use an extended w/a on gen7 if signalling from other rings */
-		(HAS_LEGACY_SEMAPHORES(i915) && IS_GEN7(i915)) ?
-		INTEL_INFO(i915)->num_rings - 1 :
-		0;
+		IS_HSW_GT1(i915) ? RUNTIME_INFO(i915)->num_rings - 1 : 0;
 	bool force_restore = false;
 	int len;
 	u32 *cs;
@@ -1597,7 +1717,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 		flags |= MI_SAVE_EXT_STATE_EN | MI_RESTORE_EXT_STATE_EN;
 
 	len = 4;
-	if (IS_GEN7(i915))
+	if (IS_GEN(i915, 7))
 		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
 	if (flags & MI_FORCE_RESTORE) {
 		GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
@@ -1611,7 +1731,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 		return PTR_ERR(cs);
 
 	/* WaProgramMiArbOnOffAroundMiSetContext:ivb,vlv,hsw,bdw,chv */
-	if (IS_GEN7(i915)) {
+	if (IS_GEN(i915, 7)) {
 		*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
 		if (num_rings) {
 			struct intel_engine_cs *signaller;
@@ -1658,7 +1778,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 	 */
 	*cs++ = MI_NOOP;
 
-	if (IS_GEN7(i915)) {
+	if (IS_GEN(i915, 7)) {
 		if (num_rings) {
 			struct intel_engine_cs *signaller;
 			i915_reg_t last_reg = {}; /* keep gcc quiet */
@@ -1828,18 +1948,21 @@ static int ring_request_alloc(struct i915_request *request)
 	int ret;
 
 	GEM_BUG_ON(!request->hw_context->pin_count);
+	GEM_BUG_ON(request->timeline->has_initial_breadcrumb);
 
-	/* Flush enough space to reduce the likelihood of waiting after
+	/*
+	 * Flush enough space to reduce the likelihood of waiting after
 	 * we start building the request - in which case we will just
 	 * have to repeat work.
 	 */
 	request->reserved_space += LEGACY_REQUEST_SIZE;
 
-	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
+	ret = switch_context(request);
 	if (ret)
 		return ret;
 
-	ret = switch_context(request);
+	/* Unconditionally invalidate GPU caches and TLBs. */
+	ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
 	if (ret)
 		return ret;
 
@@ -1881,22 +2004,6 @@ static noinline int wait_for_space(struct intel_ring *ring, unsigned int bytes)
 	return 0;
 }
 
-int intel_ring_wait_for_space(struct intel_ring *ring, unsigned int bytes)
-{
-	GEM_BUG_ON(bytes > ring->effective_size);
-	if (unlikely(bytes > ring->effective_size - ring->emit))
-		bytes += ring->size - ring->emit;
-
-	if (unlikely(bytes > ring->space)) {
-		int ret = wait_for_space(ring, bytes);
-		if (unlikely(ret))
-			return ret;
-	}
-
-	GEM_BUG_ON(ring->space < bytes);
-	return 0;
-}
-
 u32 *intel_ring_begin(struct i915_request *rq, unsigned int num_dwords)
 {
 	struct intel_ring *ring = rq->ring;
@@ -2129,77 +2236,15 @@ static int gen6_ring_flush(struct i915_request *rq, u32 mode)
 	return gen6_flush_dw(rq, mode, MI_INVALIDATE_TLB);
 }
 
-static void intel_ring_init_semaphores(struct drm_i915_private *dev_priv,
-				       struct intel_engine_cs *engine)
-{
-	int i;
-
-	if (!HAS_LEGACY_SEMAPHORES(dev_priv))
-		return;
-
-	GEM_BUG_ON(INTEL_GEN(dev_priv) < 6);
-	engine->semaphore.sync_to = gen6_ring_sync_to;
-	engine->semaphore.signal = gen6_signal;
-
-	/*
-	 * The current semaphore is only applied on pre-gen8
-	 * platform.  And there is no VCS2 ring on the pre-gen8
-	 * platform. So the semaphore between RCS and VCS2 is
-	 * initialized as INVALID.
-	 */
-	for (i = 0; i < GEN6_NUM_SEMAPHORES; i++) {
-		static const struct {
-			u32 wait_mbox;
-			i915_reg_t mbox_reg;
-		} sem_data[GEN6_NUM_SEMAPHORES][GEN6_NUM_SEMAPHORES] = {
-			[RCS_HW] = {
-				[VCS_HW] =  { .wait_mbox = MI_SEMAPHORE_SYNC_RV,  .mbox_reg = GEN6_VRSYNC },
-				[BCS_HW] =  { .wait_mbox = MI_SEMAPHORE_SYNC_RB,  .mbox_reg = GEN6_BRSYNC },
-				[VECS_HW] = { .wait_mbox = MI_SEMAPHORE_SYNC_RVE, .mbox_reg = GEN6_VERSYNC },
-			},
-			[VCS_HW] = {
-				[RCS_HW] =  { .wait_mbox = MI_SEMAPHORE_SYNC_VR,  .mbox_reg = GEN6_RVSYNC },
-				[BCS_HW] =  { .wait_mbox = MI_SEMAPHORE_SYNC_VB,  .mbox_reg = GEN6_BVSYNC },
-				[VECS_HW] = { .wait_mbox = MI_SEMAPHORE_SYNC_VVE, .mbox_reg = GEN6_VEVSYNC },
-			},
-			[BCS_HW] = {
-				[RCS_HW] =  { .wait_mbox = MI_SEMAPHORE_SYNC_BR,  .mbox_reg = GEN6_RBSYNC },
-				[VCS_HW] =  { .wait_mbox = MI_SEMAPHORE_SYNC_BV,  .mbox_reg = GEN6_VBSYNC },
-				[VECS_HW] = { .wait_mbox = MI_SEMAPHORE_SYNC_BVE, .mbox_reg = GEN6_VEBSYNC },
-			},
-			[VECS_HW] = {
-				[RCS_HW] =  { .wait_mbox = MI_SEMAPHORE_SYNC_VER, .mbox_reg = GEN6_RVESYNC },
-				[VCS_HW] =  { .wait_mbox = MI_SEMAPHORE_SYNC_VEV, .mbox_reg = GEN6_VVESYNC },
-				[BCS_HW] =  { .wait_mbox = MI_SEMAPHORE_SYNC_VEB, .mbox_reg = GEN6_BVESYNC },
-			},
-		};
-		u32 wait_mbox;
-		i915_reg_t mbox_reg;
-
-		if (i == engine->hw_id) {
-			wait_mbox = MI_SEMAPHORE_SYNC_INVALID;
-			mbox_reg = GEN6_NOSYNC;
-		} else {
-			wait_mbox = sem_data[engine->hw_id][i].wait_mbox;
-			mbox_reg = sem_data[engine->hw_id][i].mbox_reg;
-		}
-
-		engine->semaphore.mbox.wait[i] = wait_mbox;
-		engine->semaphore.mbox.signal[i] = mbox_reg;
-	}
-}
-
 static void intel_ring_init_irq(struct drm_i915_private *dev_priv,
 				struct intel_engine_cs *engine)
 {
 	if (INTEL_GEN(dev_priv) >= 6) {
 		engine->irq_enable = gen6_irq_enable;
 		engine->irq_disable = gen6_irq_disable;
-		engine->irq_seqno_barrier = gen6_seqno_barrier;
 	} else if (INTEL_GEN(dev_priv) >= 5) {
 		engine->irq_enable = gen5_irq_enable;
 		engine->irq_disable = gen5_irq_disable;
-		engine->irq_seqno_barrier = gen5_seqno_barrier;
 	} else if (INTEL_GEN(dev_priv) >= 3) {
 		engine->irq_enable = i9xx_irq_enable;
 		engine->irq_disable = i9xx_irq_disable;
@@ -2231,7 +2276,6 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
 	GEM_BUG_ON(INTEL_GEN(dev_priv) >= 8);
 
 	intel_ring_init_irq(dev_priv, engine);
-	intel_ring_init_semaphores(dev_priv, engine);
 
 	engine->init_hw = init_ring_common;
 	engine->reset.prepare = reset_prepare;
@@ -2241,18 +2285,14 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
 	engine->context_pin = intel_ring_context_pin;
 	engine->request_alloc = ring_request_alloc;
 
-	engine->emit_breadcrumb = i9xx_emit_breadcrumb;
-	engine->emit_breadcrumb_sz = i9xx_emit_breadcrumb_sz;
-	if (HAS_LEGACY_SEMAPHORES(dev_priv)) {
-		int num_rings;
-
-		engine->emit_breadcrumb = gen6_sema_emit_breadcrumb;
-
-		num_rings = INTEL_INFO(dev_priv)->num_rings - 1;
-		engine->emit_breadcrumb_sz += num_rings * 3;
-		if (num_rings & 1)
-			engine->emit_breadcrumb_sz++;
-	}
+	/*
+	 * Using a global execution timeline; the previous final breadcrumb is
+	 * equivalent to our next initial bread so we can elide
+	 * engine->emit_init_breadcrumb().
+	 */
+	engine->emit_fini_breadcrumb = i9xx_emit_breadcrumb;
+	if (IS_GEN(dev_priv, 5))
+		engine->emit_fini_breadcrumb = gen5_emit_breadcrumb;
 
 	engine->set_default_submission = i9xx_set_default_submission;
 
@@ -2278,12 +2318,15 @@ int intel_init_render_ring_buffer(struct intel_engine_cs *engine)
 
 	engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 
-	if (INTEL_GEN(dev_priv) >= 6) {
+	if (INTEL_GEN(dev_priv) >= 7) {
 		engine->init_context = intel_rcs_ctx_init;
 		engine->emit_flush = gen7_render_ring_flush;
-		if (IS_GEN6(dev_priv))
-			engine->emit_flush = gen6_render_ring_flush;
-	} else if (IS_GEN5(dev_priv)) {
+		engine->emit_fini_breadcrumb = gen7_rcs_emit_breadcrumb;
+	} else if (IS_GEN(dev_priv, 6)) {
+		engine->init_context = intel_rcs_ctx_init;
+		engine->emit_flush = gen6_render_ring_flush;
+		engine->emit_fini_breadcrumb = gen6_rcs_emit_breadcrumb;
+	} else if (IS_GEN(dev_priv, 5)) {
 		engine->emit_flush = gen4_render_ring_flush;
 	} else {
 		if (INTEL_GEN(dev_priv) < 4)
@@ -2313,13 +2356,18 @@ int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine)
 
 	if (INTEL_GEN(dev_priv) >= 6) {
 		/* gen6 bsd needs a special wa for tail updates */
-		if (IS_GEN6(dev_priv))
+		if (IS_GEN(dev_priv, 6))
 			engine->set_default_submission = gen6_bsd_set_default_submission;
 		engine->emit_flush = gen6_bsd_ring_flush;
 		engine->irq_enable_mask = GT_BSD_USER_INTERRUPT;
+
+		if (IS_GEN(dev_priv, 6))
+			engine->emit_fini_breadcrumb = gen6_xcs_emit_breadcrumb;
+		else
+			engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
 	} else {
 		engine->emit_flush = bsd_ring_flush;
-		if (IS_GEN5(dev_priv))
+		if (IS_GEN(dev_priv, 5))
 			engine->irq_enable_mask = ILK_BSD_USER_INTERRUPT;
 		else
 			engine->irq_enable_mask = I915_BSD_USER_INTERRUPT;
@@ -2332,11 +2380,18 @@ int intel_init_blt_ring_buffer(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 
+	GEM_BUG_ON(INTEL_GEN(dev_priv) < 6);
+
 	intel_ring_default_vfuncs(dev_priv, engine);
 
 	engine->emit_flush = gen6_ring_flush;
 	engine->irq_enable_mask = GT_BLT_USER_INTERRUPT;
 
+	if (IS_GEN(dev_priv, 6))
+		engine->emit_fini_breadcrumb = gen6_xcs_emit_breadcrumb;
+	else
+		engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
+
 	return intel_init_ring_buffer(engine);
 }
 
@@ -2344,6 +2399,8 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 
+	GEM_BUG_ON(INTEL_GEN(dev_priv) < 7);
+
 	intel_ring_default_vfuncs(dev_priv, engine);
 
 	engine->emit_flush = gen6_ring_flush;
@@ -2351,5 +2408,7 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
 	engine->irq_enable = hsw_vebox_irq_enable;
 	engine->irq_disable = hsw_vebox_irq_disable;
 
+	engine->emit_fini_breadcrumb = gen7_xcs_emit_breadcrumb;
+
 	return intel_init_ring_buffer(engine);
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index a1a7cc29fdd1..710ffb221775 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -5,6 +5,7 @@
 #include <drm/drm_util.h>
 
 #include <linux/hashtable.h>
+#include <linux/irq_work.h>
 #include <linux/seqlock.h>
 
 #include "i915_gem_batch_pool.h"
@@ -28,12 +29,11 @@ struct i915_sched_attr;
  * workarounds!
  */
 #define CACHELINE_BYTES 64
-#define CACHELINE_DWORDS (CACHELINE_BYTES / sizeof(uint32_t))
+#define CACHELINE_DWORDS (CACHELINE_BYTES / sizeof(u32))
 
 struct intel_hw_status_page {
 	struct i915_vma *vma;
-	u32 *page_addr;
-	u32 ggtt_offset;
+	u32 *addr;
 };
 
 #define I915_READ_TAIL(engine) I915_READ(RING_TAIL((engine)->mmio_base))
@@ -94,12 +94,12 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
 #define I915_MAX_SUBSLICES 8
 
 #define instdone_slice_mask(dev_priv__) \
-	(IS_GEN7(dev_priv__) ? \
-	 1 : INTEL_INFO(dev_priv__)->sseu.slice_mask)
+	(IS_GEN(dev_priv__, 7) ? \
+	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
 
 #define instdone_subslice_mask(dev_priv__) \
-	(IS_GEN7(dev_priv__) ? \
-	 1 : INTEL_INFO(dev_priv__)->sseu.subslice_mask[0])
+	(IS_GEN(dev_priv__, 7) ? \
+	 1 : RUNTIME_INFO(dev_priv__)->sseu.subslice_mask[0])
 
 #define for_each_instdone_slice_subslice(dev_priv__, slice__, subslice__) \
 	for ((slice__) = 0, (subslice__) = 0; \
@@ -120,13 +120,8 @@ struct intel_instdone {
 struct intel_engine_hangcheck {
 	u64 acthd;
 	u32 seqno;
-	enum intel_engine_hangcheck_action action;
 	unsigned long action_timestamp;
-	int deadlock;
 	struct intel_instdone instdone;
-	struct i915_request *active_request;
-	bool stalled:1;
-	bool wedged:1;
 };
 
 struct intel_ring {
@@ -209,6 +204,7 @@ struct i915_priolist {
 
 struct st_preempt_hang {
 	struct completion completion;
+	unsigned int count;
 	bool inject_hang;
 };
 
@@ -299,14 +295,18 @@ struct intel_engine_execlists {
 	unsigned int port_mask;
 
 	/**
-	 * @queue_priority: Highest pending priority.
+	 * @queue_priority_hint: Highest pending priority.
 	 *
 	 * When we add requests into the queue, or adjust the priority of
 	 * executing requests, we compute the maximum priority of those
 	 * pending requests. We can then use this value to determine if
 	 * we need to preempt the executing requests to service the queue.
+	 * However, since the we may have recorded the priority of an inflight
+	 * request we wanted to preempt but since completed, at the time of
+	 * dequeuing the priority hint may no longer may match the highest
+	 * available request priority.
 	 */
-	int queue_priority;
+	int queue_priority_hint;
 
 	/**
 	 * @queue: queue of requests, in priority lists
@@ -365,9 +365,6 @@ struct intel_engine_cs {
 	struct drm_i915_gem_object *default_state;
 	void *pinned_default_state;
 
-	unsigned long irq_posted;
-#define ENGINE_IRQ_BREADCRUMB 0
-
 	/* Rather than have every client wait upon all user interrupts,
 	 * with the herd waking after every interrupt and each doing the
 	 * heavyweight seqno dance, we delegate the task (of being the
@@ -385,23 +382,14 @@ struct intel_engine_cs {
 	 * the overhead of waking that client is much preferred.
 	 */
 	struct intel_breadcrumbs {
-		spinlock_t irq_lock; /* protects irq_*; irqsafe */
-		struct intel_wait *irq_wait; /* oldest waiter by retirement */
-
-		spinlock_t rb_lock; /* protects the rb and wraps irq_lock */
-		struct rb_root waiters; /* sorted by retirement, priority */
-		struct list_head signals; /* sorted by retirement */
-		struct task_struct *signaler; /* used for fence signalling */
+		spinlock_t irq_lock;
+		struct list_head signalers;
 
-		struct timer_list fake_irq; /* used after a missed interrupt */
-		struct timer_list hangcheck; /* detect missed interrupts */
+		struct irq_work irq_work; /* for use from inside irq_lock */
 
-		unsigned int hangcheck_interrupts;
 		unsigned int irq_enabled;
-		unsigned int irq_count;
 
-		bool irq_armed : 1;
-		I915_SELFTEST_DECLARE(bool mock : 1);
+		bool irq_armed;
 	} breadcrumbs;
 
 	struct {
@@ -449,9 +437,8 @@ struct intel_engine_cs {
 	int		(*init_hw)(struct intel_engine_cs *engine);
 
 	struct {
-		struct i915_request *(*prepare)(struct intel_engine_cs *engine);
-		void (*reset)(struct intel_engine_cs *engine,
-			      struct i915_request *rq);
+		void (*prepare)(struct intel_engine_cs *engine);
+		void (*reset)(struct intel_engine_cs *engine, bool stalled);
 		void (*finish)(struct intel_engine_cs *engine);
 	} reset;
 
@@ -475,8 +462,10 @@ struct intel_engine_cs {
 					 unsigned int dispatch_flags);
 #define I915_DISPATCH_SECURE BIT(0)
 #define I915_DISPATCH_PINNED BIT(1)
-	void		(*emit_breadcrumb)(struct i915_request *rq, u32 *cs);
-	int		emit_breadcrumb_sz;
+	int		 (*emit_init_breadcrumb)(struct i915_request *rq);
+	u32		*(*emit_fini_breadcrumb)(struct i915_request *rq,
+						 u32 *cs);
+	unsigned int	emit_fini_breadcrumb_dw;
 
 	/* Pass the request to the hardware queue (e.g. directly into
 	 * the legacy ringbuffer or to the end of an execlist).
@@ -502,69 +491,8 @@ struct intel_engine_cs {
 	 */
 	void		(*cancel_requests)(struct intel_engine_cs *engine);
 
-	/* Some chipsets are not quite as coherent as advertised and need
-	 * an expensive kick to force a true read of the up-to-date seqno.
-	 * However, the up-to-date seqno is not always required and the last
-	 * seen value is good enough. Note that the seqno will always be
-	 * monotonic, even if not coherent.
-	 */
-	void		(*irq_seqno_barrier)(struct intel_engine_cs *engine);
 	void		(*cleanup)(struct intel_engine_cs *engine);
 
-	/* GEN8 signal/wait table - never trust comments!
-	 *	  signal to	signal to    signal to   signal to      signal to
-	 *	    RCS		   VCS          BCS        VECS		 VCS2
-	 *      --------------------------------------------------------------------
-	 *  RCS | NOP (0x00) | VCS (0x08) | BCS (0x10) | VECS (0x18) | VCS2 (0x20) |
-	 *	|-------------------------------------------------------------------
-	 *  VCS | RCS (0x28) | NOP (0x30) | BCS (0x38) | VECS (0x40) | VCS2 (0x48) |
-	 *	|-------------------------------------------------------------------
-	 *  BCS | RCS (0x50) | VCS (0x58) | NOP (0x60) | VECS (0x68) | VCS2 (0x70) |
-	 *	|-------------------------------------------------------------------
-	 * VECS | RCS (0x78) | VCS (0x80) | BCS (0x88) |  NOP (0x90) | VCS2 (0x98) |
-	 *	|-------------------------------------------------------------------
-	 * VCS2 | RCS (0xa0) | VCS (0xa8) | BCS (0xb0) | VECS (0xb8) | NOP  (0xc0) |
-	 *	|-------------------------------------------------------------------
-	 *
-	 * Generalization:
-	 *  f(x, y) := (x->id * NUM_RINGS * seqno_size) + (seqno_size * y->id)
-	 *  ie. transpose of g(x, y)
-	 *
-	 *	 sync from	sync from    sync from    sync from	sync from
-	 *	    RCS		   VCS          BCS        VECS		 VCS2
-	 *      --------------------------------------------------------------------
-	 *  RCS | NOP (0x00) | VCS (0x28) | BCS (0x50) | VECS (0x78) | VCS2 (0xa0) |
-	 *	|-------------------------------------------------------------------
-	 *  VCS | RCS (0x08) | NOP (0x30) | BCS (0x58) | VECS (0x80) | VCS2 (0xa8) |
-	 *	|-------------------------------------------------------------------
-	 *  BCS | RCS (0x10) | VCS (0x38) | NOP (0x60) | VECS (0x88) | VCS2 (0xb0) |
-	 *	|-------------------------------------------------------------------
-	 * VECS | RCS (0x18) | VCS (0x40) | BCS (0x68) |  NOP (0x90) | VCS2 (0xb8) |
-	 *	|-------------------------------------------------------------------
-	 * VCS2 | RCS (0x20) | VCS (0x48) | BCS (0x70) | VECS (0x98) |  NOP (0xc0) |
-	 *	|-------------------------------------------------------------------
-	 *
-	 * Generalization:
-	 *  g(x, y) := (y->id * NUM_RINGS * seqno_size) + (seqno_size * x->id)
-	 *  ie. transpose of f(x, y)
-	 */
-	struct {
-#define GEN6_SEMAPHORE_LAST	VECS_HW
-#define GEN6_NUM_SEMAPHORES	(GEN6_SEMAPHORE_LAST + 1)
-#define GEN6_SEMAPHORES_MASK	GENMASK(GEN6_SEMAPHORE_LAST, 0)
-		struct {
-			/* our mbox written by others */
-			u32		wait[GEN6_NUM_SEMAPHORES];
-			/* mboxes this ring signals to */
-			i915_reg_t	signal[GEN6_NUM_SEMAPHORES];
-		} mbox;
-
-		/* AKA wait() */
-		int	(*sync_to)(struct i915_request *rq,
-				   struct i915_request *signal);
-		u32	*(*signal)(struct i915_request *rq, u32 *cs);
-	} semaphore;
-
 	struct intel_engine_execlists execlists;
 
 	/* Contexts are pinned whilst they are active on the GPU. The last
@@ -665,7 +593,20 @@ intel_engine_has_preemption(const struct intel_engine_cs *engine)
 
 static inline bool __execlists_need_preempt(int prio, int last)
 {
-	return prio > max(0, last);
+	/*
+	 * Allow preemption of low -> normal -> high, but we do
+	 * not allow low priority tasks to preempt other low priority
+	 * tasks under the impression that latency for low priority
+	 * tasks does not matter (as much as background throughput),
+	 * so kiss.
+	 *
+	 * More naturally we would write
+	 *	prio >= max(0, last);
+	 * except that we wish to prevent triggering preemption at the same
+	 * priority level: the task that is running should remain running
+	 * to preserve FIFO ordering of dependencies.
+	 */
+	return prio > max(I915_PRIORITY_NORMAL - 1, last);
 }
 
 static inline void
@@ -743,7 +684,7 @@ static inline u32
 intel_read_status_page(const struct intel_engine_cs *engine, int reg)
 {
 	/* Ensure that the compiler doesn't optimize away the load. */
-	return READ_ONCE(engine->status_page.page_addr[reg]);
+	return READ_ONCE(engine->status_page.addr[reg]);
 }
 
 static inline void
@@ -756,12 +697,12 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
 	 */
 	if (static_cpu_has(X86_FEATURE_CLFLUSH)) {
 		mb();
-		clflush(&engine->status_page.page_addr[reg]);
-		engine->status_page.page_addr[reg] = value;
-		clflush(&engine->status_page.page_addr[reg]);
+		clflush(&engine->status_page.addr[reg]);
+		engine->status_page.addr[reg] = value;
+		clflush(&engine->status_page.addr[reg]);
 		mb();
 	} else {
-		WRITE_ONCE(engine->status_page.page_addr[reg], value);
+		WRITE_ONCE(engine->status_page.addr[reg], value);
 	}
 }
 
@@ -782,11 +723,13 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
  * The area from dword 0x30 to 0x3ff is available for driver usage.
  */
 #define I915_GEM_HWS_INDEX		0x30
-#define I915_GEM_HWS_INDEX_ADDR (I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
-#define I915_GEM_HWS_PREEMPT_INDEX	0x32
-#define I915_GEM_HWS_PREEMPT_ADDR (I915_GEM_HWS_PREEMPT_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
-#define I915_GEM_HWS_SCRATCH_INDEX	0x40
-#define I915_GEM_HWS_SCRATCH_ADDR (I915_GEM_HWS_SCRATCH_INDEX << MI_STORE_DWORD_INDEX_SHIFT)
+#define I915_GEM_HWS_INDEX_ADDR		(I915_GEM_HWS_INDEX * sizeof(u32))
+#define I915_GEM_HWS_PREEMPT		0x32
+#define I915_GEM_HWS_PREEMPT_ADDR	(I915_GEM_HWS_PREEMPT * sizeof(u32))
+#define I915_GEM_HWS_SEQNO		0x40
+#define I915_GEM_HWS_SEQNO_ADDR		(I915_GEM_HWS_SEQNO * sizeof(u32))
+#define I915_GEM_HWS_SCRATCH		0x80
+#define I915_GEM_HWS_SCRATCH_ADDR	(I915_GEM_HWS_SCRATCH * sizeof(u32))
 
 #define I915_HWS_CSB_BUF0_INDEX		0x10
 #define I915_HWS_CSB_WRITE_INDEX	0x1f
@@ -809,7 +752,6 @@ void intel_legacy_submission_resume(struct drm_i915_private *dev_priv);
 
 int __must_check intel_ring_cacheline_align(struct i915_request *rq);
 
-int intel_ring_wait_for_space(struct intel_ring *ring, unsigned int bytes);
 u32 __must_check *intel_ring_begin(struct i915_request *rq, unsigned int n);
 
 static inline void intel_ring_advance(struct i915_request *rq, u32 *cs)
@@ -890,9 +832,21 @@ intel_ring_set_tail(struct intel_ring *ring, unsigned int tail)
 	return tail;
 }
 
-void intel_engine_init_global_seqno(struct intel_engine_cs *engine, u32 seqno);
+static inline unsigned int
+__intel_ring_space(unsigned int head, unsigned int tail, unsigned int size)
+{
+	/*
+	 * "If the Ring Buffer Head Pointer and the Tail Pointer are on the
+	 * same cacheline, the Head Pointer must not be greater than the Tail
+	 * Pointer."
+	 */
+	GEM_BUG_ON(!is_power_of_2(size));
+	return (head - tail - CACHELINE_BYTES) & (size - 1);
+}
+
+void intel_engine_write_global_seqno(struct intel_engine_cs *engine, u32 seqno);
 
-void intel_engine_setup_common(struct intel_engine_cs *engine);
+int intel_engine_setup_common(struct intel_engine_cs *engine);
 int intel_engine_init_common(struct intel_engine_cs *engine);
 void intel_engine_cleanup_common(struct intel_engine_cs *engine);
 
@@ -904,6 +858,8 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine);
 int intel_engine_stop_cs(struct intel_engine_cs *engine);
 void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine);
 
+void intel_engine_set_hwsp_writemask(struct intel_engine_cs *engine, u32 mask);
+
 u64 intel_engine_get_active_head(const struct intel_engine_cs *engine);
 u64 intel_engine_get_last_batch_head(const struct intel_engine_cs *engine);
 
@@ -948,102 +904,29 @@ static inline bool intel_engine_has_started(struct intel_engine_cs *engine,
 void intel_engine_get_instdone(struct intel_engine_cs *engine,
 			       struct intel_instdone *instdone);
 
-/*
- * Arbitrary size for largest possible 'add request' sequence. The code paths
- * are complex and variable. Empirical measurement shows that the worst case
- * is BDW at 192 bytes (6 + 6 + 36 dwords), then ILK at 136 bytes. However,
- * we need to allocate double the largest single packet within that emission
- * to account for tail wraparound (so 6 + 6 + 72 dwords for BDW).
- */
-#define MIN_SPACE_FOR_ADD_REQUEST 336
-
-static inline u32 intel_hws_seqno_address(struct intel_engine_cs *engine)
-{
-	return engine->status_page.ggtt_offset + I915_GEM_HWS_INDEX_ADDR;
-}
-
-static inline u32 intel_hws_preempt_done_address(struct intel_engine_cs *engine)
-{
-	return engine->status_page.ggtt_offset + I915_GEM_HWS_PREEMPT_ADDR;
-}
-
-/* intel_breadcrumbs.c -- user interrupt bottom-half for waiters */
-int intel_engine_init_breadcrumbs(struct intel_engine_cs *engine);
-
-static inline void intel_wait_init(struct intel_wait *wait)
-{
-	wait->tsk = current;
-	wait->request = NULL;
-}
-
-static inline void intel_wait_init_for_seqno(struct intel_wait *wait, u32 seqno)
-{
-	wait->tsk = current;
-	wait->seqno = seqno;
-}
-
-static inline bool intel_wait_has_seqno(const struct intel_wait *wait)
-{
-	return wait->seqno;
-}
-
-static inline bool
-intel_wait_update_seqno(struct intel_wait *wait, u32 seqno)
-{
-	wait->seqno = seqno;
-	return intel_wait_has_seqno(wait);
-}
-
-static inline bool
-intel_wait_update_request(struct intel_wait *wait,
-			  const struct i915_request *rq)
-{
-	return intel_wait_update_seqno(wait, i915_request_global_seqno(rq));
-}
-
-static inline bool
-intel_wait_check_seqno(const struct intel_wait *wait, u32 seqno)
-{
-	return wait->seqno == seqno;
-}
-
-static inline bool
-intel_wait_check_request(const struct intel_wait *wait,
-			 const struct i915_request *rq)
-{
-	return intel_wait_check_seqno(wait, i915_request_global_seqno(rq));
-}
+void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine);
+void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine);
 
-static inline bool intel_wait_complete(const struct intel_wait *wait)
-{
-	return RB_EMPTY_NODE(&wait->node);
-}
+void intel_engine_pin_breadcrumbs_irq(struct intel_engine_cs *engine);
+void intel_engine_unpin_breadcrumbs_irq(struct intel_engine_cs *engine);
 
-bool intel_engine_add_wait(struct intel_engine_cs *engine,
-			   struct intel_wait *wait);
-void intel_engine_remove_wait(struct intel_engine_cs *engine,
-			      struct intel_wait *wait);
-bool intel_engine_enable_signaling(struct i915_request *request, bool wakeup);
-void intel_engine_cancel_signaling(struct i915_request *request);
+bool intel_engine_signal_breadcrumbs(struct intel_engine_cs *engine);
+void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine);
 
-static inline bool intel_engine_has_waiter(const struct intel_engine_cs *engine)
+static inline void
+intel_engine_queue_breadcrumbs(struct intel_engine_cs *engine)
 {
-	return READ_ONCE(engine->breadcrumbs.irq_wait);
+	irq_work_queue(&engine->breadcrumbs.irq_work);
 }
 
-unsigned int intel_engine_wakeup(struct intel_engine_cs *engine);
-#define ENGINE_WAKEUP_WAITER BIT(0)
-#define ENGINE_WAKEUP_ASLEEP BIT(1)
-
-void intel_engine_pin_breadcrumbs_irq(struct intel_engine_cs *engine);
-void intel_engine_unpin_breadcrumbs_irq(struct intel_engine_cs *engine);
-
-void __intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine);
-void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine);
+bool intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine);
 
 void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine);
 void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine);
 
+void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine,
+				    struct drm_printer *p);
+
 static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
 {
 	memset(batch, 0, 6 * sizeof(u32));
@@ -1056,7 +939,7 @@ static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
 }
 
 static inline u32 *
-gen8_emit_ggtt_write_rcs(u32 *cs, u32 value, u32 gtt_offset)
+gen8_emit_ggtt_write_rcs(u32 *cs, u32 value, u32 gtt_offset, u32 flags)
 {
 	/* We're using qword write, offset should be aligned to 8 bytes. */
 	GEM_BUG_ON(!IS_ALIGNED(gtt_offset, 8));
@@ -1066,8 +949,7 @@ gen8_emit_ggtt_write_rcs(u32 *cs, u32 value, u32 gtt_offset)
 	 * following the batch.
 	 */
 	*cs++ = GFX_OP_PIPE_CONTROL(6);
-	*cs++ = PIPE_CONTROL_GLOBAL_GTT_IVB | PIPE_CONTROL_CS_STALL |
-		PIPE_CONTROL_QW_WRITE;
+	*cs++ = flags | PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_GLOBAL_GTT_IVB;
 	*cs++ = gtt_offset;
 	*cs++ = 0;
 	*cs++ = value;
@@ -1093,7 +975,14 @@ gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset)
 	return cs;
 }
 
-void intel_engines_sanitize(struct drm_i915_private *i915);
+static inline void intel_engine_reset(struct intel_engine_cs *engine,
+				      bool stalled)
+{
+	if (engine->reset.reset)
+		engine->reset.reset(engine, stalled);
+}
+
+void intel_engines_sanitize(struct drm_i915_private *i915, bool force);
 
 bool intel_engine_is_idle(struct intel_engine_cs *engine);
 bool intel_engines_are_idle(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 4350a5270423..a017a4232c0f 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -29,6 +29,8 @@
 #include <linux/pm_runtime.h>
 #include <linux/vgaarb.h>
 
+#include <drm/drm_print.h>
+
 #include "i915_drv.h"
 #include "intel_drv.h"
 
@@ -49,6 +51,268 @@
  * present for a given platform.
  */
 
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
+
+#include <linux/sort.h>
+
+#define STACKDEPTH 8
+
+static noinline depot_stack_handle_t __save_depot_stack(void)
+{
+	unsigned long entries[STACKDEPTH];
+	struct stack_trace trace = {
+		.entries = entries,
+		.max_entries = ARRAY_SIZE(entries),
+		.skip = 1,
+	};
+
+	save_stack_trace(&trace);
+	if (trace.nr_entries &&
+	    trace.entries[trace.nr_entries - 1] == ULONG_MAX)
+		trace.nr_entries--;
+
+	return depot_save_stack(&trace, GFP_NOWAIT | __GFP_NOWARN);
+}
+
+static void __print_depot_stack(depot_stack_handle_t stack,
+				char *buf, int sz, int indent)
+{
+	unsigned long entries[STACKDEPTH];
+	struct stack_trace trace = {
+		.entries = entries,
+		.max_entries = ARRAY_SIZE(entries),
+	};
+
+	depot_fetch_stack(stack, &trace);
+	snprint_stack_trace(buf, sz, &trace, indent);
+}
+
+static void init_intel_runtime_pm_wakeref(struct drm_i915_private *i915)
+{
+	struct i915_runtime_pm *rpm = &i915->runtime_pm;
+
+	spin_lock_init(&rpm->debug.lock);
+}
+
+static noinline depot_stack_handle_t
+track_intel_runtime_pm_wakeref(struct drm_i915_private *i915)
+{
+	struct i915_runtime_pm *rpm = &i915->runtime_pm;
+	depot_stack_handle_t stack, *stacks;
+	unsigned long flags;
+
+	atomic_inc(&rpm->wakeref_count);
+	assert_rpm_wakelock_held(i915);
+
+	if (!HAS_RUNTIME_PM(i915))
+		return -1;
+
+	stack = __save_depot_stack();
+	if (!stack)
+		return -1;
+
+	spin_lock_irqsave(&rpm->debug.lock, flags);
+
+	if (!rpm->debug.count)
+		rpm->debug.last_acquire = stack;
+
+	stacks = krealloc(rpm->debug.owners,
+			  (rpm->debug.count + 1) * sizeof(*stacks),
+			  GFP_NOWAIT | __GFP_NOWARN);
+	if (stacks) {
+		stacks[rpm->debug.count++] = stack;
+		rpm->debug.owners = stacks;
+	} else {
+		stack = -1;
+	}
+
+	spin_unlock_irqrestore(&rpm->debug.lock, flags);
+
+	return stack;
+}
+
+static void cancel_intel_runtime_pm_wakeref(struct drm_i915_private *i915,
+					    depot_stack_handle_t stack)
+{
+	struct i915_runtime_pm *rpm = &i915->runtime_pm;
+	unsigned long flags, n;
+	bool found = false;
+
+	if (unlikely(stack == -1))
+		return;
+
+	spin_lock_irqsave(&rpm->debug.lock, flags);
+	for (n = rpm->debug.count; n--; ) {
+		if (rpm->debug.owners[n] == stack) {
+			memmove(rpm->debug.owners + n,
+				rpm->debug.owners + n + 1,
+				(--rpm->debug.count - n) * sizeof(stack));
+			found = true;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&rpm->debug.lock, flags);
+
+	if (WARN(!found,
+		 "Unmatched wakeref (tracking %lu), count %u\n",
+		 rpm->debug.count, atomic_read(&rpm->wakeref_count))) {
+		char *buf;
+
+		buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
+		if (!buf)
+			return;
+
+		__print_depot_stack(stack, buf, PAGE_SIZE, 2);
+		DRM_DEBUG_DRIVER("wakeref %x from\n%s", stack, buf);
+
+		stack = READ_ONCE(rpm->debug.last_release);
+		if (stack) {
+			__print_depot_stack(stack, buf, PAGE_SIZE, 2);
+			DRM_DEBUG_DRIVER("wakeref last released at\n%s", buf);
+		}
+
+		kfree(buf);
+	}
+}
+
+static int cmphandle(const void *_a, const void *_b)
+{
+	const depot_stack_handle_t * const a = _a, * const b = _b;
+
+	if (*a < *b)
+		return -1;
+	else if (*a > *b)
+		return 1;
+	else
+		return 0;
+}
+
+static void
+__print_intel_runtime_pm_wakeref(struct drm_printer *p,
+				 const struct intel_runtime_pm_debug *dbg)
+{
+	unsigned long i;
+	char *buf;
+
+	buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (!buf)
+		return;
+
+	if (dbg->last_acquire) {
+		__print_depot_stack(dbg->last_acquire, buf, PAGE_SIZE, 2);
+		drm_printf(p, "Wakeref last acquired:\n%s", buf);
+	}
+
+	if (dbg->last_release) {
+		__print_depot_stack(dbg->last_release, buf, PAGE_SIZE, 2);
+		drm_printf(p, "Wakeref last released:\n%s", buf);
+	}
+
+	drm_printf(p, "Wakeref count: %lu\n", dbg->count);
+
+	sort(dbg->owners, dbg->count, sizeof(*dbg->owners), cmphandle, NULL);
+
+	for (i = 0; i < dbg->count; i++) {
+		depot_stack_handle_t stack = dbg->owners[i];
+		unsigned long rep;
+
+		rep = 1;
+		while (i + 1 < dbg->count && dbg->owners[i + 1] == stack)
+			rep++, i++;
+		__print_depot_stack(stack, buf, PAGE_SIZE, 2);
+		drm_printf(p, "Wakeref x%lu taken at:\n%s", rep, buf);
+	}
+
+	kfree(buf);
+}
+
+static noinline void
+untrack_intel_runtime_pm_wakeref(struct drm_i915_private *i915)
+{
+	struct i915_runtime_pm *rpm = &i915->runtime_pm;
+	struct intel_runtime_pm_debug dbg = {};
+	struct drm_printer p;
+	unsigned long flags;
+
+	assert_rpm_wakelock_held(i915);
+	if (atomic_dec_and_lock_irqsave(&rpm->wakeref_count,
+					&rpm->debug.lock,
+					flags)) {
+		dbg = rpm->debug;
+
+		rpm->debug.owners = NULL;
+		rpm->debug.count = 0;
+		rpm->debug.last_release = __save_depot_stack();
+
+		spin_unlock_irqrestore(&rpm->debug.lock, flags);
+	}
+	if (!dbg.count)
+		return;
+
+	p = drm_debug_printer("i915");
+	__print_intel_runtime_pm_wakeref(&p, &dbg);
+
+	kfree(dbg.owners);
+}
+
+void print_intel_runtime_pm_wakeref(struct drm_i915_private *i915,
+				    struct drm_printer *p)
+{
+	struct intel_runtime_pm_debug dbg = {};
+
+	do {
+		struct i915_runtime_pm *rpm = &i915->runtime_pm;
+		unsigned long alloc = dbg.count;
+		depot_stack_handle_t *s;
+
+		spin_lock_irq(&rpm->debug.lock);
+		dbg.count = rpm->debug.count;
+		if (dbg.count <= alloc) {
+			memcpy(dbg.owners,
+			       rpm->debug.owners,
+			       dbg.count * sizeof(*s));
+		}
+		dbg.last_acquire = rpm->debug.last_acquire;
+		dbg.last_release = rpm->debug.last_release;
+		spin_unlock_irq(&rpm->debug.lock);
+		if (dbg.count <= alloc)
+			break;
+
+		s = krealloc(dbg.owners, dbg.count * sizeof(*s), GFP_KERNEL);
+		if (!s)
+			goto out;
+
+		dbg.owners = s;
+	} while (1);
+
+	__print_intel_runtime_pm_wakeref(p, &dbg);
+
+out:
+	kfree(dbg.owners);
+}
+
+#else
+
+static void init_intel_runtime_pm_wakeref(struct drm_i915_private *i915)
+{
+}
+
+static depot_stack_handle_t
+track_intel_runtime_pm_wakeref(struct drm_i915_private *i915)
+{
+	atomic_inc(&i915->runtime_pm.wakeref_count);
+	assert_rpm_wakelock_held(i915);
+	return -1;
+}
+
+static void untrack_intel_runtime_pm_wakeref(struct drm_i915_private *i915)
+{
+	assert_rpm_wakelock_held(i915);
+	atomic_dec(&i915->runtime_pm.wakeref_count);
+}
+
+#endif
+
 bool intel_display_power_well_is_enabled(struct drm_i915_private *dev_priv,
 					 enum i915_power_well_id power_well_id);
 
@@ -509,7 +773,7 @@ static bool hsw_power_well_enabled(struct drm_i915_private *dev_priv,
 	 * BIOS's own request bits, which are forced-on for these power wells
 	 * when exiting DC5/6.
 	 */
-	if (IS_GEN9(dev_priv) && !IS_GEN9_LP(dev_priv) &&
+	if (IS_GEN(dev_priv, 9) && !IS_GEN9_LP(dev_priv) &&
 	    (id == SKL_DISP_PW_1 || id == SKL_DISP_PW_MISC_IO))
 		val |= I915_READ(regs->bios);
 
@@ -639,10 +903,10 @@ void gen9_sanitize_dc_state(struct drm_i915_private *dev_priv)
  * back on and register state is restored. This is guaranteed by the MMIO write
  * to DC_STATE_EN blocking until the state is restored.
  */
-static void gen9_set_dc_state(struct drm_i915_private *dev_priv, uint32_t state)
+static void gen9_set_dc_state(struct drm_i915_private *dev_priv, u32 state)
 {
-	uint32_t val;
-	uint32_t mask;
+	u32 val;
+	u32 mask;
 
 	if (WARN_ON_ONCE(state & ~dev_priv->csr.allowed_dc_mask))
 		state &= dev_priv->csr.allowed_dc_mask;
@@ -1274,7 +1538,7 @@ static void chv_dpio_cmn_power_well_enable(struct drm_i915_private *dev_priv,
 {
 	enum dpio_phy phy;
 	enum pipe pipe;
-	uint32_t tmp;
+	u32 tmp;
 
 	WARN_ON_ONCE(power_well->desc->id != VLV_DISP_PW_DPIO_CMN_BC &&
 		     power_well->desc->id != CHV_DISP_PW_DPIO_CMN_D);
@@ -1591,18 +1855,19 @@ __intel_display_power_get_domain(struct drm_i915_private *dev_priv,
  * Any power domain reference obtained by this function must have a symmetric
  * call to intel_display_power_put() to release the reference again.
  */
-void intel_display_power_get(struct drm_i915_private *dev_priv,
-			     enum intel_display_power_domain domain)
+intel_wakeref_t intel_display_power_get(struct drm_i915_private *dev_priv,
+					enum intel_display_power_domain domain)
 {
 	struct i915_power_domains *power_domains = &dev_priv->power_domains;
-
-	intel_runtime_pm_get(dev_priv);
+	intel_wakeref_t wakeref = intel_runtime_pm_get(dev_priv);
 
 	mutex_lock(&power_domains->lock);
 
 	__intel_display_power_get_domain(dev_priv, domain);
 
 	mutex_unlock(&power_domains->lock);
+
+	return wakeref;
 }
 
 /**
@@ -1617,13 +1882,16 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
  * Any power domain reference obtained by this function must have a symmetric
  * call to intel_display_power_put() to release the reference again.
  */
-bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
-					enum intel_display_power_domain domain)
+intel_wakeref_t
+intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
+				   enum intel_display_power_domain domain)
 {
 	struct i915_power_domains *power_domains = &dev_priv->power_domains;
+	intel_wakeref_t wakeref;
 	bool is_enabled;
 
-	if (!intel_runtime_pm_get_if_in_use(dev_priv))
+	wakeref = intel_runtime_pm_get_if_in_use(dev_priv);
+	if (!wakeref)
 		return false;
 
 	mutex_lock(&power_domains->lock);
@@ -1637,23 +1905,16 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
 
 	mutex_unlock(&power_domains->lock);
 
-	if (!is_enabled)
-		intel_runtime_pm_put(dev_priv);
+	if (!is_enabled) {
+		intel_runtime_pm_put(dev_priv, wakeref);
+		wakeref = 0;
+	}
 
-	return is_enabled;
+	return wakeref;
 }
 
-/**
- * intel_display_power_put - release a power domain reference
- * @dev_priv: i915 device instance
- * @domain: power domain to reference
- *
- * This function drops the power domain reference obtained by
- * intel_display_power_get() and might power down the corresponding hardware
- * block right away if this is the last reference.
- */
-void intel_display_power_put(struct drm_i915_private *dev_priv,
-			     enum intel_display_power_domain domain)
+static void __intel_display_power_put(struct drm_i915_private *dev_priv,
+				      enum intel_display_power_domain domain)
 {
 	struct i915_power_domains *power_domains;
 	struct i915_power_well *power_well;
@@ -1671,9 +1932,33 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
 		intel_power_well_put(dev_priv, power_well);
 
 	mutex_unlock(&power_domains->lock);
+}
+
+/**
+ * intel_display_power_put - release a power domain reference
+ * @dev_priv: i915 device instance
+ * @domain: power domain to reference
+ *
+ * This function drops the power domain reference obtained by
+ * intel_display_power_get() and might power down the corresponding hardware
+ * block right away if this is the last reference.
+ */
+void intel_display_power_put_unchecked(struct drm_i915_private *dev_priv,
+				       enum intel_display_power_domain domain)
+{
+	__intel_display_power_put(dev_priv, domain);
+	intel_runtime_pm_put_unchecked(dev_priv);
+}
 
-	intel_runtime_pm_put(dev_priv);
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
+void intel_display_power_put(struct drm_i915_private *dev_priv,
+			     enum intel_display_power_domain domain,
+			     intel_wakeref_t wakeref)
+{
+	__intel_display_power_put(dev_priv, domain);
+	intel_runtime_pm_put(dev_priv, wakeref);
 }
+#endif
 
 #define I830_PIPES_POWER_DOMAINS (		\
 	BIT_ULL(POWER_DOMAIN_PIPE_A) |		\
@@ -3043,10 +3328,10 @@ sanitize_disable_power_well_option(const struct drm_i915_private *dev_priv,
 	return 1;
 }
 
-static uint32_t get_allowed_dc_mask(const struct drm_i915_private *dev_priv,
-				    int enable_dc)
+static u32 get_allowed_dc_mask(const struct drm_i915_private *dev_priv,
+			       int enable_dc)
 {
-	uint32_t mask;
+	u32 mask;
 	int requested_dc;
 	int max_dc;
 
@@ -3058,7 +3343,7 @@ static uint32_t get_allowed_dc_mask(const struct drm_i915_private *dev_priv,
 		 * suspend/resume, so allow it unconditionally.
 		 */
 		mask = DC_STATE_EN_DC9;
-	} else if (IS_GEN10(dev_priv) || IS_GEN9_BC(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 10) || IS_GEN9_BC(dev_priv)) {
 		max_dc = 2;
 		mask = 0;
 	} else if (IS_GEN9_LP(dev_priv)) {
@@ -3311,7 +3596,7 @@ static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
 
 static void icl_mbus_init(struct drm_i915_private *dev_priv)
 {
-	uint32_t val;
+	u32 val;
 
 	val = MBUS_ABOX_BT_CREDIT_POOL1(16) |
 	      MBUS_ABOX_BT_CREDIT_POOL2(16) |
@@ -3622,7 +3907,7 @@ static void chv_phy_control_init(struct drm_i915_private *dev_priv)
 	 * current lane status.
 	 */
 	if (cmn_bc->desc->ops->is_enabled(dev_priv, cmn_bc)) {
-		uint32_t status = I915_READ(DPLL(PIPE_A));
+		u32 status = I915_READ(DPLL(PIPE_A));
 		unsigned int mask;
 
 		mask = status & DPLL_PORTB_READY_MASK;
@@ -3653,7 +3938,7 @@ static void chv_phy_control_init(struct drm_i915_private *dev_priv)
 	}
 
 	if (cmn_d->desc->ops->is_enabled(dev_priv, cmn_d)) {
-		uint32_t status = I915_READ(DPIO_PHY_STATUS);
+		u32 status = I915_READ(DPIO_PHY_STATUS);
 		unsigned int mask;
 
 		mask = status & DPLL_PORTD_READY_MASK;
@@ -3712,7 +3997,7 @@ static void intel_power_domains_verify_state(struct drm_i915_private *dev_priv);
 
 /**
  * intel_power_domains_init_hw - initialize hardware power domain state
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  * @resume: Called from resume code paths or not
  *
  * This function initializes the hardware power domain state and enables all
@@ -3726,30 +4011,31 @@ static void intel_power_domains_verify_state(struct drm_i915_private *dev_priv);
  * intel_power_domains_enable()) and must be paired with
  * intel_power_domains_fini_hw().
  */
-void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
+void intel_power_domains_init_hw(struct drm_i915_private *i915, bool resume)
 {
-	struct i915_power_domains *power_domains = &dev_priv->power_domains;
+	struct i915_power_domains *power_domains = &i915->power_domains;
 
 	power_domains->initializing = true;
 
-	if (IS_ICELAKE(dev_priv)) {
-		icl_display_core_init(dev_priv, resume);
-	} else if (IS_CANNONLAKE(dev_priv)) {
-		cnl_display_core_init(dev_priv, resume);
-	} else if (IS_GEN9_BC(dev_priv)) {
-		skl_display_core_init(dev_priv, resume);
-	} else if (IS_GEN9_LP(dev_priv)) {
-		bxt_display_core_init(dev_priv, resume);
-	} else if (IS_CHERRYVIEW(dev_priv)) {
+	if (IS_ICELAKE(i915)) {
+		icl_display_core_init(i915, resume);
+	} else if (IS_CANNONLAKE(i915)) {
+		cnl_display_core_init(i915, resume);
+	} else if (IS_GEN9_BC(i915)) {
+		skl_display_core_init(i915, resume);
+	} else if (IS_GEN9_LP(i915)) {
+		bxt_display_core_init(i915, resume);
+	} else if (IS_CHERRYVIEW(i915)) {
 		mutex_lock(&power_domains->lock);
-		chv_phy_control_init(dev_priv);
+		chv_phy_control_init(i915);
 		mutex_unlock(&power_domains->lock);
-	} else if (IS_VALLEYVIEW(dev_priv)) {
+	} else if (IS_VALLEYVIEW(i915)) {
 		mutex_lock(&power_domains->lock);
-		vlv_cmnlane_wa(dev_priv);
+		vlv_cmnlane_wa(i915);
 		mutex_unlock(&power_domains->lock);
-	} else if (IS_IVYBRIDGE(dev_priv) || INTEL_GEN(dev_priv) >= 7)
-		intel_pch_reset_handshake(dev_priv, !HAS_PCH_NOP(dev_priv));
+	} else if (IS_IVYBRIDGE(i915) || INTEL_GEN(i915) >= 7) {
+		intel_pch_reset_handshake(i915, !HAS_PCH_NOP(i915));
+	}
 
 	/*
 	 * Keep all power wells enabled for any dependent HW access during
@@ -3757,18 +4043,20 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
 	 * resources powered until display HW readout is complete. We drop
 	 * this reference in intel_power_domains_enable().
 	 */
-	intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+	power_domains->wakeref =
+		intel_display_power_get(i915, POWER_DOMAIN_INIT);
+
 	/* Disable power support if the user asked so. */
 	if (!i915_modparams.disable_power_well)
-		intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
-	intel_power_domains_sync_hw(dev_priv);
+		intel_display_power_get(i915, POWER_DOMAIN_INIT);
+	intel_power_domains_sync_hw(i915);
 
 	power_domains->initializing = false;
 }
 
 /**
  * intel_power_domains_fini_hw - deinitialize hw power domain state
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * De-initializes the display power domain HW state. It also ensures that the
  * device stays powered up so that the driver can be reloaded.
@@ -3777,21 +4065,24 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
  * intel_power_domains_disable()) and must be paired with
  * intel_power_domains_init_hw().
  */
-void intel_power_domains_fini_hw(struct drm_i915_private *dev_priv)
+void intel_power_domains_fini_hw(struct drm_i915_private *i915)
 {
-	/* Keep the power well enabled, but cancel its rpm wakeref. */
-	intel_runtime_pm_put(dev_priv);
+	intel_wakeref_t wakeref __maybe_unused =
+		fetch_and_zero(&i915->power_domains.wakeref);
 
 	/* Remove the refcount we took to keep power well support disabled. */
 	if (!i915_modparams.disable_power_well)
-		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+		intel_display_power_put_unchecked(i915, POWER_DOMAIN_INIT);
+
+	intel_power_domains_verify_state(i915);
 
-	intel_power_domains_verify_state(dev_priv);
+	/* Keep the power well enabled, but cancel its rpm wakeref. */
+	intel_runtime_pm_put(i915, wakeref);
 }
 
 /**
  * intel_power_domains_enable - enable toggling of display power wells
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * Enable the ondemand enabling/disabling of the display power wells. Note that
  * power wells not belonging to POWER_DOMAIN_INIT are allowed to be toggled
@@ -3801,30 +4092,36 @@ void intel_power_domains_fini_hw(struct drm_i915_private *dev_priv)
  * of display HW readout (which will acquire the power references reflecting
  * the current HW state).
  */
-void intel_power_domains_enable(struct drm_i915_private *dev_priv)
+void intel_power_domains_enable(struct drm_i915_private *i915)
 {
-	intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+	intel_wakeref_t wakeref __maybe_unused =
+		fetch_and_zero(&i915->power_domains.wakeref);
 
-	intel_power_domains_verify_state(dev_priv);
+	intel_display_power_put(i915, POWER_DOMAIN_INIT, wakeref);
+	intel_power_domains_verify_state(i915);
 }
 
 /**
  * intel_power_domains_disable - disable toggling of display power wells
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * Disable the ondemand enabling/disabling of the display power wells. See
  * intel_power_domains_enable() for which power wells this call controls.
  */
-void intel_power_domains_disable(struct drm_i915_private *dev_priv)
+void intel_power_domains_disable(struct drm_i915_private *i915)
 {
-	intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+	struct i915_power_domains *power_domains = &i915->power_domains;
 
-	intel_power_domains_verify_state(dev_priv);
+	WARN_ON(power_domains->wakeref);
+	power_domains->wakeref =
+		intel_display_power_get(i915, POWER_DOMAIN_INIT);
+
+	intel_power_domains_verify_state(i915);
 }
 
 /**
  * intel_power_domains_suspend - suspend power domain state
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  * @suspend_mode: specifies the target suspend state (idle, mem, hibernation)
  *
  * This function prepares the hardware power domain state before entering
@@ -3833,12 +4130,14 @@ void intel_power_domains_disable(struct drm_i915_private *dev_priv)
  * It must be called with power domains already disabled (after a call to
  * intel_power_domains_disable()) and paired with intel_power_domains_resume().
  */
-void intel_power_domains_suspend(struct drm_i915_private *dev_priv,
+void intel_power_domains_suspend(struct drm_i915_private *i915,
 				 enum i915_drm_suspend_mode suspend_mode)
 {
-	struct i915_power_domains *power_domains = &dev_priv->power_domains;
+	struct i915_power_domains *power_domains = &i915->power_domains;
+	intel_wakeref_t wakeref __maybe_unused =
+		fetch_and_zero(&power_domains->wakeref);
 
-	intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
+	intel_display_power_put(i915, POWER_DOMAIN_INIT, wakeref);
 
 	/*
 	 * In case of suspend-to-idle (aka S0ix) on a DMC platform without DC9
@@ -3847,10 +4146,10 @@ void intel_power_domains_suspend(struct drm_i915_private *dev_priv,
 	 * resources as required and also enable deeper system power states
 	 * that would be blocked if the firmware was inactive.
 	 */
-	if (!(dev_priv->csr.allowed_dc_mask & DC_STATE_EN_DC9) &&
+	if (!(i915->csr.allowed_dc_mask & DC_STATE_EN_DC9) &&
 	    suspend_mode == I915_DRM_SUSPEND_IDLE &&
-	    dev_priv->csr.dmc_payload != NULL) {
-		intel_power_domains_verify_state(dev_priv);
+	    i915->csr.dmc_payload) {
+		intel_power_domains_verify_state(i915);
 		return;
 	}
 
@@ -3859,25 +4158,25 @@ void intel_power_domains_suspend(struct drm_i915_private *dev_priv,
 	 * power wells if power domains must be deinitialized for suspend.
 	 */
 	if (!i915_modparams.disable_power_well) {
-		intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
-		intel_power_domains_verify_state(dev_priv);
+		intel_display_power_put_unchecked(i915, POWER_DOMAIN_INIT);
+		intel_power_domains_verify_state(i915);
 	}
 
-	if (IS_ICELAKE(dev_priv))
-		icl_display_core_uninit(dev_priv);
-	else if (IS_CANNONLAKE(dev_priv))
-		cnl_display_core_uninit(dev_priv);
-	else if (IS_GEN9_BC(dev_priv))
-		skl_display_core_uninit(dev_priv);
-	else if (IS_GEN9_LP(dev_priv))
-		bxt_display_core_uninit(dev_priv);
+	if (IS_ICELAKE(i915))
+		icl_display_core_uninit(i915);
+	else if (IS_CANNONLAKE(i915))
+		cnl_display_core_uninit(i915);
+	else if (IS_GEN9_BC(i915))
+		skl_display_core_uninit(i915);
+	else if (IS_GEN9_LP(i915))
+		bxt_display_core_uninit(i915);
 
 	power_domains->display_core_suspended = true;
 }
 
 /**
  * intel_power_domains_resume - resume power domain state
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * This function resume the hardware power domain state during system resume.
  *
@@ -3885,28 +4184,30 @@ void intel_power_domains_suspend(struct drm_i915_private *dev_priv,
  * intel_power_domains_enable()) and must be paired with
  * intel_power_domains_suspend().
  */
-void intel_power_domains_resume(struct drm_i915_private *dev_priv)
+void intel_power_domains_resume(struct drm_i915_private *i915)
 {
-	struct i915_power_domains *power_domains = &dev_priv->power_domains;
+	struct i915_power_domains *power_domains = &i915->power_domains;
 
 	if (power_domains->display_core_suspended) {
-		intel_power_domains_init_hw(dev_priv, true);
+		intel_power_domains_init_hw(i915, true);
 		power_domains->display_core_suspended = false;
 	} else {
-		intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
+		WARN_ON(power_domains->wakeref);
+		power_domains->wakeref =
+			intel_display_power_get(i915, POWER_DOMAIN_INIT);
 	}
 
-	intel_power_domains_verify_state(dev_priv);
+	intel_power_domains_verify_state(i915);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
 
-static void intel_power_domains_dump_info(struct drm_i915_private *dev_priv)
+static void intel_power_domains_dump_info(struct drm_i915_private *i915)
 {
-	struct i915_power_domains *power_domains = &dev_priv->power_domains;
+	struct i915_power_domains *power_domains = &i915->power_domains;
 	struct i915_power_well *power_well;
 
-	for_each_power_well(dev_priv, power_well) {
+	for_each_power_well(i915, power_well) {
 		enum intel_display_power_domain domain;
 
 		DRM_DEBUG_DRIVER("%-25s %d\n",
@@ -3921,7 +4222,7 @@ static void intel_power_domains_dump_info(struct drm_i915_private *dev_priv)
 
 /**
  * intel_power_domains_verify_state - verify the HW/SW state for all power wells
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * Verify if the reference count of each power well matches its HW enabled
  * state and the total refcount of the domains it belongs to. This must be
@@ -3929,22 +4230,21 @@ static void intel_power_domains_dump_info(struct drm_i915_private *dev_priv)
  * acquiring reference counts for any power wells in use and disabling the
  * ones left on by BIOS but not required by any active output.
  */
-static void intel_power_domains_verify_state(struct drm_i915_private *dev_priv)
+static void intel_power_domains_verify_state(struct drm_i915_private *i915)
 {
-	struct i915_power_domains *power_domains = &dev_priv->power_domains;
+	struct i915_power_domains *power_domains = &i915->power_domains;
 	struct i915_power_well *power_well;
 	bool dump_domain_info;
 
 	mutex_lock(&power_domains->lock);
 
 	dump_domain_info = false;
-	for_each_power_well(dev_priv, power_well) {
+	for_each_power_well(i915, power_well) {
 		enum intel_display_power_domain domain;
 		int domains_count;
 		bool enabled;
 
-		enabled = power_well->desc->ops->is_enabled(dev_priv,
-							    power_well);
+		enabled = power_well->desc->ops->is_enabled(i915, power_well);
 		if ((power_well->count || power_well->desc->always_on) !=
 		    enabled)
 			DRM_ERROR("power well %s state mismatch (refcount %d/enabled %d)",
@@ -3968,7 +4268,7 @@ static void intel_power_domains_verify_state(struct drm_i915_private *dev_priv)
 		static bool dumped;
 
 		if (!dumped) {
-			intel_power_domains_dump_info(dev_priv);
+			intel_power_domains_dump_info(i915);
 			dumped = true;
 		}
 	}
@@ -3978,7 +4278,7 @@ static void intel_power_domains_verify_state(struct drm_i915_private *dev_priv)
 
 #else
 
-static void intel_power_domains_verify_state(struct drm_i915_private *dev_priv)
+static void intel_power_domains_verify_state(struct drm_i915_private *i915)
 {
 }
 
@@ -3986,30 +4286,31 @@ static void intel_power_domains_verify_state(struct drm_i915_private *dev_priv)
 
 /**
  * intel_runtime_pm_get - grab a runtime pm reference
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * This function grabs a device-level runtime pm reference (mostly used for GEM
  * code to ensure the GTT or GT is on) and ensures that it is powered up.
  *
  * Any runtime pm reference obtained by this function must have a symmetric
  * call to intel_runtime_pm_put() to release the reference again.
+ *
+ * Returns: the wakeref cookie to pass to intel_runtime_pm_put()
  */
-void intel_runtime_pm_get(struct drm_i915_private *dev_priv)
+intel_wakeref_t intel_runtime_pm_get(struct drm_i915_private *i915)
 {
-	struct pci_dev *pdev = dev_priv->drm.pdev;
+	struct pci_dev *pdev = i915->drm.pdev;
 	struct device *kdev = &pdev->dev;
 	int ret;
 
 	ret = pm_runtime_get_sync(kdev);
 	WARN_ONCE(ret < 0, "pm_runtime_get_sync() failed: %d\n", ret);
 
-	atomic_inc(&dev_priv->runtime_pm.wakeref_count);
-	assert_rpm_wakelock_held(dev_priv);
+	return track_intel_runtime_pm_wakeref(i915);
 }
 
 /**
  * intel_runtime_pm_get_if_in_use - grab a runtime pm reference if device in use
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * This function grabs a device-level runtime pm reference if the device is
  * already in use and ensures that it is powered up. It is illegal to try
@@ -4018,12 +4319,13 @@ void intel_runtime_pm_get(struct drm_i915_private *dev_priv)
  * Any runtime pm reference obtained by this function must have a symmetric
  * call to intel_runtime_pm_put() to release the reference again.
  *
- * Returns: True if the wakeref was acquired, or False otherwise.
+ * Returns: the wakeref cookie to pass to intel_runtime_pm_put(), evaluates
+ * as True if the wakeref was acquired, or False otherwise.
  */
-bool intel_runtime_pm_get_if_in_use(struct drm_i915_private *dev_priv)
+intel_wakeref_t intel_runtime_pm_get_if_in_use(struct drm_i915_private *i915)
 {
 	if (IS_ENABLED(CONFIG_PM)) {
-		struct pci_dev *pdev = dev_priv->drm.pdev;
+		struct pci_dev *pdev = i915->drm.pdev;
 		struct device *kdev = &pdev->dev;
 
 		/*
@@ -4033,18 +4335,15 @@ bool intel_runtime_pm_get_if_in_use(struct drm_i915_private *dev_priv)
 		 * atm to the late/early system suspend/resume handlers.
 		 */
 		if (pm_runtime_get_if_in_use(kdev) <= 0)
-			return false;
+			return 0;
 	}
 
-	atomic_inc(&dev_priv->runtime_pm.wakeref_count);
-	assert_rpm_wakelock_held(dev_priv);
-
-	return true;
+	return track_intel_runtime_pm_wakeref(i915);
 }
 
 /**
  * intel_runtime_pm_get_noresume - grab a runtime pm reference
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * This function grabs a device-level runtime pm reference (mostly used for GEM
  * code to ensure the GTT or GT is on).
@@ -4058,41 +4357,50 @@ bool intel_runtime_pm_get_if_in_use(struct drm_i915_private *dev_priv)
  *
  * Any runtime pm reference obtained by this function must have a symmetric
  * call to intel_runtime_pm_put() to release the reference again.
+ *
+ * Returns: the wakeref cookie to pass to intel_runtime_pm_put()
  */
-void intel_runtime_pm_get_noresume(struct drm_i915_private *dev_priv)
+intel_wakeref_t intel_runtime_pm_get_noresume(struct drm_i915_private *i915)
 {
-	struct pci_dev *pdev = dev_priv->drm.pdev;
+	struct pci_dev *pdev = i915->drm.pdev;
 	struct device *kdev = &pdev->dev;
 
-	assert_rpm_wakelock_held(dev_priv);
+	assert_rpm_wakelock_held(i915);
 	pm_runtime_get_noresume(kdev);
 
-	atomic_inc(&dev_priv->runtime_pm.wakeref_count);
+	return track_intel_runtime_pm_wakeref(i915);
 }
 
 /**
  * intel_runtime_pm_put - release a runtime pm reference
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * This function drops the device-level runtime pm reference obtained by
  * intel_runtime_pm_get() and might power down the corresponding
  * hardware block right away if this is the last reference.
  */
-void intel_runtime_pm_put(struct drm_i915_private *dev_priv)
+void intel_runtime_pm_put_unchecked(struct drm_i915_private *i915)
 {
-	struct pci_dev *pdev = dev_priv->drm.pdev;
+	struct pci_dev *pdev = i915->drm.pdev;
 	struct device *kdev = &pdev->dev;
 
-	assert_rpm_wakelock_held(dev_priv);
-	atomic_dec(&dev_priv->runtime_pm.wakeref_count);
+	untrack_intel_runtime_pm_wakeref(i915);
 
 	pm_runtime_mark_last_busy(kdev);
 	pm_runtime_put_autosuspend(kdev);
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_RUNTIME_PM)
+void intel_runtime_pm_put(struct drm_i915_private *i915, intel_wakeref_t wref)
+{
+	cancel_intel_runtime_pm_wakeref(i915, wref);
+	intel_runtime_pm_put_unchecked(i915);
+}
+#endif
+
 /**
  * intel_runtime_pm_enable - enable runtime pm
- * @dev_priv: i915 device instance
+ * @i915: i915 device instance
  *
  * This function enables runtime pm at the end of the driver load sequence.
  *
@@ -4100,9 +4408,9 @@ void intel_runtime_pm_put(struct drm_i915_private *dev_priv)
  * subordinate display power domains. That is done by
  * intel_power_domains_enable().
  */
-void intel_runtime_pm_enable(struct drm_i915_private *dev_priv)
+void intel_runtime_pm_enable(struct drm_i915_private *i915)
 {
-	struct pci_dev *pdev = dev_priv->drm.pdev;
+	struct pci_dev *pdev = i915->drm.pdev;
 	struct device *kdev = &pdev->dev;
 
 	/*
@@ -4124,7 +4432,7 @@ void intel_runtime_pm_enable(struct drm_i915_private *dev_priv)
 	 * so the driver's own RPM reference tracking asserts also work on
 	 * platforms without RPM support.
 	 */
-	if (!HAS_RUNTIME_PM(dev_priv)) {
+	if (!HAS_RUNTIME_PM(i915)) {
 		int ret;
 
 		pm_runtime_dont_use_autosuspend(kdev);
@@ -4142,17 +4450,35 @@ void intel_runtime_pm_enable(struct drm_i915_private *dev_priv)
 	pm_runtime_put_autosuspend(kdev);
 }
 
-void intel_runtime_pm_disable(struct drm_i915_private *dev_priv)
+void intel_runtime_pm_disable(struct drm_i915_private *i915)
 {
-	struct pci_dev *pdev = dev_priv->drm.pdev;
+	struct pci_dev *pdev = i915->drm.pdev;
 	struct device *kdev = &pdev->dev;
 
 	/* Transfer rpm ownership back to core */
-	WARN(pm_runtime_get_sync(&dev_priv->drm.pdev->dev) < 0,
+	WARN(pm_runtime_get_sync(kdev) < 0,
 	     "Failed to pass rpm ownership back to core\n");
 
 	pm_runtime_dont_use_autosuspend(kdev);
 
-	if (!HAS_RUNTIME_PM(dev_priv))
+	if (!HAS_RUNTIME_PM(i915))
 		pm_runtime_put(kdev);
 }
+
+void intel_runtime_pm_cleanup(struct drm_i915_private *i915)
+{
+	struct i915_runtime_pm *rpm = &i915->runtime_pm;
+	int count;
+
+	count = atomic_fetch_inc(&rpm->wakeref_count); /* balance untrack */
+	WARN(count,
+	     "i915->runtime_pm.wakeref_count=%d on cleanup\n",
+	     count);
+
+	untrack_intel_runtime_pm_wakeref(i915);
+}
+
+void intel_runtime_pm_init_early(struct drm_i915_private *i915)
+{
+	init_intel_runtime_pm_wakeref(i915);
+}
diff --git a/drivers/gpu/drm/i915/intel_sdvo.c b/drivers/gpu/drm/i915/intel_sdvo.c
index 5805ec1aba12..e7b0884ba5a5 100644
--- a/drivers/gpu/drm/i915/intel_sdvo.c
+++ b/drivers/gpu/drm/i915/intel_sdvo.c
@@ -29,7 +29,6 @@
 #include <linux/slab.h>
 #include <linux/delay.h>
 #include <linux/export.h>
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
@@ -77,7 +76,7 @@ struct intel_sdvo {
 	i915_reg_t sdvo_reg;
 
 	/* Active outputs controlled by this SDVO output */
-	uint16_t controlled_output;
+	u16 controlled_output;
 
 	/*
 	 * Capabilities of the SDVO device returned by
@@ -92,33 +91,32 @@ struct intel_sdvo {
 	* For multiple function SDVO device,
 	* this is for current attached outputs.
 	*/
-	uint16_t attached_output;
+	u16 attached_output;
 
 	/*
 	 * Hotplug activation bits for this device
 	 */
-	uint16_t hotplug_active;
+	u16 hotplug_active;
 
 	enum port port;
 
 	bool has_hdmi_monitor;
 	bool has_hdmi_audio;
-	bool rgb_quant_range_selectable;
 
 	/* DDC bus used by this SDVO encoder */
-	uint8_t ddc_bus;
+	u8 ddc_bus;
 
 	/*
 	 * the sdvo flag gets lost in round trip: dtd->adjusted_mode->dtd
 	 */
-	uint8_t dtd_sdvo_flags;
+	u8 dtd_sdvo_flags;
 };
 
 struct intel_sdvo_connector {
 	struct intel_connector base;
 
 	/* Mark the type of connector */
-	uint16_t output_flag;
+	u16 output_flag;
 
 	/* This contains all current supported TV format */
 	u8 tv_format_supported[TV_FORMAT_NUM];
@@ -186,7 +184,7 @@ to_intel_sdvo_connector(struct drm_connector *connector)
 	container_of((conn_state), struct intel_sdvo_connector_state, base.base)
 
 static bool
-intel_sdvo_output_setup(struct intel_sdvo *intel_sdvo, uint16_t flags);
+intel_sdvo_output_setup(struct intel_sdvo *intel_sdvo, u16 flags);
 static bool
 intel_sdvo_tv_create_property(struct intel_sdvo *intel_sdvo,
 			      struct intel_sdvo_connector *intel_sdvo_connector,
@@ -748,9 +746,9 @@ static bool intel_sdvo_get_input_timing(struct intel_sdvo *intel_sdvo,
 static bool
 intel_sdvo_create_preferred_input_timing(struct intel_sdvo *intel_sdvo,
 					 struct intel_sdvo_connector *intel_sdvo_connector,
-					 uint16_t clock,
-					 uint16_t width,
-					 uint16_t height)
+					 u16 clock,
+					 u16 width,
+					 u16 height)
 {
 	struct intel_sdvo_preferred_input_timing_args args;
 
@@ -793,9 +791,9 @@ static bool intel_sdvo_set_clock_rate_mult(struct intel_sdvo *intel_sdvo, u8 val
 static void intel_sdvo_get_dtd_from_mode(struct intel_sdvo_dtd *dtd,
 					 const struct drm_display_mode *mode)
 {
-	uint16_t width, height;
-	uint16_t h_blank_len, h_sync_len, v_blank_len, v_sync_len;
-	uint16_t h_sync_offset, v_sync_offset;
+	u16 width, height;
+	u16 h_blank_len, h_sync_len, v_blank_len, v_sync_len;
+	u16 h_sync_offset, v_sync_offset;
 	int mode_clock;
 
 	memset(dtd, 0, sizeof(*dtd));
@@ -900,13 +898,13 @@ static bool intel_sdvo_check_supp_encode(struct intel_sdvo *intel_sdvo)
 }
 
 static bool intel_sdvo_set_encode(struct intel_sdvo *intel_sdvo,
-				  uint8_t mode)
+				  u8 mode)
 {
 	return intel_sdvo_set_value(intel_sdvo, SDVO_CMD_SET_ENCODE, &mode, 1);
 }
 
 static bool intel_sdvo_set_colorimetry(struct intel_sdvo *intel_sdvo,
-				       uint8_t mode)
+				       u8 mode)
 {
 	return intel_sdvo_set_value(intel_sdvo, SDVO_CMD_SET_COLORIMETRY, &mode, 1);
 }
@@ -915,11 +913,11 @@ static bool intel_sdvo_set_colorimetry(struct intel_sdvo *intel_sdvo,
 static void intel_sdvo_dump_hdmi_buf(struct intel_sdvo *intel_sdvo)
 {
 	int i, j;
-	uint8_t set_buf_index[2];
-	uint8_t av_split;
-	uint8_t buf_size;
-	uint8_t buf[48];
-	uint8_t *pos;
+	u8 set_buf_index[2];
+	u8 av_split;
+	u8 buf_size;
+	u8 buf[48];
+	u8 *pos;
 
 	intel_sdvo_get_value(encoder, SDVO_CMD_GET_HBUF_AV_SPLIT, &av_split, 1);
 
@@ -942,11 +940,11 @@ static void intel_sdvo_dump_hdmi_buf(struct intel_sdvo *intel_sdvo)
 #endif
 
 static bool intel_sdvo_write_infoframe(struct intel_sdvo *intel_sdvo,
-				       unsigned if_index, uint8_t tx_rate,
-				       const uint8_t *data, unsigned length)
+				       unsigned int if_index, u8 tx_rate,
+				       const u8 *data, unsigned int length)
 {
-	uint8_t set_buf_index[2] = { if_index, 0 };
-	uint8_t hbuf_size, tmp[8];
+	u8 set_buf_index[2] = { if_index, 0 };
+	u8 hbuf_size, tmp[8];
 	int i;
 
 	if (!intel_sdvo_set_value(intel_sdvo,
@@ -981,29 +979,30 @@ static bool intel_sdvo_write_infoframe(struct intel_sdvo *intel_sdvo,
 }
 
 static bool intel_sdvo_set_avi_infoframe(struct intel_sdvo *intel_sdvo,
-					 const struct intel_crtc_state *pipe_config)
+					 const struct intel_crtc_state *pipe_config,
+					 const struct drm_connector_state *conn_state)
 {
-	uint8_t sdvo_data[HDMI_INFOFRAME_SIZE(AVI)];
+	const struct drm_display_mode *adjusted_mode =
+		&pipe_config->base.adjusted_mode;
+	u8 sdvo_data[HDMI_INFOFRAME_SIZE(AVI)];
 	union hdmi_infoframe frame;
 	int ret;
 	ssize_t len;
 
 	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
-						       &pipe_config->base.adjusted_mode,
-						       false);
+						       conn_state->connector,
+						       adjusted_mode);
 	if (ret < 0) {
 		DRM_ERROR("couldn't fill AVI infoframe\n");
 		return false;
 	}
 
-	if (intel_sdvo->rgb_quant_range_selectable) {
-		if (pipe_config->limited_color_range)
-			frame.avi.quantization_range =
-				HDMI_QUANTIZATION_RANGE_LIMITED;
-		else
-			frame.avi.quantization_range =
-				HDMI_QUANTIZATION_RANGE_FULL;
-	}
+	drm_hdmi_avi_infoframe_quant_range(&frame.avi,
+					   conn_state->connector,
+					   adjusted_mode,
+					   pipe_config->limited_color_range ?
+					   HDMI_QUANTIZATION_RANGE_LIMITED :
+					   HDMI_QUANTIZATION_RANGE_FULL);
 
 	len = hdmi_infoframe_pack(&frame, sdvo_data, sizeof(sdvo_data));
 	if (len < 0)
@@ -1018,7 +1017,7 @@ static bool intel_sdvo_set_tv_format(struct intel_sdvo *intel_sdvo,
 				     const struct drm_connector_state *conn_state)
 {
 	struct intel_sdvo_tv_format format;
-	uint32_t format_map;
+	u32 format_map;
 
 	format_map = 1 << conn_state->tv.mode;
 	memset(&format, 0, sizeof(format));
@@ -1108,9 +1107,9 @@ static void i9xx_adjust_sdvo_tv_clock(struct intel_crtc_state *pipe_config)
 	pipe_config->clock_set = true;
 }
 
-static bool intel_sdvo_compute_config(struct intel_encoder *encoder,
-				      struct intel_crtc_state *pipe_config,
-				      struct drm_connector_state *conn_state)
+static int intel_sdvo_compute_config(struct intel_encoder *encoder,
+				     struct intel_crtc_state *pipe_config,
+				     struct drm_connector_state *conn_state)
 {
 	struct intel_sdvo *intel_sdvo = to_sdvo(encoder);
 	struct intel_sdvo_connector_state *intel_sdvo_state =
@@ -1135,7 +1134,7 @@ static bool intel_sdvo_compute_config(struct intel_encoder *encoder,
 	 */
 	if (IS_TV(intel_sdvo_connector)) {
 		if (!intel_sdvo_set_output_timings_from_mode(intel_sdvo, mode))
-			return false;
+			return -EINVAL;
 
 		(void) intel_sdvo_get_preferred_input_mode(intel_sdvo,
 							   intel_sdvo_connector,
@@ -1145,7 +1144,7 @@ static bool intel_sdvo_compute_config(struct intel_encoder *encoder,
 	} else if (IS_LVDS(intel_sdvo_connector)) {
 		if (!intel_sdvo_set_output_timings_from_mode(intel_sdvo,
 							     intel_sdvo_connector->base.panel.fixed_mode))
-			return false;
+			return -EINVAL;
 
 		(void) intel_sdvo_get_preferred_input_mode(intel_sdvo,
 							   intel_sdvo_connector,
@@ -1154,7 +1153,7 @@ static bool intel_sdvo_compute_config(struct intel_encoder *encoder,
 	}
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	/*
 	 * Make the CRTC code factor in the SDVO pixel multiplier.  The
@@ -1194,7 +1193,7 @@ static bool intel_sdvo_compute_config(struct intel_encoder *encoder,
 	if (intel_sdvo_connector->is_hdmi)
 		adjusted_mode->picture_aspect_ratio = conn_state->picture_aspect_ratio;
 
-	return true;
+	return 0;
 }
 
 #define UPDATE_PROPERTY(input, NAME) \
@@ -1209,7 +1208,7 @@ static void intel_sdvo_update_props(struct intel_sdvo *intel_sdvo,
 	const struct drm_connector_state *conn_state = &sdvo_state->base.base;
 	struct intel_sdvo_connector *intel_sdvo_conn =
 		to_intel_sdvo_connector(conn_state->connector);
-	uint16_t val;
+	u16 val;
 
 	if (intel_sdvo_conn->left)
 		UPDATE_PROPERTY(sdvo_state->tv.overscan_h, OVERSCAN_H);
@@ -1316,7 +1315,8 @@ static void intel_sdvo_pre_enable(struct intel_encoder *intel_encoder,
 		intel_sdvo_set_encode(intel_sdvo, SDVO_ENCODE_HDMI);
 		intel_sdvo_set_colorimetry(intel_sdvo,
 					   SDVO_COLORIMETRY_RGB256);
-		intel_sdvo_set_avi_infoframe(intel_sdvo, crtc_state);
+		intel_sdvo_set_avi_infoframe(intel_sdvo,
+					     crtc_state, conn_state);
 	} else
 		intel_sdvo_set_encode(intel_sdvo, SDVO_ENCODE_DVI);
 
@@ -1692,10 +1692,10 @@ static bool intel_sdvo_get_capabilities(struct intel_sdvo *intel_sdvo, struct in
 	return true;
 }
 
-static uint16_t intel_sdvo_get_hotplug_support(struct intel_sdvo *intel_sdvo)
+static u16 intel_sdvo_get_hotplug_support(struct intel_sdvo *intel_sdvo)
 {
 	struct drm_i915_private *dev_priv = to_i915(intel_sdvo->base.base.dev);
-	uint16_t hotplug;
+	u16 hotplug;
 
 	if (!I915_HAS_HOTPLUG(dev_priv))
 		return 0;
@@ -1802,8 +1802,6 @@ intel_sdvo_tmds_sink_detect(struct drm_connector *connector)
 			if (intel_sdvo_connector->is_hdmi) {
 				intel_sdvo->has_hdmi_monitor = drm_detect_hdmi_monitor(edid);
 				intel_sdvo->has_hdmi_audio = drm_detect_monitor_audio(edid);
-				intel_sdvo->rgb_quant_range_selectable =
-					drm_rgb_quant_range_selectable(edid);
 			}
 		} else
 			status = connector_status_disconnected;
@@ -1828,7 +1826,7 @@ intel_sdvo_connector_matches_edid(struct intel_sdvo_connector *sdvo,
 static enum drm_connector_status
 intel_sdvo_detect(struct drm_connector *connector, bool force)
 {
-	uint16_t response;
+	u16 response;
 	struct intel_sdvo *intel_sdvo = intel_attached_sdvo(connector);
 	struct intel_sdvo_connector *intel_sdvo_connector = to_intel_sdvo_connector(connector);
 	enum drm_connector_status ret;
@@ -1852,7 +1850,6 @@ intel_sdvo_detect(struct drm_connector *connector, bool force)
 
 	intel_sdvo->has_hdmi_monitor = false;
 	intel_sdvo->has_hdmi_audio = false;
-	intel_sdvo->rgb_quant_range_selectable = false;
 
 	if ((intel_sdvo_connector->output_flag & response) == 0)
 		ret = connector_status_disconnected;
@@ -1980,7 +1977,7 @@ static void intel_sdvo_get_tv_modes(struct drm_connector *connector)
 	struct intel_sdvo *intel_sdvo = intel_attached_sdvo(connector);
 	const struct drm_connector_state *conn_state = connector->state;
 	struct intel_sdvo_sdtv_resolution_request tv_res;
-	uint32_t reply = 0, format_map = 0;
+	u32 reply = 0, format_map = 0;
 	int i;
 
 	DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n",
@@ -2065,7 +2062,7 @@ static int
 intel_sdvo_connector_atomic_get_property(struct drm_connector *connector,
 					 const struct drm_connector_state *state,
 					 struct drm_property *property,
-					 uint64_t *val)
+					 u64 *val)
 {
 	struct intel_sdvo_connector *intel_sdvo_connector = to_intel_sdvo_connector(connector);
 	const struct intel_sdvo_connector_state *sdvo_state = to_intel_sdvo_connector_state((void *)state);
@@ -2124,7 +2121,7 @@ static int
 intel_sdvo_connector_atomic_set_property(struct drm_connector *connector,
 					 struct drm_connector_state *state,
 					 struct drm_property *property,
-					 uint64_t val)
+					 u64 val)
 {
 	struct intel_sdvo_connector *intel_sdvo_connector = to_intel_sdvo_connector(connector);
 	struct intel_sdvo_connector_state *sdvo_state = to_intel_sdvo_connector_state(state);
@@ -2273,7 +2270,7 @@ static const struct drm_encoder_funcs intel_sdvo_enc_funcs = {
 static void
 intel_sdvo_guess_ddc_bus(struct intel_sdvo *sdvo)
 {
-	uint16_t mask = 0;
+	u16 mask = 0;
 	unsigned int num_bits;
 
 	/*
@@ -2674,7 +2671,7 @@ err:
 }
 
 static bool
-intel_sdvo_output_setup(struct intel_sdvo *intel_sdvo, uint16_t flags)
+intel_sdvo_output_setup(struct intel_sdvo *intel_sdvo, u16 flags)
 {
 	/* SDVO requires XXX1 function may not exist unless it has XXX0 function.*/
 
@@ -2750,7 +2747,7 @@ static bool intel_sdvo_tv_create_property(struct intel_sdvo *intel_sdvo,
 {
 	struct drm_device *dev = intel_sdvo->base.base.dev;
 	struct intel_sdvo_tv_format format;
-	uint32_t format_map, i;
+	u32 format_map, i;
 
 	if (!intel_sdvo_set_target_output(intel_sdvo, type))
 		return false;
@@ -2817,7 +2814,7 @@ intel_sdvo_create_enhance_property_tv(struct intel_sdvo *intel_sdvo,
 	struct drm_connector_state *conn_state = connector->state;
 	struct intel_sdvo_connector_state *sdvo_state =
 		to_intel_sdvo_connector_state(conn_state);
-	uint16_t response, data_value[2];
+	u16 response, data_value[2];
 
 	/* when horizontal overscan is supported, Add the left/right property */
 	if (enhancements.overscan_h) {
@@ -2928,7 +2925,7 @@ intel_sdvo_create_enhance_property_lvds(struct intel_sdvo *intel_sdvo,
 {
 	struct drm_device *dev = intel_sdvo->base.base.dev;
 	struct drm_connector *connector = &intel_sdvo_connector->base.base;
-	uint16_t response, data_value[2];
+	u16 response, data_value[2];
 
 	ENHANCEMENT(&connector->state->tv, brightness, BRIGHTNESS);
 
@@ -2942,7 +2939,7 @@ static bool intel_sdvo_create_enhance_property(struct intel_sdvo *intel_sdvo,
 {
 	union {
 		struct intel_sdvo_enhancements_reply reply;
-		uint16_t response;
+		u16 response;
 	} enhancements;
 
 	BUILD_BUG_ON(sizeof(enhancements) != 2);
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 5170a0f5fe7b..b56a1a9ad01d 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -29,7 +29,6 @@
  * registers; newer ones are much simpler and we can use the new DRM plane
  * support.
  */
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_fourcc.h>
@@ -322,8 +321,8 @@ skl_program_scaler(struct intel_plane *plane,
 		&crtc_state->scaler_state.scalers[scaler_id];
 	int crtc_x = plane_state->base.dst.x1;
 	int crtc_y = plane_state->base.dst.y1;
-	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
-	uint32_t crtc_h = drm_rect_height(&plane_state->base.dst);
+	u32 crtc_w = drm_rect_width(&plane_state->base.dst);
+	u32 crtc_h = drm_rect_height(&plane_state->base.dst);
 	u16 y_hphase, uv_rgb_hphase;
 	u16 y_vphase, uv_rgb_vphase;
 	int hscale, vscale;
@@ -478,16 +477,23 @@ skl_program_plane(struct intel_plane *plane,
 	u32 aux_stride = skl_plane_stride(plane_state, 1);
 	int crtc_x = plane_state->base.dst.x1;
 	int crtc_y = plane_state->base.dst.y1;
-	uint32_t x = plane_state->color_plane[color_plane].x;
-	uint32_t y = plane_state->color_plane[color_plane].y;
-	uint32_t src_w = drm_rect_width(&plane_state->base.src) >> 16;
-	uint32_t src_h = drm_rect_height(&plane_state->base.src) >> 16;
+	u32 x = plane_state->color_plane[color_plane].x;
+	u32 y = plane_state->color_plane[color_plane].y;
+	u32 src_w = drm_rect_width(&plane_state->base.src) >> 16;
+	u32 src_h = drm_rect_height(&plane_state->base.src) >> 16;
 	struct intel_plane *linked = plane_state->linked_plane;
 	const struct drm_framebuffer *fb = plane_state->base.fb;
 	u8 alpha = plane_state->base.alpha >> 8;
+	u32 plane_color_ctl = 0;
 	unsigned long irqflags;
 	u32 keymsk, keymax;
 
+	plane_ctl |= skl_plane_ctl_crtc(crtc_state);
+
+	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
+		plane_color_ctl = plane_state->color_ctl |
+			glk_plane_color_ctl_crtc(crtc_state);
+
 	/* Sizes are 0 based */
 	src_w--;
 	src_h--;
@@ -534,8 +540,7 @@ skl_program_plane(struct intel_plane *plane,
 	}
 
 	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
-		I915_WRITE_FW(PLANE_COLOR_CTL(pipe, plane_id),
-			      plane_state->color_ctl);
+		I915_WRITE_FW(PLANE_COLOR_CTL(pipe, plane_id), plane_color_ctl);
 
 	if (fb->format->is_yuv && icl_is_hdr_plane(plane))
 		icl_program_input_csc(plane, crtc_state, plane_state);
@@ -619,17 +624,19 @@ skl_plane_get_hw_state(struct intel_plane *plane,
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum intel_display_power_domain power_domain;
 	enum plane_id plane_id = plane->id;
+	intel_wakeref_t wakeref;
 	bool ret;
 
 	power_domain = POWER_DOMAIN_PIPE(plane->pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return false;
 
 	ret = I915_READ(PLANE_CTL(plane->pipe, plane_id)) & PLANE_CTL_ENABLE;
 
 	*pipe = plane->pipe;
 
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
@@ -732,6 +739,11 @@ vlv_update_clrc(const struct intel_plane_state *plane_state)
 		      SP_SH_SIN(sh_sin) | SP_SH_COS(sh_cos));
 }
 
+static u32 vlv_sprite_ctl_crtc(const struct intel_crtc_state *crtc_state)
+{
+	return SP_GAMMA_ENABLE;
+}
+
 static u32 vlv_sprite_ctl(const struct intel_crtc_state *crtc_state,
 			  const struct intel_plane_state *plane_state)
 {
@@ -740,7 +752,7 @@ static u32 vlv_sprite_ctl(const struct intel_crtc_state *crtc_state,
 	const struct drm_intel_sprite_colorkey *key = &plane_state->ckey;
 	u32 sprctl;
 
-	sprctl = SP_ENABLE | SP_GAMMA_ENABLE;
+	sprctl = SP_ENABLE;
 
 	switch (fb->format->format) {
 	case DRM_FORMAT_YUYV:
@@ -807,17 +819,19 @@ vlv_update_plane(struct intel_plane *plane,
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum pipe pipe = plane->pipe;
 	enum plane_id plane_id = plane->id;
-	u32 sprctl = plane_state->ctl;
 	u32 sprsurf_offset = plane_state->color_plane[0].offset;
 	u32 linear_offset;
 	const struct drm_intel_sprite_colorkey *key = &plane_state->ckey;
 	int crtc_x = plane_state->base.dst.x1;
 	int crtc_y = plane_state->base.dst.y1;
-	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
-	uint32_t crtc_h = drm_rect_height(&plane_state->base.dst);
-	uint32_t x = plane_state->color_plane[0].x;
-	uint32_t y = plane_state->color_plane[0].y;
+	u32 crtc_w = drm_rect_width(&plane_state->base.dst);
+	u32 crtc_h = drm_rect_height(&plane_state->base.dst);
+	u32 x = plane_state->color_plane[0].x;
+	u32 y = plane_state->color_plane[0].y;
 	unsigned long irqflags;
+	u32 sprctl;
+
+	sprctl = plane_state->ctl | vlv_sprite_ctl_crtc(crtc_state);
 
 	/* Sizes are 0 based */
 	crtc_w--;
@@ -883,21 +897,36 @@ vlv_plane_get_hw_state(struct intel_plane *plane,
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum intel_display_power_domain power_domain;
 	enum plane_id plane_id = plane->id;
+	intel_wakeref_t wakeref;
 	bool ret;
 
 	power_domain = POWER_DOMAIN_PIPE(plane->pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return false;
 
 	ret = I915_READ(SPCNTR(plane->pipe, plane_id)) & SP_ENABLE;
 
 	*pipe = plane->pipe;
 
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
 
+static u32 ivb_sprite_ctl_crtc(const struct intel_crtc_state *crtc_state)
+{
+	struct drm_i915_private *dev_priv = to_i915(crtc_state->base.crtc->dev);
+	u32 sprctl = 0;
+
+	sprctl |= SPRITE_GAMMA_ENABLE;
+
+	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
+		sprctl |= SPRITE_PIPE_CSC_ENABLE;
+
+	return sprctl;
+}
+
 static u32 ivb_sprite_ctl(const struct intel_crtc_state *crtc_state,
 			  const struct intel_plane_state *plane_state)
 {
@@ -908,14 +937,11 @@ static u32 ivb_sprite_ctl(const struct intel_crtc_state *crtc_state,
 	const struct drm_intel_sprite_colorkey *key = &plane_state->ckey;
 	u32 sprctl;
 
-	sprctl = SPRITE_ENABLE | SPRITE_GAMMA_ENABLE;
+	sprctl = SPRITE_ENABLE;
 
 	if (IS_IVYBRIDGE(dev_priv))
 		sprctl |= SPRITE_TRICKLE_FEED_DISABLE;
 
-	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
-		sprctl |= SPRITE_PIPE_CSC_ENABLE;
-
 	switch (fb->format->format) {
 	case DRM_FORMAT_XBGR8888:
 		sprctl |= SPRITE_FORMAT_RGBX888 | SPRITE_RGB_ORDER_RGBX;
@@ -967,20 +993,22 @@ ivb_update_plane(struct intel_plane *plane,
 {
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum pipe pipe = plane->pipe;
-	u32 sprctl = plane_state->ctl, sprscale = 0;
 	u32 sprsurf_offset = plane_state->color_plane[0].offset;
 	u32 linear_offset;
 	const struct drm_intel_sprite_colorkey *key = &plane_state->ckey;
 	int crtc_x = plane_state->base.dst.x1;
 	int crtc_y = plane_state->base.dst.y1;
-	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
-	uint32_t crtc_h = drm_rect_height(&plane_state->base.dst);
-	uint32_t x = plane_state->color_plane[0].x;
-	uint32_t y = plane_state->color_plane[0].y;
-	uint32_t src_w = drm_rect_width(&plane_state->base.src) >> 16;
-	uint32_t src_h = drm_rect_height(&plane_state->base.src) >> 16;
+	u32 crtc_w = drm_rect_width(&plane_state->base.dst);
+	u32 crtc_h = drm_rect_height(&plane_state->base.dst);
+	u32 x = plane_state->color_plane[0].x;
+	u32 y = plane_state->color_plane[0].y;
+	u32 src_w = drm_rect_width(&plane_state->base.src) >> 16;
+	u32 src_h = drm_rect_height(&plane_state->base.src) >> 16;
+	u32 sprctl, sprscale = 0;
 	unsigned long irqflags;
 
+	sprctl = plane_state->ctl | ivb_sprite_ctl_crtc(crtc_state);
+
 	/* Sizes are 0 based */
 	src_w--;
 	src_h--;
@@ -1052,17 +1080,19 @@ ivb_plane_get_hw_state(struct intel_plane *plane,
 {
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum intel_display_power_domain power_domain;
+	intel_wakeref_t wakeref;
 	bool ret;
 
 	power_domain = POWER_DOMAIN_PIPE(plane->pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return false;
 
 	ret =  I915_READ(SPRCTL(plane->pipe)) & SPRITE_ENABLE;
 
 	*pipe = plane->pipe;
 
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
@@ -1075,6 +1105,11 @@ g4x_sprite_max_stride(struct intel_plane *plane,
 	return 16384;
 }
 
+static u32 g4x_sprite_ctl_crtc(const struct intel_crtc_state *crtc_state)
+{
+	return DVS_GAMMA_ENABLE;
+}
+
 static u32 g4x_sprite_ctl(const struct intel_crtc_state *crtc_state,
 			  const struct intel_plane_state *plane_state)
 {
@@ -1085,9 +1120,9 @@ static u32 g4x_sprite_ctl(const struct intel_crtc_state *crtc_state,
 	const struct drm_intel_sprite_colorkey *key = &plane_state->ckey;
 	u32 dvscntr;
 
-	dvscntr = DVS_ENABLE | DVS_GAMMA_ENABLE;
+	dvscntr = DVS_ENABLE;
 
-	if (IS_GEN6(dev_priv))
+	if (IS_GEN(dev_priv, 6))
 		dvscntr |= DVS_TRICKLE_FEED_DISABLE;
 
 	switch (fb->format->format) {
@@ -1141,20 +1176,22 @@ g4x_update_plane(struct intel_plane *plane,
 {
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum pipe pipe = plane->pipe;
-	u32 dvscntr = plane_state->ctl, dvsscale = 0;
 	u32 dvssurf_offset = plane_state->color_plane[0].offset;
 	u32 linear_offset;
 	const struct drm_intel_sprite_colorkey *key = &plane_state->ckey;
 	int crtc_x = plane_state->base.dst.x1;
 	int crtc_y = plane_state->base.dst.y1;
-	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
-	uint32_t crtc_h = drm_rect_height(&plane_state->base.dst);
-	uint32_t x = plane_state->color_plane[0].x;
-	uint32_t y = plane_state->color_plane[0].y;
-	uint32_t src_w = drm_rect_width(&plane_state->base.src) >> 16;
-	uint32_t src_h = drm_rect_height(&plane_state->base.src) >> 16;
+	u32 crtc_w = drm_rect_width(&plane_state->base.dst);
+	u32 crtc_h = drm_rect_height(&plane_state->base.dst);
+	u32 x = plane_state->color_plane[0].x;
+	u32 y = plane_state->color_plane[0].y;
+	u32 src_w = drm_rect_width(&plane_state->base.src) >> 16;
+	u32 src_h = drm_rect_height(&plane_state->base.src) >> 16;
+	u32 dvscntr, dvsscale = 0;
 	unsigned long irqflags;
 
+	dvscntr = plane_state->ctl | g4x_sprite_ctl_crtc(crtc_state);
+
 	/* Sizes are 0 based */
 	src_w--;
 	src_h--;
@@ -1218,17 +1255,19 @@ g4x_plane_get_hw_state(struct intel_plane *plane,
 {
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	enum intel_display_power_domain power_domain;
+	intel_wakeref_t wakeref;
 	bool ret;
 
 	power_domain = POWER_DOMAIN_PIPE(plane->pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return false;
 
 	ret = I915_READ(DVSCNTR(plane->pipe)) & DVS_ENABLE;
 
 	*pipe = plane->pipe;
 
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return ret;
 }
@@ -1699,7 +1738,7 @@ out:
 	return ret;
 }
 
-static const uint32_t g4x_plane_formats[] = {
+static const u32 g4x_plane_formats[] = {
 	DRM_FORMAT_XRGB8888,
 	DRM_FORMAT_YUYV,
 	DRM_FORMAT_YVYU,
@@ -1707,13 +1746,13 @@ static const uint32_t g4x_plane_formats[] = {
 	DRM_FORMAT_VYUY,
 };
 
-static const uint64_t i9xx_plane_format_modifiers[] = {
+static const u64 i9xx_plane_format_modifiers[] = {
 	I915_FORMAT_MOD_X_TILED,
 	DRM_FORMAT_MOD_LINEAR,
 	DRM_FORMAT_MOD_INVALID
 };
 
-static const uint32_t snb_plane_formats[] = {
+static const u32 snb_plane_formats[] = {
 	DRM_FORMAT_XBGR8888,
 	DRM_FORMAT_XRGB8888,
 	DRM_FORMAT_YUYV,
@@ -1722,7 +1761,7 @@ static const uint32_t snb_plane_formats[] = {
 	DRM_FORMAT_VYUY,
 };
 
-static const uint32_t vlv_plane_formats[] = {
+static const u32 vlv_plane_formats[] = {
 	DRM_FORMAT_RGB565,
 	DRM_FORMAT_ABGR8888,
 	DRM_FORMAT_ARGB8888,
@@ -1736,7 +1775,7 @@ static const uint32_t vlv_plane_formats[] = {
 	DRM_FORMAT_VYUY,
 };
 
-static const uint32_t skl_plane_formats[] = {
+static const u32 skl_plane_formats[] = {
 	DRM_FORMAT_C8,
 	DRM_FORMAT_RGB565,
 	DRM_FORMAT_XRGB8888,
@@ -1751,7 +1790,7 @@ static const uint32_t skl_plane_formats[] = {
 	DRM_FORMAT_VYUY,
 };
 
-static const uint32_t skl_planar_formats[] = {
+static const u32 skl_planar_formats[] = {
 	DRM_FORMAT_C8,
 	DRM_FORMAT_RGB565,
 	DRM_FORMAT_XRGB8888,
@@ -1767,7 +1806,7 @@ static const uint32_t skl_planar_formats[] = {
 	DRM_FORMAT_NV12,
 };
 
-static const uint64_t skl_plane_format_modifiers_noccs[] = {
+static const u64 skl_plane_format_modifiers_noccs[] = {
 	I915_FORMAT_MOD_Yf_TILED,
 	I915_FORMAT_MOD_Y_TILED,
 	I915_FORMAT_MOD_X_TILED,
@@ -1775,7 +1814,7 @@ static const uint64_t skl_plane_format_modifiers_noccs[] = {
 	DRM_FORMAT_MOD_INVALID
 };
 
-static const uint64_t skl_plane_format_modifiers_ccs[] = {
+static const u64 skl_plane_format_modifiers_ccs[] = {
 	I915_FORMAT_MOD_Yf_TILED_CCS,
 	I915_FORMAT_MOD_Y_TILED_CCS,
 	I915_FORMAT_MOD_Yf_TILED,
@@ -1983,7 +2022,7 @@ static bool skl_plane_has_planar(struct drm_i915_private *dev_priv,
 	if (IS_SKYLAKE(dev_priv) || IS_BROXTON(dev_priv))
 		return false;
 
-	if (IS_GEN9(dev_priv) && !IS_GEMINILAKE(dev_priv) && pipe == PIPE_C)
+	if (IS_GEN(dev_priv, 9) && !IS_GEMINILAKE(dev_priv) && pipe == PIPE_C)
 		return false;
 
 	if (plane_id != PLANE_PRIMARY && plane_id != PLANE_SPRITE0)
@@ -2163,7 +2202,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 		plane->check_plane = g4x_sprite_check;
 
 		modifiers = i9xx_plane_format_modifiers;
-		if (IS_GEN6(dev_priv)) {
+		if (IS_GEN(dev_priv, 6)) {
 			formats = snb_plane_formats;
 			num_formats = ARRAY_SIZE(snb_plane_formats);
 
diff --git a/drivers/gpu/drm/i915/intel_tv.c b/drivers/gpu/drm/i915/intel_tv.c
index 860f306a23ba..3924c4944e1f 100644
--- a/drivers/gpu/drm/i915/intel_tv.c
+++ b/drivers/gpu/drm/i915/intel_tv.c
@@ -30,7 +30,6 @@
  * Integrated TV-out support for the 915GM and 945GM.
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
@@ -307,7 +306,7 @@ struct tv_mode {
 
 	u32 clock;
 	u16 refresh; /* in millihertz (for precision) */
-	u32 oversample;
+	u8 oversample;
 	u8 hsync_end;
 	u16 hblank_start, hblank_end, htotal;
 	bool progressive : 1, trilevel_sync : 1, component_only : 1;
@@ -340,7 +339,6 @@ struct tv_mode {
 	const struct video_levels *composite_levels, *svideo_levels;
 	const struct color_conversion *composite_color, *svideo_color;
 	const u32 *filter_table;
-	u16 max_srcw;
 };
 
 
@@ -379,8 +377,8 @@ static const struct tv_mode tv_modes[] = {
 		.name		= "NTSC-M",
 		.clock		= 108000,
 		.refresh	= 59940,
-		.oversample	= TV_OVERSAMPLE_8X,
-		.component_only = 0,
+		.oversample	= 8,
+		.component_only = false,
 		/* 525 Lines, 60 Fields, 15.734KHz line, Sub-Carrier 3.580MHz */
 
 		.hsync_end	= 64,		    .hblank_end		= 124,
@@ -422,8 +420,8 @@ static const struct tv_mode tv_modes[] = {
 		.name		= "NTSC-443",
 		.clock		= 108000,
 		.refresh	= 59940,
-		.oversample	= TV_OVERSAMPLE_8X,
-		.component_only = 0,
+		.oversample	= 8,
+		.component_only = false,
 		/* 525 Lines, 60 Fields, 15.734KHz line, Sub-Carrier 4.43MHz */
 		.hsync_end	= 64,		    .hblank_end		= 124,
 		.hblank_start	= 836,		    .htotal		= 857,
@@ -464,8 +462,8 @@ static const struct tv_mode tv_modes[] = {
 		.name		= "NTSC-J",
 		.clock		= 108000,
 		.refresh	= 59940,
-		.oversample	= TV_OVERSAMPLE_8X,
-		.component_only = 0,
+		.oversample	= 8,
+		.component_only = false,
 
 		/* 525 Lines, 60 Fields, 15.734KHz line, Sub-Carrier 3.580MHz */
 		.hsync_end	= 64,		    .hblank_end		= 124,
@@ -507,8 +505,8 @@ static const struct tv_mode tv_modes[] = {
 		.name		= "PAL-M",
 		.clock		= 108000,
 		.refresh	= 59940,
-		.oversample	= TV_OVERSAMPLE_8X,
-		.component_only = 0,
+		.oversample	= 8,
+		.component_only = false,
 
 		/* 525 Lines, 60 Fields, 15.734KHz line, Sub-Carrier 3.580MHz */
 		.hsync_end	= 64,		  .hblank_end		= 124,
@@ -551,8 +549,8 @@ static const struct tv_mode tv_modes[] = {
 		.name	    = "PAL-N",
 		.clock		= 108000,
 		.refresh	= 50000,
-		.oversample	= TV_OVERSAMPLE_8X,
-		.component_only = 0,
+		.oversample	= 8,
+		.component_only = false,
 
 		.hsync_end	= 64,		    .hblank_end		= 128,
 		.hblank_start = 844,	    .htotal		= 863,
@@ -596,8 +594,8 @@ static const struct tv_mode tv_modes[] = {
 		.name	    = "PAL",
 		.clock		= 108000,
 		.refresh	= 50000,
-		.oversample	= TV_OVERSAMPLE_8X,
-		.component_only = 0,
+		.oversample	= 8,
+		.component_only = false,
 
 		.hsync_end	= 64,		    .hblank_end		= 142,
 		.hblank_start	= 844,	    .htotal		= 863,
@@ -636,10 +634,10 @@ static const struct tv_mode tv_modes[] = {
 	},
 	{
 		.name       = "480p",
-		.clock		= 107520,
+		.clock		= 108000,
 		.refresh	= 59940,
-		.oversample     = TV_OVERSAMPLE_4X,
-		.component_only = 1,
+		.oversample     = 4,
+		.component_only = true,
 
 		.hsync_end      = 64,               .hblank_end         = 122,
 		.hblank_start   = 842,              .htotal             = 857,
@@ -660,10 +658,10 @@ static const struct tv_mode tv_modes[] = {
 	},
 	{
 		.name       = "576p",
-		.clock		= 107520,
+		.clock		= 108000,
 		.refresh	= 50000,
-		.oversample     = TV_OVERSAMPLE_4X,
-		.component_only = 1,
+		.oversample     = 4,
+		.component_only = true,
 
 		.hsync_end      = 64,               .hblank_end         = 139,
 		.hblank_start   = 859,              .htotal             = 863,
@@ -684,10 +682,10 @@ static const struct tv_mode tv_modes[] = {
 	},
 	{
 		.name       = "720p@60Hz",
-		.clock		= 148800,
+		.clock		= 148500,
 		.refresh	= 60000,
-		.oversample     = TV_OVERSAMPLE_2X,
-		.component_only = 1,
+		.oversample     = 2,
+		.component_only = true,
 
 		.hsync_end      = 80,               .hblank_end         = 300,
 		.hblank_start   = 1580,             .htotal             = 1649,
@@ -708,10 +706,10 @@ static const struct tv_mode tv_modes[] = {
 	},
 	{
 		.name       = "720p@50Hz",
-		.clock		= 148800,
+		.clock		= 148500,
 		.refresh	= 50000,
-		.oversample     = TV_OVERSAMPLE_2X,
-		.component_only = 1,
+		.oversample     = 2,
+		.component_only = true,
 
 		.hsync_end      = 80,               .hblank_end         = 300,
 		.hblank_start   = 1580,             .htotal             = 1979,
@@ -729,14 +727,13 @@ static const struct tv_mode tv_modes[] = {
 		.burst_ena      = false,
 
 		.filter_table = filter_table,
-		.max_srcw = 800
 	},
 	{
 		.name       = "1080i@50Hz",
-		.clock		= 148800,
+		.clock		= 148500,
 		.refresh	= 50000,
-		.oversample     = TV_OVERSAMPLE_2X,
-		.component_only = 1,
+		.oversample     = 2,
+		.component_only = true,
 
 		.hsync_end      = 88,               .hblank_end         = 235,
 		.hblank_start   = 2155,             .htotal             = 2639,
@@ -759,10 +756,10 @@ static const struct tv_mode tv_modes[] = {
 	},
 	{
 		.name       = "1080i@60Hz",
-		.clock		= 148800,
+		.clock		= 148500,
 		.refresh	= 60000,
-		.oversample     = TV_OVERSAMPLE_2X,
-		.component_only = 1,
+		.oversample     = 2,
+		.component_only = true,
 
 		.hsync_end      = 88,               .hblank_end         = 235,
 		.hblank_start   = 2155,             .htotal             = 2199,
@@ -783,8 +780,115 @@ static const struct tv_mode tv_modes[] = {
 
 		.filter_table = filter_table,
 	},
+
+	{
+		.name       = "1080p@30Hz",
+		.clock		= 148500,
+		.refresh	= 30000,
+		.oversample     = 2,
+		.component_only = true,
+
+		.hsync_end      = 88,               .hblank_end         = 235,
+		.hblank_start   = 2155,             .htotal             = 2199,
+
+		.progressive	= true,		    .trilevel_sync = true,
+
+		.vsync_start_f1 = 8,               .vsync_start_f2     = 8,
+		.vsync_len      = 10,
+
+		.veq_ena	= false,	.veq_start_f1	= 0,
+		.veq_start_f2	= 0,		    .veq_len		= 0,
+
+		.vi_end_f1      = 44,               .vi_end_f2          = 44,
+		.nbr_end        = 1079,
+
+		.burst_ena      = false,
+
+		.filter_table = filter_table,
+	},
+
+	{
+		.name       = "1080p@50Hz",
+		.clock		= 148500,
+		.refresh	= 50000,
+		.oversample     = 1,
+		.component_only = true,
+
+		.hsync_end      = 88,               .hblank_end         = 235,
+		.hblank_start   = 2155,             .htotal             = 2639,
+
+		.progressive	= true,		    .trilevel_sync = true,
+
+		.vsync_start_f1 = 8,               .vsync_start_f2     = 8,
+		.vsync_len      = 10,
+
+		.veq_ena	= false,	.veq_start_f1	= 0,
+		.veq_start_f2	= 0,		    .veq_len		= 0,
+
+		.vi_end_f1      = 44,               .vi_end_f2          = 44,
+		.nbr_end        = 1079,
+
+		.burst_ena      = false,
+
+		.filter_table = filter_table,
+	},
+
+	{
+		.name       = "1080p@60Hz",
+		.clock		= 148500,
+		.refresh	= 60000,
+		.oversample     = 1,
+		.component_only = true,
+
+		.hsync_end      = 88,               .hblank_end         = 235,
+		.hblank_start   = 2155,             .htotal             = 2199,
+
+		.progressive	= true,		    .trilevel_sync = true,
+
+		.vsync_start_f1 = 8,               .vsync_start_f2     = 8,
+		.vsync_len      = 10,
+
+		.veq_ena	= false,		    .veq_start_f1	= 0,
+		.veq_start_f2	= 0,		    .veq_len		= 0,
+
+		.vi_end_f1      = 44,               .vi_end_f2          = 44,
+		.nbr_end        = 1079,
+
+		.burst_ena      = false,
+
+		.filter_table = filter_table,
+	},
 };
 
+struct intel_tv_connector_state {
+	struct drm_connector_state base;
+
+	/*
+	 * May need to override the user margins for
+	 * gen3 >1024 wide source vertical centering.
+	 */
+	struct {
+		u16 top, bottom;
+	} margins;
+
+	bool bypass_vfilter;
+};
+
+#define to_intel_tv_connector_state(x) container_of(x, struct intel_tv_connector_state, base)
+
+static struct drm_connector_state *
+intel_tv_connector_duplicate_state(struct drm_connector *connector)
+{
+	struct intel_tv_connector_state *state;
+
+	state = kmemdup(connector->state, sizeof(*state), GFP_KERNEL);
+	if (!state)
+		return NULL;
+
+	__drm_atomic_helper_connector_duplicate_state(connector, &state->base);
+	return &state->base;
+}
+
 static struct intel_tv *enc_to_tv(struct intel_encoder *encoder)
 {
 	return container_of(encoder, struct intel_tv, base);
@@ -860,45 +964,370 @@ intel_tv_mode_valid(struct drm_connector *connector,
 	return MODE_CLOCK_RANGE;
 }
 
+static int
+intel_tv_mode_vdisplay(const struct tv_mode *tv_mode)
+{
+	if (tv_mode->progressive)
+		return tv_mode->nbr_end + 1;
+	else
+		return 2 * (tv_mode->nbr_end + 1);
+}
+
+static void
+intel_tv_mode_to_mode(struct drm_display_mode *mode,
+		      const struct tv_mode *tv_mode)
+{
+	mode->clock = tv_mode->clock /
+		(tv_mode->oversample >> !tv_mode->progressive);
+
+	/*
+	 * tv_mode horizontal timings:
+	 *
+	 * hsync_end
+	 *    | hblank_end
+	 *    |    | hblank_start
+	 *    |    |       | htotal
+	 *    |     _______    |
+	 *     ____/       \___
+	 * \__/                \
+	 */
+	mode->hdisplay =
+		tv_mode->hblank_start - tv_mode->hblank_end;
+	mode->hsync_start = mode->hdisplay +
+		tv_mode->htotal - tv_mode->hblank_start;
+	mode->hsync_end = mode->hsync_start +
+		tv_mode->hsync_end;
+	mode->htotal = tv_mode->htotal + 1;
+
+	/*
+	 * tv_mode vertical timings:
+	 *
+	 * vsync_start
+	 *    | vsync_end
+	 *    |  | vi_end nbr_end
+	 *    |  |    |       |
+	 *    |  |     _______
+	 * \__    ____/       \
+	 *    \__/
+	 */
+	mode->vdisplay = intel_tv_mode_vdisplay(tv_mode);
+	if (tv_mode->progressive) {
+		mode->vsync_start = mode->vdisplay +
+			tv_mode->vsync_start_f1 + 1;
+		mode->vsync_end = mode->vsync_start +
+			tv_mode->vsync_len;
+		mode->vtotal = mode->vdisplay +
+			tv_mode->vi_end_f1 + 1;
+	} else {
+		mode->vsync_start = mode->vdisplay +
+			tv_mode->vsync_start_f1 + 1 +
+			tv_mode->vsync_start_f2 + 1;
+		mode->vsync_end = mode->vsync_start +
+			2 * tv_mode->vsync_len;
+		mode->vtotal = mode->vdisplay +
+			tv_mode->vi_end_f1 + 1 +
+			tv_mode->vi_end_f2 + 1;
+	}
+
+	/* TV has it's own notion of sync and other mode flags, so clear them. */
+	mode->flags = 0;
+
+	mode->vrefresh = 0;
+	mode->vrefresh = drm_mode_vrefresh(mode);
+
+	snprintf(mode->name, sizeof(mode->name),
+		 "%dx%d%c (%s)",
+		 mode->hdisplay, mode->vdisplay,
+		 tv_mode->progressive ? 'p' : 'i',
+		 tv_mode->name);
+}
+
+static void intel_tv_scale_mode_horiz(struct drm_display_mode *mode,
+				      int hdisplay, int left_margin,
+				      int right_margin)
+{
+	int hsync_start = mode->hsync_start - mode->hdisplay + right_margin;
+	int hsync_end = mode->hsync_end - mode->hdisplay + right_margin;
+	int new_htotal = mode->htotal * hdisplay /
+		(mode->hdisplay - left_margin - right_margin);
+
+	mode->clock = mode->clock * new_htotal / mode->htotal;
+
+	mode->hdisplay = hdisplay;
+	mode->hsync_start = hdisplay + hsync_start * new_htotal / mode->htotal;
+	mode->hsync_end = hdisplay + hsync_end * new_htotal / mode->htotal;
+	mode->htotal = new_htotal;
+}
+
+static void intel_tv_scale_mode_vert(struct drm_display_mode *mode,
+				     int vdisplay, int top_margin,
+				     int bottom_margin)
+{
+	int vsync_start = mode->vsync_start - mode->vdisplay + bottom_margin;
+	int vsync_end = mode->vsync_end - mode->vdisplay + bottom_margin;
+	int new_vtotal = mode->vtotal * vdisplay /
+		(mode->vdisplay - top_margin - bottom_margin);
+
+	mode->clock = mode->clock * new_vtotal / mode->vtotal;
+
+	mode->vdisplay = vdisplay;
+	mode->vsync_start = vdisplay + vsync_start * new_vtotal / mode->vtotal;
+	mode->vsync_end = vdisplay + vsync_end * new_vtotal / mode->vtotal;
+	mode->vtotal = new_vtotal;
+}
 
 static void
 intel_tv_get_config(struct intel_encoder *encoder,
 		    struct intel_crtc_state *pipe_config)
 {
+	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
+	struct drm_display_mode *adjusted_mode =
+		&pipe_config->base.adjusted_mode;
+	struct drm_display_mode mode = {};
+	u32 tv_ctl, hctl1, hctl3, vctl1, vctl2, tmp;
+	struct tv_mode tv_mode = {};
+	int hdisplay = adjusted_mode->crtc_hdisplay;
+	int vdisplay = adjusted_mode->crtc_vdisplay;
+	int xsize, ysize, xpos, ypos;
+
 	pipe_config->output_types |= BIT(INTEL_OUTPUT_TVOUT);
 
-	pipe_config->base.adjusted_mode.crtc_clock = pipe_config->port_clock;
+	tv_ctl = I915_READ(TV_CTL);
+	hctl1 = I915_READ(TV_H_CTL_1);
+	hctl3 = I915_READ(TV_H_CTL_3);
+	vctl1 = I915_READ(TV_V_CTL_1);
+	vctl2 = I915_READ(TV_V_CTL_2);
+
+	tv_mode.htotal = (hctl1 & TV_HTOTAL_MASK) >> TV_HTOTAL_SHIFT;
+	tv_mode.hsync_end = (hctl1 & TV_HSYNC_END_MASK) >> TV_HSYNC_END_SHIFT;
+
+	tv_mode.hblank_start = (hctl3 & TV_HBLANK_START_MASK) >> TV_HBLANK_START_SHIFT;
+	tv_mode.hblank_end = (hctl3 & TV_HSYNC_END_MASK) >> TV_HBLANK_END_SHIFT;
+
+	tv_mode.nbr_end = (vctl1 & TV_NBR_END_MASK) >> TV_NBR_END_SHIFT;
+	tv_mode.vi_end_f1 = (vctl1 & TV_VI_END_F1_MASK) >> TV_VI_END_F1_SHIFT;
+	tv_mode.vi_end_f2 = (vctl1 & TV_VI_END_F2_MASK) >> TV_VI_END_F2_SHIFT;
+
+	tv_mode.vsync_len = (vctl2 & TV_VSYNC_LEN_MASK) >> TV_VSYNC_LEN_SHIFT;
+	tv_mode.vsync_start_f1 = (vctl2 & TV_VSYNC_START_F1_MASK) >> TV_VSYNC_START_F1_SHIFT;
+	tv_mode.vsync_start_f2 = (vctl2 & TV_VSYNC_START_F2_MASK) >> TV_VSYNC_START_F2_SHIFT;
+
+	tv_mode.clock = pipe_config->port_clock;
+
+	tv_mode.progressive = tv_ctl & TV_PROGRESSIVE;
+
+	switch (tv_ctl & TV_OVERSAMPLE_MASK) {
+	case TV_OVERSAMPLE_8X:
+		tv_mode.oversample = 8;
+		break;
+	case TV_OVERSAMPLE_4X:
+		tv_mode.oversample = 4;
+		break;
+	case TV_OVERSAMPLE_2X:
+		tv_mode.oversample = 2;
+		break;
+	default:
+		tv_mode.oversample = 1;
+		break;
+	}
+
+	tmp = I915_READ(TV_WIN_POS);
+	xpos = tmp >> 16;
+	ypos = tmp & 0xffff;
+
+	tmp = I915_READ(TV_WIN_SIZE);
+	xsize = tmp >> 16;
+	ysize = tmp & 0xffff;
+
+	intel_tv_mode_to_mode(&mode, &tv_mode);
+
+	DRM_DEBUG_KMS("TV mode:\n");
+	drm_mode_debug_printmodeline(&mode);
+
+	intel_tv_scale_mode_horiz(&mode, hdisplay,
+				  xpos, mode.hdisplay - xsize - xpos);
+	intel_tv_scale_mode_vert(&mode, vdisplay,
+				 ypos, mode.vdisplay - ysize - ypos);
+
+	adjusted_mode->crtc_clock = mode.clock;
+	if (adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE)
+		adjusted_mode->crtc_clock /= 2;
+
+	/* pixel counter doesn't work on i965gm TV output */
+	if (IS_I965GM(dev_priv))
+		adjusted_mode->private_flags |=
+			I915_MODE_FLAG_USE_SCANLINE_COUNTER;
 }
 
-static bool
+static bool intel_tv_source_too_wide(struct drm_i915_private *dev_priv,
+				     int hdisplay)
+{
+	return IS_GEN(dev_priv, 3) && hdisplay > 1024;
+}
+
+static bool intel_tv_vert_scaling(const struct drm_display_mode *tv_mode,
+				  const struct drm_connector_state *conn_state,
+				  int vdisplay)
+{
+	return tv_mode->crtc_vdisplay -
+		conn_state->tv.margins.top -
+		conn_state->tv.margins.bottom !=
+		vdisplay;
+}
+
+static int
 intel_tv_compute_config(struct intel_encoder *encoder,
 			struct intel_crtc_state *pipe_config,
 			struct drm_connector_state *conn_state)
 {
+	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
+	struct intel_tv_connector_state *tv_conn_state =
+		to_intel_tv_connector_state(conn_state);
 	const struct tv_mode *tv_mode = intel_tv_mode_find(conn_state);
 	struct drm_display_mode *adjusted_mode =
 		&pipe_config->base.adjusted_mode;
+	int hdisplay = adjusted_mode->crtc_hdisplay;
+	int vdisplay = adjusted_mode->crtc_vdisplay;
 
 	if (!tv_mode)
-		return false;
+		return -EINVAL;
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
-	adjusted_mode->crtc_clock = tv_mode->clock;
+
 	DRM_DEBUG_KMS("forcing bpc to 8 for TV\n");
 	pipe_config->pipe_bpp = 8*3;
 
-	/* TV has it's own notion of sync and other mode flags, so clear them. */
-	adjusted_mode->flags = 0;
+	pipe_config->port_clock = tv_mode->clock;
+
+	intel_tv_mode_to_mode(adjusted_mode, tv_mode);
+	drm_mode_set_crtcinfo(adjusted_mode, 0);
+
+	if (intel_tv_source_too_wide(dev_priv, hdisplay) ||
+	    !intel_tv_vert_scaling(adjusted_mode, conn_state, vdisplay)) {
+		int extra, top, bottom;
+
+		extra = adjusted_mode->crtc_vdisplay - vdisplay;
+
+		if (extra < 0) {
+			DRM_DEBUG_KMS("No vertical scaling for >1024 pixel wide modes\n");
+			return -EINVAL;
+		}
+
+		/* Need to turn off the vertical filter and center the image */
+
+		/* Attempt to maintain the relative sizes of the margins */
+		top = conn_state->tv.margins.top;
+		bottom = conn_state->tv.margins.bottom;
+
+		if (top + bottom)
+			top = extra * top / (top + bottom);
+		else
+			top = extra / 2;
+		bottom = extra - top;
+
+		tv_conn_state->margins.top = top;
+		tv_conn_state->margins.bottom = bottom;
+
+		tv_conn_state->bypass_vfilter = true;
+
+		if (!tv_mode->progressive) {
+			adjusted_mode->clock /= 2;
+			adjusted_mode->crtc_clock /= 2;
+			adjusted_mode->flags |= DRM_MODE_FLAG_INTERLACE;
+		}
+	} else {
+		tv_conn_state->margins.top = conn_state->tv.margins.top;
+		tv_conn_state->margins.bottom = conn_state->tv.margins.bottom;
+
+		tv_conn_state->bypass_vfilter = false;
+	}
+
+	DRM_DEBUG_KMS("TV mode:\n");
+	drm_mode_debug_printmodeline(adjusted_mode);
 
 	/*
-	 * FIXME: We don't check whether the input mode is actually what we want
-	 * or whether userspace is doing something stupid.
+	 * The pipe scanline counter behaviour looks as follows when
+	 * using the TV encoder:
+	 *
+	 * time ->
+	 *
+	 * dsl=vtotal-1       |             |
+	 *                   ||            ||
+	 *               ___| |        ___| |
+	 *              /     |       /     |
+	 *             /      |      /      |
+	 * dsl=0   ___/       |_____/       |
+	 *        | | |  |  | |
+	 *         ^ ^ ^   ^ ^
+	 *         | | |   | pipe vblank/first part of tv vblank
+	 *         | | |   bottom margin
+	 *         | | active
+	 *         | top margin
+	 *         remainder of tv vblank
+	 *
+	 * When the TV encoder is used the pipe wants to run faster
+	 * than expected rate. During the active portion the TV
+	 * encoder stalls the pipe every few lines to keep it in
+	 * check. When the TV encoder reaches the bottom margin the
+	 * pipe simply stops. Once we reach the TV vblank the pipe is
+	 * no longer stalled and it runs at the max rate (apparently
+	 * oversample clock on gen3, cdclk on gen4). Once the pipe
+	 * reaches the pipe vtotal the pipe stops for the remainder
+	 * of the TV vblank/top margin. The pipe starts up again when
+	 * the TV encoder exits the top margin.
+	 *
+	 * To avoid huge hassles for vblank timestamping we scale
+	 * the pipe timings as if the pipe always runs at the average
+	 * rate it maintains during the active period. This also
+	 * gives us a reasonable guesstimate as to the pixel rate.
+	 * Due to the variation in the actual pipe speed the scanline
+	 * counter will give us slightly erroneous results during the
+	 * TV vblank/margins. But since vtotal was selected such that
+	 * it matches the average rate of the pipe during the active
+	 * portion the error shouldn't cause any serious grief to
+	 * vblank timestamps.
+	 *
+	 * For posterity here is the empirically derived formula
+	 * that gives us the maximum length of the pipe vblank
+	 * we can use without causing display corruption. Following
+	 * this would allow us to have a ticking scanline counter
+	 * everywhere except during the bottom margin (there the
+	 * pipe always stops). Ie. this would eliminate the second
+	 * flat portion of the above graph. However this would also
+	 * complicate vblank timestamping as the pipe vtotal would
+	 * no longer match the average rate the pipe runs at during
+	 * the active portion. Hence following this formula seems
+	 * more trouble that it's worth.
+	 *
+	 * if (IS_GEN(dev_priv, 4)) {
+	 *	num = cdclk * (tv_mode->oversample >> !tv_mode->progressive);
+	 *	den = tv_mode->clock;
+	 * } else {
+	 *	num = tv_mode->oversample >> !tv_mode->progressive;
+	 *	den = 1;
+	 * }
+	 * max_pipe_vblank_len ~=
+	 *	(num * tv_htotal * (tv_vblank_len + top_margin)) /
+	 *	(den * pipe_htotal);
 	 */
+	intel_tv_scale_mode_horiz(adjusted_mode, hdisplay,
+				  conn_state->tv.margins.left,
+				  conn_state->tv.margins.right);
+	intel_tv_scale_mode_vert(adjusted_mode, vdisplay,
+				 tv_conn_state->margins.top,
+				 tv_conn_state->margins.bottom);
+	drm_mode_set_crtcinfo(adjusted_mode, 0);
+	adjusted_mode->name[0] = '\0';
+
+	/* pixel counter doesn't work on i965gm TV output */
+	if (IS_I965GM(dev_priv))
+		adjusted_mode->private_flags |=
+			I915_MODE_FLAG_USE_SCANLINE_COUNTER;
 
-	return true;
+	return 0;
 }
 
 static void
@@ -987,14 +1416,16 @@ static void intel_tv_pre_enable(struct intel_encoder *encoder,
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_crtc *intel_crtc = to_intel_crtc(pipe_config->base.crtc);
 	struct intel_tv *intel_tv = enc_to_tv(encoder);
+	const struct intel_tv_connector_state *tv_conn_state =
+		to_intel_tv_connector_state(conn_state);
 	const struct tv_mode *tv_mode = intel_tv_mode_find(conn_state);
-	u32 tv_ctl;
+	u32 tv_ctl, tv_filter_ctl;
 	u32 scctl1, scctl2, scctl3;
 	int i, j;
 	const struct video_levels *video_levels;
 	const struct color_conversion *color_conversion;
 	bool burst_ena;
-	int xpos = 0x0, ypos = 0x0;
+	int xpos, ypos;
 	unsigned int xsize, ysize;
 
 	if (!tv_mode)
@@ -1030,7 +1461,21 @@ static void intel_tv_pre_enable(struct intel_encoder *encoder,
 	}
 
 	tv_ctl |= TV_ENC_PIPE_SEL(intel_crtc->pipe);
-	tv_ctl |= tv_mode->oversample;
+
+	switch (tv_mode->oversample) {
+	case 8:
+		tv_ctl |= TV_OVERSAMPLE_8X;
+		break;
+	case 4:
+		tv_ctl |= TV_OVERSAMPLE_4X;
+		break;
+	case 2:
+		tv_ctl |= TV_OVERSAMPLE_2X;
+		break;
+	default:
+		tv_ctl |= TV_OVERSAMPLE_NONE;
+		break;
+	}
 
 	if (tv_mode->progressive)
 		tv_ctl |= TV_PROGRESSIVE;
@@ -1082,19 +1527,20 @@ static void intel_tv_pre_enable(struct intel_encoder *encoder,
 	assert_pipe_disabled(dev_priv, intel_crtc->pipe);
 
 	/* Filter ctl must be set before TV_WIN_SIZE */
-	I915_WRITE(TV_FILTER_CTL_1, TV_AUTO_SCALE);
+	tv_filter_ctl = TV_AUTO_SCALE;
+	if (tv_conn_state->bypass_vfilter)
+		tv_filter_ctl |= TV_V_FILTER_BYPASS;
+	I915_WRITE(TV_FILTER_CTL_1, tv_filter_ctl);
+
 	xsize = tv_mode->hblank_start - tv_mode->hblank_end;
-	if (tv_mode->progressive)
-		ysize = tv_mode->nbr_end + 1;
-	else
-		ysize = 2*tv_mode->nbr_end + 1;
+	ysize = intel_tv_mode_vdisplay(tv_mode);
 
-	xpos += conn_state->tv.margins.left;
-	ypos += conn_state->tv.margins.top;
+	xpos = conn_state->tv.margins.left;
+	ypos = tv_conn_state->margins.top;
 	xsize -= (conn_state->tv.margins.left +
 		  conn_state->tv.margins.right);
-	ysize -= (conn_state->tv.margins.top +
-		  conn_state->tv.margins.bottom);
+	ysize -= (tv_conn_state->margins.top +
+		  tv_conn_state->margins.bottom);
 	I915_WRITE(TV_WIN_POS, (xpos<<16)|ypos);
 	I915_WRITE(TV_WIN_SIZE, (xsize<<16)|ysize);
 
@@ -1111,23 +1557,6 @@ static void intel_tv_pre_enable(struct intel_encoder *encoder,
 	I915_WRITE(TV_CTL, tv_ctl);
 }
 
-static const struct drm_display_mode reported_modes[] = {
-	{
-		.name = "NTSC 480i",
-		.clock = 107520,
-		.hdisplay = 1280,
-		.hsync_start = 1368,
-		.hsync_end = 1496,
-		.htotal = 1712,
-
-		.vdisplay = 1024,
-		.vsync_start = 1027,
-		.vsync_end = 1034,
-		.vtotal = 1104,
-		.type = DRM_MODE_TYPE_DRIVER,
-	},
-};
-
 static int
 intel_tv_detect_type(struct intel_tv *intel_tv,
 		      struct drm_connector *connector)
@@ -1234,16 +1663,18 @@ static void intel_tv_find_better_format(struct drm_connector *connector)
 	const struct tv_mode *tv_mode = intel_tv_mode_find(connector->state);
 	int i;
 
-	if ((intel_tv->type == DRM_MODE_CONNECTOR_Component) ==
-		tv_mode->component_only)
+	/* Component supports everything so we can keep the current mode */
+	if (intel_tv->type == DRM_MODE_CONNECTOR_Component)
 		return;
 
+	/* If the current mode is fine don't change it */
+	if (!tv_mode->component_only)
+		return;
 
 	for (i = 0; i < ARRAY_SIZE(tv_modes); i++) {
-		tv_mode = tv_modes + i;
+		tv_mode = &tv_modes[i];
 
-		if ((intel_tv->type == DRM_MODE_CONNECTOR_Component) ==
-			tv_mode->component_only)
+		if (!tv_mode->component_only)
 			break;
 	}
 
@@ -1255,7 +1686,6 @@ intel_tv_detect(struct drm_connector *connector,
 		struct drm_modeset_acquire_ctx *ctx,
 		bool force)
 {
-	struct drm_display_mode mode;
 	struct intel_tv *intel_tv = intel_attached_tv(connector);
 	enum drm_connector_status status;
 	int type;
@@ -1264,13 +1694,11 @@ intel_tv_detect(struct drm_connector *connector,
 		      connector->base.id, connector->name,
 		      force);
 
-	mode = reported_modes[0];
-
 	if (force) {
 		struct intel_load_detect_pipe tmp;
 		int ret;
 
-		ret = intel_get_load_detect_pipe(connector, &mode, &tmp, ctx);
+		ret = intel_get_load_detect_pipe(connector, NULL, &tmp, ctx);
 		if (ret < 0)
 			return ret;
 
@@ -1294,84 +1722,85 @@ intel_tv_detect(struct drm_connector *connector,
 }
 
 static const struct input_res {
-	const char *name;
-	int w, h;
+	u16 w, h;
 } input_res_table[] = {
-	{"640x480", 640, 480},
-	{"800x600", 800, 600},
-	{"1024x768", 1024, 768},
-	{"1280x1024", 1280, 1024},
-	{"848x480", 848, 480},
-	{"1280x720", 1280, 720},
-	{"1920x1080", 1920, 1080},
+	{ 640, 480 },
+	{ 800, 600 },
+	{ 1024, 768 },
+	{ 1280, 1024 },
+	{ 848, 480 },
+	{ 1280, 720 },
+	{ 1920, 1080 },
 };
 
-/*
- * Chose preferred mode  according to line number of TV format
- */
+/* Choose preferred mode according to line number of TV format */
+static bool
+intel_tv_is_preferred_mode(const struct drm_display_mode *mode,
+			   const struct tv_mode *tv_mode)
+{
+	int vdisplay = intel_tv_mode_vdisplay(tv_mode);
+
+	/* prefer 480 line modes for all SD TV modes */
+	if (vdisplay <= 576)
+		vdisplay = 480;
+
+	return vdisplay == mode->vdisplay;
+}
+
 static void
-intel_tv_choose_preferred_modes(const struct tv_mode *tv_mode,
-			       struct drm_display_mode *mode_ptr)
+intel_tv_set_mode_type(struct drm_display_mode *mode,
+		       const struct tv_mode *tv_mode)
 {
-	if (tv_mode->nbr_end < 480 && mode_ptr->vdisplay == 480)
-		mode_ptr->type |= DRM_MODE_TYPE_PREFERRED;
-	else if (tv_mode->nbr_end > 480) {
-		if (tv_mode->progressive == true && tv_mode->nbr_end < 720) {
-			if (mode_ptr->vdisplay == 720)
-				mode_ptr->type |= DRM_MODE_TYPE_PREFERRED;
-		} else if (mode_ptr->vdisplay == 1080)
-				mode_ptr->type |= DRM_MODE_TYPE_PREFERRED;
-	}
+	mode->type = DRM_MODE_TYPE_DRIVER;
+
+	if (intel_tv_is_preferred_mode(mode, tv_mode))
+		mode->type |= DRM_MODE_TYPE_PREFERRED;
 }
 
 static int
 intel_tv_get_modes(struct drm_connector *connector)
 {
-	struct drm_display_mode *mode_ptr;
+	struct drm_i915_private *dev_priv = to_i915(connector->dev);
 	const struct tv_mode *tv_mode = intel_tv_mode_find(connector->state);
-	int j, count = 0;
-	u64 tmp;
+	int i, count = 0;
 
-	for (j = 0; j < ARRAY_SIZE(input_res_table);
-	     j++) {
-		const struct input_res *input = &input_res_table[j];
-		unsigned int hactive_s = input->w;
-		unsigned int vactive_s = input->h;
+	for (i = 0; i < ARRAY_SIZE(input_res_table); i++) {
+		const struct input_res *input = &input_res_table[i];
+		struct drm_display_mode *mode;
 
-		if (tv_mode->max_srcw && input->w > tv_mode->max_srcw)
+		if (input->w > 1024 &&
+		    !tv_mode->progressive &&
+		    !tv_mode->component_only)
 			continue;
 
-		if (input->w > 1024 && (!tv_mode->progressive
-					&& !tv_mode->component_only))
+		/* no vertical scaling with wide sources on gen3 */
+		if (IS_GEN(dev_priv, 3) && input->w > 1024 &&
+		    input->h > intel_tv_mode_vdisplay(tv_mode))
 			continue;
 
-		mode_ptr = drm_mode_create(connector->dev);
-		if (!mode_ptr)
+		mode = drm_mode_create(connector->dev);
+		if (!mode)
 			continue;
-		strlcpy(mode_ptr->name, input->name, DRM_DISPLAY_MODE_LEN);
-
-		mode_ptr->hdisplay = hactive_s;
-		mode_ptr->hsync_start = hactive_s + 1;
-		mode_ptr->hsync_end = hactive_s + 64;
-		if (mode_ptr->hsync_end <= mode_ptr->hsync_start)
-			mode_ptr->hsync_end = mode_ptr->hsync_start + 1;
-		mode_ptr->htotal = hactive_s + 96;
-
-		mode_ptr->vdisplay = vactive_s;
-		mode_ptr->vsync_start = vactive_s + 1;
-		mode_ptr->vsync_end = vactive_s + 32;
-		if (mode_ptr->vsync_end <= mode_ptr->vsync_start)
-			mode_ptr->vsync_end = mode_ptr->vsync_start  + 1;
-		mode_ptr->vtotal = vactive_s + 33;
-
-		tmp = mul_u32_u32(tv_mode->refresh, mode_ptr->vtotal);
-		tmp *= mode_ptr->htotal;
-		tmp = div_u64(tmp, 1000000);
-		mode_ptr->clock = (int) tmp;
-
-		mode_ptr->type = DRM_MODE_TYPE_DRIVER;
-		intel_tv_choose_preferred_modes(tv_mode, mode_ptr);
-		drm_mode_probed_add(connector, mode_ptr);
+
+		/*
+		 * We take the TV mode and scale it to look
+		 * like it had the expected h/vdisplay. This
+		 * provides the most information to userspace
+		 * about the actual timings of the mode. We
+		 * do ignore the margins though.
+		 */
+		intel_tv_mode_to_mode(mode, tv_mode);
+		if (count == 0) {
+			DRM_DEBUG_KMS("TV mode:\n");
+			drm_mode_debug_printmodeline(mode);
+		}
+		intel_tv_scale_mode_horiz(mode, input->w, 0, 0);
+		intel_tv_scale_mode_vert(mode, input->h, 0, 0);
+		intel_tv_set_mode_type(mode, tv_mode);
+
+		drm_mode_set_name(mode);
+
+		drm_mode_probed_add(connector, mode);
 		count++;
 	}
 
@@ -1384,7 +1813,7 @@ static const struct drm_connector_funcs intel_tv_connector_funcs = {
 	.destroy = intel_connector_destroy,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
-	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
+	.atomic_duplicate_state = intel_tv_connector_duplicate_state,
 };
 
 static int intel_tv_atomic_check(struct drm_connector *connector,
@@ -1531,11 +1960,15 @@ intel_tv_init(struct drm_i915_private *dev_priv)
 	connector->doublescan_allowed = false;
 
 	/* Create TV properties then attach current values */
-	for (i = 0; i < ARRAY_SIZE(tv_modes); i++)
+	for (i = 0; i < ARRAY_SIZE(tv_modes); i++) {
+		/* 1080p50/1080p60 not supported on gen3 */
+		if (IS_GEN(dev_priv, 3) &&
+		    tv_modes[i].oversample == 1)
+			break;
+
 		tv_format_names[i] = tv_modes[i].name;
-	drm_mode_create_tv_properties(dev,
-				      ARRAY_SIZE(tv_modes),
-				      tv_format_names);
+	}
+	drm_mode_create_tv_properties(dev, i, tv_format_names);
 
 	drm_object_attach_property(&connector->base, dev->mode_config.tv_mode_property,
 				   state->tv.mode);
diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index b34c318b238d..e711eb3268bc 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -26,6 +26,7 @@
 #include "intel_guc_submission.h"
 #include "intel_guc.h"
 #include "i915_drv.h"
+#include "i915_reset.h"
 
 static void guc_free_load_err_log(struct intel_guc *guc);
 
@@ -71,7 +72,7 @@ static int __get_default_guc_log_level(struct drm_i915_private *i915)
 {
 	int guc_log_level;
 
-	if (!HAS_GUC(i915) || !intel_uc_is_using_guc())
+	if (!HAS_GUC(i915) || !intel_uc_is_using_guc(i915))
 		guc_log_level = GUC_LOG_LEVEL_DISABLED;
 	else if (IS_ENABLED(CONFIG_DRM_I915_DEBUG) ||
 		 IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
@@ -112,11 +113,11 @@ static void sanitize_options_early(struct drm_i915_private *i915)
 
 	DRM_DEBUG_DRIVER("enable_guc=%d (submission:%s huc:%s)\n",
 			 i915_modparams.enable_guc,
-			 yesno(intel_uc_is_using_guc_submission()),
-			 yesno(intel_uc_is_using_huc()));
+			 yesno(intel_uc_is_using_guc_submission(i915)),
+			 yesno(intel_uc_is_using_huc(i915)));
 
 	/* Verify GuC firmware availability */
-	if (intel_uc_is_using_guc() && !intel_uc_fw_is_selected(guc_fw)) {
+	if (intel_uc_is_using_guc(i915) && !intel_uc_fw_is_selected(guc_fw)) {
 		DRM_WARN("Incompatible option detected: %s=%d, %s!\n",
 			 "enable_guc", i915_modparams.enable_guc,
 			 !HAS_GUC(i915) ? "no GuC hardware" :
@@ -124,7 +125,7 @@ static void sanitize_options_early(struct drm_i915_private *i915)
 	}
 
 	/* Verify HuC firmware availability */
-	if (intel_uc_is_using_huc() && !intel_uc_fw_is_selected(huc_fw)) {
+	if (intel_uc_is_using_huc(i915) && !intel_uc_fw_is_selected(huc_fw)) {
 		DRM_WARN("Incompatible option detected: %s=%d, %s!\n",
 			 "enable_guc", i915_modparams.enable_guc,
 			 !HAS_HUC(i915) ? "no HuC hardware" :
@@ -136,7 +137,7 @@ static void sanitize_options_early(struct drm_i915_private *i915)
 		i915_modparams.guc_log_level =
 			__get_default_guc_log_level(i915);
 
-	if (i915_modparams.guc_log_level > 0 && !intel_uc_is_using_guc()) {
+	if (i915_modparams.guc_log_level > 0 && !intel_uc_is_using_guc(i915)) {
 		DRM_WARN("Incompatible option detected: %s=%d, %s!\n",
 			 "guc_log_level", i915_modparams.guc_log_level,
 			 !HAS_GUC(i915) ? "no GuC hardware" :
@@ -354,7 +355,7 @@ int intel_uc_init_hw(struct drm_i915_private *i915)
 
 	/* WaEnableuKernelHeaderValidFix:skl */
 	/* WaEnableGuCBootHashCheckNotSet:skl,bxt,kbl */
-	if (IS_GEN9(i915))
+	if (IS_GEN(i915, 9))
 		attempts = 3;
 	else
 		attempts = 1;
diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
index 25d73ada74ae..870faf9011b9 100644
--- a/drivers/gpu/drm/i915/intel_uc.h
+++ b/drivers/gpu/drm/i915/intel_uc.h
@@ -41,19 +41,19 @@ void intel_uc_fini(struct drm_i915_private *dev_priv);
 int intel_uc_suspend(struct drm_i915_private *dev_priv);
 int intel_uc_resume(struct drm_i915_private *dev_priv);
 
-static inline bool intel_uc_is_using_guc(void)
+static inline bool intel_uc_is_using_guc(struct drm_i915_private *i915)
 {
 	GEM_BUG_ON(i915_modparams.enable_guc < 0);
 	return i915_modparams.enable_guc > 0;
 }
 
-static inline bool intel_uc_is_using_guc_submission(void)
+static inline bool intel_uc_is_using_guc_submission(struct drm_i915_private *i915)
 {
 	GEM_BUG_ON(i915_modparams.enable_guc < 0);
 	return i915_modparams.enable_guc & ENABLE_GUC_SUBMISSION;
 }
 
-static inline bool intel_uc_is_using_huc(void)
+static inline bool intel_uc_is_using_huc(struct drm_i915_private *i915)
 {
 	GEM_BUG_ON(i915_modparams.enable_guc < 0);
 	return i915_modparams.enable_guc & ENABLE_GUC_LOAD_HUC;
diff --git a/drivers/gpu/drm/i915/intel_uc_fw.c b/drivers/gpu/drm/i915/intel_uc_fw.c
index fd496416087c..becf05ebae4d 100644
--- a/drivers/gpu/drm/i915/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/intel_uc_fw.c
@@ -46,12 +46,17 @@ void intel_uc_fw_fetch(struct drm_i915_private *dev_priv,
 	size_t size;
 	int err;
 
+	if (!uc_fw->path) {
+		dev_info(dev_priv->drm.dev,
+			 "%s: No firmware was defined for %s!\n",
+			 intel_uc_fw_type_repr(uc_fw->type),
+			 intel_platform_name(INTEL_INFO(dev_priv)->platform));
+		return;
+	}
+
 	DRM_DEBUG_DRIVER("%s fw fetch %s\n",
 			 intel_uc_fw_type_repr(uc_fw->type), uc_fw->path);
 
-	if (!uc_fw->path)
-		return;
-
 	uc_fw->fetch_status = INTEL_UC_FIRMWARE_PENDING;
 	DRM_DEBUG_DRIVER("%s fw fetch %s\n",
 			 intel_uc_fw_type_repr(uc_fw->type),
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 9289515108c3..75646a1e0051 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -528,7 +528,7 @@ check_for_unclaimed_mmio(struct drm_i915_private *dev_priv)
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
 		ret |= vlv_check_for_unclaimed_mmio(dev_priv);
 
-	if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv))
+	if (IS_GEN_RANGE(dev_priv, 6, 7))
 		ret |= gen6_check_for_fifo_debug(dev_priv);
 
 	return ret;
@@ -556,7 +556,7 @@ static void __intel_uncore_early_sanitize(struct drm_i915_private *dev_priv,
 		dev_priv->uncore.funcs.force_wake_get(dev_priv,
 						      restore_forcewake);
 
-		if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv))
+		if (IS_GEN_RANGE(dev_priv, 6, 7))
 			dev_priv->uncore.fifo_count =
 				fifo_free_entries(dev_priv);
 		spin_unlock_irq(&dev_priv->uncore.lock);
@@ -1398,7 +1398,7 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
 	if (INTEL_GEN(dev_priv) <= 5 || intel_vgpu_active(dev_priv))
 		return;
 
-	if (IS_GEN6(dev_priv)) {
+	if (IS_GEN(dev_priv, 6)) {
 		dev_priv->uncore.fw_reset = 0;
 		dev_priv->uncore.fw_set = FORCEWAKE_KERNEL;
 		dev_priv->uncore.fw_clear = 0;
@@ -1437,7 +1437,7 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
 				       FORCEWAKE_MEDIA_VEBOX_GEN11(i),
 				       FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(i));
 		}
-	} else if (IS_GEN10(dev_priv) || IS_GEN9(dev_priv)) {
+	} else if (IS_GEN_RANGE(dev_priv, 9, 10)) {
 		dev_priv->uncore.funcs.force_wake_get =
 			fw_domains_get_with_fallback;
 		dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
@@ -1503,7 +1503,7 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
 			fw_domain_init(dev_priv, FW_DOMAIN_ID_RENDER,
 				       FORCEWAKE, FORCEWAKE_ACK);
 		}
-	} else if (IS_GEN6(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 6)) {
 		dev_priv->uncore.funcs.force_wake_get =
 			fw_domains_get_with_thread_status;
 		dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
@@ -1567,13 +1567,13 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 	dev_priv->uncore.pmic_bus_access_nb.notifier_call =
 		i915_pmic_bus_access_notifier;
 
-	if (IS_GEN(dev_priv, 2, 4) || intel_vgpu_active(dev_priv)) {
+	if (IS_GEN_RANGE(dev_priv, 2, 4) || intel_vgpu_active(dev_priv)) {
 		ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen2);
 		ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen2);
-	} else if (IS_GEN5(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 5)) {
 		ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen5);
 		ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen5);
-	} else if (IS_GEN(dev_priv, 6, 7)) {
+	} else if (IS_GEN_RANGE(dev_priv, 6, 7)) {
 		ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen6);
 
 		if (IS_VALLEYVIEW(dev_priv)) {
@@ -1582,7 +1582,7 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 		} else {
 			ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen6);
 		}
-	} else if (IS_GEN8(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 8)) {
 		if (IS_CHERRYVIEW(dev_priv)) {
 			ASSIGN_FW_DOMAINS_TABLE(__chv_fw_ranges);
 			ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, fwtable);
@@ -1592,7 +1592,7 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
 			ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen8);
 			ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen6);
 		}
-	} else if (IS_GEN(dev_priv, 9, 10)) {
+	} else if (IS_GEN_RANGE(dev_priv, 9, 10)) {
 		ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
 		ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, fwtable);
 		ASSIGN_READ_MMIO_VFUNCS(dev_priv, fwtable);
@@ -1670,6 +1670,7 @@ int i915_reg_read_ioctl(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_reg_read *reg = data;
 	struct reg_whitelist const *entry;
+	intel_wakeref_t wakeref;
 	unsigned int flags;
 	int remain;
 	int ret = 0;
@@ -1695,286 +1696,25 @@ int i915_reg_read_ioctl(struct drm_device *dev,
 
 	flags = reg->offset & (entry->size - 1);
 
-	intel_runtime_pm_get(dev_priv);
-	if (entry->size == 8 && flags == I915_REG_READ_8B_WA)
-		reg->val = I915_READ64_2x32(entry->offset_ldw,
-					    entry->offset_udw);
-	else if (entry->size == 8 && flags == 0)
-		reg->val = I915_READ64(entry->offset_ldw);
-	else if (entry->size == 4 && flags == 0)
-		reg->val = I915_READ(entry->offset_ldw);
-	else if (entry->size == 2 && flags == 0)
-		reg->val = I915_READ16(entry->offset_ldw);
-	else if (entry->size == 1 && flags == 0)
-		reg->val = I915_READ8(entry->offset_ldw);
-	else
-		ret = -EINVAL;
-	intel_runtime_pm_put(dev_priv);
-
-	return ret;
-}
-
-static void gen3_stop_engine(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	const u32 base = engine->mmio_base;
-
-	if (intel_engine_stop_cs(engine))
-		DRM_DEBUG_DRIVER("%s: timed out on STOP_RING\n", engine->name);
-
-	I915_WRITE_FW(RING_HEAD(base), I915_READ_FW(RING_TAIL(base)));
-	POSTING_READ_FW(RING_HEAD(base)); /* paranoia */
-
-	I915_WRITE_FW(RING_HEAD(base), 0);
-	I915_WRITE_FW(RING_TAIL(base), 0);
-	POSTING_READ_FW(RING_TAIL(base));
-
-	/* The ring must be empty before it is disabled */
-	I915_WRITE_FW(RING_CTL(base), 0);
-
-	/* Check acts as a post */
-	if (I915_READ_FW(RING_HEAD(base)) != 0)
-		DRM_DEBUG_DRIVER("%s: ring head not parked\n",
-				 engine->name);
-}
-
-static void i915_stop_engines(struct drm_i915_private *dev_priv,
-			      unsigned int engine_mask)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	if (INTEL_GEN(dev_priv) < 3)
-		return;
-
-	for_each_engine_masked(engine, dev_priv, engine_mask, id)
-		gen3_stop_engine(engine);
-}
-
-static bool i915_in_reset(struct pci_dev *pdev)
-{
-	u8 gdrst;
-
-	pci_read_config_byte(pdev, I915_GDRST, &gdrst);
-	return gdrst & GRDOM_RESET_STATUS;
-}
-
-static int i915_do_reset(struct drm_i915_private *dev_priv,
-			 unsigned int engine_mask,
-			 unsigned int retry)
-{
-	struct pci_dev *pdev = dev_priv->drm.pdev;
-	int err;
-
-	/* Assert reset for at least 20 usec, and wait for acknowledgement. */
-	pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
-	usleep_range(50, 200);
-	err = wait_for(i915_in_reset(pdev), 500);
-
-	/* Clear the reset request. */
-	pci_write_config_byte(pdev, I915_GDRST, 0);
-	usleep_range(50, 200);
-	if (!err)
-		err = wait_for(!i915_in_reset(pdev), 500);
-
-	return err;
-}
-
-static bool g4x_reset_complete(struct pci_dev *pdev)
-{
-	u8 gdrst;
-
-	pci_read_config_byte(pdev, I915_GDRST, &gdrst);
-	return (gdrst & GRDOM_RESET_ENABLE) == 0;
-}
-
-static int g33_do_reset(struct drm_i915_private *dev_priv,
-			unsigned int engine_mask,
-			unsigned int retry)
-{
-	struct pci_dev *pdev = dev_priv->drm.pdev;
-
-	pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
-	return wait_for(g4x_reset_complete(pdev), 500);
-}
-
-static int g4x_do_reset(struct drm_i915_private *dev_priv,
-			unsigned int engine_mask,
-			unsigned int retry)
-{
-	struct pci_dev *pdev = dev_priv->drm.pdev;
-	int ret;
-
-	/* WaVcpClkGateDisableForMediaReset:ctg,elk */
-	I915_WRITE(VDECCLK_GATE_D,
-		   I915_READ(VDECCLK_GATE_D) | VCP_UNIT_CLOCK_GATE_DISABLE);
-	POSTING_READ(VDECCLK_GATE_D);
-
-	pci_write_config_byte(pdev, I915_GDRST,
-			      GRDOM_MEDIA | GRDOM_RESET_ENABLE);
-	ret =  wait_for(g4x_reset_complete(pdev), 500);
-	if (ret) {
-		DRM_DEBUG_DRIVER("Wait for media reset failed\n");
-		goto out;
-	}
-
-	pci_write_config_byte(pdev, I915_GDRST,
-			      GRDOM_RENDER | GRDOM_RESET_ENABLE);
-	ret =  wait_for(g4x_reset_complete(pdev), 500);
-	if (ret) {
-		DRM_DEBUG_DRIVER("Wait for render reset failed\n");
-		goto out;
-	}
-
-out:
-	pci_write_config_byte(pdev, I915_GDRST, 0);
-
-	I915_WRITE(VDECCLK_GATE_D,
-		   I915_READ(VDECCLK_GATE_D) & ~VCP_UNIT_CLOCK_GATE_DISABLE);
-	POSTING_READ(VDECCLK_GATE_D);
-
-	return ret;
-}
-
-static int ironlake_do_reset(struct drm_i915_private *dev_priv,
-			     unsigned int engine_mask,
-			     unsigned int retry)
-{
-	int ret;
-
-	I915_WRITE(ILK_GDSR, ILK_GRDOM_RENDER | ILK_GRDOM_RESET_ENABLE);
-	ret = intel_wait_for_register(dev_priv,
-				      ILK_GDSR, ILK_GRDOM_RESET_ENABLE, 0,
-				      500);
-	if (ret) {
-		DRM_DEBUG_DRIVER("Wait for render reset failed\n");
-		goto out;
-	}
-
-	I915_WRITE(ILK_GDSR, ILK_GRDOM_MEDIA | ILK_GRDOM_RESET_ENABLE);
-	ret = intel_wait_for_register(dev_priv,
-				      ILK_GDSR, ILK_GRDOM_RESET_ENABLE, 0,
-				      500);
-	if (ret) {
-		DRM_DEBUG_DRIVER("Wait for media reset failed\n");
-		goto out;
+	with_intel_runtime_pm(dev_priv, wakeref) {
+		if (entry->size == 8 && flags == I915_REG_READ_8B_WA)
+			reg->val = I915_READ64_2x32(entry->offset_ldw,
+						    entry->offset_udw);
+		else if (entry->size == 8 && flags == 0)
+			reg->val = I915_READ64(entry->offset_ldw);
+		else if (entry->size == 4 && flags == 0)
+			reg->val = I915_READ(entry->offset_ldw);
+		else if (entry->size == 2 && flags == 0)
+			reg->val = I915_READ16(entry->offset_ldw);
+		else if (entry->size == 1 && flags == 0)
+			reg->val = I915_READ8(entry->offset_ldw);
+		else
+			ret = -EINVAL;
 	}
 
-out:
-	I915_WRITE(ILK_GDSR, 0);
-	POSTING_READ(ILK_GDSR);
 	return ret;
 }
 
-/* Reset the hardware domains (GENX_GRDOM_*) specified by mask */
-static int gen6_hw_domain_reset(struct drm_i915_private *dev_priv,
-				u32 hw_domain_mask)
-{
-	int err;
-
-	/* GEN6_GDRST is not in the gt power well, no need to check
-	 * for fifo space for the write or forcewake the chip for
-	 * the read
-	 */
-	__raw_i915_write32(dev_priv, GEN6_GDRST, hw_domain_mask);
-
-	/* Wait for the device to ack the reset requests */
-	err = __intel_wait_for_register_fw(dev_priv,
-					   GEN6_GDRST, hw_domain_mask, 0,
-					   500, 0,
-					   NULL);
-	if (err)
-		DRM_DEBUG_DRIVER("Wait for 0x%08x engines reset failed\n",
-				 hw_domain_mask);
-
-	return err;
-}
-
-/**
- * gen6_reset_engines - reset individual engines
- * @dev_priv: i915 device
- * @engine_mask: mask of intel_ring_flag() engines or ALL_ENGINES for full reset
- * @retry: the count of of previous attempts to reset.
- *
- * This function will reset the individual engines that are set in engine_mask.
- * If you provide ALL_ENGINES as mask, full global domain reset will be issued.
- *
- * Note: It is responsibility of the caller to handle the difference between
- * asking full domain reset versus reset for all available individual engines.
- *
- * Returns 0 on success, nonzero on error.
- */
-static int gen6_reset_engines(struct drm_i915_private *dev_priv,
-			      unsigned int engine_mask,
-			      unsigned int retry)
-{
-	struct intel_engine_cs *engine;
-	const u32 hw_engine_mask[I915_NUM_ENGINES] = {
-		[RCS] = GEN6_GRDOM_RENDER,
-		[BCS] = GEN6_GRDOM_BLT,
-		[VCS] = GEN6_GRDOM_MEDIA,
-		[VCS2] = GEN8_GRDOM_MEDIA2,
-		[VECS] = GEN6_GRDOM_VECS,
-	};
-	u32 hw_mask;
-
-	if (engine_mask == ALL_ENGINES) {
-		hw_mask = GEN6_GRDOM_FULL;
-	} else {
-		unsigned int tmp;
-
-		hw_mask = 0;
-		for_each_engine_masked(engine, dev_priv, engine_mask, tmp)
-			hw_mask |= hw_engine_mask[engine->id];
-	}
-
-	return gen6_hw_domain_reset(dev_priv, hw_mask);
-}
-
-/**
- * gen11_reset_engines - reset individual engines
- * @dev_priv: i915 device
- * @engine_mask: mask of intel_ring_flag() engines or ALL_ENGINES for full reset
- *
- * This function will reset the individual engines that are set in engine_mask.
- * If you provide ALL_ENGINES as mask, full global domain reset will be issued.
- *
- * Note: It is responsibility of the caller to handle the difference between
- * asking full domain reset versus reset for all available individual engines.
- *
- * Returns 0 on success, nonzero on error.
- */
-static int gen11_reset_engines(struct drm_i915_private *dev_priv,
-			       unsigned int engine_mask)
-{
-	struct intel_engine_cs *engine;
-	const u32 hw_engine_mask[I915_NUM_ENGINES] = {
-		[RCS] = GEN11_GRDOM_RENDER,
-		[BCS] = GEN11_GRDOM_BLT,
-		[VCS] = GEN11_GRDOM_MEDIA,
-		[VCS2] = GEN11_GRDOM_MEDIA2,
-		[VCS3] = GEN11_GRDOM_MEDIA3,
-		[VCS4] = GEN11_GRDOM_MEDIA4,
-		[VECS] = GEN11_GRDOM_VECS,
-		[VECS2] = GEN11_GRDOM_VECS2,
-	};
-	u32 hw_mask;
-
-	BUILD_BUG_ON(VECS2 + 1 != I915_NUM_ENGINES);
-
-	if (engine_mask == ALL_ENGINES) {
-		hw_mask = GEN11_GRDOM_FULL;
-	} else {
-		unsigned int tmp;
-
-		hw_mask = 0;
-		for_each_engine_masked(engine, dev_priv, engine_mask, tmp)
-			hw_mask |= hw_engine_mask[engine->id];
-	}
-
-	return gen6_hw_domain_reset(dev_priv, hw_mask);
-}
-
 /**
  * __intel_wait_for_register_fw - wait until register matches expected state
  * @dev_priv: the i915 device
@@ -2079,202 +1819,15 @@ int __intel_wait_for_register(struct drm_i915_private *dev_priv,
 				 (reg_value & mask) == value,
 				 slow_timeout_ms * 1000, 10, 1000);
 
+	/* just trace the final value */
+	trace_i915_reg_rw(false, reg, reg_value, sizeof(reg_value), true);
+
 	if (out_value)
 		*out_value = reg_value;
 
 	return ret;
 }
 
-static int gen8_engine_reset_prepare(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-	int ret;
-
-	I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base),
-		      _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET));
-
-	ret = __intel_wait_for_register_fw(dev_priv,
-					   RING_RESET_CTL(engine->mmio_base),
-					   RESET_CTL_READY_TO_RESET,
-					   RESET_CTL_READY_TO_RESET,
-					   700, 0,
-					   NULL);
-	if (ret)
-		DRM_ERROR("%s: reset request timeout\n", engine->name);
-
-	return ret;
-}
-
-static void gen8_engine_reset_cancel(struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base),
-		      _MASKED_BIT_DISABLE(RESET_CTL_REQUEST_RESET));
-}
-
-static int reset_engines(struct drm_i915_private *i915,
-			 unsigned int engine_mask,
-			 unsigned int retry)
-{
-	if (INTEL_GEN(i915) >= 11)
-		return gen11_reset_engines(i915, engine_mask);
-	else
-		return gen6_reset_engines(i915, engine_mask, retry);
-}
-
-static int gen8_reset_engines(struct drm_i915_private *dev_priv,
-			      unsigned int engine_mask,
-			      unsigned int retry)
-{
-	struct intel_engine_cs *engine;
-	const bool reset_non_ready = retry >= 1;
-	unsigned int tmp;
-	int ret;
-
-	for_each_engine_masked(engine, dev_priv, engine_mask, tmp) {
-		ret = gen8_engine_reset_prepare(engine);
-		if (ret && !reset_non_ready)
-			goto skip_reset;
-
-		/*
-		 * If this is not the first failed attempt to prepare,
-		 * we decide to proceed anyway.
-		 *
-		 * By doing so we risk context corruption and with
-		 * some gens (kbl), possible system hang if reset
-		 * happens during active bb execution.
-		 *
-		 * We rather take context corruption instead of
-		 * failed reset with a wedged driver/gpu. And
-		 * active bb execution case should be covered by
-		 * i915_stop_engines we have before the reset.
-		 */
-	}
-
-	ret = reset_engines(dev_priv, engine_mask, retry);
-
-skip_reset:
-	for_each_engine_masked(engine, dev_priv, engine_mask, tmp)
-		gen8_engine_reset_cancel(engine);
-
-	return ret;
-}
-
-typedef int (*reset_func)(struct drm_i915_private *,
-			  unsigned int engine_mask, unsigned int retry);
-
-static reset_func intel_get_gpu_reset(struct drm_i915_private *dev_priv)
-{
-	if (!i915_modparams.reset)
-		return NULL;
-
-	if (INTEL_GEN(dev_priv) >= 8)
-		return gen8_reset_engines;
-	else if (INTEL_GEN(dev_priv) >= 6)
-		return gen6_reset_engines;
-	else if (IS_GEN5(dev_priv))
-		return ironlake_do_reset;
-	else if (IS_G4X(dev_priv))
-		return g4x_do_reset;
-	else if (IS_G33(dev_priv) || IS_PINEVIEW(dev_priv))
-		return g33_do_reset;
-	else if (INTEL_GEN(dev_priv) >= 3)
-		return i915_do_reset;
-	else
-		return NULL;
-}
-
-int intel_gpu_reset(struct drm_i915_private *dev_priv,
-		    const unsigned int engine_mask)
-{
-	reset_func reset = intel_get_gpu_reset(dev_priv);
-	unsigned int retry;
-	int ret;
-
-	GEM_BUG_ON(!engine_mask);
-
-	/*
-	 * We want to perform per-engine reset from atomic context (e.g.
-	 * softirq), which imposes the constraint that we cannot sleep.
-	 * However, experience suggests that spending a bit of time waiting
-	 * for a reset helps in various cases, so for a full-device reset
-	 * we apply the opposite rule and wait if we want to. As we should
-	 * always follow up a failed per-engine reset with a full device reset,
-	 * being a little faster, stricter and more error prone for the
-	 * atomic case seems an acceptable compromise.
-	 *
-	 * Unfortunately this leads to a bimodal routine, when the goal was
-	 * to have a single reset function that worked for resetting any
-	 * number of engines simultaneously.
-	 */
-	might_sleep_if(engine_mask == ALL_ENGINES);
-
-	/*
-	 * If the power well sleeps during the reset, the reset
-	 * request may be dropped and never completes (causing -EIO).
-	 */
-	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
-	for (retry = 0; retry < 3; retry++) {
-
-		/*
-		 * We stop engines, otherwise we might get failed reset and a
-		 * dead gpu (on elk). Also as modern gpu as kbl can suffer
-		 * from system hang if batchbuffer is progressing when
-		 * the reset is issued, regardless of READY_TO_RESET ack.
-		 * Thus assume it is best to stop engines on all gens
-		 * where we have a gpu reset.
-		 *
-		 * WaKBLVECSSemaphoreWaitPoll:kbl (on ALL_ENGINES)
-		 *
-		 * WaMediaResetMainRingCleanup:ctg,elk (presumably)
-		 *
-		 * FIXME: Wa for more modern gens needs to be validated
-		 */
-		i915_stop_engines(dev_priv, engine_mask);
-
-		ret = -ENODEV;
-		if (reset) {
-			ret = reset(dev_priv, engine_mask, retry);
-			GEM_TRACE("engine_mask=%x, ret=%d, retry=%d\n",
-				  engine_mask, ret, retry);
-		}
-		if (ret != -ETIMEDOUT || engine_mask != ALL_ENGINES)
-			break;
-
-		cond_resched();
-	}
-	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
-
-	return ret;
-}
-
-bool intel_has_gpu_reset(struct drm_i915_private *dev_priv)
-{
-	return intel_get_gpu_reset(dev_priv) != NULL;
-}
-
-bool intel_has_reset_engine(struct drm_i915_private *dev_priv)
-{
-	return (dev_priv->info.has_reset_engine &&
-		i915_modparams.reset >= 2);
-}
-
-int intel_reset_guc(struct drm_i915_private *dev_priv)
-{
-	u32 guc_domain = INTEL_GEN(dev_priv) >= 11 ? GEN11_GRDOM_GUC :
-						     GEN9_GRDOM_GUC;
-	int ret;
-
-	GEM_BUG_ON(!HAS_GUC(dev_priv));
-
-	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
-	ret = gen6_hw_domain_reset(dev_priv, guc_domain);
-	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
-
-	return ret;
-}
-
 bool intel_uncore_unclaimed_mmio(struct drm_i915_private *dev_priv)
 {
 	return check_for_unclaimed_mmio(dev_priv);
@@ -2321,7 +1874,7 @@ intel_uncore_forcewake_for_read(struct drm_i915_private *dev_priv,
 	} else if (INTEL_GEN(dev_priv) >= 6) {
 		fw_domains = __gen6_reg_read_fw_domains(offset);
 	} else {
-		WARN_ON(!IS_GEN(dev_priv, 2, 5));
+		WARN_ON(!IS_GEN_RANGE(dev_priv, 2, 5));
 		fw_domains = 0;
 	}
 
@@ -2341,12 +1894,12 @@ intel_uncore_forcewake_for_write(struct drm_i915_private *dev_priv,
 		fw_domains = __gen11_fwtable_reg_write_fw_domains(offset);
 	} else if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
 		fw_domains = __fwtable_reg_write_fw_domains(offset);
-	} else if (IS_GEN8(dev_priv)) {
+	} else if (IS_GEN(dev_priv, 8)) {
 		fw_domains = __gen8_reg_write_fw_domains(offset);
-	} else if (IS_GEN(dev_priv, 6, 7)) {
+	} else if (IS_GEN_RANGE(dev_priv, 6, 7)) {
 		fw_domains = FORCEWAKE_RENDER;
 	} else {
-		WARN_ON(!IS_GEN(dev_priv, 2, 5));
+		WARN_ON(!IS_GEN_RANGE(dev_priv, 2, 5));
 		fw_domains = 0;
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_vdsc.c b/drivers/gpu/drm/i915/intel_vdsc.c
index c56ba0e04044..23abf03736e7 100644
--- a/drivers/gpu/drm/i915/intel_vdsc.c
+++ b/drivers/gpu/drm/i915/intel_vdsc.c
@@ -6,7 +6,6 @@
  *         Manasi Navare <manasi.d.navare@intel.com>
  */
 
-#include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
 #include "intel_drv.h"
@@ -1083,6 +1082,6 @@ void intel_dsc_disable(const struct intel_crtc_state *old_crtc_state)
 	I915_WRITE(dss_ctl2_reg, dss_ctl2_val);
 
 	/* Disable Power wells for VDSC/joining */
-	intel_display_power_put(dev_priv,
-				intel_dsc_power_domain(old_crtc_state));
+	intel_display_power_put_unchecked(dev_priv,
+					  intel_dsc_power_domain(old_crtc_state));
 }
diff --git a/drivers/gpu/drm/i915/intel_wopcm.c b/drivers/gpu/drm/i915/intel_wopcm.c
index 92cb82dd0c07..f82a415ea2ba 100644
--- a/drivers/gpu/drm/i915/intel_wopcm.c
+++ b/drivers/gpu/drm/i915/intel_wopcm.c
@@ -130,11 +130,11 @@ static inline int check_hw_restriction(struct drm_i915_private *i915,
 {
 	int err = 0;
 
-	if (IS_GEN9(i915))
+	if (IS_GEN(i915, 9))
 		err = gen9_check_dword_gap(guc_wopcm_base, guc_wopcm_size);
 
 	if (!err &&
-	    (IS_GEN9(i915) || IS_CNL_REVID(i915, CNL_REVID_A0, CNL_REVID_A0)))
+	    (IS_GEN(i915, 9) || IS_CNL_REVID(i915, CNL_REVID_A0, CNL_REVID_A0)))
 		err = gen9_check_huc_fw_fits(guc_wopcm_size, huc_fw_size);
 
 	return err;
@@ -163,7 +163,7 @@ int intel_wopcm_init(struct intel_wopcm *wopcm)
 	u32 guc_wopcm_rsvd;
 	int err;
 
-	if (!USES_GUC(dev_priv))
+	if (!USES_GUC(i915))
 		return 0;
 
 	GEM_BUG_ON(!wopcm->size);
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c
index 4f41e326f3f3..15f4a6dee5aa 100644
--- a/drivers/gpu/drm/i915/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/intel_workarounds.c
@@ -142,7 +142,8 @@ static void _wa_add(struct i915_wa_list *wal, const struct i915_wa *wa)
 }
 
 static void
-__wa_add(struct i915_wa_list *wal, i915_reg_t reg, u32 mask, u32 val)
+wa_write_masked_or(struct i915_wa_list *wal, i915_reg_t reg, u32 mask,
+		   u32 val)
 {
 	struct i915_wa wa = {
 		.reg = reg,
@@ -153,16 +154,32 @@ __wa_add(struct i915_wa_list *wal, i915_reg_t reg, u32 mask, u32 val)
 	_wa_add(wal, &wa);
 }
 
-#define WA_REG(addr, mask, val) __wa_add(wal, (addr), (mask), (val))
+static void
+wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+{
+	wa_write_masked_or(wal, reg, val, _MASKED_BIT_ENABLE(val));
+}
+
+static void
+wa_write(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+{
+	wa_write_masked_or(wal, reg, ~0, val);
+}
+
+static void
+wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+{
+	wa_write_masked_or(wal, reg, val, val);
+}
 
 #define WA_SET_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+	wa_write_masked_or(wal, (addr), (mask), _MASKED_BIT_ENABLE(mask))
 
 #define WA_CLR_BIT_MASKED(addr, mask) \
-	WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+	wa_write_masked_or(wal, (addr), (mask), _MASKED_BIT_DISABLE(mask))
 
 #define WA_SET_FIELD_MASKED(addr, mask, value) \
-	WA_REG(addr, (mask), _MASKED_FIELD(mask, value))
+	wa_write_masked_or(wal, (addr), (mask), _MASKED_FIELD((mask), (value)))
 
 static void gen8_ctx_workarounds_init(struct intel_engine_cs *engine)
 {
@@ -366,7 +383,7 @@ static void skl_tune_iz_hashing(struct intel_engine_cs *engine)
 		 * Only consider slices where one, and only one, subslice has 7
 		 * EUs
 		 */
-		if (!is_power_of_2(INTEL_INFO(i915)->sseu.subslice_7eu[i]))
+		if (!is_power_of_2(RUNTIME_INFO(i915)->sseu.subslice_7eu[i]))
 			continue;
 
 		/*
@@ -375,7 +392,7 @@ static void skl_tune_iz_hashing(struct intel_engine_cs *engine)
 		 *
 		 * ->    0 <= ss <= 3;
 		 */
-		ss = ffs(INTEL_INFO(i915)->sseu.subslice_7eu[i]) - 1;
+		ss = ffs(RUNTIME_INFO(i915)->sseu.subslice_7eu[i]) - 1;
 		vals[i] = 3 - ss;
 	}
 
@@ -532,6 +549,12 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine)
 	if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
 		WA_SET_BIT_MASKED(GEN11_COMMON_SLICE_CHICKEN3,
 				  GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC);
+
+	/* WaEnableFloatBlendOptimization:icl */
+	wa_write_masked_or(wal,
+			   GEN10_CACHE_MODE_SS,
+			   0, /* write-only, so skip validation */
+			   _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE));
 }
 
 void intel_engine_init_ctx_wa(struct intel_engine_cs *engine)
@@ -603,46 +626,8 @@ int intel_engine_emit_ctx_wa(struct i915_request *rq)
 }
 
 static void
-wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
+gen9_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	struct i915_wa wa = {
-		.reg = reg,
-		.mask = val,
-		.val = _MASKED_BIT_ENABLE(val)
-	};
-
-	_wa_add(wal, &wa);
-}
-
-static void
-wa_write_masked_or(struct i915_wa_list *wal, i915_reg_t reg, u32 mask,
-		   u32 val)
-{
-	struct i915_wa wa = {
-		.reg = reg,
-		.mask = mask,
-		.val = val
-	};
-
-	_wa_add(wal, &wa);
-}
-
-static void
-wa_write(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
-{
-	wa_write_masked_or(wal, reg, ~0, val);
-}
-
-static void
-wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
-{
-	wa_write_masked_or(wal, reg, val, val);
-}
-
-static void gen9_gt_workarounds_init(struct drm_i915_private *i915)
-{
-	struct i915_wa_list *wal = &i915->gt_wa_list;
-
 	/* WaDisableKillLogic:bxt,skl,kbl */
 	if (!IS_COFFEELAKE(i915))
 		wa_write_or(wal,
@@ -666,11 +651,10 @@ static void gen9_gt_workarounds_init(struct drm_i915_private *i915)
 		    BDW_DISABLE_HDC_INVALIDATION);
 }
 
-static void skl_gt_workarounds_init(struct drm_i915_private *i915)
+static void
+skl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	struct i915_wa_list *wal = &i915->gt_wa_list;
-
-	gen9_gt_workarounds_init(i915);
+	gen9_gt_workarounds_init(i915, wal);
 
 	/* WaDisableGafsUnitClkGating:skl */
 	wa_write_or(wal,
@@ -684,11 +668,10 @@ static void skl_gt_workarounds_init(struct drm_i915_private *i915)
 			    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
 }
 
-static void bxt_gt_workarounds_init(struct drm_i915_private *i915)
+static void
+bxt_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	struct i915_wa_list *wal = &i915->gt_wa_list;
-
-	gen9_gt_workarounds_init(i915);
+	gen9_gt_workarounds_init(i915, wal);
 
 	/* WaInPlaceDecompressionHang:bxt */
 	wa_write_or(wal,
@@ -696,11 +679,10 @@ static void bxt_gt_workarounds_init(struct drm_i915_private *i915)
 		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
 }
 
-static void kbl_gt_workarounds_init(struct drm_i915_private *i915)
+static void
+kbl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	struct i915_wa_list *wal = &i915->gt_wa_list;
-
-	gen9_gt_workarounds_init(i915);
+	gen9_gt_workarounds_init(i915, wal);
 
 	/* WaDisableDynamicCreditSharing:kbl */
 	if (IS_KBL_REVID(i915, 0, KBL_REVID_B0))
@@ -719,16 +701,16 @@ static void kbl_gt_workarounds_init(struct drm_i915_private *i915)
 		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
 }
 
-static void glk_gt_workarounds_init(struct drm_i915_private *i915)
+static void
+glk_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	gen9_gt_workarounds_init(i915);
+	gen9_gt_workarounds_init(i915, wal);
 }
 
-static void cfl_gt_workarounds_init(struct drm_i915_private *i915)
+static void
+cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	struct i915_wa_list *wal = &i915->gt_wa_list;
-
-	gen9_gt_workarounds_init(i915);
+	gen9_gt_workarounds_init(i915, wal);
 
 	/* WaDisableGafsUnitClkGating:cfl */
 	wa_write_or(wal,
@@ -741,10 +723,10 @@ static void cfl_gt_workarounds_init(struct drm_i915_private *i915)
 		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
 }
 
-static void wa_init_mcr(struct drm_i915_private *dev_priv)
+static void
+wa_init_mcr(struct drm_i915_private *dev_priv, struct i915_wa_list *wal)
 {
-	const struct sseu_dev_info *sseu = &(INTEL_INFO(dev_priv)->sseu);
-	struct i915_wa_list *wal = &dev_priv->gt_wa_list;
+	const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
 	u32 mcr_slice_subslice_mask;
 
 	/*
@@ -804,11 +786,10 @@ static void wa_init_mcr(struct drm_i915_private *dev_priv)
 			   intel_calculate_mcr_s_ss_select(dev_priv));
 }
 
-static void cnl_gt_workarounds_init(struct drm_i915_private *i915)
+static void
+cnl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	struct i915_wa_list *wal = &i915->gt_wa_list;
-
-	wa_init_mcr(i915);
+	wa_init_mcr(i915, wal);
 
 	/* WaDisableI2mCycleOnWRPort:cnl (pre-prod) */
 	if (IS_CNL_REVID(i915, CNL_REVID_B0, CNL_REVID_B0))
@@ -822,11 +803,10 @@ static void cnl_gt_workarounds_init(struct drm_i915_private *i915)
 		    GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
 }
 
-static void icl_gt_workarounds_init(struct drm_i915_private *i915)
+static void
+icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	struct i915_wa_list *wal = &i915->gt_wa_list;
-
-	wa_init_mcr(i915);
+	wa_init_mcr(i915, wal);
 
 	/* WaInPlaceDecompressionHang:icl */
 	wa_write_or(wal,
@@ -879,12 +859,9 @@ static void icl_gt_workarounds_init(struct drm_i915_private *i915)
 		    GAMT_CHKN_DISABLE_L3_COH_PIPE);
 }
 
-void intel_gt_init_workarounds(struct drm_i915_private *i915)
+static void
+gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-	struct i915_wa_list *wal = &i915->gt_wa_list;
-
-	wa_init_start(wal, "GT");
-
 	if (INTEL_GEN(i915) < 8)
 		return;
 	else if (IS_BROADWELL(i915))
@@ -892,22 +869,29 @@ void intel_gt_init_workarounds(struct drm_i915_private *i915)
 	else if (IS_CHERRYVIEW(i915))
 		return;
 	else if (IS_SKYLAKE(i915))
-		skl_gt_workarounds_init(i915);
+		skl_gt_workarounds_init(i915, wal);
 	else if (IS_BROXTON(i915))
-		bxt_gt_workarounds_init(i915);
+		bxt_gt_workarounds_init(i915, wal);
 	else if (IS_KABYLAKE(i915))
-		kbl_gt_workarounds_init(i915);
+		kbl_gt_workarounds_init(i915, wal);
 	else if (IS_GEMINILAKE(i915))
-		glk_gt_workarounds_init(i915);
+		glk_gt_workarounds_init(i915, wal);
 	else if (IS_COFFEELAKE(i915))
-		cfl_gt_workarounds_init(i915);
+		cfl_gt_workarounds_init(i915, wal);
 	else if (IS_CANNONLAKE(i915))
-		cnl_gt_workarounds_init(i915);
+		cnl_gt_workarounds_init(i915, wal);
 	else if (IS_ICELAKE(i915))
-		icl_gt_workarounds_init(i915);
+		icl_gt_workarounds_init(i915, wal);
 	else
 		MISSING_CASE(INTEL_GEN(i915));
+}
 
+void intel_gt_init_workarounds(struct drm_i915_private *i915)
+{
+	struct i915_wa_list *wal = &i915->gt_wa_list;
+
+	wa_init_start(wal, "GT");
+	gt_init_workarounds(i915, wal);
 	wa_init_finish(wal);
 }
 
@@ -955,8 +939,6 @@ wa_list_apply(struct drm_i915_private *dev_priv, const struct i915_wa_list *wal)
 
 	intel_uncore_forcewake_put__locked(dev_priv, fw);
 	spin_unlock_irqrestore(&dev_priv->uncore.lock, flags);
-
-	DRM_DEBUG_DRIVER("Applied %u %s workarounds\n", wal->count, wal->name);
 }
 
 void intel_gt_apply_workarounds(struct drm_i915_private *dev_priv)
@@ -1126,14 +1108,12 @@ void intel_engine_apply_whitelist(struct intel_engine_cs *engine)
 	for (; i < RING_MAX_NONPRIV_SLOTS; i++)
 		I915_WRITE(RING_FORCE_TO_NONPRIV(base, i),
 			   i915_mmio_reg_offset(RING_NOPID(base)));
-
-	DRM_DEBUG_DRIVER("Applied %u %s workarounds\n", wal->count, wal->name);
 }
 
-static void rcs_engine_wa_init(struct intel_engine_cs *engine)
+static void
+rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 {
 	struct drm_i915_private *i915 = engine->i915;
-	struct i915_wa_list *wal = &engine->wa_list;
 
 	if (IS_ICELAKE(i915)) {
 		/* This is not an Wa. Enable for better image quality */
@@ -1190,7 +1170,7 @@ static void rcs_engine_wa_init(struct intel_engine_cs *engine)
 				    GEN7_DISABLE_SAMPLER_PREFETCH);
 	}
 
-	if (IS_GEN9(i915) || IS_CANNONLAKE(i915)) {
+	if (IS_GEN(i915, 9) || IS_CANNONLAKE(i915)) {
 		/* WaEnablePreemptionGranularityControlByUMD:skl,bxt,kbl,cfl,cnl */
 		wa_masked_en(wal,
 			     GEN7_FF_SLICE_CS_CHICKEN1,
@@ -1211,7 +1191,7 @@ static void rcs_engine_wa_init(struct intel_engine_cs *engine)
 			     GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE);
 	}
 
-	if (IS_GEN9(i915)) {
+	if (IS_GEN(i915, 9)) {
 		/* WaContextSwitchWithConcurrentTLBInvalidate:skl,bxt,kbl,glk,cfl */
 		wa_masked_en(wal,
 			     GEN9_CSFE_CHICKEN1_RCS,
@@ -1237,10 +1217,10 @@ static void rcs_engine_wa_init(struct intel_engine_cs *engine)
 	}
 }
 
-static void xcs_engine_wa_init(struct intel_engine_cs *engine)
+static void
+xcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
 {
 	struct drm_i915_private *i915 = engine->i915;
-	struct i915_wa_list *wal = &engine->wa_list;
 
 	/* WaKBLVECSSemaphoreWaitPoll:kbl */
 	if (IS_KBL_REVID(i915, KBL_REVID_A0, KBL_REVID_E0)) {
@@ -1250,6 +1230,18 @@ static void xcs_engine_wa_init(struct intel_engine_cs *engine)
 	}
 }
 
+static void
+engine_init_workarounds(struct intel_engine_cs *engine, struct i915_wa_list *wal)
+{
+	if (I915_SELFTEST_ONLY(INTEL_GEN(engine->i915) < 8))
+		return;
+
+	if (engine->id == RCS)
+		rcs_engine_wa_init(engine, wal);
+	else
+		xcs_engine_wa_init(engine, wal);
+}
+
 void intel_engine_init_workarounds(struct intel_engine_cs *engine)
 {
 	struct i915_wa_list *wal = &engine->wa_list;
@@ -1258,12 +1250,7 @@ void intel_engine_init_workarounds(struct intel_engine_cs *engine)
 		return;
 
 	wa_init_start(wal, engine->name);
-
-	if (engine->id == RCS)
-		rcs_engine_wa_init(engine);
-	else
-		xcs_engine_wa_init(engine);
-
+	engine_init_workarounds(engine, wal);
 	wa_init_finish(wal);
 }
 
@@ -1273,11 +1260,5 @@ void intel_engine_apply_workarounds(struct intel_engine_cs *engine)
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-static bool intel_engine_verify_workarounds(struct intel_engine_cs *engine,
-					    const char *from)
-{
-	return wa_list_verify(engine->i915, &engine->wa_list, from);
-}
-
 #include "selftests/intel_workarounds.c"
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
index 26c065c8d2c0..a9a2fa35876f 100644
--- a/drivers/gpu/drm/i915/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -972,7 +972,6 @@ static int gpu_write(struct i915_vma *vma,
 {
 	struct i915_request *rq;
 	struct i915_vma *batch;
-	int flags = 0;
 	int err;
 
 	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
@@ -981,14 +980,14 @@ static int gpu_write(struct i915_vma *vma,
 	if (err)
 		return err;
 
-	rq = i915_request_alloc(engine, ctx);
-	if (IS_ERR(rq))
-		return PTR_ERR(rq);
-
 	batch = gpu_write_dw(vma, dword * sizeof(u32), value);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto err_request;
+	if (IS_ERR(batch))
+		return PTR_ERR(batch);
+
+	rq = i915_request_alloc(engine, ctx);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto err_batch;
 	}
 
 	err = i915_vma_move_to_active(batch, rq, 0);
@@ -996,21 +995,21 @@ static int gpu_write(struct i915_vma *vma,
 		goto err_request;
 
 	i915_gem_object_set_active_reference(batch->obj);
-	i915_vma_unpin(batch);
-	i915_vma_close(batch);
 
-	err = engine->emit_bb_start(rq,
-				    batch->node.start, batch->node.size,
-				    flags);
+	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
 	if (err)
 		goto err_request;
 
-	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
+	err = engine->emit_bb_start(rq,
+				    batch->node.start, batch->node.size,
+				    0);
+err_request:
 	if (err)
 		i915_request_skip(rq, err);
-
-err_request:
 	i915_request_add(rq);
+err_batch:
+	i915_vma_unpin(batch);
+	i915_vma_close(batch);
 
 	return err;
 }
@@ -1450,7 +1449,7 @@ static int igt_ppgtt_pin_update(void *arg)
 	 * huge-gtt-pages.
 	 */
 
-	if (!HAS_FULL_48BIT_PPGTT(dev_priv)) {
+	if (!ppgtt || !i915_vm_is_48bit(&ppgtt->vm)) {
 		pr_info("48b PPGTT not supported, skipping\n");
 		return 0;
 	}
@@ -1703,7 +1702,6 @@ int i915_gem_huge_page_mock_selftests(void)
 	};
 	struct drm_i915_private *dev_priv;
 	struct i915_hw_ppgtt *ppgtt;
-	struct pci_dev *pdev;
 	int err;
 
 	dev_priv = mock_gem_device();
@@ -1713,9 +1711,6 @@ int i915_gem_huge_page_mock_selftests(void)
 	/* Pretend to be a device which supports the 48b PPGTT */
 	mkwrite_device_info(dev_priv)->ppgtt = INTEL_PPGTT_FULL_4LVL;
 
-	pdev = dev_priv->drm.pdev;
-	dma_coerce_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(39));
-
 	mutex_lock(&dev_priv->drm.struct_mutex);
 	ppgtt = i915_ppgtt_create(dev_priv, ERR_PTR(-ENODEV));
 	if (IS_ERR(ppgtt)) {
@@ -1761,6 +1756,7 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
 	};
 	struct drm_file *file;
 	struct i915_gem_context *ctx;
+	intel_wakeref_t wakeref;
 	int err;
 
 	if (!HAS_PPGTT(dev_priv)) {
@@ -1776,7 +1772,7 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
 		return PTR_ERR(file);
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	ctx = live_context(dev_priv, file);
 	if (IS_ERR(ctx)) {
@@ -1790,7 +1786,7 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *dev_priv)
 	err = i915_subtests(tests, ctx);
 
 out_unlock:
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	mock_file_free(dev_priv, file);
diff --git a/drivers/gpu/drm/i915/selftests/i915_active.c b/drivers/gpu/drm/i915/selftests/i915_active.c
new file mode 100644
index 000000000000..337b1f98b923
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/i915_active.c
@@ -0,0 +1,157 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#include "../i915_selftest.h"
+
+#include "igt_flush_test.h"
+#include "lib_sw_fence.h"
+
+struct live_active {
+	struct i915_active base;
+	bool retired;
+};
+
+static void __live_active_retire(struct i915_active *base)
+{
+	struct live_active *active = container_of(base, typeof(*active), base);
+
+	active->retired = true;
+}
+
+static int __live_active_setup(struct drm_i915_private *i915,
+			       struct live_active *active)
+{
+	struct intel_engine_cs *engine;
+	struct i915_sw_fence *submit;
+	enum intel_engine_id id;
+	unsigned int count = 0;
+	int err = 0;
+
+	submit = heap_fence_create(GFP_KERNEL);
+	if (!submit)
+		return -ENOMEM;
+
+	i915_active_init(i915, &active->base, __live_active_retire);
+	active->retired = false;
+
+	if (!i915_active_acquire(&active->base)) {
+		pr_err("First i915_active_acquire should report being idle\n");
+		err = -EINVAL;
+		goto out;
+	}
+
+	for_each_engine(engine, i915, id) {
+		struct i915_request *rq;
+
+		rq = i915_request_alloc(engine, i915->kernel_context);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			break;
+		}
+
+		err = i915_sw_fence_await_sw_fence_gfp(&rq->submit,
+						       submit,
+						       GFP_KERNEL);
+		if (err >= 0)
+			err = i915_active_ref(&active->base,
+					      rq->fence.context, rq);
+		i915_request_add(rq);
+		if (err) {
+			pr_err("Failed to track active ref!\n");
+			break;
+		}
+
+		count++;
+	}
+
+	i915_active_release(&active->base);
+	if (active->retired && count) {
+		pr_err("i915_active retired before submission!\n");
+		err = -EINVAL;
+	}
+	if (active->base.count != count) {
+		pr_err("i915_active not tracking all requests, found %d, expected %d\n",
+		       active->base.count, count);
+		err = -EINVAL;
+	}
+
+out:
+	i915_sw_fence_commit(submit);
+	heap_fence_put(submit);
+
+	return err;
+}
+
+static int live_active_wait(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct live_active active;
+	intel_wakeref_t wakeref;
+	int err;
+
+	/* Check that we get a callback when requests retire upon waiting */
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	err = __live_active_setup(i915, &active);
+
+	i915_active_wait(&active.base);
+	if (!active.retired) {
+		pr_err("i915_active not retired after waiting!\n");
+		err = -EINVAL;
+	}
+
+	i915_active_fini(&active.base);
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
+static int live_active_retire(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct live_active active;
+	intel_wakeref_t wakeref;
+	int err;
+
+	/* Check that we get a callback when requests are indirectly retired */
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	err = __live_active_setup(i915, &active);
+
+	/* waits for & retires all requests */
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	if (!active.retired) {
+		pr_err("i915_active not retired after flushing!\n");
+		err = -EINVAL;
+	}
+
+	i915_active_fini(&active.base);
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
+int i915_active_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(live_active_wait),
+		SUBTEST(live_active_retire),
+	};
+
+	if (i915_terminally_wedged(&i915->gpu_error))
+		return 0;
+
+	return i915_subtests(tests, i915);
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c
index d0aa19d17653..e77b7ed449ae 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c
@@ -16,9 +16,10 @@ static int switch_to_context(struct drm_i915_private *i915,
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
 	int err = 0;
 
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	for_each_engine(engine, i915, id) {
 		struct i915_request *rq;
@@ -32,7 +33,7 @@ static int switch_to_context(struct drm_i915_private *i915,
 		i915_request_add(rq);
 	}
 
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 
 	return err;
 }
@@ -65,7 +66,9 @@ static void trash_stolen(struct drm_i915_private *i915)
 
 static void simulate_hibernate(struct drm_i915_private *i915)
 {
-	intel_runtime_pm_get(i915);
+	intel_wakeref_t wakeref;
+
+	wakeref = intel_runtime_pm_get(i915);
 
 	/*
 	 * As a final sting in the tail, invalidate stolen. Under a real S4,
@@ -76,7 +79,7 @@ static void simulate_hibernate(struct drm_i915_private *i915)
 	 */
 	trash_stolen(i915);
 
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 }
 
 static int pm_prepare(struct drm_i915_private *i915)
@@ -93,39 +96,39 @@ static int pm_prepare(struct drm_i915_private *i915)
 
 static void pm_suspend(struct drm_i915_private *i915)
 {
-	intel_runtime_pm_get(i915);
-
-	i915_gem_suspend_gtt_mappings(i915);
-	i915_gem_suspend_late(i915);
+	intel_wakeref_t wakeref;
 
-	intel_runtime_pm_put(i915);
+	with_intel_runtime_pm(i915, wakeref) {
+		i915_gem_suspend_gtt_mappings(i915);
+		i915_gem_suspend_late(i915);
+	}
 }
 
 static void pm_hibernate(struct drm_i915_private *i915)
 {
-	intel_runtime_pm_get(i915);
+	intel_wakeref_t wakeref;
 
-	i915_gem_suspend_gtt_mappings(i915);
+	with_intel_runtime_pm(i915, wakeref) {
+		i915_gem_suspend_gtt_mappings(i915);
 
-	i915_gem_freeze(i915);
-	i915_gem_freeze_late(i915);
-
-	intel_runtime_pm_put(i915);
+		i915_gem_freeze(i915);
+		i915_gem_freeze_late(i915);
+	}
 }
 
 static void pm_resume(struct drm_i915_private *i915)
 {
+	intel_wakeref_t wakeref;
+
 	/*
 	 * Both suspend and hibernate follow the same wakeup path and assume
 	 * that runtime-pm just works.
 	 */
-	intel_runtime_pm_get(i915);
-
-	intel_engines_sanitize(i915);
-	i915_gem_sanitize(i915);
-	i915_gem_resume(i915);
-
-	intel_runtime_pm_put(i915);
+	with_intel_runtime_pm(i915, wakeref) {
+		intel_engines_sanitize(i915, false);
+		i915_gem_sanitize(i915);
+		i915_gem_resume(i915);
+	}
 }
 
 static int igt_gem_suspend(void *arg)
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
index f7392c1ffe75..fd89a5a33c1a 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
@@ -279,6 +279,7 @@ static int igt_gem_coherency(void *arg)
 	struct drm_i915_private *i915 = arg;
 	const struct igt_coherency_mode *read, *write, *over;
 	struct drm_i915_gem_object *obj;
+	intel_wakeref_t wakeref;
 	unsigned long count, n;
 	u32 *offsets, *values;
 	int err = 0;
@@ -298,7 +299,7 @@ static int igt_gem_coherency(void *arg)
 	values = offsets + ncachelines;
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 	for (over = igt_coherency_mode; over->name; over++) {
 		if (!over->set)
 			continue;
@@ -376,7 +377,7 @@ static int igt_gem_coherency(void *arg)
 		}
 	}
 unlock:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	kfree(offsets);
 	return err;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 7d82043aff10..d00d0bb07784 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -24,9 +24,13 @@
 
 #include <linux/prime_numbers.h>
 
+#include "../i915_reset.h"
 #include "../i915_selftest.h"
 #include "i915_random.h"
 #include "igt_flush_test.h"
+#include "igt_live_test.h"
+#include "igt_reset.h"
+#include "igt_spinner.h"
 
 #include "mock_drm.h"
 #include "mock_gem_device.h"
@@ -34,84 +38,6 @@
 
 #define DW_PER_PAGE (PAGE_SIZE / sizeof(u32))
 
-struct live_test {
-	struct drm_i915_private *i915;
-	const char *func;
-	const char *name;
-
-	unsigned int reset_global;
-	unsigned int reset_engine[I915_NUM_ENGINES];
-};
-
-static int begin_live_test(struct live_test *t,
-			   struct drm_i915_private *i915,
-			   const char *func,
-			   const char *name)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-	int err;
-
-	t->i915 = i915;
-	t->func = func;
-	t->name = name;
-
-	err = i915_gem_wait_for_idle(i915,
-				     I915_WAIT_LOCKED,
-				     MAX_SCHEDULE_TIMEOUT);
-	if (err) {
-		pr_err("%s(%s): failed to idle before, with err=%d!",
-		       func, name, err);
-		return err;
-	}
-
-	i915->gpu_error.missed_irq_rings = 0;
-	t->reset_global = i915_reset_count(&i915->gpu_error);
-
-	for_each_engine(engine, i915, id)
-		t->reset_engine[id] =
-			i915_reset_engine_count(&i915->gpu_error, engine);
-
-	return 0;
-}
-
-static int end_live_test(struct live_test *t)
-{
-	struct drm_i915_private *i915 = t->i915;
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	if (igt_flush_test(i915, I915_WAIT_LOCKED))
-		return -EIO;
-
-	if (t->reset_global != i915_reset_count(&i915->gpu_error)) {
-		pr_err("%s(%s): GPU was reset %d times!\n",
-		       t->func, t->name,
-		       i915_reset_count(&i915->gpu_error) - t->reset_global);
-		return -EIO;
-	}
-
-	for_each_engine(engine, i915, id) {
-		if (t->reset_engine[id] ==
-		    i915_reset_engine_count(&i915->gpu_error, engine))
-			continue;
-
-		pr_err("%s(%s): engine '%s' was reset %d times!\n",
-		       t->func, t->name, engine->name,
-		       i915_reset_engine_count(&i915->gpu_error, engine) -
-		       t->reset_engine[id]);
-		return -EIO;
-	}
-
-	if (i915->gpu_error.missed_irq_rings) {
-		pr_err("%s(%s): Missed interrupts on engines %lx\n",
-		       t->func, t->name, i915->gpu_error.missed_irq_rings);
-		return -EIO;
-	}
-
-	return 0;
-}
-
 static int live_nop_switch(void *arg)
 {
 	const unsigned int nctx = 1024;
@@ -119,8 +45,9 @@ static int live_nop_switch(void *arg)
 	struct intel_engine_cs *engine;
 	struct i915_gem_context **ctx;
 	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	struct igt_live_test t;
 	struct drm_file *file;
-	struct live_test t;
 	unsigned long n;
 	int err = -ENODEV;
 
@@ -140,7 +67,7 @@ static int live_nop_switch(void *arg)
 		return PTR_ERR(file);
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	ctx = kcalloc(nctx, sizeof(*ctx), GFP_KERNEL);
 	if (!ctx) {
@@ -184,7 +111,7 @@ static int live_nop_switch(void *arg)
 		pr_info("Populated %d contexts on %s in %lluns\n",
 			nctx, engine->name, ktime_to_ns(times[1] - times[0]));
 
-		err = begin_live_test(&t, i915, __func__, engine->name);
+		err = igt_live_test_begin(&t, i915, __func__, engine->name);
 		if (err)
 			goto out_unlock;
 
@@ -232,7 +159,7 @@ static int live_nop_switch(void *arg)
 				break;
 		}
 
-		err = end_live_test(&t);
+		err = igt_live_test_end(&t);
 		if (err)
 			goto out_unlock;
 
@@ -243,7 +170,7 @@ static int live_nop_switch(void *arg)
 	}
 
 out_unlock:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	mock_file_free(i915, file);
 	return err;
@@ -553,10 +480,10 @@ static int igt_ctx_exec(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct drm_i915_gem_object *obj = NULL;
 	unsigned long ncontexts, ndwords, dw;
+	struct igt_live_test t;
 	struct drm_file *file;
 	IGT_TIMEOUT(end_time);
 	LIST_HEAD(objects);
-	struct live_test t;
 	int err = -ENODEV;
 
 	/*
@@ -574,7 +501,7 @@ static int igt_ctx_exec(void *arg)
 
 	mutex_lock(&i915->drm.struct_mutex);
 
-	err = begin_live_test(&t, i915, __func__, "");
+	err = igt_live_test_begin(&t, i915, __func__, "");
 	if (err)
 		goto out_unlock;
 
@@ -593,6 +520,8 @@ static int igt_ctx_exec(void *arg)
 		}
 
 		for_each_engine(engine, i915, id) {
+			intel_wakeref_t wakeref;
+
 			if (!engine->context_size)
 				continue; /* No logical context support in HW */
 
@@ -607,9 +536,9 @@ static int igt_ctx_exec(void *arg)
 				}
 			}
 
-			intel_runtime_pm_get(i915);
-			err = gpu_fill(obj, ctx, engine, dw);
-			intel_runtime_pm_put(i915);
+			err = 0;
+			with_intel_runtime_pm(i915, wakeref)
+				err = gpu_fill(obj, ctx, engine, dw);
 			if (err) {
 				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
 				       ndwords, dw, max_dwords(obj),
@@ -627,7 +556,7 @@ static int igt_ctx_exec(void *arg)
 		ncontexts++;
 	}
 	pr_info("Submitted %lu contexts (across %u engines), filling %lu dwords\n",
-		ncontexts, INTEL_INFO(i915)->num_rings, ndwords);
+		ncontexts, RUNTIME_INFO(i915)->num_rings, ndwords);
 
 	dw = 0;
 	list_for_each_entry(obj, &objects, st_link) {
@@ -642,7 +571,7 @@ static int igt_ctx_exec(void *arg)
 	}
 
 out_unlock:
-	if (end_live_test(&t))
+	if (igt_live_test_end(&t))
 		err = -EIO;
 	mutex_unlock(&i915->drm.struct_mutex);
 
@@ -650,6 +579,469 @@ out_unlock:
 	return err;
 }
 
+static struct i915_vma *rpcs_query_batch(struct i915_vma *vma)
+{
+	struct drm_i915_gem_object *obj;
+	u32 *cmd;
+	int err;
+
+	if (INTEL_GEN(vma->vm->i915) < 8)
+		return ERR_PTR(-EINVAL);
+
+	obj = i915_gem_object_create_internal(vma->vm->i915, PAGE_SIZE);
+	if (IS_ERR(obj))
+		return ERR_CAST(obj);
+
+	cmd = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	if (IS_ERR(cmd)) {
+		err = PTR_ERR(cmd);
+		goto err;
+	}
+
+	*cmd++ = MI_STORE_REGISTER_MEM_GEN8;
+	*cmd++ = i915_mmio_reg_offset(GEN8_R_PWR_CLK_STATE);
+	*cmd++ = lower_32_bits(vma->node.start);
+	*cmd++ = upper_32_bits(vma->node.start);
+	*cmd = MI_BATCH_BUFFER_END;
+
+	i915_gem_object_unpin_map(obj);
+
+	err = i915_gem_object_set_to_gtt_domain(obj, false);
+	if (err)
+		goto err;
+
+	vma = i915_vma_instance(obj, vma->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err;
+	}
+
+	err = i915_vma_pin(vma, 0, 0, PIN_USER);
+	if (err)
+		goto err;
+
+	return vma;
+
+err:
+	i915_gem_object_put(obj);
+	return ERR_PTR(err);
+}
+
+static int
+emit_rpcs_query(struct drm_i915_gem_object *obj,
+		struct i915_gem_context *ctx,
+		struct intel_engine_cs *engine,
+		struct i915_request **rq_out)
+{
+	struct i915_request *rq;
+	struct i915_vma *batch;
+	struct i915_vma *vma;
+	int err;
+
+	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
+
+	vma = i915_vma_instance(obj, &ctx->ppgtt->vm, NULL);
+	if (IS_ERR(vma))
+		return PTR_ERR(vma);
+
+	err = i915_gem_object_set_to_gtt_domain(obj, false);
+	if (err)
+		return err;
+
+	err = i915_vma_pin(vma, 0, 0, PIN_USER);
+	if (err)
+		return err;
+
+	batch = rpcs_query_batch(vma);
+	if (IS_ERR(batch)) {
+		err = PTR_ERR(batch);
+		goto err_vma;
+	}
+
+	rq = i915_request_alloc(engine, ctx);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto err_batch;
+	}
+
+	err = engine->emit_bb_start(rq, batch->node.start, batch->node.size, 0);
+	if (err)
+		goto err_request;
+
+	err = i915_vma_move_to_active(batch, rq, 0);
+	if (err)
+		goto skip_request;
+
+	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
+	if (err)
+		goto skip_request;
+
+	i915_gem_object_set_active_reference(batch->obj);
+	i915_vma_unpin(batch);
+	i915_vma_close(batch);
+
+	i915_vma_unpin(vma);
+
+	*rq_out = i915_request_get(rq);
+
+	i915_request_add(rq);
+
+	return 0;
+
+skip_request:
+	i915_request_skip(rq, err);
+err_request:
+	i915_request_add(rq);
+err_batch:
+	i915_vma_unpin(batch);
+err_vma:
+	i915_vma_unpin(vma);
+
+	return err;
+}
+
+#define TEST_IDLE	BIT(0)
+#define TEST_BUSY	BIT(1)
+#define TEST_RESET	BIT(2)
+
+static int
+__sseu_prepare(struct drm_i915_private *i915,
+	       const char *name,
+	       unsigned int flags,
+	       struct i915_gem_context *ctx,
+	       struct intel_engine_cs *engine,
+	       struct igt_spinner **spin_out)
+{
+	int ret = 0;
+
+	if (flags & (TEST_BUSY | TEST_RESET)) {
+		struct igt_spinner *spin;
+		struct i915_request *rq;
+
+		spin = kzalloc(sizeof(*spin), GFP_KERNEL);
+		if (!spin) {
+			ret = -ENOMEM;
+			goto out;
+		}
+
+		ret = igt_spinner_init(spin, i915);
+		if (ret)
+			return ret;
+
+		rq = igt_spinner_create_request(spin, ctx, engine, MI_NOOP);
+		if (IS_ERR(rq)) {
+			ret = PTR_ERR(rq);
+			igt_spinner_fini(spin);
+			kfree(spin);
+			goto out;
+		}
+
+		i915_request_add(rq);
+
+		if (!igt_wait_for_spinner(spin, rq)) {
+			pr_err("%s: Spinner failed to start!\n", name);
+			igt_spinner_end(spin);
+			igt_spinner_fini(spin);
+			kfree(spin);
+			ret = -ETIMEDOUT;
+			goto out;
+		}
+
+		*spin_out = spin;
+	}
+
+out:
+	return ret;
+}
+
+static int
+__read_slice_count(struct drm_i915_private *i915,
+		   struct i915_gem_context *ctx,
+		   struct intel_engine_cs *engine,
+		   struct drm_i915_gem_object *obj,
+		   struct igt_spinner *spin,
+		   u32 *rpcs)
+{
+	struct i915_request *rq = NULL;
+	u32 s_mask, s_shift;
+	unsigned int cnt;
+	u32 *buf, val;
+	long ret;
+
+	ret = emit_rpcs_query(obj, ctx, engine, &rq);
+	if (ret)
+		return ret;
+
+	if (spin)
+		igt_spinner_end(spin);
+
+	ret = i915_request_wait(rq, I915_WAIT_LOCKED, MAX_SCHEDULE_TIMEOUT);
+	i915_request_put(rq);
+	if (ret < 0)
+		return ret;
+
+	buf = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	if (IS_ERR(buf)) {
+		ret = PTR_ERR(buf);
+		return ret;
+	}
+
+	if (INTEL_GEN(i915) >= 11) {
+		s_mask = GEN11_RPCS_S_CNT_MASK;
+		s_shift = GEN11_RPCS_S_CNT_SHIFT;
+	} else {
+		s_mask = GEN8_RPCS_S_CNT_MASK;
+		s_shift = GEN8_RPCS_S_CNT_SHIFT;
+	}
+
+	val = *buf;
+	cnt = (val & s_mask) >> s_shift;
+	*rpcs = val;
+
+	i915_gem_object_unpin_map(obj);
+
+	return cnt;
+}
+
+static int
+__check_rpcs(const char *name, u32 rpcs, int slices, unsigned int expected,
+	     const char *prefix, const char *suffix)
+{
+	if (slices == expected)
+		return 0;
+
+	if (slices < 0) {
+		pr_err("%s: %s read slice count failed with %d%s\n",
+		       name, prefix, slices, suffix);
+		return slices;
+	}
+
+	pr_err("%s: %s slice count %d is not %u%s\n",
+	       name, prefix, slices, expected, suffix);
+
+	pr_info("RPCS=0x%x; %u%sx%u%s\n",
+		rpcs, slices,
+		(rpcs & GEN8_RPCS_S_CNT_ENABLE) ? "*" : "",
+		(rpcs & GEN8_RPCS_SS_CNT_MASK) >> GEN8_RPCS_SS_CNT_SHIFT,
+		(rpcs & GEN8_RPCS_SS_CNT_ENABLE) ? "*" : "");
+
+	return -EINVAL;
+}
+
+static int
+__sseu_finish(struct drm_i915_private *i915,
+	      const char *name,
+	      unsigned int flags,
+	      struct i915_gem_context *ctx,
+	      struct i915_gem_context *kctx,
+	      struct intel_engine_cs *engine,
+	      struct drm_i915_gem_object *obj,
+	      unsigned int expected,
+	      struct igt_spinner *spin)
+{
+	unsigned int slices =
+		hweight32(intel_device_default_sseu(i915).slice_mask);
+	u32 rpcs = 0;
+	int ret = 0;
+
+	if (flags & TEST_RESET) {
+		ret = i915_reset_engine(engine, "sseu");
+		if (ret)
+			goto out;
+	}
+
+	ret = __read_slice_count(i915, ctx, engine, obj,
+				 flags & TEST_RESET ? NULL : spin, &rpcs);
+	ret = __check_rpcs(name, rpcs, ret, expected, "Context", "!");
+	if (ret)
+		goto out;
+
+	ret = __read_slice_count(i915, kctx, engine, obj, NULL, &rpcs);
+	ret = __check_rpcs(name, rpcs, ret, slices, "Kernel context", "!");
+
+out:
+	if (spin)
+		igt_spinner_end(spin);
+
+	if ((flags & TEST_IDLE) && ret == 0) {
+		ret = i915_gem_wait_for_idle(i915,
+					     I915_WAIT_LOCKED,
+					     MAX_SCHEDULE_TIMEOUT);
+		if (ret)
+			return ret;
+
+		ret = __read_slice_count(i915, ctx, engine, obj, NULL, &rpcs);
+		ret = __check_rpcs(name, rpcs, ret, expected,
+				   "Context", " after idle!");
+	}
+
+	return ret;
+}
+
+static int
+__sseu_test(struct drm_i915_private *i915,
+	    const char *name,
+	    unsigned int flags,
+	    struct i915_gem_context *ctx,
+	    struct intel_engine_cs *engine,
+	    struct drm_i915_gem_object *obj,
+	    struct intel_sseu sseu)
+{
+	struct igt_spinner *spin = NULL;
+	struct i915_gem_context *kctx;
+	int ret;
+
+	kctx = kernel_context(i915);
+	if (IS_ERR(kctx))
+		return PTR_ERR(kctx);
+
+	ret = __sseu_prepare(i915, name, flags, ctx, engine, &spin);
+	if (ret)
+		goto out;
+
+	ret = __i915_gem_context_reconfigure_sseu(ctx, engine, sseu);
+	if (ret)
+		goto out;
+
+	ret = __sseu_finish(i915, name, flags, ctx, kctx, engine, obj,
+			    hweight32(sseu.slice_mask), spin);
+
+out:
+	if (spin) {
+		igt_spinner_end(spin);
+		igt_spinner_fini(spin);
+		kfree(spin);
+	}
+
+	kernel_context_close(kctx);
+
+	return ret;
+}
+
+static int
+__igt_ctx_sseu(struct drm_i915_private *i915,
+	       const char *name,
+	       unsigned int flags)
+{
+	struct intel_sseu default_sseu = intel_device_default_sseu(i915);
+	struct intel_engine_cs *engine = i915->engine[RCS];
+	struct drm_i915_gem_object *obj;
+	struct i915_gem_context *ctx;
+	struct intel_sseu pg_sseu;
+	intel_wakeref_t wakeref;
+	struct drm_file *file;
+	int ret;
+
+	if (INTEL_GEN(i915) < 9)
+		return 0;
+
+	if (!RUNTIME_INFO(i915)->sseu.has_slice_pg)
+		return 0;
+
+	if (hweight32(default_sseu.slice_mask) < 2)
+		return 0;
+
+	/*
+	 * Gen11 VME friendly power-gated configuration with half enabled
+	 * sub-slices.
+	 */
+	pg_sseu = default_sseu;
+	pg_sseu.slice_mask = 1;
+	pg_sseu.subslice_mask =
+		~(~0 << (hweight32(default_sseu.subslice_mask) / 2));
+
+	pr_info("SSEU subtest '%s', flags=%x, def_slices=%u, pg_slices=%u\n",
+		name, flags, hweight32(default_sseu.slice_mask),
+		hweight32(pg_sseu.slice_mask));
+
+	file = mock_file(i915);
+	if (IS_ERR(file))
+		return PTR_ERR(file);
+
+	if (flags & TEST_RESET)
+		igt_global_reset_lock(i915);
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	ctx = i915_gem_create_context(i915, file->driver_priv);
+	if (IS_ERR(ctx)) {
+		ret = PTR_ERR(ctx);
+		goto out_unlock;
+	}
+
+	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+	if (IS_ERR(obj)) {
+		ret = PTR_ERR(obj);
+		goto out_unlock;
+	}
+
+	wakeref = intel_runtime_pm_get(i915);
+
+	/* First set the default mask. */
+	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
+	if (ret)
+		goto out_fail;
+
+	/* Then set a power-gated configuration. */
+	ret = __sseu_test(i915, name, flags, ctx, engine, obj, pg_sseu);
+	if (ret)
+		goto out_fail;
+
+	/* Back to defaults. */
+	ret = __sseu_test(i915, name, flags, ctx, engine, obj, default_sseu);
+	if (ret)
+		goto out_fail;
+
+	/* One last power-gated configuration for the road. */
+	ret = __sseu_test(i915, name, flags, ctx, engine, obj, pg_sseu);
+	if (ret)
+		goto out_fail;
+
+out_fail:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		ret = -EIO;
+
+	i915_gem_object_put(obj);
+
+	intel_runtime_pm_put(i915, wakeref);
+
+out_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	if (flags & TEST_RESET)
+		igt_global_reset_unlock(i915);
+
+	mock_file_free(i915, file);
+
+	if (ret)
+		pr_err("%s: Failed with %d!\n", name, ret);
+
+	return ret;
+}
+
+static int igt_ctx_sseu(void *arg)
+{
+	struct {
+		const char *name;
+		unsigned int flags;
+	} *phase, phases[] = {
+		{ .name = "basic", .flags = 0 },
+		{ .name = "idle", .flags = TEST_IDLE },
+		{ .name = "busy", .flags = TEST_BUSY },
+		{ .name = "busy-reset", .flags = TEST_BUSY | TEST_RESET },
+		{ .name = "busy-idle", .flags = TEST_BUSY | TEST_IDLE },
+		{ .name = "reset-idle", .flags = TEST_RESET | TEST_IDLE },
+	};
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0, phase = phases; ret == 0 && i < ARRAY_SIZE(phases);
+	     i++, phase++)
+		ret = __igt_ctx_sseu(arg, phase->name, phase->flags);
+
+	return ret;
+}
+
 static int igt_ctx_readonly(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -657,11 +1049,11 @@ static int igt_ctx_readonly(void *arg)
 	struct i915_gem_context *ctx;
 	struct i915_hw_ppgtt *ppgtt;
 	unsigned long ndwords, dw;
+	struct igt_live_test t;
 	struct drm_file *file;
 	I915_RND_STATE(prng);
 	IGT_TIMEOUT(end_time);
 	LIST_HEAD(objects);
-	struct live_test t;
 	int err = -ENODEV;
 
 	/*
@@ -676,7 +1068,7 @@ static int igt_ctx_readonly(void *arg)
 
 	mutex_lock(&i915->drm.struct_mutex);
 
-	err = begin_live_test(&t, i915, __func__, "");
+	err = igt_live_test_begin(&t, i915, __func__, "");
 	if (err)
 		goto out_unlock;
 
@@ -699,6 +1091,8 @@ static int igt_ctx_readonly(void *arg)
 		unsigned int id;
 
 		for_each_engine(engine, i915, id) {
+			intel_wakeref_t wakeref;
+
 			if (!intel_engine_can_store_dword(engine))
 				continue;
 
@@ -713,9 +1107,9 @@ static int igt_ctx_readonly(void *arg)
 					i915_gem_object_set_readonly(obj);
 			}
 
-			intel_runtime_pm_get(i915);
-			err = gpu_fill(obj, ctx, engine, dw);
-			intel_runtime_pm_put(i915);
+			err = 0;
+			with_intel_runtime_pm(i915, wakeref)
+				err = gpu_fill(obj, ctx, engine, dw);
 			if (err) {
 				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
 				       ndwords, dw, max_dwords(obj),
@@ -732,7 +1126,7 @@ static int igt_ctx_readonly(void *arg)
 		}
 	}
 	pr_info("Submitted %lu dwords (across %u engines)\n",
-		ndwords, INTEL_INFO(i915)->num_rings);
+		ndwords, RUNTIME_INFO(i915)->num_rings);
 
 	dw = 0;
 	list_for_each_entry(obj, &objects, st_link) {
@@ -752,7 +1146,7 @@ static int igt_ctx_readonly(void *arg)
 	}
 
 out_unlock:
-	if (end_live_test(&t))
+	if (igt_live_test_end(&t))
 		err = -EIO;
 	mutex_unlock(&i915->drm.struct_mutex);
 
@@ -976,10 +1370,11 @@ static int igt_vm_isolation(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct i915_gem_context *ctx_a, *ctx_b;
 	struct intel_engine_cs *engine;
+	intel_wakeref_t wakeref;
+	struct igt_live_test t;
 	struct drm_file *file;
 	I915_RND_STATE(prng);
 	unsigned long count;
-	struct live_test t;
 	unsigned int id;
 	u64 vm_total;
 	int err;
@@ -998,7 +1393,7 @@ static int igt_vm_isolation(void *arg)
 
 	mutex_lock(&i915->drm.struct_mutex);
 
-	err = begin_live_test(&t, i915, __func__, "");
+	err = igt_live_test_begin(&t, i915, __func__, "");
 	if (err)
 		goto out_unlock;
 
@@ -1022,7 +1417,7 @@ static int igt_vm_isolation(void *arg)
 	GEM_BUG_ON(ctx_b->ppgtt->vm.total != vm_total);
 	vm_total -= I915_GTT_PAGE_SIZE;
 
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	count = 0;
 	for_each_engine(engine, i915, id) {
@@ -1064,12 +1459,12 @@ static int igt_vm_isolation(void *arg)
 		count += this;
 	}
 	pr_info("Checked %lu scratch offsets across %d engines\n",
-		count, INTEL_INFO(i915)->num_rings);
+		count, RUNTIME_INFO(i915)->num_rings);
 
 out_rpm:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 out_unlock:
-	if (end_live_test(&t))
+	if (igt_live_test_end(&t))
 		err = -EIO;
 	mutex_unlock(&i915->drm.struct_mutex);
 
@@ -1165,6 +1560,7 @@ static int igt_switch_to_kernel_context(void *arg)
 	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
 	int err;
 
 	/*
@@ -1175,7 +1571,7 @@ static int igt_switch_to_kernel_context(void *arg)
 	 */
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	ctx = kernel_context(i915);
 	if (IS_ERR(ctx)) {
@@ -1200,7 +1596,7 @@ out_unlock:
 	if (igt_flush_test(i915, I915_WAIT_LOCKED))
 		err = -EIO;
 
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	kernel_context_close(ctx);
@@ -1232,6 +1628,7 @@ int i915_gem_context_live_selftests(struct drm_i915_private *dev_priv)
 		SUBTEST(live_nop_switch),
 		SUBTEST(igt_ctx_exec),
 		SUBTEST(igt_ctx_readonly),
+		SUBTEST(igt_ctx_sseu),
 		SUBTEST(igt_vm_isolation),
 	};
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 4365979d8222..32dce7176f63 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -29,11 +29,23 @@
 #include "mock_drm.h"
 #include "mock_gem_device.h"
 
-static int populate_ggtt(struct drm_i915_private *i915)
+static void quirk_add(struct drm_i915_gem_object *obj,
+		      struct list_head *objects)
 {
+	/* quirk is only for live tiled objects, use it to declare ownership */
+	GEM_BUG_ON(obj->mm.quirked);
+	obj->mm.quirked = true;
+	list_add(&obj->st_link, objects);
+}
+
+static int populate_ggtt(struct drm_i915_private *i915,
+			 struct list_head *objects)
+{
+	unsigned long unbound, bound, count;
 	struct drm_i915_gem_object *obj;
 	u64 size;
 
+	count = 0;
 	for (size = 0;
 	     size + I915_GTT_PAGE_SIZE <= i915->ggtt.vm.total;
 	     size += I915_GTT_PAGE_SIZE) {
@@ -43,21 +55,36 @@ static int populate_ggtt(struct drm_i915_private *i915)
 		if (IS_ERR(obj))
 			return PTR_ERR(obj);
 
+		quirk_add(obj, objects);
+
 		vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, 0);
 		if (IS_ERR(vma))
 			return PTR_ERR(vma);
+
+		count++;
 	}
 
-	if (!list_empty(&i915->mm.unbound_list)) {
-		size = 0;
-		list_for_each_entry(obj, &i915->mm.unbound_list, mm.link)
-			size++;
+	unbound = 0;
+	list_for_each_entry(obj, &i915->mm.unbound_list, mm.link)
+		if (obj->mm.quirked)
+			unbound++;
+	if (unbound) {
+		pr_err("%s: Found %lu objects unbound, expected %u!\n",
+		       __func__, unbound, 0);
+		return -EINVAL;
+	}
 
-		pr_err("Found %lld objects unbound!\n", size);
+	bound = 0;
+	list_for_each_entry(obj, &i915->mm.bound_list, mm.link)
+		if (obj->mm.quirked)
+			bound++;
+	if (bound != count) {
+		pr_err("%s: Found %lu objects bound, expected %lu!\n",
+		       __func__, bound, count);
 		return -EINVAL;
 	}
 
-	if (list_empty(&i915->ggtt.vm.inactive_list)) {
+	if (list_empty(&i915->ggtt.vm.bound_list)) {
 		pr_err("No objects on the GGTT inactive list!\n");
 		return -EINVAL;
 	}
@@ -67,21 +94,26 @@ static int populate_ggtt(struct drm_i915_private *i915)
 
 static void unpin_ggtt(struct drm_i915_private *i915)
 {
+	struct i915_ggtt *ggtt = &i915->ggtt;
 	struct i915_vma *vma;
 
-	list_for_each_entry(vma, &i915->ggtt.vm.inactive_list, vm_link)
-		i915_vma_unpin(vma);
+	mutex_lock(&ggtt->vm.mutex);
+	list_for_each_entry(vma, &i915->ggtt.vm.bound_list, vm_link)
+		if (vma->obj->mm.quirked)
+			i915_vma_unpin(vma);
+	mutex_unlock(&ggtt->vm.mutex);
 }
 
-static void cleanup_objects(struct drm_i915_private *i915)
+static void cleanup_objects(struct drm_i915_private *i915,
+			    struct list_head *list)
 {
 	struct drm_i915_gem_object *obj, *on;
 
-	list_for_each_entry_safe(obj, on, &i915->mm.unbound_list, mm.link)
-		i915_gem_object_put(obj);
-
-	list_for_each_entry_safe(obj, on, &i915->mm.bound_list, mm.link)
+	list_for_each_entry_safe(obj, on, list, st_link) {
+		GEM_BUG_ON(!obj->mm.quirked);
+		obj->mm.quirked = false;
 		i915_gem_object_put(obj);
+	}
 
 	mutex_unlock(&i915->drm.struct_mutex);
 
@@ -94,11 +126,12 @@ static int igt_evict_something(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
 	struct i915_ggtt *ggtt = &i915->ggtt;
+	LIST_HEAD(objects);
 	int err;
 
 	/* Fill the GGTT with pinned objects and try to evict one. */
 
-	err = populate_ggtt(i915);
+	err = populate_ggtt(i915, &objects);
 	if (err)
 		goto cleanup;
 
@@ -127,7 +160,7 @@ static int igt_evict_something(void *arg)
 	}
 
 cleanup:
-	cleanup_objects(i915);
+	cleanup_objects(i915, &objects);
 	return err;
 }
 
@@ -136,13 +169,14 @@ static int igt_overcommit(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct drm_i915_gem_object *obj;
 	struct i915_vma *vma;
+	LIST_HEAD(objects);
 	int err;
 
 	/* Fill the GGTT with pinned objects and then try to pin one more.
 	 * We expect it to fail.
 	 */
 
-	err = populate_ggtt(i915);
+	err = populate_ggtt(i915, &objects);
 	if (err)
 		goto cleanup;
 
@@ -152,6 +186,8 @@ static int igt_overcommit(void *arg)
 		goto cleanup;
 	}
 
+	quirk_add(obj, &objects);
+
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, 0);
 	if (!IS_ERR(vma) || PTR_ERR(vma) != -ENOSPC) {
 		pr_err("Failed to evict+insert, i915_gem_object_ggtt_pin returned err=%d\n", (int)PTR_ERR(vma));
@@ -160,7 +196,7 @@ static int igt_overcommit(void *arg)
 	}
 
 cleanup:
-	cleanup_objects(i915);
+	cleanup_objects(i915, &objects);
 	return err;
 }
 
@@ -172,11 +208,12 @@ static int igt_evict_for_vma(void *arg)
 		.start = 0,
 		.size = 4096,
 	};
+	LIST_HEAD(objects);
 	int err;
 
 	/* Fill the GGTT with pinned objects and try to evict a range. */
 
-	err = populate_ggtt(i915);
+	err = populate_ggtt(i915, &objects);
 	if (err)
 		goto cleanup;
 
@@ -199,7 +236,7 @@ static int igt_evict_for_vma(void *arg)
 	}
 
 cleanup:
-	cleanup_objects(i915);
+	cleanup_objects(i915, &objects);
 	return err;
 }
 
@@ -222,6 +259,7 @@ static int igt_evict_for_cache_color(void *arg)
 	};
 	struct drm_i915_gem_object *obj;
 	struct i915_vma *vma;
+	LIST_HEAD(objects);
 	int err;
 
 	/* Currently the use of color_adjust is limited to cache domains within
@@ -237,6 +275,7 @@ static int igt_evict_for_cache_color(void *arg)
 		goto cleanup;
 	}
 	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	quirk_add(obj, &objects);
 
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
 				       I915_GTT_PAGE_SIZE | flags);
@@ -252,6 +291,7 @@ static int igt_evict_for_cache_color(void *arg)
 		goto cleanup;
 	}
 	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	quirk_add(obj, &objects);
 
 	/* Neighbouring; same colour - should fit */
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
@@ -287,7 +327,7 @@ static int igt_evict_for_cache_color(void *arg)
 
 cleanup:
 	unpin_ggtt(i915);
-	cleanup_objects(i915);
+	cleanup_objects(i915, &objects);
 	ggtt->vm.mm.color_adjust = NULL;
 	return err;
 }
@@ -296,11 +336,12 @@ static int igt_evict_vm(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
 	struct i915_ggtt *ggtt = &i915->ggtt;
+	LIST_HEAD(objects);
 	int err;
 
 	/* Fill the GGTT with pinned objects and try to evict everything. */
 
-	err = populate_ggtt(i915);
+	err = populate_ggtt(i915, &objects);
 	if (err)
 		goto cleanup;
 
@@ -322,7 +363,7 @@ static int igt_evict_vm(void *arg)
 	}
 
 cleanup:
-	cleanup_objects(i915);
+	cleanup_objects(i915, &objects);
 	return err;
 }
 
@@ -336,6 +377,7 @@ static int igt_evict_contexts(void *arg)
 		struct drm_mm_node node;
 		struct reserved *next;
 	} *reserved = NULL;
+	intel_wakeref_t wakeref;
 	struct drm_mm_node hole;
 	unsigned long count;
 	int err;
@@ -355,7 +397,7 @@ static int igt_evict_contexts(void *arg)
 		return 0;
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	/* Reserve a block so that we know we have enough to fit a few rq */
 	memset(&hole, 0, sizeof(hole));
@@ -400,8 +442,10 @@ static int igt_evict_contexts(void *arg)
 		struct drm_file *file;
 
 		file = mock_file(i915);
-		if (IS_ERR(file))
-			return PTR_ERR(file);
+		if (IS_ERR(file)) {
+			err = PTR_ERR(file);
+			break;
+		}
 
 		count = 0;
 		mutex_lock(&i915->drm.struct_mutex);
@@ -464,7 +508,7 @@ out_locked:
 	}
 	if (drm_mm_node_allocated(&hole))
 		drm_mm_remove_node(&hole);
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	return err;
@@ -480,14 +524,17 @@ int i915_gem_evict_mock_selftests(void)
 		SUBTEST(igt_overcommit),
 	};
 	struct drm_i915_private *i915;
-	int err;
+	intel_wakeref_t wakeref;
+	int err = 0;
 
 	i915 = mock_gem_device();
 	if (!i915)
 		return -ENOMEM;
 
 	mutex_lock(&i915->drm.struct_mutex);
-	err = i915_subtests(tests, i915);
+	with_intel_runtime_pm(i915, wakeref)
+		err = i915_subtests(tests, i915);
+
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	drm_dev_put(&i915->drm);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index a9ed0ecc94e2..3850ef4a5ec8 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -275,6 +275,7 @@ static int lowlevel_hole(struct drm_i915_private *i915,
 
 		for (n = 0; n < count; n++) {
 			u64 addr = hole_start + order[n] * BIT_ULL(size);
+			intel_wakeref_t wakeref;
 
 			GEM_BUG_ON(addr + BIT_ULL(size) > vm->total);
 
@@ -293,9 +294,9 @@ static int lowlevel_hole(struct drm_i915_private *i915,
 			mock_vma.node.size = BIT_ULL(size);
 			mock_vma.node.start = addr;
 
-			intel_runtime_pm_get(i915);
+			wakeref = intel_runtime_pm_get(i915);
 			vm->insert_entries(vm, &mock_vma, I915_CACHE_NONE, 0);
-			intel_runtime_pm_put(i915);
+			intel_runtime_pm_put(i915, wakeref);
 		}
 		count = n;
 
@@ -1144,6 +1145,7 @@ static int igt_ggtt_page(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct i915_ggtt *ggtt = &i915->ggtt;
 	struct drm_i915_gem_object *obj;
+	intel_wakeref_t wakeref;
 	struct drm_mm_node tmp;
 	unsigned int *order, n;
 	int err;
@@ -1169,7 +1171,7 @@ static int igt_ggtt_page(void *arg)
 	if (err)
 		goto out_unpin;
 
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	for (n = 0; n < count; n++) {
 		u64 offset = tmp.start + n * PAGE_SIZE;
@@ -1216,7 +1218,7 @@ static int igt_ggtt_page(void *arg)
 	kfree(order);
 out_remove:
 	ggtt->vm.clear_range(&ggtt->vm, tmp.start, tmp.size);
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	drm_mm_remove_node(&tmp);
 out_unpin:
 	i915_gem_object_unpin_pages(obj);
@@ -1235,7 +1237,10 @@ static void track_vma_bind(struct i915_vma *vma)
 	__i915_gem_object_pin_pages(obj);
 
 	vma->pages = obj->mm.pages;
-	list_move_tail(&vma->vm_link, &vma->vm->inactive_list);
+
+	mutex_lock(&vma->vm->mutex);
+	list_move_tail(&vma->vm_link, &vma->vm->bound_list);
+	mutex_unlock(&vma->vm->mutex);
 }
 
 static int exercise_mock(struct drm_i915_private *i915,
@@ -1265,27 +1270,35 @@ static int exercise_mock(struct drm_i915_private *i915,
 
 static int igt_mock_fill(void *arg)
 {
-	return exercise_mock(arg, fill_hole);
+	struct i915_ggtt *ggtt = arg;
+
+	return exercise_mock(ggtt->vm.i915, fill_hole);
 }
 
 static int igt_mock_walk(void *arg)
 {
-	return exercise_mock(arg, walk_hole);
+	struct i915_ggtt *ggtt = arg;
+
+	return exercise_mock(ggtt->vm.i915, walk_hole);
 }
 
 static int igt_mock_pot(void *arg)
 {
-	return exercise_mock(arg, pot_hole);
+	struct i915_ggtt *ggtt = arg;
+
+	return exercise_mock(ggtt->vm.i915, pot_hole);
 }
 
 static int igt_mock_drunk(void *arg)
 {
-	return exercise_mock(arg, drunk_hole);
+	struct i915_ggtt *ggtt = arg;
+
+	return exercise_mock(ggtt->vm.i915, drunk_hole);
 }
 
 static int igt_gtt_reserve(void *arg)
 {
-	struct drm_i915_private *i915 = arg;
+	struct i915_ggtt *ggtt = arg;
 	struct drm_i915_gem_object *obj, *on;
 	LIST_HEAD(objects);
 	u64 total;
@@ -1298,11 +1311,12 @@ static int igt_gtt_reserve(void *arg)
 
 	/* Start by filling the GGTT */
 	for (total = 0;
-	     total + 2*I915_GTT_PAGE_SIZE <= i915->ggtt.vm.total;
-	     total += 2*I915_GTT_PAGE_SIZE) {
+	     total + 2 * I915_GTT_PAGE_SIZE <= ggtt->vm.total;
+	     total += 2 * I915_GTT_PAGE_SIZE) {
 		struct i915_vma *vma;
 
-		obj = i915_gem_object_create_internal(i915, 2*PAGE_SIZE);
+		obj = i915_gem_object_create_internal(ggtt->vm.i915,
+						      2 * PAGE_SIZE);
 		if (IS_ERR(obj)) {
 			err = PTR_ERR(obj);
 			goto out;
@@ -1316,20 +1330,20 @@ static int igt_gtt_reserve(void *arg)
 
 		list_add(&obj->st_link, &objects);
 
-		vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
 		}
 
-		err = i915_gem_gtt_reserve(&i915->ggtt.vm, &vma->node,
+		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
 					   obj->base.size,
 					   total,
 					   obj->cache_level,
 					   0);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 1) failed at %llu/%llu with err=%d\n",
-			       total, i915->ggtt.vm.total, err);
+			       total, ggtt->vm.total, err);
 			goto out;
 		}
 		track_vma_bind(vma);
@@ -1347,11 +1361,12 @@ static int igt_gtt_reserve(void *arg)
 
 	/* Now we start forcing evictions */
 	for (total = I915_GTT_PAGE_SIZE;
-	     total + 2*I915_GTT_PAGE_SIZE <= i915->ggtt.vm.total;
-	     total += 2*I915_GTT_PAGE_SIZE) {
+	     total + 2 * I915_GTT_PAGE_SIZE <= ggtt->vm.total;
+	     total += 2 * I915_GTT_PAGE_SIZE) {
 		struct i915_vma *vma;
 
-		obj = i915_gem_object_create_internal(i915, 2*PAGE_SIZE);
+		obj = i915_gem_object_create_internal(ggtt->vm.i915,
+						      2 * PAGE_SIZE);
 		if (IS_ERR(obj)) {
 			err = PTR_ERR(obj);
 			goto out;
@@ -1365,20 +1380,20 @@ static int igt_gtt_reserve(void *arg)
 
 		list_add(&obj->st_link, &objects);
 
-		vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
 		}
 
-		err = i915_gem_gtt_reserve(&i915->ggtt.vm, &vma->node,
+		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
 					   obj->base.size,
 					   total,
 					   obj->cache_level,
 					   0);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 2) failed at %llu/%llu with err=%d\n",
-			       total, i915->ggtt.vm.total, err);
+			       total, ggtt->vm.total, err);
 			goto out;
 		}
 		track_vma_bind(vma);
@@ -1399,7 +1414,7 @@ static int igt_gtt_reserve(void *arg)
 		struct i915_vma *vma;
 		u64 offset;
 
-		vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
@@ -1411,18 +1426,18 @@ static int igt_gtt_reserve(void *arg)
 			goto out;
 		}
 
-		offset = random_offset(0, i915->ggtt.vm.total,
+		offset = random_offset(0, ggtt->vm.total,
 				       2*I915_GTT_PAGE_SIZE,
 				       I915_GTT_MIN_ALIGNMENT);
 
-		err = i915_gem_gtt_reserve(&i915->ggtt.vm, &vma->node,
+		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
 					   obj->base.size,
 					   offset,
 					   obj->cache_level,
 					   0);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 3) failed at %llu/%llu with err=%d\n",
-			       total, i915->ggtt.vm.total, err);
+			       total, ggtt->vm.total, err);
 			goto out;
 		}
 		track_vma_bind(vma);
@@ -1448,7 +1463,7 @@ out:
 
 static int igt_gtt_insert(void *arg)
 {
-	struct drm_i915_private *i915 = arg;
+	struct i915_ggtt *ggtt = arg;
 	struct drm_i915_gem_object *obj, *on;
 	struct drm_mm_node tmp = {};
 	const struct invalid_insert {
@@ -1457,8 +1472,8 @@ static int igt_gtt_insert(void *arg)
 		u64 start, end;
 	} invalid_insert[] = {
 		{
-			i915->ggtt.vm.total + I915_GTT_PAGE_SIZE, 0,
-			0, i915->ggtt.vm.total,
+			ggtt->vm.total + I915_GTT_PAGE_SIZE, 0,
+			0, ggtt->vm.total,
 		},
 		{
 			2*I915_GTT_PAGE_SIZE, 0,
@@ -1488,7 +1503,7 @@ static int igt_gtt_insert(void *arg)
 
 	/* Check a couple of obviously invalid requests */
 	for (ii = invalid_insert; ii->size; ii++) {
-		err = i915_gem_gtt_insert(&i915->ggtt.vm, &tmp,
+		err = i915_gem_gtt_insert(&ggtt->vm, &tmp,
 					  ii->size, ii->alignment,
 					  I915_COLOR_UNEVICTABLE,
 					  ii->start, ii->end,
@@ -1503,11 +1518,12 @@ static int igt_gtt_insert(void *arg)
 
 	/* Start by filling the GGTT */
 	for (total = 0;
-	     total + I915_GTT_PAGE_SIZE <= i915->ggtt.vm.total;
+	     total + I915_GTT_PAGE_SIZE <= ggtt->vm.total;
 	     total += I915_GTT_PAGE_SIZE) {
 		struct i915_vma *vma;
 
-		obj = i915_gem_object_create_internal(i915, I915_GTT_PAGE_SIZE);
+		obj = i915_gem_object_create_internal(ggtt->vm.i915,
+						      I915_GTT_PAGE_SIZE);
 		if (IS_ERR(obj)) {
 			err = PTR_ERR(obj);
 			goto out;
@@ -1521,15 +1537,15 @@ static int igt_gtt_insert(void *arg)
 
 		list_add(&obj->st_link, &objects);
 
-		vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
 		}
 
-		err = i915_gem_gtt_insert(&i915->ggtt.vm, &vma->node,
+		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
 					  obj->base.size, 0, obj->cache_level,
-					  0, i915->ggtt.vm.total,
+					  0, ggtt->vm.total,
 					  0);
 		if (err == -ENOSPC) {
 			/* maxed out the GGTT space */
@@ -1538,7 +1554,7 @@ static int igt_gtt_insert(void *arg)
 		}
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 1) failed at %llu/%llu with err=%d\n",
-			       total, i915->ggtt.vm.total, err);
+			       total, ggtt->vm.total, err);
 			goto out;
 		}
 		track_vma_bind(vma);
@@ -1550,7 +1566,7 @@ static int igt_gtt_insert(void *arg)
 	list_for_each_entry(obj, &objects, st_link) {
 		struct i915_vma *vma;
 
-		vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
@@ -1570,7 +1586,7 @@ static int igt_gtt_insert(void *arg)
 		struct i915_vma *vma;
 		u64 offset;
 
-		vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
@@ -1585,13 +1601,13 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		err = i915_gem_gtt_insert(&i915->ggtt.vm, &vma->node,
+		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
 					  obj->base.size, 0, obj->cache_level,
-					  0, i915->ggtt.vm.total,
+					  0, ggtt->vm.total,
 					  0);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 2) failed at %llu/%llu with err=%d\n",
-			       total, i915->ggtt.vm.total, err);
+			       total, ggtt->vm.total, err);
 			goto out;
 		}
 		track_vma_bind(vma);
@@ -1607,11 +1623,12 @@ static int igt_gtt_insert(void *arg)
 
 	/* And then force evictions */
 	for (total = 0;
-	     total + 2*I915_GTT_PAGE_SIZE <= i915->ggtt.vm.total;
-	     total += 2*I915_GTT_PAGE_SIZE) {
+	     total + 2 * I915_GTT_PAGE_SIZE <= ggtt->vm.total;
+	     total += 2 * I915_GTT_PAGE_SIZE) {
 		struct i915_vma *vma;
 
-		obj = i915_gem_object_create_internal(i915, 2*I915_GTT_PAGE_SIZE);
+		obj = i915_gem_object_create_internal(ggtt->vm.i915,
+						      2 * I915_GTT_PAGE_SIZE);
 		if (IS_ERR(obj)) {
 			err = PTR_ERR(obj);
 			goto out;
@@ -1625,19 +1642,19 @@ static int igt_gtt_insert(void *arg)
 
 		list_add(&obj->st_link, &objects);
 
-		vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
 		}
 
-		err = i915_gem_gtt_insert(&i915->ggtt.vm, &vma->node,
+		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
 					  obj->base.size, 0, obj->cache_level,
-					  0, i915->ggtt.vm.total,
+					  0, ggtt->vm.total,
 					  0);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 3) failed at %llu/%llu with err=%d\n",
-			       total, i915->ggtt.vm.total, err);
+			       total, ggtt->vm.total, err);
 			goto out;
 		}
 		track_vma_bind(vma);
@@ -1664,17 +1681,25 @@ int i915_gem_gtt_mock_selftests(void)
 		SUBTEST(igt_gtt_insert),
 	};
 	struct drm_i915_private *i915;
+	struct i915_ggtt ggtt;
 	int err;
 
 	i915 = mock_gem_device();
 	if (!i915)
 		return -ENOMEM;
 
+	mock_init_ggtt(i915, &ggtt);
+
 	mutex_lock(&i915->drm.struct_mutex);
-	err = i915_subtests(tests, i915);
+	err = i915_subtests(tests, &ggtt);
+	mock_device_flush(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
+	i915_gem_drain_freed_objects(i915);
+
+	mock_fini_ggtt(&ggtt);
 	drm_dev_put(&i915->drm);
+
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index c3999dd2021e..395ae878e0f7 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -238,6 +238,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
 		u32 *cpu;
 
 		GEM_BUG_ON(view.partial.size > nreal);
+		cond_resched();
 
 		err = i915_gem_object_set_to_gtt_domain(obj, true);
 		if (err) {
@@ -307,6 +308,7 @@ static int igt_partial_tiling(void *arg)
 	const unsigned int nreal = 1 << 12; /* largest tile row x2 */
 	struct drm_i915_private *i915 = arg;
 	struct drm_i915_gem_object *obj;
+	intel_wakeref_t wakeref;
 	int tiling;
 	int err;
 
@@ -332,7 +334,7 @@ static int igt_partial_tiling(void *arg)
 	}
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	if (1) {
 		IGT_TIMEOUT(end);
@@ -443,7 +445,7 @@ next_tiling: ;
 	}
 
 out_unlock:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	i915_gem_object_unpin_pages(obj);
 out:
@@ -505,11 +507,13 @@ static void disable_retire_worker(struct drm_i915_private *i915)
 
 	mutex_lock(&i915->drm.struct_mutex);
 	if (!i915->gt.active_requests++) {
-		intel_runtime_pm_get(i915);
-		i915_gem_unpark(i915);
-		intel_runtime_pm_put(i915);
+		intel_wakeref_t wakeref;
+
+		with_intel_runtime_pm(i915, wakeref)
+			i915_gem_unpark(i915);
 	}
 	mutex_unlock(&i915->drm.struct_mutex);
+
 	cancel_delayed_work_sync(&i915->gt.retire_work);
 	cancel_delayed_work_sync(&i915->gt.idle_work);
 }
@@ -577,6 +581,8 @@ static int igt_mmap_offset_exhaustion(void *arg)
 
 	/* Now fill with busy dead objects that we expect to reap */
 	for (loop = 0; loop < 3; loop++) {
+		intel_wakeref_t wakeref;
+
 		if (i915_terminally_wedged(&i915->gpu_error))
 			break;
 
@@ -586,10 +592,10 @@ static int igt_mmap_offset_exhaustion(void *arg)
 			goto out;
 		}
 
+		err = 0;
 		mutex_lock(&i915->drm.struct_mutex);
-		intel_runtime_pm_get(i915);
-		err = make_obj_busy(obj);
-		intel_runtime_pm_put(i915);
+		with_intel_runtime_pm(i915, wakeref)
+			err = make_obj_busy(obj);
 		mutex_unlock(&i915->drm.struct_mutex);
 		if (err) {
 			pr_err("[loop %d] Failed to busy the object\n", loop);
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index a15713cae3b3..6d766925ad04 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -12,7 +12,9 @@
 selftest(sanitycheck, i915_live_sanitycheck) /* keep first (igt selfcheck) */
 selftest(uncore, intel_uncore_live_selftests)
 selftest(workarounds, intel_workarounds_live_selftests)
+selftest(timelines, i915_timeline_live_selftests)
 selftest(requests, i915_request_live_selftests)
+selftest(active, i915_active_live_selftests)
 selftest(objects, i915_gem_object_live_selftests)
 selftest(dmabuf, i915_gem_dmabuf_live_selftests)
 selftest(coherency, i915_gem_coherency_live_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
index 1b70208eeea7..88e5ab586337 100644
--- a/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
@@ -15,8 +15,7 @@ selftest(scatterlist, scatterlist_mock_selftests)
 selftest(syncmap, i915_syncmap_mock_selftests)
 selftest(uncore, intel_uncore_mock_selftests)
 selftest(engine, intel_engine_cs_mock_selftests)
-selftest(breadcrumbs, intel_breadcrumbs_mock_selftests)
-selftest(timelines, i915_gem_timeline_mock_selftests)
+selftest(timelines, i915_timeline_mock_selftests)
 selftest(requests, i915_request_mock_selftests)
 selftest(objects, i915_gem_object_mock_selftests)
 selftest(dmabuf, i915_gem_dmabuf_mock_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/i915_random.c b/drivers/gpu/drm/i915/selftests/i915_random.c
index 1f415ce47018..716a3f19f030 100644
--- a/drivers/gpu/drm/i915/selftests/i915_random.c
+++ b/drivers/gpu/drm/i915/selftests/i915_random.c
@@ -41,18 +41,37 @@ u64 i915_prandom_u64_state(struct rnd_state *rnd)
 	return x;
 }
 
-void i915_random_reorder(unsigned int *order, unsigned int count,
-			 struct rnd_state *state)
+void i915_prandom_shuffle(void *arr, size_t elsz, size_t count,
+			  struct rnd_state *state)
 {
-	unsigned int i, j;
+	char stack[128];
+
+	if (WARN_ON(elsz > sizeof(stack) || count > U32_MAX))
+		return;
+
+	if (!elsz || !count)
+		return;
+
+	/* Fisher-Yates shuffle courtesy of Knuth */
+	while (--count) {
+		size_t swp;
+
+		swp = i915_prandom_u32_max_state(count + 1, state);
+		if (swp == count)
+			continue;
 
-	for (i = 0; i < count; i++) {
-		BUILD_BUG_ON(sizeof(unsigned int) > sizeof(u32));
-		j = i915_prandom_u32_max_state(count, state);
-		swap(order[i], order[j]);
+		memcpy(stack, arr + count * elsz, elsz);
+		memcpy(arr + count * elsz, arr + swp * elsz, elsz);
+		memcpy(arr + swp * elsz, stack, elsz);
 	}
 }
 
+void i915_random_reorder(unsigned int *order, unsigned int count,
+			 struct rnd_state *state)
+{
+	i915_prandom_shuffle(order, sizeof(*order), count, state);
+}
+
 unsigned int *i915_random_order(unsigned int count, struct rnd_state *state)
 {
 	unsigned int *order, i;
diff --git a/drivers/gpu/drm/i915/selftests/i915_random.h b/drivers/gpu/drm/i915/selftests/i915_random.h
index 7dffedc501ca..8e1ff9c105b6 100644
--- a/drivers/gpu/drm/i915/selftests/i915_random.h
+++ b/drivers/gpu/drm/i915/selftests/i915_random.h
@@ -54,4 +54,7 @@ void i915_random_reorder(unsigned int *order,
 			 unsigned int count,
 			 struct rnd_state *state);
 
+void i915_prandom_shuffle(void *arr, size_t elsz, size_t count,
+			  struct rnd_state *state);
+
 #endif /* !__I915_SELFTESTS_RANDOM_H__ */
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index 07e557815308..6733dc5b6b4c 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -25,8 +25,12 @@
 #include <linux/prime_numbers.h>
 
 #include "../i915_selftest.h"
+#include "i915_random.h"
+#include "igt_live_test.h"
+#include "lib_sw_fence.h"
 
 #include "mock_context.h"
+#include "mock_drm.h"
 #include "mock_gem_device.h"
 
 static int igt_add_request(void *arg)
@@ -246,93 +250,285 @@ err_context_0:
 	return err;
 }
 
-int i915_request_mock_selftests(void)
+struct smoketest {
+	struct intel_engine_cs *engine;
+	struct i915_gem_context **contexts;
+	atomic_long_t num_waits, num_fences;
+	int ncontexts, max_batch;
+	struct i915_request *(*request_alloc)(struct i915_gem_context *,
+					      struct intel_engine_cs *);
+};
+
+static struct i915_request *
+__mock_request_alloc(struct i915_gem_context *ctx,
+		     struct intel_engine_cs *engine)
 {
-	static const struct i915_subtest tests[] = {
-		SUBTEST(igt_add_request),
-		SUBTEST(igt_wait_request),
-		SUBTEST(igt_fence_wait),
-		SUBTEST(igt_request_rewind),
-	};
-	struct drm_i915_private *i915;
-	int err;
+	return mock_request(engine, ctx, 0);
+}
 
-	i915 = mock_gem_device();
-	if (!i915)
+static struct i915_request *
+__live_request_alloc(struct i915_gem_context *ctx,
+		     struct intel_engine_cs *engine)
+{
+	return i915_request_alloc(engine, ctx);
+}
+
+static int __igt_breadcrumbs_smoketest(void *arg)
+{
+	struct smoketest *t = arg;
+	struct mutex * const BKL = &t->engine->i915->drm.struct_mutex;
+	const unsigned int max_batch = min(t->ncontexts, t->max_batch) - 1;
+	const unsigned int total = 4 * t->ncontexts + 1;
+	unsigned int num_waits = 0, num_fences = 0;
+	struct i915_request **requests;
+	I915_RND_STATE(prng);
+	unsigned int *order;
+	int err = 0;
+
+	/*
+	 * A very simple test to catch the most egregious of list handling bugs.
+	 *
+	 * At its heart, we simply create oodles of requests running across
+	 * multiple kthreads and enable signaling on them, for the sole purpose
+	 * of stressing our breadcrumb handling. The only inspection we do is
+	 * that the fences were marked as signaled.
+	 */
+
+	requests = kmalloc_array(total, sizeof(*requests), GFP_KERNEL);
+	if (!requests)
 		return -ENOMEM;
 
-	err = i915_subtests(tests, i915);
-	drm_dev_put(&i915->drm);
+	order = i915_random_order(total, &prng);
+	if (!order) {
+		err = -ENOMEM;
+		goto out_requests;
+	}
 
-	return err;
-}
+	while (!kthread_should_stop()) {
+		struct i915_sw_fence *submit, *wait;
+		unsigned int n, count;
 
-struct live_test {
-	struct drm_i915_private *i915;
-	const char *func;
-	const char *name;
+		submit = heap_fence_create(GFP_KERNEL);
+		if (!submit) {
+			err = -ENOMEM;
+			break;
+		}
 
-	unsigned int reset_count;
-};
+		wait = heap_fence_create(GFP_KERNEL);
+		if (!wait) {
+			i915_sw_fence_commit(submit);
+			heap_fence_put(submit);
+			err = ENOMEM;
+			break;
+		}
 
-static int begin_live_test(struct live_test *t,
-			   struct drm_i915_private *i915,
-			   const char *func,
-			   const char *name)
-{
-	int err;
+		i915_random_reorder(order, total, &prng);
+		count = 1 + i915_prandom_u32_max_state(max_batch, &prng);
 
-	t->i915 = i915;
-	t->func = func;
-	t->name = name;
+		for (n = 0; n < count; n++) {
+			struct i915_gem_context *ctx =
+				t->contexts[order[n] % t->ncontexts];
+			struct i915_request *rq;
 
-	err = i915_gem_wait_for_idle(i915,
-				     I915_WAIT_LOCKED,
-				     MAX_SCHEDULE_TIMEOUT);
-	if (err) {
-		pr_err("%s(%s): failed to idle before, with err=%d!",
-		       func, name, err);
-		return err;
+			mutex_lock(BKL);
+
+			rq = t->request_alloc(ctx, t->engine);
+			if (IS_ERR(rq)) {
+				mutex_unlock(BKL);
+				err = PTR_ERR(rq);
+				count = n;
+				break;
+			}
+
+			err = i915_sw_fence_await_sw_fence_gfp(&rq->submit,
+							       submit,
+							       GFP_KERNEL);
+
+			requests[n] = i915_request_get(rq);
+			i915_request_add(rq);
+
+			mutex_unlock(BKL);
+
+			if (err >= 0)
+				err = i915_sw_fence_await_dma_fence(wait,
+								    &rq->fence,
+								    0,
+								    GFP_KERNEL);
+
+			if (err < 0) {
+				i915_request_put(rq);
+				count = n;
+				break;
+			}
+		}
+
+		i915_sw_fence_commit(submit);
+		i915_sw_fence_commit(wait);
+
+		if (!wait_event_timeout(wait->wait,
+					i915_sw_fence_done(wait),
+					HZ / 2)) {
+			struct i915_request *rq = requests[count - 1];
+
+			pr_err("waiting for %d fences (last %llx:%lld) on %s timed out!\n",
+			       count,
+			       rq->fence.context, rq->fence.seqno,
+			       t->engine->name);
+			i915_gem_set_wedged(t->engine->i915);
+			GEM_BUG_ON(!i915_request_completed(rq));
+			i915_sw_fence_wait(wait);
+			err = -EIO;
+		}
+
+		for (n = 0; n < count; n++) {
+			struct i915_request *rq = requests[n];
+
+			if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+				      &rq->fence.flags)) {
+				pr_err("%llu:%llu was not signaled!\n",
+				       rq->fence.context, rq->fence.seqno);
+				err = -EINVAL;
+			}
+
+			i915_request_put(rq);
+		}
+
+		heap_fence_put(wait);
+		heap_fence_put(submit);
+
+		if (err < 0)
+			break;
+
+		num_fences += count;
+		num_waits++;
+
+		cond_resched();
 	}
 
-	i915->gpu_error.missed_irq_rings = 0;
-	t->reset_count = i915_reset_count(&i915->gpu_error);
+	atomic_long_add(num_fences, &t->num_fences);
+	atomic_long_add(num_waits, &t->num_waits);
 
-	return 0;
+	kfree(order);
+out_requests:
+	kfree(requests);
+	return err;
 }
 
-static int end_live_test(struct live_test *t)
+static int mock_breadcrumbs_smoketest(void *arg)
 {
-	struct drm_i915_private *i915 = t->i915;
+	struct drm_i915_private *i915 = arg;
+	struct smoketest t = {
+		.engine = i915->engine[RCS],
+		.ncontexts = 1024,
+		.max_batch = 1024,
+		.request_alloc = __mock_request_alloc
+	};
+	unsigned int ncpus = num_online_cpus();
+	struct task_struct **threads;
+	unsigned int n;
+	int ret = 0;
+
+	/*
+	 * Smoketest our breadcrumb/signal handling for requests across multiple
+	 * threads. A very simple test to only catch the most egregious of bugs.
+	 * See __igt_breadcrumbs_smoketest();
+	 */
 
-	i915_retire_requests(i915);
+	threads = kmalloc_array(ncpus, sizeof(*threads), GFP_KERNEL);
+	if (!threads)
+		return -ENOMEM;
 
-	if (wait_for(intel_engines_are_idle(i915), 10)) {
-		pr_err("%s(%s): GPU not idle\n", t->func, t->name);
-		return -EIO;
+	t.contexts =
+		kmalloc_array(t.ncontexts, sizeof(*t.contexts), GFP_KERNEL);
+	if (!t.contexts) {
+		ret = -ENOMEM;
+		goto out_threads;
 	}
 
-	if (t->reset_count != i915_reset_count(&i915->gpu_error)) {
-		pr_err("%s(%s): GPU was reset %d times!\n",
-		       t->func, t->name,
-		       i915_reset_count(&i915->gpu_error) - t->reset_count);
-		return -EIO;
+	mutex_lock(&t.engine->i915->drm.struct_mutex);
+	for (n = 0; n < t.ncontexts; n++) {
+		t.contexts[n] = mock_context(t.engine->i915, "mock");
+		if (!t.contexts[n]) {
+			ret = -ENOMEM;
+			goto out_contexts;
+		}
 	}
+	mutex_unlock(&t.engine->i915->drm.struct_mutex);
+
+	for (n = 0; n < ncpus; n++) {
+		threads[n] = kthread_run(__igt_breadcrumbs_smoketest,
+					 &t, "igt/%d", n);
+		if (IS_ERR(threads[n])) {
+			ret = PTR_ERR(threads[n]);
+			ncpus = n;
+			break;
+		}
 
-	if (i915->gpu_error.missed_irq_rings) {
-		pr_err("%s(%s): Missed interrupts on engines %lx\n",
-		       t->func, t->name, i915->gpu_error.missed_irq_rings);
-		return -EIO;
+		get_task_struct(threads[n]);
 	}
 
-	return 0;
+	msleep(jiffies_to_msecs(i915_selftest.timeout_jiffies));
+
+	for (n = 0; n < ncpus; n++) {
+		int err;
+
+		err = kthread_stop(threads[n]);
+		if (err < 0 && !ret)
+			ret = err;
+
+		put_task_struct(threads[n]);
+	}
+	pr_info("Completed %lu waits for %lu fence across %d cpus\n",
+		atomic_long_read(&t.num_waits),
+		atomic_long_read(&t.num_fences),
+		ncpus);
+
+	mutex_lock(&t.engine->i915->drm.struct_mutex);
+out_contexts:
+	for (n = 0; n < t.ncontexts; n++) {
+		if (!t.contexts[n])
+			break;
+		mock_context_close(t.contexts[n]);
+	}
+	mutex_unlock(&t.engine->i915->drm.struct_mutex);
+	kfree(t.contexts);
+out_threads:
+	kfree(threads);
+
+	return ret;
+}
+
+int i915_request_mock_selftests(void)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_add_request),
+		SUBTEST(igt_wait_request),
+		SUBTEST(igt_fence_wait),
+		SUBTEST(igt_request_rewind),
+		SUBTEST(mock_breadcrumbs_smoketest),
+	};
+	struct drm_i915_private *i915;
+	intel_wakeref_t wakeref;
+	int err = 0;
+
+	i915 = mock_gem_device();
+	if (!i915)
+		return -ENOMEM;
+
+	with_intel_runtime_pm(i915, wakeref)
+		err = i915_subtests(tests, i915);
+
+	drm_dev_put(&i915->drm);
+
+	return err;
 }
 
 static int live_nop_request(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
 	struct intel_engine_cs *engine;
-	struct live_test t;
+	intel_wakeref_t wakeref;
+	struct igt_live_test t;
 	unsigned int id;
 	int err = -ENODEV;
 
@@ -342,7 +538,7 @@ static int live_nop_request(void *arg)
 	 */
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	for_each_engine(engine, i915, id) {
 		struct i915_request *request = NULL;
@@ -350,7 +546,7 @@ static int live_nop_request(void *arg)
 		IGT_TIMEOUT(end_time);
 		ktime_t times[2] = {};
 
-		err = begin_live_test(&t, i915, __func__, engine->name);
+		err = igt_live_test_begin(&t, i915, __func__, engine->name);
 		if (err)
 			goto out_unlock;
 
@@ -392,7 +588,7 @@ static int live_nop_request(void *arg)
 				break;
 		}
 
-		err = end_live_test(&t);
+		err = igt_live_test_end(&t);
 		if (err)
 			goto out_unlock;
 
@@ -403,7 +599,7 @@ static int live_nop_request(void *arg)
 	}
 
 out_unlock:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	return err;
 }
@@ -478,7 +674,8 @@ static int live_empty_request(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
 	struct intel_engine_cs *engine;
-	struct live_test t;
+	intel_wakeref_t wakeref;
+	struct igt_live_test t;
 	struct i915_vma *batch;
 	unsigned int id;
 	int err = 0;
@@ -489,7 +686,7 @@ static int live_empty_request(void *arg)
 	 */
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	batch = empty_batch(i915);
 	if (IS_ERR(batch)) {
@@ -503,7 +700,7 @@ static int live_empty_request(void *arg)
 		unsigned long n, prime;
 		ktime_t times[2] = {};
 
-		err = begin_live_test(&t, i915, __func__, engine->name);
+		err = igt_live_test_begin(&t, i915, __func__, engine->name);
 		if (err)
 			goto out_batch;
 
@@ -539,7 +736,7 @@ static int live_empty_request(void *arg)
 				break;
 		}
 
-		err = end_live_test(&t);
+		err = igt_live_test_end(&t);
 		if (err)
 			goto out_batch;
 
@@ -553,7 +750,7 @@ out_batch:
 	i915_vma_unpin(batch);
 	i915_vma_put(batch);
 out_unlock:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	return err;
 }
@@ -637,8 +834,9 @@ static int live_all_engines(void *arg)
 	struct drm_i915_private *i915 = arg;
 	struct intel_engine_cs *engine;
 	struct i915_request *request[I915_NUM_ENGINES];
+	intel_wakeref_t wakeref;
+	struct igt_live_test t;
 	struct i915_vma *batch;
-	struct live_test t;
 	unsigned int id;
 	int err;
 
@@ -648,9 +846,9 @@ static int live_all_engines(void *arg)
 	 */
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
-	err = begin_live_test(&t, i915, __func__, "");
+	err = igt_live_test_begin(&t, i915, __func__, "");
 	if (err)
 		goto out_unlock;
 
@@ -722,7 +920,7 @@ static int live_all_engines(void *arg)
 		request[id] = NULL;
 	}
 
-	err = end_live_test(&t);
+	err = igt_live_test_end(&t);
 
 out_request:
 	for_each_engine(engine, i915, id)
@@ -731,7 +929,7 @@ out_request:
 	i915_vma_unpin(batch);
 	i915_vma_put(batch);
 out_unlock:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	return err;
 }
@@ -742,7 +940,8 @@ static int live_sequential_engines(void *arg)
 	struct i915_request *request[I915_NUM_ENGINES] = {};
 	struct i915_request *prev = NULL;
 	struct intel_engine_cs *engine;
-	struct live_test t;
+	intel_wakeref_t wakeref;
+	struct igt_live_test t;
 	unsigned int id;
 	int err;
 
@@ -753,9 +952,9 @@ static int live_sequential_engines(void *arg)
 	 */
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
-	err = begin_live_test(&t, i915, __func__, "");
+	err = igt_live_test_begin(&t, i915, __func__, "");
 	if (err)
 		goto out_unlock;
 
@@ -838,7 +1037,7 @@ static int live_sequential_engines(void *arg)
 		GEM_BUG_ON(!i915_request_completed(request[id]));
 	}
 
-	err = end_live_test(&t);
+	err = igt_live_test_end(&t);
 
 out_request:
 	for_each_engine(engine, i915, id) {
@@ -860,11 +1059,183 @@ out_request:
 		i915_request_put(request[id]);
 	}
 out_unlock:
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	return err;
 }
 
+static int
+max_batches(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
+{
+	struct i915_request *rq;
+	int ret;
+
+	/*
+	 * Before execlists, all contexts share the same ringbuffer. With
+	 * execlists, each context/engine has a separate ringbuffer and
+	 * for the purposes of this test, inexhaustible.
+	 *
+	 * For the global ringbuffer though, we have to be very careful
+	 * that we do not wrap while preventing the execution of requests
+	 * with a unsignaled fence.
+	 */
+	if (HAS_EXECLISTS(ctx->i915))
+		return INT_MAX;
+
+	rq = i915_request_alloc(engine, ctx);
+	if (IS_ERR(rq)) {
+		ret = PTR_ERR(rq);
+	} else {
+		int sz;
+
+		ret = rq->ring->size - rq->reserved_space;
+		i915_request_add(rq);
+
+		sz = rq->ring->emit - rq->head;
+		if (sz < 0)
+			sz += rq->ring->size;
+		ret /= sz;
+		ret /= 2; /* leave half spare, in case of emergency! */
+	}
+
+	return ret;
+}
+
+static int live_breadcrumbs_smoketest(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct smoketest t[I915_NUM_ENGINES];
+	unsigned int ncpus = num_online_cpus();
+	unsigned long num_waits, num_fences;
+	struct intel_engine_cs *engine;
+	struct task_struct **threads;
+	struct igt_live_test live;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	struct drm_file *file;
+	unsigned int n;
+	int ret = 0;
+
+	/*
+	 * Smoketest our breadcrumb/signal handling for requests across multiple
+	 * threads. A very simple test to only catch the most egregious of bugs.
+	 * See __igt_breadcrumbs_smoketest();
+	 *
+	 * On real hardware this time.
+	 */
+
+	wakeref = intel_runtime_pm_get(i915);
+
+	file = mock_file(i915);
+	if (IS_ERR(file)) {
+		ret = PTR_ERR(file);
+		goto out_rpm;
+	}
+
+	threads = kcalloc(ncpus * I915_NUM_ENGINES,
+			  sizeof(*threads),
+			  GFP_KERNEL);
+	if (!threads) {
+		ret = -ENOMEM;
+		goto out_file;
+	}
+
+	memset(&t[0], 0, sizeof(t[0]));
+	t[0].request_alloc = __live_request_alloc;
+	t[0].ncontexts = 64;
+	t[0].contexts = kmalloc_array(t[0].ncontexts,
+				      sizeof(*t[0].contexts),
+				      GFP_KERNEL);
+	if (!t[0].contexts) {
+		ret = -ENOMEM;
+		goto out_threads;
+	}
+
+	mutex_lock(&i915->drm.struct_mutex);
+	for (n = 0; n < t[0].ncontexts; n++) {
+		t[0].contexts[n] = live_context(i915, file);
+		if (!t[0].contexts[n]) {
+			ret = -ENOMEM;
+			goto out_contexts;
+		}
+	}
+
+	ret = igt_live_test_begin(&live, i915, __func__, "");
+	if (ret)
+		goto out_contexts;
+
+	for_each_engine(engine, i915, id) {
+		t[id] = t[0];
+		t[id].engine = engine;
+		t[id].max_batch = max_batches(t[0].contexts[0], engine);
+		if (t[id].max_batch < 0) {
+			ret = t[id].max_batch;
+			mutex_unlock(&i915->drm.struct_mutex);
+			goto out_flush;
+		}
+		/* One ring interleaved between requests from all cpus */
+		t[id].max_batch /= num_online_cpus() + 1;
+		pr_debug("Limiting batches to %d requests on %s\n",
+			 t[id].max_batch, engine->name);
+
+		for (n = 0; n < ncpus; n++) {
+			struct task_struct *tsk;
+
+			tsk = kthread_run(__igt_breadcrumbs_smoketest,
+					  &t[id], "igt/%d.%d", id, n);
+			if (IS_ERR(tsk)) {
+				ret = PTR_ERR(tsk);
+				mutex_unlock(&i915->drm.struct_mutex);
+				goto out_flush;
+			}
+
+			get_task_struct(tsk);
+			threads[id * ncpus + n] = tsk;
+		}
+	}
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	msleep(jiffies_to_msecs(i915_selftest.timeout_jiffies));
+
+out_flush:
+	num_waits = 0;
+	num_fences = 0;
+	for_each_engine(engine, i915, id) {
+		for (n = 0; n < ncpus; n++) {
+			struct task_struct *tsk = threads[id * ncpus + n];
+			int err;
+
+			if (!tsk)
+				continue;
+
+			err = kthread_stop(tsk);
+			if (err < 0 && !ret)
+				ret = err;
+
+			put_task_struct(tsk);
+		}
+
+		num_waits += atomic_long_read(&t[id].num_waits);
+		num_fences += atomic_long_read(&t[id].num_fences);
+	}
+	pr_info("Completed %lu waits for %lu fences across %d engines and %d cpus\n",
+		num_waits, num_fences, RUNTIME_INFO(i915)->num_rings, ncpus);
+
+	mutex_lock(&i915->drm.struct_mutex);
+	ret = igt_live_test_end(&live) ?: ret;
+out_contexts:
+	mutex_unlock(&i915->drm.struct_mutex);
+	kfree(t[0].contexts);
+out_threads:
+	kfree(threads);
+out_file:
+	mock_file_free(i915, file);
+out_rpm:
+	intel_runtime_pm_put(i915, wakeref);
+
+	return ret;
+}
+
 int i915_request_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -872,6 +1243,7 @@ int i915_request_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_all_engines),
 		SUBTEST(live_sequential_engines),
 		SUBTEST(live_empty_request),
+		SUBTEST(live_breadcrumbs_smoketest),
 	};
 
 	if (i915_terminally_wedged(&i915->gpu_error))
diff --git a/drivers/gpu/drm/i915/selftests/i915_selftest.c b/drivers/gpu/drm/i915/selftests/i915_selftest.c
index 86c54ea37f48..10ef0e636a24 100644
--- a/drivers/gpu/drm/i915/selftests/i915_selftest.c
+++ b/drivers/gpu/drm/i915/selftests/i915_selftest.c
@@ -197,6 +197,49 @@ int i915_live_selftests(struct pci_dev *pdev)
 	return 0;
 }
 
+static bool apply_subtest_filter(const char *caller, const char *name)
+{
+	char *filter, *sep, *tok;
+	bool result = true;
+
+	filter = kstrdup(i915_selftest.filter, GFP_KERNEL);
+	for (sep = filter; (tok = strsep(&sep, ","));) {
+		bool allow = true;
+		char *sl;
+
+		if (*tok == '!') {
+			allow = false;
+			tok++;
+		}
+
+		if (*tok == '\0')
+			continue;
+
+		sl = strchr(tok, '/');
+		if (sl) {
+			*sl++ = '\0';
+			if (strcmp(tok, caller)) {
+				if (allow)
+					result = false;
+				continue;
+			}
+			tok = sl;
+		}
+
+		if (strcmp(tok, name)) {
+			if (allow)
+				result = false;
+			continue;
+		}
+
+		result = allow;
+		break;
+	}
+	kfree(filter);
+
+	return result;
+}
+
 int __i915_subtests(const char *caller,
 		    const struct i915_subtest *st,
 		    unsigned int count,
@@ -209,6 +252,9 @@ int __i915_subtests(const char *caller,
 		if (signal_pending(current))
 			return -EINTR;
 
+		if (!apply_subtest_filter(caller, st->name))
+			continue;
+
 		pr_debug(DRIVER_NAME ": Running %s/%s\n", caller, st->name);
 		GEM_TRACE("Running %s/%s\n", caller, st->name);
 
@@ -244,6 +290,7 @@ bool __igt_timeout(unsigned long timeout, const char *fmt, ...)
 
 module_param_named(st_random_seed, i915_selftest.random_seed, uint, 0400);
 module_param_named(st_timeout, i915_selftest.timeout_ms, uint, 0400);
+module_param_named(st_filter, i915_selftest.filter, charp, 0400);
 
 module_param_named_unsafe(mock_selftests, i915_selftest.mock, int, 0400);
 MODULE_PARM_DESC(mock_selftests, "Run selftests before loading, using mock hardware (0:disabled [default], 1:run tests then load driver, -1:run tests then exit module)");
diff --git a/drivers/gpu/drm/i915/selftests/i915_timeline.c b/drivers/gpu/drm/i915/selftests/i915_timeline.c
index 19f1c6a5c8fb..12ea69b1a1e5 100644
--- a/drivers/gpu/drm/i915/selftests/i915_timeline.c
+++ b/drivers/gpu/drm/i915/selftests/i915_timeline.c
@@ -4,12 +4,155 @@
  * Copyright © 2017-2018 Intel Corporation
  */
 
+#include <linux/prime_numbers.h>
+
 #include "../i915_selftest.h"
 #include "i915_random.h"
 
+#include "igt_flush_test.h"
 #include "mock_gem_device.h"
 #include "mock_timeline.h"
 
+static struct page *hwsp_page(struct i915_timeline *tl)
+{
+	struct drm_i915_gem_object *obj = tl->hwsp_ggtt->obj;
+
+	GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
+	return sg_page(obj->mm.pages->sgl);
+}
+
+static unsigned long hwsp_cacheline(struct i915_timeline *tl)
+{
+	unsigned long address = (unsigned long)page_address(hwsp_page(tl));
+
+	return (address + tl->hwsp_offset) / CACHELINE_BYTES;
+}
+
+#define CACHELINES_PER_PAGE (PAGE_SIZE / CACHELINE_BYTES)
+
+struct mock_hwsp_freelist {
+	struct drm_i915_private *i915;
+	struct radix_tree_root cachelines;
+	struct i915_timeline **history;
+	unsigned long count, max;
+	struct rnd_state prng;
+};
+
+enum {
+	SHUFFLE = BIT(0),
+};
+
+static void __mock_hwsp_record(struct mock_hwsp_freelist *state,
+			       unsigned int idx,
+			       struct i915_timeline *tl)
+{
+	tl = xchg(&state->history[idx], tl);
+	if (tl) {
+		radix_tree_delete(&state->cachelines, hwsp_cacheline(tl));
+		i915_timeline_put(tl);
+	}
+}
+
+static int __mock_hwsp_timeline(struct mock_hwsp_freelist *state,
+				unsigned int count,
+				unsigned int flags)
+{
+	struct i915_timeline *tl;
+	unsigned int idx;
+
+	while (count--) {
+		unsigned long cacheline;
+		int err;
+
+		tl = i915_timeline_create(state->i915, "mock", NULL);
+		if (IS_ERR(tl))
+			return PTR_ERR(tl);
+
+		cacheline = hwsp_cacheline(tl);
+		err = radix_tree_insert(&state->cachelines, cacheline, tl);
+		if (err) {
+			if (err == -EEXIST) {
+				pr_err("HWSP cacheline %lu already used; duplicate allocation!\n",
+				       cacheline);
+			}
+			i915_timeline_put(tl);
+			return err;
+		}
+
+		idx = state->count++ % state->max;
+		__mock_hwsp_record(state, idx, tl);
+	}
+
+	if (flags & SHUFFLE)
+		i915_prandom_shuffle(state->history,
+				     sizeof(*state->history),
+				     min(state->count, state->max),
+				     &state->prng);
+
+	count = i915_prandom_u32_max_state(min(state->count, state->max),
+					   &state->prng);
+	while (count--) {
+		idx = --state->count % state->max;
+		__mock_hwsp_record(state, idx, NULL);
+	}
+
+	return 0;
+}
+
+static int mock_hwsp_freelist(void *arg)
+{
+	struct mock_hwsp_freelist state;
+	const struct {
+		const char *name;
+		unsigned int flags;
+	} phases[] = {
+		{ "linear", 0 },
+		{ "shuffled", SHUFFLE },
+		{ },
+	}, *p;
+	unsigned int na;
+	int err = 0;
+
+	INIT_RADIX_TREE(&state.cachelines, GFP_KERNEL);
+	state.prng = I915_RND_STATE_INITIALIZER(i915_selftest.random_seed);
+
+	state.i915 = mock_gem_device();
+	if (!state.i915)
+		return -ENOMEM;
+
+	/*
+	 * Create a bunch of timelines and check that their HWSP do not overlap.
+	 * Free some, and try again.
+	 */
+
+	state.max = PAGE_SIZE / sizeof(*state.history);
+	state.count = 0;
+	state.history = kcalloc(state.max, sizeof(*state.history), GFP_KERNEL);
+	if (!state.history) {
+		err = -ENOMEM;
+		goto err_put;
+	}
+
+	mutex_lock(&state.i915->drm.struct_mutex);
+	for (p = phases; p->name; p++) {
+		pr_debug("%s(%s)\n", __func__, p->name);
+		for_each_prime_number_from(na, 1, 2 * CACHELINES_PER_PAGE) {
+			err = __mock_hwsp_timeline(&state, na, p->flags);
+			if (err)
+				goto out;
+		}
+	}
+
+out:
+	for (na = 0; na < state.max; na++)
+		__mock_hwsp_record(&state, na, NULL);
+	mutex_unlock(&state.i915->drm.struct_mutex);
+	kfree(state.history);
+err_put:
+	drm_dev_put(&state.i915->drm);
+	return err;
+}
+
 struct __igt_sync {
 	const char *name;
 	u32 seqno;
@@ -256,12 +399,331 @@ static int bench_sync(void *arg)
 	return 0;
 }
 
-int i915_gem_timeline_mock_selftests(void)
+int i915_timeline_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
+		SUBTEST(mock_hwsp_freelist),
 		SUBTEST(igt_sync),
 		SUBTEST(bench_sync),
 	};
 
 	return i915_subtests(tests, NULL);
 }
+
+static int emit_ggtt_store_dw(struct i915_request *rq, u32 addr, u32 value)
+{
+	u32 *cs;
+
+	cs = intel_ring_begin(rq, 4);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	if (INTEL_GEN(rq->i915) >= 8) {
+		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+		*cs++ = addr;
+		*cs++ = 0;
+		*cs++ = value;
+	} else if (INTEL_GEN(rq->i915) >= 4) {
+		*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+		*cs++ = 0;
+		*cs++ = addr;
+		*cs++ = value;
+	} else {
+		*cs++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
+		*cs++ = addr;
+		*cs++ = value;
+		*cs++ = MI_NOOP;
+	}
+
+	intel_ring_advance(rq, cs);
+
+	return 0;
+}
+
+static struct i915_request *
+tl_write(struct i915_timeline *tl, struct intel_engine_cs *engine, u32 value)
+{
+	struct i915_request *rq;
+	int err;
+
+	lockdep_assert_held(&tl->i915->drm.struct_mutex); /* lazy rq refs */
+
+	err = i915_timeline_pin(tl);
+	if (err) {
+		rq = ERR_PTR(err);
+		goto out;
+	}
+
+	rq = i915_request_alloc(engine, engine->i915->kernel_context);
+	if (IS_ERR(rq))
+		goto out_unpin;
+
+	err = emit_ggtt_store_dw(rq, tl->hwsp_offset, value);
+	i915_request_add(rq);
+	if (err)
+		rq = ERR_PTR(err);
+
+out_unpin:
+	i915_timeline_unpin(tl);
+out:
+	if (IS_ERR(rq))
+		pr_err("Failed to write to timeline!\n");
+	return rq;
+}
+
+static struct i915_timeline *
+checked_i915_timeline_create(struct drm_i915_private *i915)
+{
+	struct i915_timeline *tl;
+
+	tl = i915_timeline_create(i915, "live", NULL);
+	if (IS_ERR(tl))
+		return tl;
+
+	if (*tl->hwsp_seqno != tl->seqno) {
+		pr_err("Timeline created with incorrect breadcrumb, found %x, expected %x\n",
+		       *tl->hwsp_seqno, tl->seqno);
+		i915_timeline_put(tl);
+		return ERR_PTR(-EINVAL);
+	}
+
+	return tl;
+}
+
+static int live_hwsp_engine(void *arg)
+{
+#define NUM_TIMELINES 4096
+	struct drm_i915_private *i915 = arg;
+	struct i915_timeline **timelines;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	unsigned long count, n;
+	int err = 0;
+
+	/*
+	 * Create a bunch of timelines and check we can write
+	 * independently to each of their breadcrumb slots.
+	 */
+
+	timelines = kvmalloc_array(NUM_TIMELINES * I915_NUM_ENGINES,
+				   sizeof(*timelines),
+				   GFP_KERNEL);
+	if (!timelines)
+		return -ENOMEM;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	count = 0;
+	for_each_engine(engine, i915, id) {
+		if (!intel_engine_can_store_dword(engine))
+			continue;
+
+		for (n = 0; n < NUM_TIMELINES; n++) {
+			struct i915_timeline *tl;
+			struct i915_request *rq;
+
+			tl = checked_i915_timeline_create(i915);
+			if (IS_ERR(tl)) {
+				err = PTR_ERR(tl);
+				goto out;
+			}
+
+			rq = tl_write(tl, engine, count);
+			if (IS_ERR(rq)) {
+				i915_timeline_put(tl);
+				err = PTR_ERR(rq);
+				goto out;
+			}
+
+			timelines[count++] = tl;
+		}
+	}
+
+out:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	for (n = 0; n < count; n++) {
+		struct i915_timeline *tl = timelines[n];
+
+		if (!err && *tl->hwsp_seqno != n) {
+			pr_err("Invalid seqno stored in timeline %lu, found 0x%x\n",
+			       n, *tl->hwsp_seqno);
+			err = -EINVAL;
+		}
+		i915_timeline_put(tl);
+	}
+
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	kvfree(timelines);
+
+	return err;
+#undef NUM_TIMELINES
+}
+
+static int live_hwsp_alternate(void *arg)
+{
+#define NUM_TIMELINES 4096
+	struct drm_i915_private *i915 = arg;
+	struct i915_timeline **timelines;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	unsigned long count, n;
+	int err = 0;
+
+	/*
+	 * Create a bunch of timelines and check we can write
+	 * independently to each of their breadcrumb slots with adjacent
+	 * engines.
+	 */
+
+	timelines = kvmalloc_array(NUM_TIMELINES * I915_NUM_ENGINES,
+				   sizeof(*timelines),
+				   GFP_KERNEL);
+	if (!timelines)
+		return -ENOMEM;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	count = 0;
+	for (n = 0; n < NUM_TIMELINES; n++) {
+		for_each_engine(engine, i915, id) {
+			struct i915_timeline *tl;
+			struct i915_request *rq;
+
+			if (!intel_engine_can_store_dword(engine))
+				continue;
+
+			tl = checked_i915_timeline_create(i915);
+			if (IS_ERR(tl)) {
+				err = PTR_ERR(tl);
+				goto out;
+			}
+
+			rq = tl_write(tl, engine, count);
+			if (IS_ERR(rq)) {
+				i915_timeline_put(tl);
+				err = PTR_ERR(rq);
+				goto out;
+			}
+
+			timelines[count++] = tl;
+		}
+	}
+
+out:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	for (n = 0; n < count; n++) {
+		struct i915_timeline *tl = timelines[n];
+
+		if (!err && *tl->hwsp_seqno != n) {
+			pr_err("Invalid seqno stored in timeline %lu, found 0x%x\n",
+			       n, *tl->hwsp_seqno);
+			err = -EINVAL;
+		}
+		i915_timeline_put(tl);
+	}
+
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	kvfree(timelines);
+
+	return err;
+#undef NUM_TIMELINES
+}
+
+static int live_hwsp_recycle(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	unsigned long count;
+	int err = 0;
+
+	/*
+	 * Check seqno writes into one timeline at a time. We expect to
+	 * recycle the breadcrumb slot between iterations and neither
+	 * want to confuse ourselves or the GPU.
+	 */
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	count = 0;
+	for_each_engine(engine, i915, id) {
+		IGT_TIMEOUT(end_time);
+
+		if (!intel_engine_can_store_dword(engine))
+			continue;
+
+		do {
+			struct i915_timeline *tl;
+			struct i915_request *rq;
+
+			tl = checked_i915_timeline_create(i915);
+			if (IS_ERR(tl)) {
+				err = PTR_ERR(tl);
+				goto out;
+			}
+
+			rq = tl_write(tl, engine, count);
+			if (IS_ERR(rq)) {
+				i915_timeline_put(tl);
+				err = PTR_ERR(rq);
+				goto out;
+			}
+
+			if (i915_request_wait(rq,
+					      I915_WAIT_LOCKED,
+					      HZ / 5) < 0) {
+				pr_err("Wait for timeline writes timed out!\n");
+				i915_timeline_put(tl);
+				err = -EIO;
+				goto out;
+			}
+
+			if (*tl->hwsp_seqno != count) {
+				pr_err("Invalid seqno stored in timeline %lu, found 0x%x\n",
+				       count, *tl->hwsp_seqno);
+				err = -EINVAL;
+			}
+
+			i915_timeline_put(tl);
+			count++;
+
+			if (err)
+				goto out;
+
+			i915_timelines_park(i915); /* Encourage recycling! */
+		} while (!__igt_timeout(end_time, NULL));
+	}
+
+out:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	return err;
+}
+
+int i915_timeline_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(live_hwsp_recycle),
+		SUBTEST(live_hwsp_engine),
+		SUBTEST(live_hwsp_alternate),
+	};
+
+	return i915_subtests(tests, i915);
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
index ffa74290e054..cf1de82741fa 100644
--- a/drivers/gpu/drm/i915/selftests/i915_vma.c
+++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
@@ -28,6 +28,7 @@
 
 #include "mock_gem_device.h"
 #include "mock_context.h"
+#include "mock_gtt.h"
 
 static bool assert_vma(struct i915_vma *vma,
 		       struct drm_i915_gem_object *obj,
@@ -141,7 +142,8 @@ static int create_vmas(struct drm_i915_private *i915,
 
 static int igt_vma_create(void *arg)
 {
-	struct drm_i915_private *i915 = arg;
+	struct i915_ggtt *ggtt = arg;
+	struct drm_i915_private *i915 = ggtt->vm.i915;
 	struct drm_i915_gem_object *obj, *on;
 	struct i915_gem_context *ctx, *cn;
 	unsigned long num_obj, num_ctx;
@@ -245,7 +247,7 @@ static bool assert_pin_einval(const struct i915_vma *vma,
 
 static int igt_vma_pin1(void *arg)
 {
-	struct drm_i915_private *i915 = arg;
+	struct i915_ggtt *ggtt = arg;
 	const struct pin_mode modes[] = {
 #define VALID(sz, fl) { .size = (sz), .flags = (fl), .assert = assert_pin_valid, .string = #sz ", " #fl ", (valid) " }
 #define __INVALID(sz, fl, check, eval) { .size = (sz), .flags = (fl), .assert = (check), .string = #sz ", " #fl ", (invalid " #eval ")" }
@@ -256,30 +258,30 @@ static int igt_vma_pin1(void *arg)
 
 		VALID(0, PIN_GLOBAL | PIN_OFFSET_BIAS | 4096),
 		VALID(0, PIN_GLOBAL | PIN_OFFSET_BIAS | 8192),
-		VALID(0, PIN_GLOBAL | PIN_OFFSET_BIAS | (i915->ggtt.mappable_end - 4096)),
-		VALID(0, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_BIAS | (i915->ggtt.mappable_end - 4096)),
-		VALID(0, PIN_GLOBAL | PIN_OFFSET_BIAS | (i915->ggtt.vm.total - 4096)),
-
-		VALID(0, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_FIXED | (i915->ggtt.mappable_end - 4096)),
-		INVALID(0, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_FIXED | i915->ggtt.mappable_end),
-		VALID(0, PIN_GLOBAL | PIN_OFFSET_FIXED | (i915->ggtt.vm.total - 4096)),
-		INVALID(0, PIN_GLOBAL | PIN_OFFSET_FIXED | i915->ggtt.vm.total),
+		VALID(0, PIN_GLOBAL | PIN_OFFSET_BIAS | (ggtt->mappable_end - 4096)),
+		VALID(0, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_BIAS | (ggtt->mappable_end - 4096)),
+		VALID(0, PIN_GLOBAL | PIN_OFFSET_BIAS | (ggtt->vm.total - 4096)),
+
+		VALID(0, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_FIXED | (ggtt->mappable_end - 4096)),
+		INVALID(0, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_FIXED | ggtt->mappable_end),
+		VALID(0, PIN_GLOBAL | PIN_OFFSET_FIXED | (ggtt->vm.total - 4096)),
+		INVALID(0, PIN_GLOBAL | PIN_OFFSET_FIXED | ggtt->vm.total),
 		INVALID(0, PIN_GLOBAL | PIN_OFFSET_FIXED | round_down(U64_MAX, PAGE_SIZE)),
 
 		VALID(4096, PIN_GLOBAL),
 		VALID(8192, PIN_GLOBAL),
-		VALID(i915->ggtt.mappable_end - 4096, PIN_GLOBAL | PIN_MAPPABLE),
-		VALID(i915->ggtt.mappable_end, PIN_GLOBAL | PIN_MAPPABLE),
-		NOSPACE(i915->ggtt.mappable_end + 4096, PIN_GLOBAL | PIN_MAPPABLE),
-		VALID(i915->ggtt.vm.total - 4096, PIN_GLOBAL),
-		VALID(i915->ggtt.vm.total, PIN_GLOBAL),
-		NOSPACE(i915->ggtt.vm.total + 4096, PIN_GLOBAL),
+		VALID(ggtt->mappable_end - 4096, PIN_GLOBAL | PIN_MAPPABLE),
+		VALID(ggtt->mappable_end, PIN_GLOBAL | PIN_MAPPABLE),
+		NOSPACE(ggtt->mappable_end + 4096, PIN_GLOBAL | PIN_MAPPABLE),
+		VALID(ggtt->vm.total - 4096, PIN_GLOBAL),
+		VALID(ggtt->vm.total, PIN_GLOBAL),
+		NOSPACE(ggtt->vm.total + 4096, PIN_GLOBAL),
 		NOSPACE(round_down(U64_MAX, PAGE_SIZE), PIN_GLOBAL),
-		INVALID(8192, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_FIXED | (i915->ggtt.mappable_end - 4096)),
-		INVALID(8192, PIN_GLOBAL | PIN_OFFSET_FIXED | (i915->ggtt.vm.total - 4096)),
+		INVALID(8192, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_FIXED | (ggtt->mappable_end - 4096)),
+		INVALID(8192, PIN_GLOBAL | PIN_OFFSET_FIXED | (ggtt->vm.total - 4096)),
 		INVALID(8192, PIN_GLOBAL | PIN_OFFSET_FIXED | (round_down(U64_MAX, PAGE_SIZE) - 4096)),
 
-		VALID(8192, PIN_GLOBAL | PIN_OFFSET_BIAS | (i915->ggtt.mappable_end - 4096)),
+		VALID(8192, PIN_GLOBAL | PIN_OFFSET_BIAS | (ggtt->mappable_end - 4096)),
 
 #if !IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
 		/* Misusing BIAS is a programming error (it is not controllable
@@ -287,10 +289,10 @@ static int igt_vma_pin1(void *arg)
 		 * However, the tests are still quite interesting for checking
 		 * variable start, end and size.
 		 */
-		NOSPACE(0, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_BIAS | i915->ggtt.mappable_end),
-		NOSPACE(0, PIN_GLOBAL | PIN_OFFSET_BIAS | i915->ggtt.vm.total),
-		NOSPACE(8192, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_BIAS | (i915->ggtt.mappable_end - 4096)),
-		NOSPACE(8192, PIN_GLOBAL | PIN_OFFSET_BIAS | (i915->ggtt.vm.total - 4096)),
+		NOSPACE(0, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_BIAS | ggtt->mappable_end),
+		NOSPACE(0, PIN_GLOBAL | PIN_OFFSET_BIAS | ggtt->vm.total),
+		NOSPACE(8192, PIN_GLOBAL | PIN_MAPPABLE | PIN_OFFSET_BIAS | (ggtt->mappable_end - 4096)),
+		NOSPACE(8192, PIN_GLOBAL | PIN_OFFSET_BIAS | (ggtt->vm.total - 4096)),
 #endif
 		{ },
 #undef NOSPACE
@@ -306,13 +308,13 @@ static int igt_vma_pin1(void *arg)
 	 * focusing on error handling of boundary conditions.
 	 */
 
-	GEM_BUG_ON(!drm_mm_clean(&i915->ggtt.vm.mm));
+	GEM_BUG_ON(!drm_mm_clean(&ggtt->vm.mm));
 
-	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+	obj = i915_gem_object_create_internal(ggtt->vm.i915, PAGE_SIZE);
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	vma = checked_vma_instance(obj, &i915->ggtt.vm, NULL);
+	vma = checked_vma_instance(obj, &ggtt->vm, NULL);
 	if (IS_ERR(vma))
 		goto out;
 
@@ -403,8 +405,8 @@ static unsigned int rotated_size(const struct intel_rotation_plane_info *a,
 
 static int igt_vma_rotate(void *arg)
 {
-	struct drm_i915_private *i915 = arg;
-	struct i915_address_space *vm = &i915->ggtt.vm;
+	struct i915_ggtt *ggtt = arg;
+	struct i915_address_space *vm = &ggtt->vm;
 	struct drm_i915_gem_object *obj;
 	const struct intel_rotation_plane_info planes[] = {
 		{ .width = 1, .height = 1, .stride = 1 },
@@ -431,7 +433,7 @@ static int igt_vma_rotate(void *arg)
 	 * that the page layout within the rotated VMA match our expectations.
 	 */
 
-	obj = i915_gem_object_create_internal(i915, max_pages * PAGE_SIZE);
+	obj = i915_gem_object_create_internal(vm->i915, max_pages * PAGE_SIZE);
 	if (IS_ERR(obj))
 		goto out;
 
@@ -602,8 +604,8 @@ static bool assert_pin(struct i915_vma *vma,
 
 static int igt_vma_partial(void *arg)
 {
-	struct drm_i915_private *i915 = arg;
-	struct i915_address_space *vm = &i915->ggtt.vm;
+	struct i915_ggtt *ggtt = arg;
+	struct i915_address_space *vm = &ggtt->vm;
 	const unsigned int npages = 1021; /* prime! */
 	struct drm_i915_gem_object *obj;
 	const struct phase {
@@ -621,7 +623,7 @@ static int igt_vma_partial(void *arg)
 	 * we are returned the same VMA when we later request the same range.
 	 */
 
-	obj = i915_gem_object_create_internal(i915, npages*PAGE_SIZE);
+	obj = i915_gem_object_create_internal(vm->i915, npages * PAGE_SIZE);
 	if (IS_ERR(obj))
 		goto out;
 
@@ -670,7 +672,7 @@ static int igt_vma_partial(void *arg)
 		}
 
 		count = 0;
-		list_for_each_entry(vma, &obj->vma_list, obj_link)
+		list_for_each_entry(vma, &obj->vma.list, obj_link)
 			count++;
 		if (count != nvma) {
 			pr_err("(%s) All partial vma were not recorded on the obj->vma_list: found %u, expected %u\n",
@@ -699,7 +701,7 @@ static int igt_vma_partial(void *arg)
 		i915_vma_unpin(vma);
 
 		count = 0;
-		list_for_each_entry(vma, &obj->vma_list, obj_link)
+		list_for_each_entry(vma, &obj->vma.list, obj_link)
 			count++;
 		if (count != nvma) {
 			pr_err("(%s) allocated an extra full vma!\n", p->name);
@@ -723,17 +725,24 @@ int i915_vma_mock_selftests(void)
 		SUBTEST(igt_vma_partial),
 	};
 	struct drm_i915_private *i915;
+	struct i915_ggtt ggtt;
 	int err;
 
 	i915 = mock_gem_device();
 	if (!i915)
 		return -ENOMEM;
 
+	mock_init_ggtt(i915, &ggtt);
+
 	mutex_lock(&i915->drm.struct_mutex);
-	err = i915_subtests(tests, i915);
+	err = i915_subtests(tests, &ggtt);
+	mock_device_flush(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
+	i915_gem_drain_freed_objects(i915);
+
+	mock_fini_ggtt(&ggtt);
 	drm_dev_put(&i915->drm);
+
 	return err;
 }
-
diff --git a/drivers/gpu/drm/i915/selftests/igt_live_test.c b/drivers/gpu/drm/i915/selftests/igt_live_test.c
new file mode 100644
index 000000000000..3e902761cd16
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/igt_live_test.c
@@ -0,0 +1,78 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#include "../i915_drv.h"
+
+#include "../i915_selftest.h"
+#include "igt_flush_test.h"
+#include "igt_live_test.h"
+
+int igt_live_test_begin(struct igt_live_test *t,
+			struct drm_i915_private *i915,
+			const char *func,
+			const char *name)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err;
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	t->i915 = i915;
+	t->func = func;
+	t->name = name;
+
+	err = i915_gem_wait_for_idle(i915,
+				     I915_WAIT_INTERRUPTIBLE |
+				     I915_WAIT_LOCKED,
+				     MAX_SCHEDULE_TIMEOUT);
+	if (err) {
+		pr_err("%s(%s): failed to idle before, with err=%d!",
+		       func, name, err);
+		return err;
+	}
+
+	t->reset_global = i915_reset_count(&i915->gpu_error);
+
+	for_each_engine(engine, i915, id)
+		t->reset_engine[id] =
+			i915_reset_engine_count(&i915->gpu_error, engine);
+
+	return 0;
+}
+
+int igt_live_test_end(struct igt_live_test *t)
+{
+	struct drm_i915_private *i915 = t->i915;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		return -EIO;
+
+	if (t->reset_global != i915_reset_count(&i915->gpu_error)) {
+		pr_err("%s(%s): GPU was reset %d times!\n",
+		       t->func, t->name,
+		       i915_reset_count(&i915->gpu_error) - t->reset_global);
+		return -EIO;
+	}
+
+	for_each_engine(engine, i915, id) {
+		if (t->reset_engine[id] ==
+		    i915_reset_engine_count(&i915->gpu_error, engine))
+			continue;
+
+		pr_err("%s(%s): engine '%s' was reset %d times!\n",
+		       t->func, t->name, engine->name,
+		       i915_reset_engine_count(&i915->gpu_error, engine) -
+		       t->reset_engine[id]);
+		return -EIO;
+	}
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/selftests/igt_live_test.h b/drivers/gpu/drm/i915/selftests/igt_live_test.h
new file mode 100644
index 000000000000..c0e9f99d50de
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/igt_live_test.h
@@ -0,0 +1,35 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef IGT_LIVE_TEST_H
+#define IGT_LIVE_TEST_H
+
+#include "../i915_gem.h"
+
+struct drm_i915_private;
+
+struct igt_live_test {
+	struct drm_i915_private *i915;
+	const char *func;
+	const char *name;
+
+	unsigned int reset_global;
+	unsigned int reset_engine[I915_NUM_ENGINES];
+};
+
+/*
+ * Flush the GPU state before and after the test to ensure that no residual
+ * code is running on the GPU that may affect this test. Also compare the
+ * state before and after the test and alert if it unexpectedly changes,
+ * e.g. if the GPU was reset.
+ */
+int igt_live_test_begin(struct igt_live_test *t,
+			struct drm_i915_private *i915,
+			const char *func,
+			const char *name);
+int igt_live_test_end(struct igt_live_test *t);
+
+#endif /* IGT_LIVE_TEST_H */
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index 8cd34f6e6859..9ebd9225684e 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -68,48 +68,65 @@ static u64 hws_address(const struct i915_vma *hws,
 	return hws->node.start + seqno_offset(rq->fence.context);
 }
 
-static int emit_recurse_batch(struct igt_spinner *spin,
-			      struct i915_request *rq,
-			      u32 arbitration_command)
+static int move_to_active(struct i915_vma *vma,
+			  struct i915_request *rq,
+			  unsigned int flags)
 {
-	struct i915_address_space *vm = &rq->gem_context->ppgtt->vm;
+	int err;
+
+	err = i915_vma_move_to_active(vma, rq, flags);
+	if (err)
+		return err;
+
+	if (!i915_gem_object_has_active_reference(vma->obj)) {
+		i915_gem_object_get(vma->obj);
+		i915_gem_object_set_active_reference(vma->obj);
+	}
+
+	return 0;
+}
+
+struct i915_request *
+igt_spinner_create_request(struct igt_spinner *spin,
+			   struct i915_gem_context *ctx,
+			   struct intel_engine_cs *engine,
+			   u32 arbitration_command)
+{
+	struct i915_address_space *vm = &ctx->ppgtt->vm;
+	struct i915_request *rq = NULL;
 	struct i915_vma *hws, *vma;
 	u32 *batch;
 	int err;
 
 	vma = i915_vma_instance(spin->obj, vm, NULL);
 	if (IS_ERR(vma))
-		return PTR_ERR(vma);
+		return ERR_CAST(vma);
 
 	hws = i915_vma_instance(spin->hws, vm, NULL);
 	if (IS_ERR(hws))
-		return PTR_ERR(hws);
+		return ERR_CAST(hws);
 
 	err = i915_vma_pin(vma, 0, 0, PIN_USER);
 	if (err)
-		return err;
+		return ERR_PTR(err);
 
 	err = i915_vma_pin(hws, 0, 0, PIN_USER);
 	if (err)
 		goto unpin_vma;
 
-	err = i915_vma_move_to_active(vma, rq, 0);
-	if (err)
+	rq = i915_request_alloc(engine, ctx);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
 		goto unpin_hws;
-
-	if (!i915_gem_object_has_active_reference(vma->obj)) {
-		i915_gem_object_get(vma->obj);
-		i915_gem_object_set_active_reference(vma->obj);
 	}
 
-	err = i915_vma_move_to_active(hws, rq, 0);
+	err = move_to_active(vma, rq, 0);
 	if (err)
-		goto unpin_hws;
+		goto cancel_rq;
 
-	if (!i915_gem_object_has_active_reference(hws->obj)) {
-		i915_gem_object_get(hws->obj);
-		i915_gem_object_set_active_reference(hws->obj);
-	}
+	err = move_to_active(hws, rq, 0);
+	if (err)
+		goto cancel_rq;
 
 	batch = spin->batch;
 
@@ -127,35 +144,18 @@ static int emit_recurse_batch(struct igt_spinner *spin,
 
 	i915_gem_chipset_flush(spin->i915);
 
-	err = rq->engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
+	err = engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
 
+cancel_rq:
+	if (err) {
+		i915_request_skip(rq, err);
+		i915_request_add(rq);
+	}
 unpin_hws:
 	i915_vma_unpin(hws);
 unpin_vma:
 	i915_vma_unpin(vma);
-	return err;
-}
-
-struct i915_request *
-igt_spinner_create_request(struct igt_spinner *spin,
-			   struct i915_gem_context *ctx,
-			   struct intel_engine_cs *engine,
-			   u32 arbitration_command)
-{
-	struct i915_request *rq;
-	int err;
-
-	rq = i915_request_alloc(engine, ctx);
-	if (IS_ERR(rq))
-		return rq;
-
-	err = emit_recurse_batch(spin, rq, arbitration_command);
-	if (err) {
-		i915_request_add(rq);
-		return ERR_PTR(err);
-	}
-
-	return rq;
+	return err ? ERR_PTR(err) : rq;
 }
 
 static u32
@@ -185,11 +185,6 @@ void igt_spinner_fini(struct igt_spinner *spin)
 
 bool igt_wait_for_spinner(struct igt_spinner *spin, struct i915_request *rq)
 {
-	if (!wait_event_timeout(rq->execute,
-				READ_ONCE(rq->global_seqno),
-				msecs_to_jiffies(10)))
-		return false;
-
 	return !(wait_for_us(i915_seqno_passed(hws_seqno(spin, rq),
 					       rq->fence.seqno),
 			     10) &&
diff --git a/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c b/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c
deleted file mode 100644
index f03b407fdbe2..000000000000
--- a/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c
+++ /dev/null
@@ -1,470 +0,0 @@
-/*
- * Copyright © 2016 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- *
- */
-
-#include "../i915_selftest.h"
-#include "i915_random.h"
-
-#include "mock_gem_device.h"
-#include "mock_engine.h"
-
-static int check_rbtree(struct intel_engine_cs *engine,
-			const unsigned long *bitmap,
-			const struct intel_wait *waiters,
-			const int count)
-{
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-	struct rb_node *rb;
-	int n;
-
-	if (&b->irq_wait->node != rb_first(&b->waiters)) {
-		pr_err("First waiter does not match first element of wait-tree\n");
-		return -EINVAL;
-	}
-
-	n = find_first_bit(bitmap, count);
-	for (rb = rb_first(&b->waiters); rb; rb = rb_next(rb)) {
-		struct intel_wait *w = container_of(rb, typeof(*w), node);
-		int idx = w - waiters;
-
-		if (!test_bit(idx, bitmap)) {
-			pr_err("waiter[%d, seqno=%d] removed but still in wait-tree\n",
-			       idx, w->seqno);
-			return -EINVAL;
-		}
-
-		if (n != idx) {
-			pr_err("waiter[%d, seqno=%d] does not match expected next element in tree [%d]\n",
-			       idx, w->seqno, n);
-			return -EINVAL;
-		}
-
-		n = find_next_bit(bitmap, count, n + 1);
-	}
-
-	return 0;
-}
-
-static int check_completion(struct intel_engine_cs *engine,
-			    const unsigned long *bitmap,
-			    const struct intel_wait *waiters,
-			    const int count)
-{
-	int n;
-
-	for (n = 0; n < count; n++) {
-		if (intel_wait_complete(&waiters[n]) != !!test_bit(n, bitmap))
-			continue;
-
-		pr_err("waiter[%d, seqno=%d] is %s, but expected %s\n",
-		       n, waiters[n].seqno,
-		       intel_wait_complete(&waiters[n]) ? "complete" : "active",
-		       test_bit(n, bitmap) ? "active" : "complete");
-		return -EINVAL;
-	}
-
-	return 0;
-}
-
-static int check_rbtree_empty(struct intel_engine_cs *engine)
-{
-	struct intel_breadcrumbs *b = &engine->breadcrumbs;
-
-	if (b->irq_wait) {
-		pr_err("Empty breadcrumbs still has a waiter\n");
-		return -EINVAL;
-	}
-
-	if (!RB_EMPTY_ROOT(&b->waiters)) {
-		pr_err("Empty breadcrumbs, but wait-tree not empty\n");
-		return -EINVAL;
-	}
-
-	return 0;
-}
-
-static int igt_random_insert_remove(void *arg)
-{
-	const u32 seqno_bias = 0x1000;
-	I915_RND_STATE(prng);
-	struct intel_engine_cs *engine = arg;
-	struct intel_wait *waiters;
-	const int count = 4096;
-	unsigned int *order;
-	unsigned long *bitmap;
-	int err = -ENOMEM;
-	int n;
-
-	mock_engine_reset(engine);
-
-	waiters = kvmalloc_array(count, sizeof(*waiters), GFP_KERNEL);
-	if (!waiters)
-		goto out_engines;
-
-	bitmap = kcalloc(DIV_ROUND_UP(count, BITS_PER_LONG), sizeof(*bitmap),
-			 GFP_KERNEL);
-	if (!bitmap)
-		goto out_waiters;
-
-	order = i915_random_order(count, &prng);
-	if (!order)
-		goto out_bitmap;
-
-	for (n = 0; n < count; n++)
-		intel_wait_init_for_seqno(&waiters[n], seqno_bias + n);
-
-	err = check_rbtree(engine, bitmap, waiters, count);
-	if (err)
-		goto out_order;
-
-	/* Add and remove waiters into the rbtree in random order. At each
-	 * step, we verify that the rbtree is correctly ordered.
-	 */
-	for (n = 0; n < count; n++) {
-		int i = order[n];
-
-		intel_engine_add_wait(engine, &waiters[i]);
-		__set_bit(i, bitmap);
-
-		err = check_rbtree(engine, bitmap, waiters, count);
-		if (err)
-			goto out_order;
-	}
-
-	i915_random_reorder(order, count, &prng);
-	for (n = 0; n < count; n++) {
-		int i = order[n];
-
-		intel_engine_remove_wait(engine, &waiters[i]);
-		__clear_bit(i, bitmap);
-
-		err = check_rbtree(engine, bitmap, waiters, count);
-		if (err)
-			goto out_order;
-	}
-
-	err = check_rbtree_empty(engine);
-out_order:
-	kfree(order);
-out_bitmap:
-	kfree(bitmap);
-out_waiters:
-	kvfree(waiters);
-out_engines:
-	mock_engine_flush(engine);
-	return err;
-}
-
-static int igt_insert_complete(void *arg)
-{
-	const u32 seqno_bias = 0x1000;
-	struct intel_engine_cs *engine = arg;
-	struct intel_wait *waiters;
-	const int count = 4096;
-	unsigned long *bitmap;
-	int err = -ENOMEM;
-	int n, m;
-
-	mock_engine_reset(engine);
-
-	waiters = kvmalloc_array(count, sizeof(*waiters), GFP_KERNEL);
-	if (!waiters)
-		goto out_engines;
-
-	bitmap = kcalloc(DIV_ROUND_UP(count, BITS_PER_LONG), sizeof(*bitmap),
-			 GFP_KERNEL);
-	if (!bitmap)
-		goto out_waiters;
-
-	for (n = 0; n < count; n++) {
-		intel_wait_init_for_seqno(&waiters[n], n + seqno_bias);
-		intel_engine_add_wait(engine, &waiters[n]);
-		__set_bit(n, bitmap);
-	}
-	err = check_rbtree(engine, bitmap, waiters, count);
-	if (err)
-		goto out_bitmap;
-
-	/* On each step, we advance the seqno so that several waiters are then
-	 * complete (we increase the seqno by increasingly larger values to
-	 * retire more and more waiters at once). All retired waiters should
-	 * be woken and removed from the rbtree, and so that we check.
-	 */
-	for (n = 0; n < count; n = m) {
-		int seqno = 2 * n;
-
-		GEM_BUG_ON(find_first_bit(bitmap, count) != n);
-
-		if (intel_wait_complete(&waiters[n])) {
-			pr_err("waiter[%d, seqno=%d] completed too early\n",
-			       n, waiters[n].seqno);
-			err = -EINVAL;
-			goto out_bitmap;
-		}
-
-		/* complete the following waiters */
-		mock_seqno_advance(engine, seqno + seqno_bias);
-		for (m = n; m <= seqno; m++) {
-			if (m == count)
-				break;
-
-			GEM_BUG_ON(!test_bit(m, bitmap));
-			__clear_bit(m, bitmap);
-		}
-
-		intel_engine_remove_wait(engine, &waiters[n]);
-		RB_CLEAR_NODE(&waiters[n].node);
-
-		err = check_rbtree(engine, bitmap, waiters, count);
-		if (err) {
-			pr_err("rbtree corrupt after seqno advance to %d\n",
-			       seqno + seqno_bias);
-			goto out_bitmap;
-		}
-
-		err = check_completion(engine, bitmap, waiters, count);
-		if (err) {
-			pr_err("completions after seqno advance to %d failed\n",
-			       seqno + seqno_bias);
-			goto out_bitmap;
-		}
-	}
-
-	err = check_rbtree_empty(engine);
-out_bitmap:
-	kfree(bitmap);
-out_waiters:
-	kvfree(waiters);
-out_engines:
-	mock_engine_flush(engine);
-	return err;
-}
-
-struct igt_wakeup {
-	struct task_struct *tsk;
-	atomic_t *ready, *set, *done;
-	struct intel_engine_cs *engine;
-	unsigned long flags;
-#define STOP 0
-#define IDLE 1
-	wait_queue_head_t *wq;
-	u32 seqno;
-};
-
-static bool wait_for_ready(struct igt_wakeup *w)
-{
-	DEFINE_WAIT(ready);
-
-	set_bit(IDLE, &w->flags);
-	if (atomic_dec_and_test(w->done))
-		wake_up_var(w->done);
-
-	if (test_bit(STOP, &w->flags))
-		goto out;
-
-	for (;;) {
-		prepare_to_wait(w->wq, &ready, TASK_INTERRUPTIBLE);
-		if (atomic_read(w->ready) == 0)
-			break;
-
-		schedule();
-	}
-	finish_wait(w->wq, &ready);
-
-out:
-	clear_bit(IDLE, &w->flags);
-	if (atomic_dec_and_test(w->set))
-		wake_up_var(w->set);
-
-	return !test_bit(STOP, &w->flags);
-}
-
-static int igt_wakeup_thread(void *arg)
-{
-	struct igt_wakeup *w = arg;
-	struct intel_wait wait;
-
-	while (wait_for_ready(w)) {
-		GEM_BUG_ON(kthread_should_stop());
-
-		intel_wait_init_for_seqno(&wait, w->seqno);
-		intel_engine_add_wait(w->engine, &wait);
-		for (;;) {
-			set_current_state(TASK_UNINTERRUPTIBLE);
-			if (i915_seqno_passed(intel_engine_get_seqno(w->engine),
-					      w->seqno))
-				break;
-
-			if (test_bit(STOP, &w->flags)) /* emergency escape */
-				break;
-
-			schedule();
-		}
-		intel_engine_remove_wait(w->engine, &wait);
-		__set_current_state(TASK_RUNNING);
-	}
-
-	return 0;
-}
-
-static void igt_wake_all_sync(atomic_t *ready,
-			      atomic_t *set,
-			      atomic_t *done,
-			      wait_queue_head_t *wq,
-			      int count)
-{
-	atomic_set(set, count);
-	atomic_set(ready, 0);
-	wake_up_all(wq);
-
-	wait_var_event(set, !atomic_read(set));
-	atomic_set(ready, count);
-	atomic_set(done, count);
-}
-
-static int igt_wakeup(void *arg)
-{
-	I915_RND_STATE(prng);
-	struct intel_engine_cs *engine = arg;
-	struct igt_wakeup *waiters;
-	DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
-	const int count = 4096;
-	const u32 max_seqno = count / 4;
-	atomic_t ready, set, done;
-	int err = -ENOMEM;
-	int n, step;
-
-	mock_engine_reset(engine);
-
-	waiters = kvmalloc_array(count, sizeof(*waiters), GFP_KERNEL);
-	if (!waiters)
-		goto out_engines;
-
-	/* Create a large number of threads, each waiting on a random seqno.
-	 * Multiple waiters will be waiting for the same seqno.
-	 */
-	atomic_set(&ready, count);
-	for (n = 0; n < count; n++) {
-		waiters[n].wq = &wq;
-		waiters[n].ready = &ready;
-		waiters[n].set = &set;
-		waiters[n].done = &done;
-		waiters[n].engine = engine;
-		waiters[n].flags = BIT(IDLE);
-
-		waiters[n].tsk = kthread_run(igt_wakeup_thread, &waiters[n],
-					     "i915/igt:%d", n);
-		if (IS_ERR(waiters[n].tsk))
-			goto out_waiters;
-
-		get_task_struct(waiters[n].tsk);
-	}
-
-	for (step = 1; step <= max_seqno; step <<= 1) {
-		u32 seqno;
-
-		/* The waiter threads start paused as we assign them a random
-		 * seqno and reset the engine. Once the engine is reset,
-		 * we signal that the threads may begin their wait upon their
-		 * seqno.
-		 */
-		for (n = 0; n < count; n++) {
-			GEM_BUG_ON(!test_bit(IDLE, &waiters[n].flags));
-			waiters[n].seqno =
-				1 + prandom_u32_state(&prng) % max_seqno;
-		}
-		mock_seqno_advance(engine, 0);
-		igt_wake_all_sync(&ready, &set, &done, &wq, count);
-
-		/* Simulate the GPU doing chunks of work, with one or more
-		 * seqno appearing to finish at the same time. A random number
-		 * of threads will be waiting upon the update and hopefully be
-		 * woken.
-		 */
-		for (seqno = 1; seqno <= max_seqno + step; seqno += step) {
-			usleep_range(50, 500);
-			mock_seqno_advance(engine, seqno);
-		}
-		GEM_BUG_ON(intel_engine_get_seqno(engine) < 1 + max_seqno);
-
-		/* With the seqno now beyond any of the waiting threads, they
-		 * should all be woken, see that they are complete and signal
-		 * that they are ready for the next test. We wait until all
-		 * threads are complete and waiting for us (i.e. not a seqno).
-		 */
-		if (!wait_var_event_timeout(&done,
-					    !atomic_read(&done), 10 * HZ)) {
-			pr_err("Timed out waiting for %d remaining waiters\n",
-			       atomic_read(&done));
-			err = -ETIMEDOUT;
-			break;
-		}
-
-		err = check_rbtree_empty(engine);
-		if (err)
-			break;
-	}
-
-out_waiters:
-	for (n = 0; n < count; n++) {
-		if (IS_ERR(waiters[n].tsk))
-			break;
-
-		set_bit(STOP, &waiters[n].flags);
-	}
-	mock_seqno_advance(engine, INT_MAX); /* wakeup any broken waiters */
-	igt_wake_all_sync(&ready, &set, &done, &wq, n);
-
-	for (n = 0; n < count; n++) {
-		if (IS_ERR(waiters[n].tsk))
-			break;
-
-		kthread_stop(waiters[n].tsk);
-		put_task_struct(waiters[n].tsk);
-	}
-
-	kvfree(waiters);
-out_engines:
-	mock_engine_flush(engine);
-	return err;
-}
-
-int intel_breadcrumbs_mock_selftests(void)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(igt_random_insert_remove),
-		SUBTEST(igt_insert_complete),
-		SUBTEST(igt_wakeup),
-	};
-	struct drm_i915_private *i915;
-	int err;
-
-	i915 = mock_gem_device();
-	if (!i915)
-		return -ENOMEM;
-
-	err = i915_subtests(tests, i915->engine[RCS]);
-	drm_dev_put(&i915->drm);
-
-	return err;
-}
diff --git a/drivers/gpu/drm/i915/selftests/intel_guc.c b/drivers/gpu/drm/i915/selftests/intel_guc.c
index 32cba4cae31a..c5e0a0e98fcb 100644
--- a/drivers/gpu/drm/i915/selftests/intel_guc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_guc.c
@@ -137,12 +137,13 @@ static bool client_doorbell_in_sync(struct intel_guc_client *client)
 static int igt_guc_clients(void *args)
 {
 	struct drm_i915_private *dev_priv = args;
+	intel_wakeref_t wakeref;
 	struct intel_guc *guc;
 	int err = 0;
 
 	GEM_BUG_ON(!HAS_GUC(dev_priv));
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	guc = &dev_priv->guc;
 	if (!guc) {
@@ -225,7 +226,7 @@ out:
 	guc_clients_create(guc);
 	guc_clients_enable(guc);
 unlock:
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 	return err;
 }
@@ -238,13 +239,14 @@ unlock:
 static int igt_guc_doorbells(void *arg)
 {
 	struct drm_i915_private *dev_priv = arg;
+	intel_wakeref_t wakeref;
 	struct intel_guc *guc;
 	int i, err = 0;
 	u16 db_id;
 
 	GEM_BUG_ON(!HAS_GUC(dev_priv));
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	intel_runtime_pm_get(dev_priv);
+	wakeref = intel_runtime_pm_get(dev_priv);
 
 	guc = &dev_priv->guc;
 	if (!guc) {
@@ -337,7 +339,7 @@ out:
 			guc_client_free(clients[i]);
 		}
 unlock:
-	intel_runtime_pm_put(dev_priv);
+	intel_runtime_pm_put(dev_priv, wakeref);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 40efbed611de..7b6f3bea9ef8 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -103,52 +103,87 @@ static u64 hws_address(const struct i915_vma *hws,
 	return hws->node.start + offset_in_page(sizeof(u32)*rq->fence.context);
 }
 
-static int emit_recurse_batch(struct hang *h,
-			      struct i915_request *rq)
+static int move_to_active(struct i915_vma *vma,
+			  struct i915_request *rq,
+			  unsigned int flags)
+{
+	int err;
+
+	err = i915_vma_move_to_active(vma, rq, flags);
+	if (err)
+		return err;
+
+	if (!i915_gem_object_has_active_reference(vma->obj)) {
+		i915_gem_object_get(vma->obj);
+		i915_gem_object_set_active_reference(vma->obj);
+	}
+
+	return 0;
+}
+
+static struct i915_request *
+hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = h->i915;
 	struct i915_address_space *vm =
-		rq->gem_context->ppgtt ?
-		&rq->gem_context->ppgtt->vm :
-		&i915->ggtt.vm;
+		h->ctx->ppgtt ? &h->ctx->ppgtt->vm : &i915->ggtt.vm;
+	struct i915_request *rq = NULL;
 	struct i915_vma *hws, *vma;
 	unsigned int flags;
 	u32 *batch;
 	int err;
 
+	if (i915_gem_object_is_active(h->obj)) {
+		struct drm_i915_gem_object *obj;
+		void *vaddr;
+
+		obj = i915_gem_object_create_internal(h->i915, PAGE_SIZE);
+		if (IS_ERR(obj))
+			return ERR_CAST(obj);
+
+		vaddr = i915_gem_object_pin_map(obj,
+						i915_coherent_map_type(h->i915));
+		if (IS_ERR(vaddr)) {
+			i915_gem_object_put(obj);
+			return ERR_CAST(vaddr);
+		}
+
+		i915_gem_object_unpin_map(h->obj);
+		i915_gem_object_put(h->obj);
+
+		h->obj = obj;
+		h->batch = vaddr;
+	}
+
 	vma = i915_vma_instance(h->obj, vm, NULL);
 	if (IS_ERR(vma))
-		return PTR_ERR(vma);
+		return ERR_CAST(vma);
 
 	hws = i915_vma_instance(h->hws, vm, NULL);
 	if (IS_ERR(hws))
-		return PTR_ERR(hws);
+		return ERR_CAST(hws);
 
 	err = i915_vma_pin(vma, 0, 0, PIN_USER);
 	if (err)
-		return err;
+		return ERR_PTR(err);
 
 	err = i915_vma_pin(hws, 0, 0, PIN_USER);
 	if (err)
 		goto unpin_vma;
 
-	err = i915_vma_move_to_active(vma, rq, 0);
-	if (err)
+	rq = i915_request_alloc(engine, h->ctx);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
 		goto unpin_hws;
-
-	if (!i915_gem_object_has_active_reference(vma->obj)) {
-		i915_gem_object_get(vma->obj);
-		i915_gem_object_set_active_reference(vma->obj);
 	}
 
-	err = i915_vma_move_to_active(hws, rq, 0);
+	err = move_to_active(vma, rq, 0);
 	if (err)
-		goto unpin_hws;
+		goto cancel_rq;
 
-	if (!i915_gem_object_has_active_reference(hws->obj)) {
-		i915_gem_object_get(hws->obj);
-		i915_gem_object_set_active_reference(hws->obj);
-	}
+	err = move_to_active(hws, rq, 0);
+	if (err)
+		goto cancel_rq;
 
 	batch = h->batch;
 	if (INTEL_GEN(i915) >= 8) {
@@ -213,52 +248,16 @@ static int emit_recurse_batch(struct hang *h,
 
 	err = rq->engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, flags);
 
+cancel_rq:
+	if (err) {
+		i915_request_skip(rq, err);
+		i915_request_add(rq);
+	}
 unpin_hws:
 	i915_vma_unpin(hws);
 unpin_vma:
 	i915_vma_unpin(vma);
-	return err;
-}
-
-static struct i915_request *
-hang_create_request(struct hang *h, struct intel_engine_cs *engine)
-{
-	struct i915_request *rq;
-	int err;
-
-	if (i915_gem_object_is_active(h->obj)) {
-		struct drm_i915_gem_object *obj;
-		void *vaddr;
-
-		obj = i915_gem_object_create_internal(h->i915, PAGE_SIZE);
-		if (IS_ERR(obj))
-			return ERR_CAST(obj);
-
-		vaddr = i915_gem_object_pin_map(obj,
-						i915_coherent_map_type(h->i915));
-		if (IS_ERR(vaddr)) {
-			i915_gem_object_put(obj);
-			return ERR_CAST(vaddr);
-		}
-
-		i915_gem_object_unpin_map(h->obj);
-		i915_gem_object_put(h->obj);
-
-		h->obj = obj;
-		h->batch = vaddr;
-	}
-
-	rq = i915_request_alloc(engine, h->ctx);
-	if (IS_ERR(rq))
-		return rq;
-
-	err = emit_recurse_batch(h, rq);
-	if (err) {
-		i915_request_add(rq);
-		return ERR_PTR(err);
-	}
-
-	return rq;
+	return err ? ERR_PTR(err) : rq;
 }
 
 static u32 hws_seqno(const struct hang *h, const struct i915_request *rq)
@@ -364,9 +363,7 @@ static int igt_global_reset(void *arg)
 	/* Check that we can issue a global GPU reset */
 
 	igt_global_reset_lock(i915);
-	set_bit(I915_RESET_HANDOFF, &i915->gpu_error.flags);
 
-	mutex_lock(&i915->drm.struct_mutex);
 	reset_count = i915_reset_count(&i915->gpu_error);
 
 	i915_reset(i915, ALL_ENGINES, NULL);
@@ -375,9 +372,7 @@ static int igt_global_reset(void *arg)
 		pr_err("No GPU reset recorded!\n");
 		err = -EINVAL;
 	}
-	mutex_unlock(&i915->drm.struct_mutex);
 
-	GEM_BUG_ON(test_bit(I915_RESET_HANDOFF, &i915->gpu_error.flags));
 	igt_global_reset_unlock(i915);
 
 	if (i915_terminally_wedged(&i915->gpu_error))
@@ -386,6 +381,29 @@ static int igt_global_reset(void *arg)
 	return err;
 }
 
+static int igt_wedged_reset(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	intel_wakeref_t wakeref;
+
+	/* Check that we can recover a wedged device with a GPU reset */
+
+	igt_global_reset_lock(i915);
+	wakeref = intel_runtime_pm_get(i915);
+
+	i915_gem_set_wedged(i915);
+
+	mutex_lock(&i915->drm.struct_mutex);
+	GEM_BUG_ON(!i915_terminally_wedged(&i915->gpu_error));
+	i915_reset(i915, ALL_ENGINES, NULL);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	intel_runtime_pm_put(i915, wakeref);
+	igt_global_reset_unlock(i915);
+
+	return i915_terminally_wedged(&i915->gpu_error) ? -EIO : 0;
+}
+
 static bool wait_for_idle(struct intel_engine_cs *engine)
 {
 	return wait_for(intel_engine_is_idle(engine), IGT_IDLE_TIMEOUT) == 0;
@@ -431,8 +449,6 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
 
 		set_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
 		do {
-			u32 seqno = intel_engine_get_seqno(engine);
-
 			if (active) {
 				struct i915_request *rq;
 
@@ -451,7 +467,7 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
 				if (!wait_until_running(&h, rq)) {
 					struct drm_printer p = drm_info_printer(i915->drm.dev);
 
-					pr_err("%s: Failed to start request %x, at %x\n",
+					pr_err("%s: Failed to start request %llx, at %x\n",
 					       __func__, rq->fence.seqno, hws_seqno(&h, rq));
 					intel_engine_dump(engine, &p,
 							  "%s\n", engine->name);
@@ -461,8 +477,6 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
 					break;
 				}
 
-				GEM_BUG_ON(!rq->global_seqno);
-				seqno = rq->global_seqno - 1;
 				i915_request_put(rq);
 			}
 
@@ -478,16 +492,15 @@ static int __igt_reset_engine(struct drm_i915_private *i915, bool active)
 				break;
 			}
 
-			reset_engine_count += active;
 			if (i915_reset_engine_count(&i915->gpu_error, engine) !=
-			    reset_engine_count) {
-				pr_err("%s engine reset %srecorded!\n",
-				       engine->name, active ? "not " : "");
+			    ++reset_engine_count) {
+				pr_err("%s engine reset not recorded!\n",
+				       engine->name);
 				err = -EINVAL;
 				break;
 			}
 
-			if (!wait_for_idle(engine)) {
+			if (!i915_reset_flush(i915)) {
 				struct drm_printer p =
 					drm_info_printer(i915->drm.dev);
 
@@ -552,7 +565,7 @@ static int active_request_put(struct i915_request *rq)
 		return 0;
 
 	if (i915_request_wait(rq, 0, 5 * HZ) < 0) {
-		GEM_TRACE("%s timed out waiting for completion of fence %llx:%d, seqno %d.\n",
+		GEM_TRACE("%s timed out waiting for completion of fence %llx:%lld, seqno %d.\n",
 			  rq->engine->name,
 			  rq->fence.context,
 			  rq->fence.seqno,
@@ -710,7 +723,6 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
 
 		set_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
 		do {
-			u32 seqno = intel_engine_get_seqno(engine);
 			struct i915_request *rq = NULL;
 
 			if (flags & TEST_ACTIVE) {
@@ -729,7 +741,7 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
 				if (!wait_until_running(&h, rq)) {
 					struct drm_printer p = drm_info_printer(i915->drm.dev);
 
-					pr_err("%s: Failed to start request %x, at %x\n",
+					pr_err("%s: Failed to start request %llx, at %x\n",
 					       __func__, rq->fence.seqno, hws_seqno(&h, rq));
 					intel_engine_dump(engine, &p,
 							  "%s\n", engine->name);
@@ -738,9 +750,6 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
 					err = -EIO;
 					break;
 				}
-
-				GEM_BUG_ON(!rq->global_seqno);
-				seqno = rq->global_seqno - 1;
 			}
 
 			err = i915_reset_engine(engine, NULL);
@@ -777,10 +786,9 @@ static int __igt_reset_engines(struct drm_i915_private *i915,
 
 		reported = i915_reset_engine_count(&i915->gpu_error, engine);
 		reported -= threads[engine->id].resets;
-		if (reported != (flags & TEST_ACTIVE ? count : 0)) {
-			pr_err("i915_reset_engine(%s:%s): reset %lu times, but reported %lu, expected %lu reported\n",
-			       engine->name, test_name, count, reported,
-			       (flags & TEST_ACTIVE ? count : 0));
+		if (reported != count) {
+			pr_err("i915_reset_engine(%s:%s): reset %lu times, but reported %lu\n",
+			       engine->name, test_name, count, reported);
 			if (!err)
 				err = -EINVAL;
 		}
@@ -879,20 +887,13 @@ static int igt_reset_engines(void *arg)
 	return 0;
 }
 
-static u32 fake_hangcheck(struct i915_request *rq, u32 mask)
+static u32 fake_hangcheck(struct drm_i915_private *i915, u32 mask)
 {
-	struct i915_gpu_error *error = &rq->i915->gpu_error;
-	u32 reset_count = i915_reset_count(error);
-
-	error->stalled_mask = mask;
-
-	/* set_bit() must be after we have setup the backchannel (mask) */
-	smp_mb__before_atomic();
-	set_bit(I915_RESET_HANDOFF, &error->flags);
+	u32 count = i915_reset_count(&i915->gpu_error);
 
-	wake_up_all(&error->wait_queue);
+	i915_reset(i915, mask, NULL);
 
-	return reset_count;
+	return count;
 }
 
 static int igt_reset_wait(void *arg)
@@ -928,7 +929,7 @@ static int igt_reset_wait(void *arg)
 	if (!wait_until_running(&h, rq)) {
 		struct drm_printer p = drm_info_printer(i915->drm.dev);
 
-		pr_err("%s: Failed to start request %x, at %x\n",
+		pr_err("%s: Failed to start request %llx, at %x\n",
 		       __func__, rq->fence.seqno, hws_seqno(&h, rq));
 		intel_engine_dump(rq->engine, &p, "%s\n", rq->engine->name);
 
@@ -938,7 +939,7 @@ static int igt_reset_wait(void *arg)
 		goto out_rq;
 	}
 
-	reset_count = fake_hangcheck(rq, ALL_ENGINES);
+	reset_count = fake_hangcheck(i915, ALL_ENGINES);
 
 	timeout = i915_request_wait(rq, I915_WAIT_LOCKED, 10);
 	if (timeout < 0) {
@@ -948,7 +949,6 @@ static int igt_reset_wait(void *arg)
 		goto out_rq;
 	}
 
-	GEM_BUG_ON(test_bit(I915_RESET_HANDOFF, &i915->gpu_error.flags));
 	if (i915_reset_count(&i915->gpu_error) == reset_count) {
 		pr_err("No GPU reset recorded!\n");
 		err = -EINVAL;
@@ -1107,7 +1107,7 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
 	if (!wait_until_running(&h, rq)) {
 		struct drm_printer p = drm_info_printer(i915->drm.dev);
 
-		pr_err("%s: Failed to start request %x, at %x\n",
+		pr_err("%s: Failed to start request %llx, at %x\n",
 		       __func__, rq->fence.seqno, hws_seqno(&h, rq));
 		intel_engine_dump(rq->engine, &p, "%s\n", rq->engine->name);
 
@@ -1127,7 +1127,7 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
 
 	wait_for_completion(&arg.completion);
 
-	if (wait_for(waitqueue_active(&rq->execute), 10)) {
+	if (wait_for(!list_empty(&rq->fence.cb_list), 10)) {
 		struct drm_printer p = drm_info_printer(i915->drm.dev);
 
 		pr_err("igt/evict_vma kthread did not wait\n");
@@ -1138,7 +1138,7 @@ static int __igt_reset_evict_vma(struct drm_i915_private *i915,
 	}
 
 out_reset:
-	fake_hangcheck(rq, intel_engine_flag(rq->engine));
+	fake_hangcheck(rq->i915, intel_engine_flag(rq->engine));
 
 	if (tsk) {
 		struct igt_wedge_me w;
@@ -1302,7 +1302,7 @@ static int igt_reset_queue(void *arg)
 			if (!wait_until_running(&h, prev)) {
 				struct drm_printer p = drm_info_printer(i915->drm.dev);
 
-				pr_err("%s(%s): Failed to start request %x, at %x\n",
+				pr_err("%s(%s): Failed to start request %llx, at %x\n",
 				       __func__, engine->name,
 				       prev->fence.seqno, hws_seqno(&h, prev));
 				intel_engine_dump(engine, &p,
@@ -1317,12 +1317,7 @@ static int igt_reset_queue(void *arg)
 				goto fini;
 			}
 
-			reset_count = fake_hangcheck(prev, ENGINE_MASK(id));
-
-			i915_reset(i915, ENGINE_MASK(id), NULL);
-
-			GEM_BUG_ON(test_bit(I915_RESET_HANDOFF,
-					    &i915->gpu_error.flags));
+			reset_count = fake_hangcheck(i915, ENGINE_MASK(id));
 
 			if (prev->fence.error != -EIO) {
 				pr_err("GPU reset not recorded on hanging request [fence.error=%d]!\n",
@@ -1413,7 +1408,7 @@ static int igt_handle_error(void *arg)
 	if (!wait_until_running(&h, rq)) {
 		struct drm_printer p = drm_info_printer(i915->drm.dev);
 
-		pr_err("%s: Failed to start request %x, at %x\n",
+		pr_err("%s: Failed to start request %llx, at %x\n",
 		       __func__, rq->fence.seqno, hws_seqno(&h, rq));
 		intel_engine_dump(rq->engine, &p, "%s\n", rq->engine->name);
 
@@ -1449,10 +1444,203 @@ err_unlock:
 	return err;
 }
 
+static void __preempt_begin(void)
+{
+	preempt_disable();
+}
+
+static void __preempt_end(void)
+{
+	preempt_enable();
+}
+
+static void __softirq_begin(void)
+{
+	local_bh_disable();
+}
+
+static void __softirq_end(void)
+{
+	local_bh_enable();
+}
+
+static void __hardirq_begin(void)
+{
+	local_irq_disable();
+}
+
+static void __hardirq_end(void)
+{
+	local_irq_enable();
+}
+
+struct atomic_section {
+	const char *name;
+	void (*critical_section_begin)(void);
+	void (*critical_section_end)(void);
+};
+
+static int __igt_atomic_reset_engine(struct intel_engine_cs *engine,
+				     const struct atomic_section *p,
+				     const char *mode)
+{
+	struct tasklet_struct * const t = &engine->execlists.tasklet;
+	int err;
+
+	GEM_TRACE("i915_reset_engine(%s:%s) under %s\n",
+		  engine->name, mode, p->name);
+
+	tasklet_disable_nosync(t);
+	p->critical_section_begin();
+
+	err = i915_reset_engine(engine, NULL);
+
+	p->critical_section_end();
+	tasklet_enable(t);
+
+	if (err)
+		pr_err("i915_reset_engine(%s:%s) failed under %s\n",
+		       engine->name, mode, p->name);
+
+	return err;
+}
+
+static int igt_atomic_reset_engine(struct intel_engine_cs *engine,
+				   const struct atomic_section *p)
+{
+	struct drm_i915_private *i915 = engine->i915;
+	struct i915_request *rq;
+	struct hang h;
+	int err;
+
+	err = __igt_atomic_reset_engine(engine, p, "idle");
+	if (err)
+		return err;
+
+	err = hang_init(&h, i915);
+	if (err)
+		return err;
+
+	rq = hang_create_request(&h, engine);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto out;
+	}
+
+	i915_request_get(rq);
+	i915_request_add(rq);
+
+	if (wait_until_running(&h, rq)) {
+		err = __igt_atomic_reset_engine(engine, p, "active");
+	} else {
+		pr_err("%s(%s): Failed to start request %llx, at %x\n",
+		       __func__, engine->name,
+		       rq->fence.seqno, hws_seqno(&h, rq));
+		i915_gem_set_wedged(i915);
+		err = -EIO;
+	}
+
+	if (err == 0) {
+		struct igt_wedge_me w;
+
+		igt_wedge_on_timeout(&w, i915, HZ / 20 /* 50ms timeout*/)
+			i915_request_wait(rq,
+					  I915_WAIT_LOCKED,
+					  MAX_SCHEDULE_TIMEOUT);
+		if (i915_terminally_wedged(&i915->gpu_error))
+			err = -EIO;
+	}
+
+	i915_request_put(rq);
+out:
+	hang_fini(&h);
+	return err;
+}
+
+static void force_reset(struct drm_i915_private *i915)
+{
+	i915_gem_set_wedged(i915);
+	i915_reset(i915, 0, NULL);
+}
+
+static int igt_atomic_reset(void *arg)
+{
+	static const struct atomic_section phases[] = {
+		{ "preempt", __preempt_begin, __preempt_end },
+		{ "softirq", __softirq_begin, __softirq_end },
+		{ "hardirq", __hardirq_begin, __hardirq_end },
+		{ }
+	};
+	struct drm_i915_private *i915 = arg;
+	intel_wakeref_t wakeref;
+	int err = 0;
+
+	/* Check that the resets are usable from atomic context */
+
+	if (USES_GUC_SUBMISSION(i915))
+		return 0; /* guc is dead; long live the guc */
+
+	igt_global_reset_lock(i915);
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	/* Flush any requests before we get started and check basics */
+	force_reset(i915);
+	if (i915_terminally_wedged(&i915->gpu_error))
+		goto unlock;
+
+	if (intel_has_gpu_reset(i915)) {
+		const typeof(*phases) *p;
+
+		for (p = phases; p->name; p++) {
+			GEM_TRACE("intel_gpu_reset under %s\n", p->name);
+
+			p->critical_section_begin();
+			err = intel_gpu_reset(i915, ALL_ENGINES);
+			p->critical_section_end();
+
+			if (err) {
+				pr_err("intel_gpu_reset failed under %s\n",
+				       p->name);
+				goto out;
+			}
+		}
+
+		force_reset(i915);
+	}
+
+	if (intel_has_reset_engine(i915)) {
+		struct intel_engine_cs *engine;
+		enum intel_engine_id id;
+
+		for_each_engine(engine, i915, id) {
+			const typeof(*phases) *p;
+
+			for (p = phases; p->name; p++) {
+				err = igt_atomic_reset_engine(engine, p);
+				if (err)
+					goto out;
+			}
+		}
+	}
+
+out:
+	/* As we poke around the guts, do a full reset before continuing. */
+	force_reset(i915);
+
+unlock:
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+	igt_global_reset_unlock(i915);
+
+	return err;
+}
+
 int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_global_reset), /* attempt to recover GPU first */
+		SUBTEST(igt_wedged_reset),
 		SUBTEST(igt_hang_sanitycheck),
 		SUBTEST(igt_reset_idle_engine),
 		SUBTEST(igt_reset_active_engine),
@@ -1463,7 +1651,9 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(igt_reset_evict_ppgtt),
 		SUBTEST(igt_reset_evict_fence),
 		SUBTEST(igt_handle_error),
+		SUBTEST(igt_atomic_reset),
 	};
+	intel_wakeref_t wakeref;
 	bool saved_hangcheck;
 	int err;
 
@@ -1473,8 +1663,9 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
 	if (i915_terminally_wedged(&i915->gpu_error))
 		return -EIO; /* we're long past hope of a successful reset */
 
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 	saved_hangcheck = fetch_and_zero(&i915_modparams.enable_hangcheck);
+	drain_delayed_work(&i915->gpu_error.hangcheck_work); /* flush param */
 
 	err = i915_subtests(tests, i915);
 
@@ -1483,7 +1674,7 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	i915_modparams.enable_hangcheck = saved_hangcheck;
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index ca461e3a5f27..58144e024751 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -4,6 +4,10 @@
  * Copyright © 2018 Intel Corporation
  */
 
+#include <linux/prime_numbers.h>
+
+#include "../i915_reset.h"
+
 #include "../i915_selftest.h"
 #include "igt_flush_test.h"
 #include "igt_spinner.h"
@@ -18,13 +22,14 @@ static int live_sanitycheck(void *arg)
 	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
 	struct igt_spinner spin;
+	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	if (!HAS_LOGICAL_RING_CONTEXTS(i915))
 		return 0;
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	if (igt_spinner_init(&spin, i915))
 		goto err_unlock;
@@ -65,7 +70,7 @@ err_spin:
 	igt_spinner_fini(&spin);
 err_unlock:
 	igt_flush_test(i915, I915_WAIT_LOCKED);
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	return err;
 }
@@ -77,13 +82,14 @@ static int live_preempt(void *arg)
 	struct igt_spinner spin_hi, spin_lo;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
 		return 0;
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	if (igt_spinner_init(&spin_hi, i915))
 		goto err_unlock;
@@ -158,7 +164,7 @@ err_spin_hi:
 	igt_spinner_fini(&spin_hi);
 err_unlock:
 	igt_flush_test(i915, I915_WAIT_LOCKED);
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	return err;
 }
@@ -171,13 +177,14 @@ static int live_late_preempt(void *arg)
 	struct intel_engine_cs *engine;
 	struct i915_sched_attr attr = {};
 	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
 		return 0;
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	if (igt_spinner_init(&spin_hi, i915))
 		goto err_unlock;
@@ -251,7 +258,7 @@ err_spin_hi:
 	igt_spinner_fini(&spin_hi);
 err_unlock:
 	igt_flush_test(i915, I915_WAIT_LOCKED);
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	return err;
 
@@ -263,6 +270,243 @@ err_wedged:
 	goto err_ctx_lo;
 }
 
+struct preempt_client {
+	struct igt_spinner spin;
+	struct i915_gem_context *ctx;
+};
+
+static int preempt_client_init(struct drm_i915_private *i915,
+			       struct preempt_client *c)
+{
+	c->ctx = kernel_context(i915);
+	if (!c->ctx)
+		return -ENOMEM;
+
+	if (igt_spinner_init(&c->spin, i915))
+		goto err_ctx;
+
+	return 0;
+
+err_ctx:
+	kernel_context_close(c->ctx);
+	return -ENOMEM;
+}
+
+static void preempt_client_fini(struct preempt_client *c)
+{
+	igt_spinner_fini(&c->spin);
+	kernel_context_close(c->ctx);
+}
+
+static int live_suppress_self_preempt(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	struct i915_sched_attr attr = {
+		.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX)
+	};
+	struct preempt_client a, b;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	int err = -ENOMEM;
+
+	/*
+	 * Verify that if a preemption request does not cause a change in
+	 * the current execution order, the preempt-to-idle injection is
+	 * skipped and that we do not accidentally apply it after the CS
+	 * completion event.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	if (USES_GUC_SUBMISSION(i915))
+		return 0; /* presume black blox */
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	if (preempt_client_init(i915, &a))
+		goto err_unlock;
+	if (preempt_client_init(i915, &b))
+		goto err_client_a;
+
+	for_each_engine(engine, i915, id) {
+		struct i915_request *rq_a, *rq_b;
+		int depth;
+
+		engine->execlists.preempt_hang.count = 0;
+
+		rq_a = igt_spinner_create_request(&a.spin,
+						  a.ctx, engine,
+						  MI_NOOP);
+		if (IS_ERR(rq_a)) {
+			err = PTR_ERR(rq_a);
+			goto err_client_b;
+		}
+
+		i915_request_add(rq_a);
+		if (!igt_wait_for_spinner(&a.spin, rq_a)) {
+			pr_err("First client failed to start\n");
+			goto err_wedged;
+		}
+
+		for (depth = 0; depth < 8; depth++) {
+			rq_b = igt_spinner_create_request(&b.spin,
+							  b.ctx, engine,
+							  MI_NOOP);
+			if (IS_ERR(rq_b)) {
+				err = PTR_ERR(rq_b);
+				goto err_client_b;
+			}
+			i915_request_add(rq_b);
+
+			GEM_BUG_ON(i915_request_completed(rq_a));
+			engine->schedule(rq_a, &attr);
+			igt_spinner_end(&a.spin);
+
+			if (!igt_wait_for_spinner(&b.spin, rq_b)) {
+				pr_err("Second client failed to start\n");
+				goto err_wedged;
+			}
+
+			swap(a, b);
+			rq_a = rq_b;
+		}
+		igt_spinner_end(&a.spin);
+
+		if (engine->execlists.preempt_hang.count) {
+			pr_err("Preemption recorded x%d, depth %d; should have been suppressed!\n",
+			       engine->execlists.preempt_hang.count,
+			       depth);
+			err = -EINVAL;
+			goto err_client_b;
+		}
+
+		if (igt_flush_test(i915, I915_WAIT_LOCKED))
+			goto err_wedged;
+	}
+
+	err = 0;
+err_client_b:
+	preempt_client_fini(&b);
+err_client_a:
+	preempt_client_fini(&a);
+err_unlock:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+
+err_wedged:
+	igt_spinner_end(&b.spin);
+	igt_spinner_end(&a.spin);
+	i915_gem_set_wedged(i915);
+	err = -EIO;
+	goto err_client_b;
+}
+
+static int live_chain_preempt(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	struct preempt_client hi, lo;
+	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
+	int err = -ENOMEM;
+
+	/*
+	 * Build a chain AB...BA between two contexts (A, B) and request
+	 * preemption of the last request. It should then complete before
+	 * the previously submitted spinner in B.
+	 */
+
+	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+	wakeref = intel_runtime_pm_get(i915);
+
+	if (preempt_client_init(i915, &hi))
+		goto err_unlock;
+
+	if (preempt_client_init(i915, &lo))
+		goto err_client_hi;
+
+	for_each_engine(engine, i915, id) {
+		struct i915_sched_attr attr = {
+			.priority = I915_USER_PRIORITY(I915_PRIORITY_MAX),
+		};
+		int count, i;
+
+		for_each_prime_number_from(count, 1, 32) { /* must fit ring! */
+			struct i915_request *rq;
+
+			rq = igt_spinner_create_request(&hi.spin,
+							hi.ctx, engine,
+							MI_ARB_CHECK);
+			if (IS_ERR(rq))
+				goto err_wedged;
+			i915_request_add(rq);
+			if (!igt_wait_for_spinner(&hi.spin, rq))
+				goto err_wedged;
+
+			rq = igt_spinner_create_request(&lo.spin,
+							lo.ctx, engine,
+							MI_ARB_CHECK);
+			if (IS_ERR(rq))
+				goto err_wedged;
+			i915_request_add(rq);
+
+			for (i = 0; i < count; i++) {
+				rq = i915_request_alloc(engine, lo.ctx);
+				if (IS_ERR(rq))
+					goto err_wedged;
+				i915_request_add(rq);
+			}
+
+			rq = i915_request_alloc(engine, hi.ctx);
+			if (IS_ERR(rq))
+				goto err_wedged;
+			i915_request_add(rq);
+			engine->schedule(rq, &attr);
+
+			igt_spinner_end(&hi.spin);
+			if (i915_request_wait(rq, I915_WAIT_LOCKED, HZ / 5) < 0) {
+				struct drm_printer p =
+					drm_info_printer(i915->drm.dev);
+
+				pr_err("Failed to preempt over chain of %d\n",
+				       count);
+				intel_engine_dump(engine, &p,
+						  "%s\n", engine->name);
+				goto err_wedged;
+			}
+			igt_spinner_end(&lo.spin);
+		}
+	}
+
+	err = 0;
+err_client_lo:
+	preempt_client_fini(&lo);
+err_client_hi:
+	preempt_client_fini(&hi);
+err_unlock:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+	intel_runtime_pm_put(i915, wakeref);
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+
+err_wedged:
+	igt_spinner_end(&hi.spin);
+	igt_spinner_end(&lo.spin);
+	i915_gem_set_wedged(i915);
+	err = -EIO;
+	goto err_client_lo;
+}
+
 static int live_preempt_hang(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -270,6 +514,7 @@ static int live_preempt_hang(void *arg)
 	struct igt_spinner spin_hi, spin_lo;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
+	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 
 	if (!HAS_LOGICAL_RING_PREEMPTION(i915))
@@ -279,7 +524,7 @@ static int live_preempt_hang(void *arg)
 		return 0;
 
 	mutex_lock(&i915->drm.struct_mutex);
-	intel_runtime_pm_get(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
 	if (igt_spinner_init(&spin_hi, i915))
 		goto err_unlock;
@@ -374,7 +619,7 @@ err_spin_hi:
 	igt_spinner_fini(&spin_hi);
 err_unlock:
 	igt_flush_test(i915, I915_WAIT_LOCKED);
-	intel_runtime_pm_put(i915);
+	intel_runtime_pm_put(i915, wakeref);
 	mutex_unlock(&i915->drm.struct_mutex);
 	return err;
 }
@@ -522,7 +767,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags)
 
 	pr_info("Submitted %lu crescendo:%x requests across %d engines and %d contexts\n",
 		count, flags,
-		INTEL_INFO(smoke->i915)->num_rings, smoke->ncontext);
+		RUNTIME_INFO(smoke->i915)->num_rings, smoke->ncontext);
 	return 0;
 }
 
@@ -550,7 +795,7 @@ static int smoke_random(struct preempt_smoke *smoke, unsigned int flags)
 
 	pr_info("Submitted %lu random:%x requests across %d engines and %d contexts\n",
 		count, flags,
-		INTEL_INFO(smoke->i915)->num_rings, smoke->ncontext);
+		RUNTIME_INFO(smoke->i915)->num_rings, smoke->ncontext);
 	return 0;
 }
 
@@ -562,6 +807,7 @@ static int live_preempt_smoke(void *arg)
 		.ncontext = 1024,
 	};
 	const unsigned int phase[] = { 0, BATCH };
+	intel_wakeref_t wakeref;
 	int err = -ENOMEM;
 	u32 *cs;
 	int n;
@@ -576,7 +822,7 @@ static int live_preempt_smoke(void *arg)
 		return -ENOMEM;
 
 	mutex_lock(&smoke.i915->drm.struct_mutex);
-	intel_runtime_pm_get(smoke.i915);
+	wakeref = intel_runtime_pm_get(smoke.i915);
 
 	smoke.batch = i915_gem_object_create_internal(smoke.i915, PAGE_SIZE);
 	if (IS_ERR(smoke.batch)) {
@@ -627,7 +873,7 @@ err_ctx:
 err_batch:
 	i915_gem_object_put(smoke.batch);
 err_unlock:
-	intel_runtime_pm_put(smoke.i915);
+	intel_runtime_pm_put(smoke.i915, wakeref);
 	mutex_unlock(&smoke.i915->drm.struct_mutex);
 	kfree(smoke.contexts);
 
@@ -640,6 +886,8 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_sanitycheck),
 		SUBTEST(live_preempt),
 		SUBTEST(live_late_preempt),
+		SUBTEST(live_suppress_self_preempt),
+		SUBTEST(live_chain_preempt),
 		SUBTEST(live_preempt_hang),
 		SUBTEST(live_preempt_smoke),
 	};
diff --git a/drivers/gpu/drm/i915/selftests/intel_workarounds.c b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
index 67017d5175b8..b15c4f26c593 100644
--- a/drivers/gpu/drm/i915/selftests/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
@@ -5,6 +5,7 @@
  */
 
 #include "../i915_selftest.h"
+#include "../i915_reset.h"
 
 #include "igt_flush_test.h"
 #include "igt_reset.h"
@@ -12,13 +13,59 @@
 #include "igt_wedge_me.h"
 #include "mock_context.h"
 
+#define REF_NAME_MAX (INTEL_ENGINE_CS_MAX_NAME + 4)
+struct wa_lists {
+	struct i915_wa_list gt_wa_list;
+	struct {
+		char name[REF_NAME_MAX];
+		struct i915_wa_list wa_list;
+	} engine[I915_NUM_ENGINES];
+};
+
+static void
+reference_lists_init(struct drm_i915_private *i915, struct wa_lists *lists)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	memset(lists, 0, sizeof(*lists));
+
+	wa_init_start(&lists->gt_wa_list, "GT_REF");
+	gt_init_workarounds(i915, &lists->gt_wa_list);
+	wa_init_finish(&lists->gt_wa_list);
+
+	for_each_engine(engine, i915, id) {
+		struct i915_wa_list *wal = &lists->engine[id].wa_list;
+		char *name = lists->engine[id].name;
+
+		snprintf(name, REF_NAME_MAX, "%s_REF", engine->name);
+
+		wa_init_start(wal, name);
+		engine_init_workarounds(engine, wal);
+		wa_init_finish(wal);
+	}
+}
+
+static void
+reference_lists_fini(struct drm_i915_private *i915, struct wa_lists *lists)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, i915, id)
+		intel_wa_list_free(&lists->engine[id].wa_list);
+
+	intel_wa_list_free(&lists->gt_wa_list);
+}
+
 static struct drm_i915_gem_object *
 read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 {
+	const u32 base = engine->mmio_base;
 	struct drm_i915_gem_object *result;
+	intel_wakeref_t wakeref;
 	struct i915_request *rq;
 	struct i915_vma *vma;
-	const u32 base = engine->mmio_base;
 	u32 srm, *cs;
 	int err;
 	int i;
@@ -47,9 +94,9 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 	if (err)
 		goto err_obj;
 
-	intel_runtime_pm_get(engine->i915);
-	rq = i915_request_alloc(engine, ctx);
-	intel_runtime_pm_put(engine->i915);
+	rq = ERR_PTR(-ENODEV);
+	with_intel_runtime_pm(engine->i915, wakeref)
+		rq = i915_request_alloc(engine, ctx);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
 		goto err_pin;
@@ -167,7 +214,6 @@ out_put:
 
 static int do_device_reset(struct intel_engine_cs *engine)
 {
-	set_bit(I915_RESET_HANDOFF, &engine->i915->gpu_error.flags);
 	i915_reset(engine->i915, ENGINE_MASK(engine->id), "live_workarounds");
 	return 0;
 }
@@ -183,20 +229,22 @@ switch_to_scratch_context(struct intel_engine_cs *engine,
 {
 	struct i915_gem_context *ctx;
 	struct i915_request *rq;
+	intel_wakeref_t wakeref;
 	int err = 0;
 
 	ctx = kernel_context(engine->i915);
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
 
-	intel_runtime_pm_get(engine->i915);
-
-	if (spin)
-		rq = igt_spinner_create_request(spin, ctx, engine, MI_NOOP);
-	else
-		rq = i915_request_alloc(engine, ctx);
-
-	intel_runtime_pm_put(engine->i915);
+	rq = ERR_PTR(-ENODEV);
+	with_intel_runtime_pm(engine->i915, wakeref) {
+		if (spin)
+			rq = igt_spinner_create_request(spin,
+							ctx, engine,
+							MI_NOOP);
+		else
+			rq = i915_request_alloc(engine, ctx);
+	}
 
 	kernel_context_close(ctx);
 
@@ -228,6 +276,7 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine,
 	bool want_spin = reset == do_engine_reset;
 	struct i915_gem_context *ctx;
 	struct igt_spinner spin;
+	intel_wakeref_t wakeref;
 	int err;
 
 	pr_info("Checking %d whitelisted registers (RING_NONPRIV) [%s]\n",
@@ -253,9 +302,8 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine,
 	if (err)
 		goto out;
 
-	intel_runtime_pm_get(i915);
-	err = reset(engine);
-	intel_runtime_pm_put(i915);
+	with_intel_runtime_pm(i915, wakeref)
+		err = reset(engine);
 
 	if (want_spin) {
 		igt_spinner_end(&spin);
@@ -326,16 +374,17 @@ out:
 	return err;
 }
 
-static bool verify_gt_engine_wa(struct drm_i915_private *i915, const char *str)
+static bool verify_gt_engine_wa(struct drm_i915_private *i915,
+				struct wa_lists *lists, const char *str)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 	bool ok = true;
 
-	ok &= intel_gt_verify_workarounds(i915, str);
+	ok &= wa_list_verify(i915, &lists->gt_wa_list, str);
 
 	for_each_engine(engine, i915, id)
-		ok &= intel_engine_verify_workarounds(engine, str);
+		ok &= wa_list_verify(i915, &lists->engine[id].wa_list, str);
 
 	return ok;
 }
@@ -344,7 +393,8 @@ static int
 live_gpu_reset_gt_engine_workarounds(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
-	struct i915_gpu_error *error = &i915->gpu_error;
+	intel_wakeref_t wakeref;
+	struct wa_lists lists;
 	bool ok;
 
 	if (!intel_has_gpu_reset(i915))
@@ -353,19 +403,21 @@ live_gpu_reset_gt_engine_workarounds(void *arg)
 	pr_info("Verifying after GPU reset...\n");
 
 	igt_global_reset_lock(i915);
+	wakeref = intel_runtime_pm_get(i915);
 
-	ok = verify_gt_engine_wa(i915, "before reset");
+	reference_lists_init(i915, &lists);
+
+	ok = verify_gt_engine_wa(i915, &lists, "before reset");
 	if (!ok)
 		goto out;
 
-	intel_runtime_pm_get(i915);
-	set_bit(I915_RESET_HANDOFF, &error->flags);
 	i915_reset(i915, ALL_ENGINES, "live_workarounds");
-	intel_runtime_pm_put(i915);
 
-	ok = verify_gt_engine_wa(i915, "after reset");
+	ok = verify_gt_engine_wa(i915, &lists, "after reset");
 
 out:
+	reference_lists_fini(i915, &lists);
+	intel_runtime_pm_put(i915, wakeref);
 	igt_global_reset_unlock(i915);
 
 	return ok ? 0 : -ESRCH;
@@ -380,6 +432,8 @@ live_engine_reset_gt_engine_workarounds(void *arg)
 	struct igt_spinner spin;
 	enum intel_engine_id id;
 	struct i915_request *rq;
+	intel_wakeref_t wakeref;
+	struct wa_lists lists;
 	int ret = 0;
 
 	if (!intel_has_reset_engine(i915))
@@ -390,23 +444,24 @@ live_engine_reset_gt_engine_workarounds(void *arg)
 		return PTR_ERR(ctx);
 
 	igt_global_reset_lock(i915);
+	wakeref = intel_runtime_pm_get(i915);
+
+	reference_lists_init(i915, &lists);
 
 	for_each_engine(engine, i915, id) {
 		bool ok;
 
 		pr_info("Verifying after %s reset...\n", engine->name);
 
-		ok = verify_gt_engine_wa(i915, "before reset");
+		ok = verify_gt_engine_wa(i915, &lists, "before reset");
 		if (!ok) {
 			ret = -ESRCH;
 			goto err;
 		}
 
-		intel_runtime_pm_get(i915);
 		i915_reset_engine(engine, "live_workarounds");
-		intel_runtime_pm_put(i915);
 
-		ok = verify_gt_engine_wa(i915, "after idle reset");
+		ok = verify_gt_engine_wa(i915, &lists, "after idle reset");
 		if (!ok) {
 			ret = -ESRCH;
 			goto err;
@@ -416,13 +471,10 @@ live_engine_reset_gt_engine_workarounds(void *arg)
 		if (ret)
 			goto err;
 
-		intel_runtime_pm_get(i915);
-
 		rq = igt_spinner_create_request(&spin, ctx, engine, MI_NOOP);
 		if (IS_ERR(rq)) {
 			ret = PTR_ERR(rq);
 			igt_spinner_fini(&spin);
-			intel_runtime_pm_put(i915);
 			goto err;
 		}
 
@@ -431,19 +483,16 @@ live_engine_reset_gt_engine_workarounds(void *arg)
 		if (!igt_wait_for_spinner(&spin, rq)) {
 			pr_err("Spinner failed to start\n");
 			igt_spinner_fini(&spin);
-			intel_runtime_pm_put(i915);
 			ret = -ETIMEDOUT;
 			goto err;
 		}
 
 		i915_reset_engine(engine, "live_workarounds");
 
-		intel_runtime_pm_put(i915);
-
 		igt_spinner_end(&spin);
 		igt_spinner_fini(&spin);
 
-		ok = verify_gt_engine_wa(i915, "after busy reset");
+		ok = verify_gt_engine_wa(i915, &lists, "after busy reset");
 		if (!ok) {
 			ret = -ESRCH;
 			goto err;
@@ -451,6 +500,8 @@ live_engine_reset_gt_engine_workarounds(void *arg)
 	}
 
 err:
+	reference_lists_fini(i915, &lists);
+	intel_runtime_pm_put(i915, wakeref);
 	igt_global_reset_unlock(i915);
 	kernel_context_close(ctx);
 
diff --git a/drivers/gpu/drm/i915/selftests/lib_sw_fence.c b/drivers/gpu/drm/i915/selftests/lib_sw_fence.c
index b26f07b55d86..2bfa72c1654b 100644
--- a/drivers/gpu/drm/i915/selftests/lib_sw_fence.c
+++ b/drivers/gpu/drm/i915/selftests/lib_sw_fence.c
@@ -76,3 +76,57 @@ void timed_fence_fini(struct timed_fence *tf)
 	destroy_timer_on_stack(&tf->timer);
 	i915_sw_fence_fini(&tf->fence);
 }
+
+struct heap_fence {
+	struct i915_sw_fence fence;
+	union {
+		struct kref ref;
+		struct rcu_head rcu;
+	};
+};
+
+static int __i915_sw_fence_call
+heap_fence_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
+{
+	struct heap_fence *h = container_of(fence, typeof(*h), fence);
+
+	switch (state) {
+	case FENCE_COMPLETE:
+		break;
+
+	case FENCE_FREE:
+		heap_fence_put(&h->fence);
+	}
+
+	return NOTIFY_DONE;
+}
+
+struct i915_sw_fence *heap_fence_create(gfp_t gfp)
+{
+	struct heap_fence *h;
+
+	h = kmalloc(sizeof(*h), gfp);
+	if (!h)
+		return NULL;
+
+	i915_sw_fence_init(&h->fence, heap_fence_notify);
+	refcount_set(&h->ref.refcount, 2);
+
+	return &h->fence;
+}
+
+static void heap_fence_release(struct kref *ref)
+{
+	struct heap_fence *h = container_of(ref, typeof(*h), ref);
+
+	i915_sw_fence_fini(&h->fence);
+
+	kfree_rcu(h, rcu);
+}
+
+void heap_fence_put(struct i915_sw_fence *fence)
+{
+	struct heap_fence *h = container_of(fence, typeof(*h), fence);
+
+	kref_put(&h->ref, heap_fence_release);
+}
diff --git a/drivers/gpu/drm/i915/selftests/lib_sw_fence.h b/drivers/gpu/drm/i915/selftests/lib_sw_fence.h
index 474aafb92ae1..1f9927e10f3a 100644
--- a/drivers/gpu/drm/i915/selftests/lib_sw_fence.h
+++ b/drivers/gpu/drm/i915/selftests/lib_sw_fence.h
@@ -39,4 +39,7 @@ struct timed_fence {
 void timed_fence_init(struct timed_fence *tf, unsigned long expires);
 void timed_fence_fini(struct timed_fence *tf);
 
+struct i915_sw_fence *heap_fence_create(gfp_t gfp);
+void heap_fence_put(struct i915_sw_fence *fence);
+
 #endif /* _LIB_SW_FENCE_H_ */
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index d937bdff26f9..b646cdcdd602 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -45,11 +45,8 @@ mock_context(struct drm_i915_private *i915,
 	INIT_LIST_HEAD(&ctx->handles_list);
 	INIT_LIST_HEAD(&ctx->hw_id_link);
 
-	for (n = 0; n < ARRAY_SIZE(ctx->__engine); n++) {
-		struct intel_context *ce = &ctx->__engine[n];
-
-		ce->gem_context = ctx;
-	}
+	for (n = 0; n < ARRAY_SIZE(ctx->__engine); n++)
+		intel_context_init(&ctx->__engine[n], ctx, i915->engine[n]);
 
 	ret = i915_gem_context_pin_hw_id(ctx);
 	if (ret < 0)
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
index d0c44c18db42..08f0cab02e0f 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.c
+++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
@@ -30,6 +30,52 @@ struct mock_ring {
 	struct i915_timeline timeline;
 };
 
+static void mock_timeline_pin(struct i915_timeline *tl)
+{
+	tl->pin_count++;
+}
+
+static void mock_timeline_unpin(struct i915_timeline *tl)
+{
+	GEM_BUG_ON(!tl->pin_count);
+	tl->pin_count--;
+}
+
+static struct intel_ring *mock_ring(struct intel_engine_cs *engine)
+{
+	const unsigned long sz = PAGE_SIZE / 2;
+	struct mock_ring *ring;
+
+	ring = kzalloc(sizeof(*ring) + sz, GFP_KERNEL);
+	if (!ring)
+		return NULL;
+
+	if (i915_timeline_init(engine->i915,
+			       &ring->timeline, engine->name,
+			       NULL)) {
+		kfree(ring);
+		return NULL;
+	}
+
+	ring->base.size = sz;
+	ring->base.effective_size = sz;
+	ring->base.vaddr = (void *)(ring + 1);
+	ring->base.timeline = &ring->timeline;
+
+	INIT_LIST_HEAD(&ring->base.request_list);
+	intel_ring_update_space(&ring->base);
+
+	return &ring->base;
+}
+
+static void mock_ring_free(struct intel_ring *base)
+{
+	struct mock_ring *ring = container_of(base, typeof(*ring), base);
+
+	i915_timeline_fini(&ring->timeline);
+	kfree(ring);
+}
+
 static struct mock_request *first_request(struct mock_engine *engine)
 {
 	return list_first_entry_or_null(&engine->hw_queue,
@@ -37,24 +83,29 @@ static struct mock_request *first_request(struct mock_engine *engine)
 					link);
 }
 
-static void advance(struct mock_engine *engine,
-		    struct mock_request *request)
+static void advance(struct mock_request *request)
 {
 	list_del_init(&request->link);
-	mock_seqno_advance(&engine->base, request->base.global_seqno);
+	intel_engine_write_global_seqno(request->base.engine,
+					request->base.global_seqno);
+	i915_request_mark_complete(&request->base);
+	GEM_BUG_ON(!i915_request_completed(&request->base));
+
+	intel_engine_queue_breadcrumbs(request->base.engine);
 }
 
 static void hw_delay_complete(struct timer_list *t)
 {
 	struct mock_engine *engine = from_timer(engine, t, hw_delay);
 	struct mock_request *request;
+	unsigned long flags;
 
-	spin_lock(&engine->hw_lock);
+	spin_lock_irqsave(&engine->hw_lock, flags);
 
 	/* Timer fired, first request is complete */
 	request = first_request(engine);
 	if (request)
-		advance(engine, request);
+		advance(request);
 
 	/*
 	 * Also immediately signal any subsequent 0-delay requests, but
@@ -66,20 +117,24 @@ static void hw_delay_complete(struct timer_list *t)
 			break;
 		}
 
-		advance(engine, request);
+		advance(request);
 	}
 
-	spin_unlock(&engine->hw_lock);
+	spin_unlock_irqrestore(&engine->hw_lock, flags);
 }
 
 static void mock_context_unpin(struct intel_context *ce)
 {
+	mock_timeline_unpin(ce->ring->timeline);
 	i915_gem_context_put(ce->gem_context);
 }
 
 static void mock_context_destroy(struct intel_context *ce)
 {
 	GEM_BUG_ON(ce->pin_count);
+
+	if (ce->ring)
+		mock_ring_free(ce->ring);
 }
 
 static const struct intel_context_ops mock_context_ops = {
@@ -92,14 +147,26 @@ mock_context_pin(struct intel_engine_cs *engine,
 		 struct i915_gem_context *ctx)
 {
 	struct intel_context *ce = to_intel_context(ctx, engine);
+	int err = -ENOMEM;
 
-	if (!ce->pin_count++) {
-		i915_gem_context_get(ctx);
-		ce->ring = engine->buffer;
-		ce->ops = &mock_context_ops;
+	if (ce->pin_count++)
+		return ce;
+
+	if (!ce->ring) {
+		ce->ring = mock_ring(engine);
+		if (!ce->ring)
+			goto err;
 	}
 
+	mock_timeline_pin(ce->ring->timeline);
+
+	ce->ops = &mock_context_ops;
+	i915_gem_context_get(ctx);
 	return ce;
+
+err:
+	ce->pin_count = 0;
+	return ERR_PTR(err);
 }
 
 static int mock_request_alloc(struct i915_request *request)
@@ -118,9 +185,9 @@ static int mock_emit_flush(struct i915_request *request,
 	return 0;
 }
 
-static void mock_emit_breadcrumb(struct i915_request *request,
-				 u32 *flags)
+static u32 *mock_emit_breadcrumb(struct i915_request *request, u32 *cs)
 {
+	return cs;
 }
 
 static void mock_submit_request(struct i915_request *request)
@@ -128,51 +195,20 @@ static void mock_submit_request(struct i915_request *request)
 	struct mock_request *mock = container_of(request, typeof(*mock), base);
 	struct mock_engine *engine =
 		container_of(request->engine, typeof(*engine), base);
+	unsigned long flags;
 
 	i915_request_submit(request);
 	GEM_BUG_ON(!request->global_seqno);
 
-	spin_lock_irq(&engine->hw_lock);
+	spin_lock_irqsave(&engine->hw_lock, flags);
 	list_add_tail(&mock->link, &engine->hw_queue);
 	if (mock->link.prev == &engine->hw_queue) {
 		if (mock->delay)
 			mod_timer(&engine->hw_delay, jiffies + mock->delay);
 		else
-			advance(engine, mock);
+			advance(mock);
 	}
-	spin_unlock_irq(&engine->hw_lock);
-}
-
-static struct intel_ring *mock_ring(struct intel_engine_cs *engine)
-{
-	const unsigned long sz = PAGE_SIZE / 2;
-	struct mock_ring *ring;
-
-	BUILD_BUG_ON(MIN_SPACE_FOR_ADD_REQUEST > sz);
-
-	ring = kzalloc(sizeof(*ring) + sz, GFP_KERNEL);
-	if (!ring)
-		return NULL;
-
-	i915_timeline_init(engine->i915, &ring->timeline, engine->name);
-
-	ring->base.size = sz;
-	ring->base.effective_size = sz;
-	ring->base.vaddr = (void *)(ring + 1);
-	ring->base.timeline = &ring->timeline;
-
-	INIT_LIST_HEAD(&ring->base.request_list);
-	intel_ring_update_space(&ring->base);
-
-	return &ring->base;
-}
-
-static void mock_ring_free(struct intel_ring *base)
-{
-	struct mock_ring *ring = container_of(base, typeof(*ring), base);
-
-	i915_timeline_fini(&ring->timeline);
-	kfree(ring);
+	spin_unlock_irqrestore(&engine->hw_lock, flags);
 }
 
 struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
@@ -191,39 +227,37 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 	engine->base.i915 = i915;
 	snprintf(engine->base.name, sizeof(engine->base.name), "%s", name);
 	engine->base.id = id;
-	engine->base.status_page.page_addr = (void *)(engine + 1);
+	engine->base.status_page.addr = (void *)(engine + 1);
 
 	engine->base.context_pin = mock_context_pin;
 	engine->base.request_alloc = mock_request_alloc;
 	engine->base.emit_flush = mock_emit_flush;
-	engine->base.emit_breadcrumb = mock_emit_breadcrumb;
+	engine->base.emit_fini_breadcrumb = mock_emit_breadcrumb;
 	engine->base.submit_request = mock_submit_request;
 
-	i915_timeline_init(i915, &engine->base.timeline, engine->base.name);
+	if (i915_timeline_init(i915,
+			       &engine->base.timeline,
+			       engine->base.name,
+			       NULL))
+		goto err_free;
 	i915_timeline_set_subclass(&engine->base.timeline, TIMELINE_ENGINE);
 
 	intel_engine_init_breadcrumbs(&engine->base);
-	engine->base.breadcrumbs.mock = true; /* prevent touching HW for irqs */
 
 	/* fake hw queue */
 	spin_lock_init(&engine->hw_lock);
 	timer_setup(&engine->hw_delay, hw_delay_complete, 0);
 	INIT_LIST_HEAD(&engine->hw_queue);
 
-	engine->base.buffer = mock_ring(&engine->base);
-	if (!engine->base.buffer)
-		goto err_breadcrumbs;
-
 	if (IS_ERR(intel_context_pin(i915->kernel_context, &engine->base)))
-		goto err_ring;
+		goto err_breadcrumbs;
 
 	return &engine->base;
 
-err_ring:
-	mock_ring_free(engine->base.buffer);
 err_breadcrumbs:
 	intel_engine_fini_breadcrumbs(&engine->base);
 	i915_timeline_fini(&engine->base.timeline);
+err_free:
 	kfree(engine);
 	return NULL;
 }
@@ -237,16 +271,14 @@ void mock_engine_flush(struct intel_engine_cs *engine)
 	del_timer_sync(&mock->hw_delay);
 
 	spin_lock_irq(&mock->hw_lock);
-	list_for_each_entry_safe(request, rn, &mock->hw_queue, link) {
-		list_del_init(&request->link);
-		mock_seqno_advance(&mock->base, request->base.global_seqno);
-	}
+	list_for_each_entry_safe(request, rn, &mock->hw_queue, link)
+		advance(request);
 	spin_unlock_irq(&mock->hw_lock);
 }
 
 void mock_engine_reset(struct intel_engine_cs *engine)
 {
-	intel_write_status_page(engine, I915_GEM_HWS_INDEX, 0);
+	intel_engine_write_global_seqno(engine, 0);
 }
 
 void mock_engine_free(struct intel_engine_cs *engine)
@@ -263,8 +295,6 @@ void mock_engine_free(struct intel_engine_cs *engine)
 
 	__intel_context_unpin(engine->i915->kernel_context, engine);
 
-	mock_ring_free(engine->buffer);
-
 	intel_engine_fini_breadcrumbs(engine);
 	i915_timeline_fini(&engine->timeline);
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.h b/drivers/gpu/drm/i915/selftests/mock_engine.h
index 133d0c21790d..b9cc3a245f16 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.h
+++ b/drivers/gpu/drm/i915/selftests/mock_engine.h
@@ -46,10 +46,4 @@ void mock_engine_flush(struct intel_engine_cs *engine);
 void mock_engine_reset(struct intel_engine_cs *engine);
 void mock_engine_free(struct intel_engine_cs *engine);
 
-static inline void mock_seqno_advance(struct intel_engine_cs *engine, u32 seqno)
-{
-	intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno);
-	intel_engine_wakeup(engine);
-}
-
 #endif /* !__MOCK_ENGINE_H__ */
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 43ed8b28aeaa..14ae46fda49f 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -58,8 +58,8 @@ static void mock_device_release(struct drm_device *dev)
 	i915_gem_contexts_lost(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
-	cancel_delayed_work_sync(&i915->gt.retire_work);
-	cancel_delayed_work_sync(&i915->gt.idle_work);
+	drain_delayed_work(&i915->gt.retire_work);
+	drain_delayed_work(&i915->gt.idle_work);
 	i915_gem_drain_workqueue(i915);
 
 	mutex_lock(&i915->drm.struct_mutex);
@@ -68,13 +68,14 @@ static void mock_device_release(struct drm_device *dev)
 	i915_gem_contexts_fini(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
+	i915_timelines_fini(i915);
+
 	drain_workqueue(i915->wq);
 	i915_gem_drain_freed_objects(i915);
 
 	mutex_lock(&i915->drm.struct_mutex);
-	mock_fini_ggtt(i915);
+	mock_fini_ggtt(&i915->ggtt);
 	mutex_unlock(&i915->drm.struct_mutex);
-	WARN_ON(!list_empty(&i915->gt.timelines));
 
 	destroy_workqueue(i915->wq);
 
@@ -147,22 +148,24 @@ struct drm_i915_private *mock_gem_device(void)
 	pdev->class = PCI_BASE_CLASS_DISPLAY << 16;
 	pdev->dev.release = release_dev;
 	dev_set_name(&pdev->dev, "mock");
-	dma_coerce_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+	dma_coerce_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
 
 #if IS_ENABLED(CONFIG_IOMMU_API) && defined(CONFIG_INTEL_IOMMU)
 	/* hack to disable iommu for the fake device; force identity mapping */
 	pdev->dev.archdata.iommu = (void *)-1;
 #endif
 
+	i915 = (struct drm_i915_private *)(pdev + 1);
+	pci_set_drvdata(pdev, i915);
+
+	intel_runtime_pm_init_early(i915);
+
 	dev_pm_domain_set(&pdev->dev, &pm_domain);
 	pm_runtime_enable(&pdev->dev);
 	pm_runtime_dont_use_autosuspend(&pdev->dev);
 	if (pm_runtime_enabled(&pdev->dev))
 		WARN_ON(pm_runtime_get_sync(&pdev->dev));
 
-	i915 = (struct drm_i915_private *)(pdev + 1);
-	pci_set_drvdata(pdev, i915);
-
 	err = drm_dev_init(&i915->drm, &mock_driver, &pdev->dev);
 	if (err) {
 		pr_err("Failed to initialise mock GEM device: err=%d\n", err);
@@ -186,6 +189,7 @@ struct drm_i915_private *mock_gem_device(void)
 
 	init_waitqueue_head(&i915->gpu_error.wait_queue);
 	init_waitqueue_head(&i915->gpu_error.reset_queue);
+	mutex_init(&i915->gpu_error.wedge_mutex);
 
 	i915->wq = alloc_ordered_workqueue("mock", 0);
 	if (!i915->wq)
@@ -223,13 +227,14 @@ struct drm_i915_private *mock_gem_device(void)
 	if (!i915->priorities)
 		goto err_dependencies;
 
-	INIT_LIST_HEAD(&i915->gt.timelines);
+	i915_timelines_init(i915);
+
 	INIT_LIST_HEAD(&i915->gt.active_rings);
 	INIT_LIST_HEAD(&i915->gt.closed_vma);
 
 	mutex_lock(&i915->drm.struct_mutex);
 
-	mock_init_ggtt(i915);
+	mock_init_ggtt(i915, &i915->ggtt);
 
 	mkwrite_device_info(i915)->ring_mask = BIT(0);
 	i915->kernel_context = mock_context(i915, NULL);
@@ -250,6 +255,7 @@ err_context:
 	i915_gem_contexts_fini(i915);
 err_unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
+	i915_timelines_fini(i915);
 	kmem_cache_destroy(i915->priorities);
 err_dependencies:
 	kmem_cache_destroy(i915->dependencies);
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index 6ae418c76015..cd83929fde8e 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -70,7 +70,7 @@ mock_ppgtt(struct drm_i915_private *i915,
 	ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
 	ppgtt->vm.file = ERR_PTR(-ENODEV);
 
-	i915_address_space_init(&ppgtt->vm, i915);
+	i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
 
 	ppgtt->vm.clear_range = nop_clear_range;
 	ppgtt->vm.insert_page = mock_insert_page;
@@ -97,11 +97,12 @@ static void mock_unbind_ggtt(struct i915_vma *vma)
 {
 }
 
-void mock_init_ggtt(struct drm_i915_private *i915)
+void mock_init_ggtt(struct drm_i915_private *i915, struct i915_ggtt *ggtt)
 {
-	struct i915_ggtt *ggtt = &i915->ggtt;
+	memset(ggtt, 0, sizeof(*ggtt));
 
 	ggtt->vm.i915 = i915;
+	ggtt->vm.is_ggtt = true;
 
 	ggtt->gmadr = (struct resource) DEFINE_RES_MEM(0, 2048 * PAGE_SIZE);
 	ggtt->mappable_end = resource_size(&ggtt->gmadr);
@@ -117,14 +118,10 @@ void mock_init_ggtt(struct drm_i915_private *i915)
 	ggtt->vm.vma_ops.set_pages   = ggtt_set_pages;
 	ggtt->vm.vma_ops.clear_pages = clear_pages;
 
-	i915_address_space_init(&ggtt->vm, i915);
-
-	ggtt->vm.is_ggtt = true;
+	i915_address_space_init(&ggtt->vm, VM_CLASS_GGTT);
 }
 
-void mock_fini_ggtt(struct drm_i915_private *i915)
+void mock_fini_ggtt(struct i915_ggtt *ggtt)
 {
-	struct i915_ggtt *ggtt = &i915->ggtt;
-
 	i915_address_space_fini(&ggtt->vm);
 }
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.h b/drivers/gpu/drm/i915/selftests/mock_gtt.h
index 9a0a833bb545..40d544bde1d5 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.h
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.h
@@ -25,8 +25,8 @@
 #ifndef __MOCK_GTT_H
 #define __MOCK_GTT_H
 
-void mock_init_ggtt(struct drm_i915_private *i915);
-void mock_fini_ggtt(struct drm_i915_private *i915);
+void mock_init_ggtt(struct drm_i915_private *i915, struct i915_ggtt *ggtt);
+void mock_fini_ggtt(struct i915_ggtt *ggtt);
 
 struct i915_hw_ppgtt *
 mock_ppgtt(struct drm_i915_private *i915,
diff --git a/drivers/gpu/drm/i915/selftests/mock_timeline.c b/drivers/gpu/drm/i915/selftests/mock_timeline.c
index dcf3b16f5a07..d2de9ece2118 100644
--- a/drivers/gpu/drm/i915/selftests/mock_timeline.c
+++ b/drivers/gpu/drm/i915/selftests/mock_timeline.c
@@ -10,11 +10,13 @@
 
 void mock_timeline_init(struct i915_timeline *timeline, u64 context)
 {
+	timeline->i915 = NULL;
 	timeline->fence_context = context;
 
 	spin_lock_init(&timeline->lock);
 
-	init_request_active(&timeline->last_request, NULL);
+	INIT_ACTIVE_REQUEST(&timeline->barrier);
+	INIT_ACTIVE_REQUEST(&timeline->last_request);
 	INIT_LIST_HEAD(&timeline->requests);
 
 	i915_syncmap_init(&timeline->sync);
@@ -24,5 +26,5 @@ void mock_timeline_init(struct i915_timeline *timeline, u64 context)
 
 void mock_timeline_fini(struct i915_timeline *timeline)
 {
-	i915_timeline_fini(timeline);
+	i915_syncmap_free(&timeline->sync);
 }
diff --git a/drivers/gpu/drm/i915/vlv_dsi.c b/drivers/gpu/drm/i915/vlv_dsi.c
index 361e962a7969..6403728fe778 100644
--- a/drivers/gpu/drm/i915/vlv_dsi.c
+++ b/drivers/gpu/drm/i915/vlv_dsi.c
@@ -23,7 +23,6 @@
  * Author: Jani Nikula <jani.nikula@intel.com>
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
@@ -257,9 +256,9 @@ static void band_gap_reset(struct drm_i915_private *dev_priv)
 	mutex_unlock(&dev_priv->sb_lock);
 }
 
-static bool intel_dsi_compute_config(struct intel_encoder *encoder,
-				     struct intel_crtc_state *pipe_config,
-				     struct drm_connector_state *conn_state)
+static int intel_dsi_compute_config(struct intel_encoder *encoder,
+				    struct intel_crtc_state *pipe_config,
+				    struct drm_connector_state *conn_state)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_dsi *intel_dsi = container_of(encoder, struct intel_dsi,
@@ -276,7 +275,7 @@ static bool intel_dsi_compute_config(struct intel_encoder *encoder,
 	if (fixed_mode) {
 		intel_fixed_panel_mode(fixed_mode, adjusted_mode);
 
-		if (HAS_GMCH_DISPLAY(dev_priv))
+		if (HAS_GMCH(dev_priv))
 			intel_gmch_panel_fitting(crtc, pipe_config,
 						 conn_state->scaling_mode);
 		else
@@ -285,11 +284,16 @@ static bool intel_dsi_compute_config(struct intel_encoder *encoder,
 	}
 
 	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)
-		return false;
+		return -EINVAL;
 
 	/* DSI uses short packets for sync events, so clear mode flags for DSI */
 	adjusted_mode->flags = 0;
 
+	if (intel_dsi->pixel_format == MIPI_DSI_FMT_RGB888)
+		pipe_config->pipe_bpp = 24;
+	else
+		pipe_config->pipe_bpp = 18;
+
 	if (IS_GEN9_LP(dev_priv)) {
 		/* Enable Frame time stamp based scanline reporting */
 		adjusted_mode->private_flags |=
@@ -303,16 +307,16 @@ static bool intel_dsi_compute_config(struct intel_encoder *encoder,
 
 		ret = bxt_dsi_pll_compute(encoder, pipe_config);
 		if (ret)
-			return false;
+			return -EINVAL;
 	} else {
 		ret = vlv_dsi_pll_compute(encoder, pipe_config);
 		if (ret)
-			return false;
+			return -EINVAL;
 	}
 
 	pipe_config->clock_set = true;
 
-	return true;
+	return 0;
 }
 
 static bool glk_dsi_enable_io(struct intel_encoder *encoder)
@@ -674,6 +678,10 @@ static void intel_dsi_port_enable(struct intel_encoder *encoder,
 					LANE_CONFIGURATION_DUAL_LINK_B :
 					LANE_CONFIGURATION_DUAL_LINK_A;
 		}
+
+		if (intel_dsi->pixel_format != MIPI_DSI_FMT_RGB888)
+			temp |= DITHERING_ENABLE;
+
 		/* assert ip_tg_enable signal */
 		I915_WRITE(port_ctrl, temp | DPI_ENABLE);
 		POSTING_READ(port_ctrl);
@@ -960,13 +968,15 @@ static bool intel_dsi_get_hw_state(struct intel_encoder *encoder,
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_dsi *intel_dsi = enc_to_intel_dsi(&encoder->base);
+	intel_wakeref_t wakeref;
 	enum port port;
 	bool active = false;
 
 	DRM_DEBUG_KMS("\n");
 
-	if (!intel_display_power_get_if_enabled(dev_priv,
-						encoder->power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv,
+						     encoder->power_domain);
+	if (!wakeref)
 		return false;
 
 	/*
@@ -1022,7 +1032,7 @@ static bool intel_dsi_get_hw_state(struct intel_encoder *encoder,
 	}
 
 out_put_power:
-	intel_display_power_put(dev_priv, encoder->power_domain);
+	intel_display_power_put(dev_priv, encoder->power_domain, wakeref);
 
 	return active;
 }
@@ -1058,10 +1068,8 @@ static void bxt_dsi_get_pipe_config(struct intel_encoder *encoder,
 	}
 
 	fmt = I915_READ(MIPI_DSI_FUNC_PRG(port)) & VID_MODE_FORMAT_MASK;
-	pipe_config->pipe_bpp =
-			mipi_dsi_pixel_format_to_bpp(
-				pixel_format_from_register_bits(fmt));
-	bpp = pipe_config->pipe_bpp;
+	bpp = mipi_dsi_pixel_format_to_bpp(
+			pixel_format_from_register_bits(fmt));
 
 	/* Enable Frame time stamo based scanline reporting */
 	adjusted_mode->private_flags |=
@@ -1199,11 +1207,9 @@ static void intel_dsi_get_config(struct intel_encoder *encoder,
 
 	if (IS_GEN9_LP(dev_priv)) {
 		bxt_dsi_get_pipe_config(encoder, pipe_config);
-		pclk = bxt_dsi_get_pclk(encoder, pipe_config->pipe_bpp,
-					pipe_config);
+		pclk = bxt_dsi_get_pclk(encoder, pipe_config);
 	} else {
-		pclk = vlv_dsi_get_pclk(encoder, pipe_config->pipe_bpp,
-					pipe_config);
+		pclk = vlv_dsi_get_pclk(encoder, pipe_config);
 	}
 
 	if (pclk) {
@@ -1575,6 +1581,7 @@ vlv_dsi_get_hw_panel_orientation(struct intel_connector *connector)
 	enum drm_panel_orientation orientation;
 	struct intel_plane *plane;
 	struct intel_crtc *crtc;
+	intel_wakeref_t wakeref;
 	enum pipe pipe;
 	u32 val;
 
@@ -1585,7 +1592,8 @@ vlv_dsi_get_hw_panel_orientation(struct intel_connector *connector)
 	plane = to_intel_plane(crtc->base.primary);
 
 	power_domain = POWER_DOMAIN_PIPE(pipe);
-	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+	wakeref = intel_display_power_get_if_enabled(dev_priv, power_domain);
+	if (!wakeref)
 		return DRM_MODE_PANEL_ORIENTATION_UNKNOWN;
 
 	val = I915_READ(DSPCNTR(plane->i9xx_plane));
@@ -1597,7 +1605,7 @@ vlv_dsi_get_hw_panel_orientation(struct intel_connector *connector)
 	else
 		orientation = DRM_MODE_PANEL_ORIENTATION_NORMAL;
 
-	intel_display_power_put(dev_priv, power_domain);
+	intel_display_power_put(dev_priv, power_domain, wakeref);
 
 	return orientation;
 }
@@ -1625,7 +1633,7 @@ static void intel_dsi_add_properties(struct intel_connector *connector)
 		u32 allowed_scalers;
 
 		allowed_scalers = BIT(DRM_MODE_SCALE_ASPECT) | BIT(DRM_MODE_SCALE_FULLSCREEN);
-		if (!HAS_GMCH_DISPLAY(dev_priv))
+		if (!HAS_GMCH(dev_priv))
 			allowed_scalers |= BIT(DRM_MODE_SCALE_CENTER);
 
 		drm_connector_attach_scaling_mode_property(&connector->base,
@@ -1689,6 +1697,7 @@ void vlv_dsi_init(struct drm_i915_private *dev_priv)
 	intel_encoder->post_disable = intel_dsi_post_disable;
 	intel_encoder->get_hw_state = intel_dsi_get_hw_state;
 	intel_encoder->get_config = intel_dsi_get_config;
+	intel_encoder->update_pipe = intel_panel_update_backlight;
 
 	intel_connector->get_hw_state = intel_connector_get_hw_state;
 
diff --git a/drivers/gpu/drm/i915/vlv_dsi_pll.c b/drivers/gpu/drm/i915/vlv_dsi_pll.c
index a132a8037ecc..954d5a8c4fa7 100644
--- a/drivers/gpu/drm/i915/vlv_dsi_pll.c
+++ b/drivers/gpu/drm/i915/vlv_dsi_pll.c
@@ -252,20 +252,12 @@ void bxt_dsi_pll_disable(struct intel_encoder *encoder)
 		DRM_ERROR("Timeout waiting for PLL lock deassertion\n");
 }
 
-static void assert_bpp_mismatch(enum mipi_dsi_pixel_format fmt, int pipe_bpp)
-{
-	int bpp = mipi_dsi_pixel_format_to_bpp(fmt);
-
-	WARN(bpp != pipe_bpp,
-	     "bpp match assertion failure (expected %d, current %d)\n",
-	     bpp, pipe_bpp);
-}
-
-u32 vlv_dsi_get_pclk(struct intel_encoder *encoder, int pipe_bpp,
+u32 vlv_dsi_get_pclk(struct intel_encoder *encoder,
 		     struct intel_crtc_state *config)
 {
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct intel_dsi *intel_dsi = enc_to_intel_dsi(&encoder->base);
+	int bpp = mipi_dsi_pixel_format_to_bpp(intel_dsi->pixel_format);
 	u32 dsi_clock, pclk;
 	u32 pll_ctl, pll_div;
 	u32 m = 0, p = 0, n;
@@ -319,15 +311,12 @@ u32 vlv_dsi_get_pclk(struct intel_encoder *encoder, int pipe_bpp,
 
 	dsi_clock = (m * refclk) / (p * n);
 
-	/* pixel_format and pipe_bpp should agree */
-	assert_bpp_mismatch(intel_dsi->pixel_format, pipe_bpp);
-
-	pclk = DIV_ROUND_CLOSEST(dsi_clock * intel_dsi->lane_count, pipe_bpp);
+	pclk = DIV_ROUND_CLOSEST(dsi_clock * intel_dsi->lane_count, bpp);
 
 	return pclk;
 }
 
-u32 bxt_dsi_get_pclk(struct intel_encoder *encoder, int pipe_bpp,
+u32 bxt_dsi_get_pclk(struct intel_encoder *encoder,
 		     struct intel_crtc_state *config)
 {
 	u32 pclk;
@@ -335,12 +324,7 @@ u32 bxt_dsi_get_pclk(struct intel_encoder *encoder, int pipe_bpp,
 	u32 dsi_ratio;
 	struct intel_dsi *intel_dsi = enc_to_intel_dsi(&encoder->base);
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
-
-	/* Divide by zero */
-	if (!pipe_bpp) {
-		DRM_ERROR("Invalid BPP(0)\n");
-		return 0;
-	}
+	int bpp = mipi_dsi_pixel_format_to_bpp(intel_dsi->pixel_format);
 
 	config->dsi_pll.ctrl = I915_READ(BXT_DSI_PLL_CTL);
 
@@ -348,10 +332,7 @@ u32 bxt_dsi_get_pclk(struct intel_encoder *encoder, int pipe_bpp,
 
 	dsi_clk = (dsi_ratio * BXT_REF_CLOCK_KHZ) / 2;
 
-	/* pixel_format and pipe_bpp should agree */
-	assert_bpp_mismatch(intel_dsi->pixel_format, pipe_bpp);
-
-	pclk = DIV_ROUND_CLOSEST(dsi_clk * intel_dsi->lane_count, pipe_bpp);
+	pclk = DIV_ROUND_CLOSEST(dsi_clk * intel_dsi->lane_count, bpp);
 
 	DRM_DEBUG_DRIVER("Calculated pclk=%u\n", pclk);
 	return pclk;
diff --git a/drivers/gpu/drm/imx/Kconfig b/drivers/gpu/drm/imx/Kconfig
index c9e439c82241..c3c84a09e628 100644
--- a/drivers/gpu/drm/imx/Kconfig
+++ b/drivers/gpu/drm/imx/Kconfig
@@ -4,7 +4,7 @@ config DRM_IMX
 	select VIDEOMODE_HELPERS
 	select DRM_GEM_CMA_HELPER
 	select DRM_KMS_CMA_HELPER
-	depends on DRM && (ARCH_MXC || ARCH_MULTIPLATFORM)
+	depends on DRM && (ARCH_MXC || ARCH_MULTIPLATFORM || COMPILE_TEST)
 	depends on IMX_IPUV3_CORE
 	help
 	  enable i.MX graphics support
@@ -18,6 +18,7 @@ config DRM_IMX_PARALLEL_DISPLAY
 config DRM_IMX_TVE
 	tristate "Support for TV and VGA displays"
 	depends on DRM_IMX
+	depends on COMMON_CLK
 	select REGMAP_MMIO
 	help
 	  Choose this to enable the internal Television Encoder (TVe)
diff --git a/drivers/gpu/drm/imx/dw_hdmi-imx.c b/drivers/gpu/drm/imx/dw_hdmi-imx.c
index 77a26fd3a44a..06393cd1067d 100644
--- a/drivers/gpu/drm/imx/dw_hdmi-imx.c
+++ b/drivers/gpu/drm/imx/dw_hdmi-imx.c
@@ -13,7 +13,7 @@
 #include <linux/regmap.h>
 #include <drm/drm_of.h>
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_atomic_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_encoder_slave.h>
 
diff --git a/drivers/gpu/drm/imx/imx-drm-core.c b/drivers/gpu/drm/imx/imx-drm-core.c
index 820c7e3878f0..c935cbe059a7 100644
--- a/drivers/gpu/drm/imx/imx-drm-core.c
+++ b/drivers/gpu/drm/imx/imx-drm-core.c
@@ -12,13 +12,13 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_fb_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
-#include <drm/drm_fb_cma_helper.h>
-#include <drm/drm_plane_helper.h>
 #include <drm/drm_of.h>
+#include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <video/imx-ipu-v3.h>
 
 #include "imx-drm.h"
@@ -49,11 +49,7 @@ static int imx_drm_atomic_check(struct drm_device *dev,
 {
 	int ret;
 
-	ret = drm_atomic_helper_check_modeset(dev, state);
-	if (ret)
-		return ret;
-
-	ret = drm_atomic_helper_check_planes(dev, state);
+	ret = drm_atomic_helper_check(dev, state);
 	if (ret)
 		return ret;
 
@@ -229,6 +225,7 @@ static int imx_drm_bind(struct device *dev)
 	drm->mode_config.funcs = &imx_drm_mode_config_funcs;
 	drm->mode_config.helper_private = &imx_drm_mode_config_helpers;
 	drm->mode_config.allow_fb_modifiers = true;
+	drm->mode_config.normalize_zpos = true;
 
 	drm_mode_config_init(drm);
 
diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c
index e31e263cf86b..383733302280 100644
--- a/drivers/gpu/drm/imx/imx-ldb.c
+++ b/drivers/gpu/drm/imx/imx-ldb.c
@@ -12,9 +12,9 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_fb_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/mfd/syscon.h>
 #include <linux/mfd/syscon/imx6q-iomuxc-gpr.h>
 #include <linux/of_device.h>
diff --git a/drivers/gpu/drm/imx/imx-tve.c b/drivers/gpu/drm/imx/imx-tve.c
index 293dd5752583..e725af8a0025 100644
--- a/drivers/gpu/drm/imx/imx-tve.c
+++ b/drivers/gpu/drm/imx/imx-tve.c
@@ -17,7 +17,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_fb_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <video/imx-ipu-v3.h>
 
 #include "imx-drm.h"
diff --git a/drivers/gpu/drm/imx/ipuv3-crtc.c b/drivers/gpu/drm/imx/ipuv3-crtc.c
index 058b53c0aa7e..ec3602ebbc1c 100644
--- a/drivers/gpu/drm/imx/ipuv3-crtc.c
+++ b/drivers/gpu/drm/imx/ipuv3-crtc.c
@@ -4,19 +4,19 @@
  *
  * Copyright (C) 2011 Sascha Hauer, Pengutronix
  */
+#include <linux/clk.h>
 #include <linux/component.h>
-#include <linux/module.h>
-#include <linux/export.h>
 #include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/export.h>
+#include <linux/module.h>
 #include <linux/platform_device.h>
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
-#include <linux/clk.h>
-#include <linux/errno.h>
-#include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include <video/imx-ipu-v3.h>
 #include "imx-drm.h"
@@ -34,6 +34,7 @@ struct ipu_crtc {
 	struct ipu_dc		*dc;
 	struct ipu_di		*di;
 	int			irq;
+	struct drm_pending_vblank_event *event;
 };
 
 static inline struct ipu_crtc *to_ipu_crtc(struct drm_crtc *crtc)
@@ -173,8 +174,31 @@ static const struct drm_crtc_funcs ipu_crtc_funcs = {
 static irqreturn_t ipu_irq_handler(int irq, void *dev_id)
 {
 	struct ipu_crtc *ipu_crtc = dev_id;
+	struct drm_crtc *crtc = &ipu_crtc->base;
+	unsigned long flags;
+	int i;
+
+	drm_crtc_handle_vblank(crtc);
+
+	if (ipu_crtc->event) {
+		for (i = 0; i < ARRAY_SIZE(ipu_crtc->plane); i++) {
+			struct ipu_plane *plane = ipu_crtc->plane[i];
 
-	drm_crtc_handle_vblank(&ipu_crtc->base);
+			if (!plane)
+				continue;
+
+			if (ipu_plane_atomic_update_pending(&plane->base))
+				break;
+		}
+
+		if (i == ARRAY_SIZE(ipu_crtc->plane)) {
+			spin_lock_irqsave(&crtc->dev->event_lock, flags);
+			drm_crtc_send_vblank_event(crtc, ipu_crtc->event);
+			ipu_crtc->event = NULL;
+			drm_crtc_vblank_put(crtc);
+			spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
+		}
+	}
 
 	return IRQ_HANDLED;
 }
@@ -223,8 +247,10 @@ static void ipu_crtc_atomic_flush(struct drm_crtc *crtc,
 {
 	spin_lock_irq(&crtc->dev->event_lock);
 	if (crtc->state->event) {
+		struct ipu_crtc *ipu_crtc = to_ipu_crtc(crtc);
+
 		WARN_ON(drm_crtc_vblank_get(crtc));
-		drm_crtc_arm_vblank_event(crtc, crtc->state->event);
+		ipu_crtc->event = crtc->state->event;
 		crtc->state->event = NULL;
 	}
 	spin_unlock_irq(&crtc->dev->event_lock);
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
index 21e964f6ab5c..d7a727a6e3d7 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -273,6 +273,7 @@ static void ipu_plane_destroy(struct drm_plane *plane)
 
 static void ipu_plane_state_reset(struct drm_plane *plane)
 {
+	unsigned int zpos = (plane->type == DRM_PLANE_TYPE_PRIMARY) ? 0 : 1;
 	struct ipu_plane_state *ipu_state;
 
 	if (plane->state) {
@@ -284,8 +285,11 @@ static void ipu_plane_state_reset(struct drm_plane *plane)
 
 	ipu_state = kzalloc(sizeof(*ipu_state), GFP_KERNEL);
 
-	if (ipu_state)
+	if (ipu_state) {
 		__drm_atomic_helper_plane_reset(plane, &ipu_state->base);
+		ipu_state->base.zpos = zpos;
+		ipu_state->base.normalized_zpos = zpos;
+	}
 }
 
 static struct drm_plane_state *
@@ -560,6 +564,25 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 	if (ipu_plane->dp_flow == IPU_DP_FLOW_SYNC_FG)
 		ipu_dp_set_window_pos(ipu_plane->dp, dst->x1, dst->y1);
 
+	switch (ipu_plane->dp_flow) {
+	case IPU_DP_FLOW_SYNC_BG:
+		if (state->normalized_zpos == 1) {
+			ipu_dp_set_global_alpha(ipu_plane->dp,
+						!fb->format->has_alpha, 0xff,
+						true);
+		} else {
+			ipu_dp_set_global_alpha(ipu_plane->dp, true, 0, true);
+		}
+		break;
+	case IPU_DP_FLOW_SYNC_FG:
+		if (state->normalized_zpos == 1) {
+			ipu_dp_set_global_alpha(ipu_plane->dp,
+						!fb->format->has_alpha, 0xff,
+						false);
+		}
+		break;
+	}
+
 	eba = drm_plane_state_to_eba(state, 0);
 
 	/*
@@ -582,6 +605,7 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 		active = ipu_idmac_get_current_buffer(ipu_plane->ipu_ch);
 		ipu_cpmem_set_buffer(ipu_plane->ipu_ch, !active, eba);
 		ipu_idmac_select_buffer(ipu_plane->ipu_ch, !active);
+		ipu_plane->next_buf = !active;
 		if (ipu_plane_separate_alpha(ipu_plane)) {
 			active = ipu_idmac_get_current_buffer(ipu_plane->alpha_ch);
 			ipu_cpmem_set_buffer(ipu_plane->alpha_ch, !active,
@@ -595,34 +619,11 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 	switch (ipu_plane->dp_flow) {
 	case IPU_DP_FLOW_SYNC_BG:
 		ipu_dp_setup_channel(ipu_plane->dp, ics, IPUV3_COLORSPACE_RGB);
-		ipu_dp_set_global_alpha(ipu_plane->dp, true, 0, true);
 		break;
 	case IPU_DP_FLOW_SYNC_FG:
 		ipu_dp_setup_channel(ipu_plane->dp, ics,
 					IPUV3_COLORSPACE_UNKNOWN);
-		/* Enable local alpha on partial plane */
-		switch (fb->format->format) {
-		case DRM_FORMAT_ARGB1555:
-		case DRM_FORMAT_ABGR1555:
-		case DRM_FORMAT_RGBA5551:
-		case DRM_FORMAT_BGRA5551:
-		case DRM_FORMAT_ARGB4444:
-		case DRM_FORMAT_ARGB8888:
-		case DRM_FORMAT_ABGR8888:
-		case DRM_FORMAT_RGBA8888:
-		case DRM_FORMAT_BGRA8888:
-		case DRM_FORMAT_RGB565_A8:
-		case DRM_FORMAT_BGR565_A8:
-		case DRM_FORMAT_RGB888_A8:
-		case DRM_FORMAT_BGR888_A8:
-		case DRM_FORMAT_RGBX8888_A8:
-		case DRM_FORMAT_BGRX8888_A8:
-			ipu_dp_set_global_alpha(ipu_plane->dp, false, 0, false);
-			break;
-		default:
-			ipu_dp_set_global_alpha(ipu_plane->dp, true, 0, true);
-			break;
-		}
+		break;
 	}
 
 	ipu_dmfc_config_wait4eot(ipu_plane->dmfc, drm_rect_width(dst));
@@ -709,6 +710,7 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 	ipu_cpmem_set_buffer(ipu_plane->ipu_ch, 1, eba);
 	ipu_idmac_lock_enable(ipu_plane->ipu_ch, num_bursts);
 	ipu_plane_enable(ipu_plane);
+	ipu_plane->next_buf = -1;
 }
 
 static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
@@ -718,6 +720,24 @@ static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
 	.atomic_update = ipu_plane_atomic_update,
 };
 
+bool ipu_plane_atomic_update_pending(struct drm_plane *plane)
+{
+	struct ipu_plane *ipu_plane = to_ipu_plane(plane);
+	struct drm_plane_state *state = plane->state;
+	struct ipu_plane_state *ipu_state = to_ipu_plane_state(state);
+
+	/* disabled crtcs must not block the update */
+	if (!state->crtc)
+		return false;
+
+	if (ipu_state->use_pre)
+		return ipu_prg_channel_configure_pending(ipu_plane->ipu_ch);
+	else if (ipu_plane->next_buf >= 0)
+		return ipu_idmac_get_current_buffer(ipu_plane->ipu_ch) !=
+		       ipu_plane->next_buf;
+
+	return false;
+}
 int ipu_planes_assign_pre(struct drm_device *dev,
 			  struct drm_atomic_state *state)
 {
@@ -806,6 +826,7 @@ struct ipu_plane *ipu_plane_init(struct drm_device *dev, struct ipu_soc *ipu,
 {
 	struct ipu_plane *ipu_plane;
 	const uint64_t *modifiers = ipu_format_modifiers;
+	unsigned int zpos = (type == DRM_PLANE_TYPE_PRIMARY) ? 0 : 1;
 	int ret;
 
 	DRM_DEBUG_KMS("channel %d, dp flow %d, possible_crtcs=0x%x\n",
@@ -836,5 +857,10 @@ struct ipu_plane *ipu_plane_init(struct drm_device *dev, struct ipu_soc *ipu,
 
 	drm_plane_helper_add(&ipu_plane->base, &ipu_plane_helper_funcs);
 
+	if (dp == IPU_DP_FLOW_SYNC_BG || dp == IPU_DP_FLOW_SYNC_FG)
+		drm_plane_create_zpos_property(&ipu_plane->base, zpos, 0, 1);
+	else
+		drm_plane_create_zpos_immutable_property(&ipu_plane->base, 0);
+
 	return ipu_plane;
 }
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.h b/drivers/gpu/drm/imx/ipuv3-plane.h
index e563ea17a827..15e85e15d35c 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.h
+++ b/drivers/gpu/drm/imx/ipuv3-plane.h
@@ -27,6 +27,7 @@ struct ipu_plane {
 	int			dp_flow;
 
 	bool			disabling;
+	int			next_buf;
 };
 
 struct ipu_plane *ipu_plane_init(struct drm_device *dev, struct ipu_soc *ipu,
@@ -48,5 +49,6 @@ int ipu_plane_irq(struct ipu_plane *plane);
 
 void ipu_plane_disable(struct ipu_plane *ipu_plane, bool disable_dp_channel);
 void ipu_plane_disable_deferred(struct drm_plane *plane);
+bool ipu_plane_atomic_update_pending(struct drm_plane *plane);
 
 #endif
diff --git a/drivers/gpu/drm/imx/parallel-display.c b/drivers/gpu/drm/imx/parallel-display.c
index f3ce51121dd6..1a76de1e8e7b 100644
--- a/drivers/gpu/drm/imx/parallel-display.c
+++ b/drivers/gpu/drm/imx/parallel-display.c
@@ -10,9 +10,9 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_fb_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/videodev2.h>
 #include <video/of_display_timing.h>
 
diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c b/drivers/gpu/drm/mediatek/mtk_dpi.c
index 62a9d47df948..22e68a100e7b 100644
--- a/drivers/gpu/drm/mediatek/mtk_dpi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
@@ -13,7 +13,7 @@
  */
 #include <drm/drmP.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_atomic_helper.h>
 #include <drm/drm_of.h>
 #include <linux/kernel.h>
 #include <linux/component.h>
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index 92ecb9bf982c..acad088173da 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -14,8 +14,8 @@
 #include <asm/barrier.h>
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/pm_runtime.h>
 #include <soc/mediatek/smi.h>
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 6422e99952fe..cf59ea9bccfd 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -15,10 +15,10 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_gem.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/component.h>
 #include <linux/iommu.h>
 #include <linux/of_address.h>
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_fb.c b/drivers/gpu/drm/mediatek/mtk_drm_fb.c
index be5f6f1daf55..e20fcaef2851 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_fb.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_fb.c
@@ -12,7 +12,7 @@
  */
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_modeset_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem.h>
 #include <drm/drm_gem_framebuffer_helper.h>
diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c b/drivers/gpu/drm/mediatek/mtk_dsi.c
index 27b507eb4a99..b00eb2d2e086 100644
--- a/drivers/gpu/drm/mediatek/mtk_dsi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dsi.c
@@ -13,10 +13,10 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_mipi_dsi.h>
 #include <drm/drm_panel.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/component.h>
 #include <linux/iopoll.h>
diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi.c b/drivers/gpu/drm/mediatek/mtk_hdmi.c
index 862f3ec22131..915cc84621ae 100644
--- a/drivers/gpu/drm/mediatek/mtk_hdmi.c
+++ b/drivers/gpu/drm/mediatek/mtk_hdmi.c
@@ -14,7 +14,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_edid.h>
 #include <linux/arm-smccc.h>
 #include <linux/clk.h>
@@ -981,7 +981,8 @@ static int mtk_hdmi_setup_avi_infoframe(struct mtk_hdmi *hdmi,
 	u8 buffer[17];
 	ssize_t err;
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame,
+						       &hdmi->conn, mode);
 	if (err < 0) {
 		dev_err(hdmi->dev,
 			"Failed to get AVI infoframe from mode: %zd\n", err);
@@ -1370,8 +1371,8 @@ static void mtk_hdmi_bridge_post_disable(struct drm_bridge *bridge)
 }
 
 static void mtk_hdmi_bridge_mode_set(struct drm_bridge *bridge,
-				     struct drm_display_mode *mode,
-				     struct drm_display_mode *adjusted_mode)
+				const struct drm_display_mode *mode,
+				const struct drm_display_mode *adjusted_mode)
 {
 	struct mtk_hdmi *hdmi = hdmi_ctx_from_bridge(bridge);
 
diff --git a/drivers/gpu/drm/meson/meson_crtc.c b/drivers/gpu/drm/meson/meson_crtc.c
index 4f5c67f70c4d..43e29984f8b1 100644
--- a/drivers/gpu/drm/meson/meson_crtc.c
+++ b/drivers/gpu/drm/meson/meson_crtc.c
@@ -30,7 +30,7 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_flip_work.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "meson_crtc.h"
 #include "meson_plane.h"
diff --git a/drivers/gpu/drm/meson/meson_drv.c b/drivers/gpu/drm/meson/meson_drv.c
index 12ff47b13668..2281ed3eb774 100644
--- a/drivers/gpu/drm/meson/meson_drv.c
+++ b/drivers/gpu/drm/meson/meson_drv.c
@@ -30,14 +30,14 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fb_helper.h>
 #include <drm/drm_flip_work.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_plane_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
-#include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_rect.h>
-#include <drm/drm_fb_helper.h>
 
 #include "meson_drv.h"
 #include "meson_plane.h"
@@ -94,7 +94,7 @@ static irqreturn_t meson_irq(int irq, void *arg)
 DEFINE_DRM_GEM_CMA_FOPS(fops);
 
 static struct drm_driver meson_driver = {
-	.driver_features	= DRIVER_HAVE_IRQ | DRIVER_GEM |
+	.driver_features	= DRIVER_GEM |
 				  DRIVER_MODESET | DRIVER_PRIME |
 				  DRIVER_ATOMIC,
 
@@ -156,6 +156,23 @@ static void meson_vpu_init(struct meson_drm *priv)
 	writel_relaxed(0x20000, priv->io_base + _REG(VPU_WRARB_MODE_L2C1));
 }
 
+static void meson_remove_framebuffers(void)
+{
+	struct apertures_struct *ap;
+
+	ap = alloc_apertures(1);
+	if (!ap)
+		return;
+
+	/* The framebuffer can be located anywhere in RAM */
+	ap->ranges[0].base = 0;
+	ap->ranges[0].size = ~0;
+
+	drm_fb_helper_remove_conflicting_framebuffers(ap, "meson-drm-fb",
+						      false);
+	kfree(ap);
+}
+
 static int meson_drv_bind_master(struct device *dev, bool has_components)
 {
 	struct platform_device *pdev = to_platform_device(dev);
@@ -266,6 +283,9 @@ static int meson_drv_bind_master(struct device *dev, bool has_components)
 	if (ret)
 		goto free_drm;
 
+	/* Remove early framebuffers (ie. simplefb) */
+	meson_remove_framebuffers();
+
 	drm_mode_config_init(drm);
 	drm->mode_config.max_width = 3840;
 	drm->mode_config.max_height = 2160;
diff --git a/drivers/gpu/drm/meson/meson_dw_hdmi.c b/drivers/gpu/drm/meson/meson_dw_hdmi.c
index 807111ebfdd9..e28814f4ea6c 100644
--- a/drivers/gpu/drm/meson/meson_dw_hdmi.c
+++ b/drivers/gpu/drm/meson/meson_dw_hdmi.c
@@ -26,9 +26,9 @@
 #include <linux/regulator/consumer.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_edid.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/bridge/dw_hdmi.h>
 
 #include <uapi/linux/media-bus-format.h>
@@ -365,7 +365,8 @@ static int dw_hdmi_phy_init(struct dw_hdmi *hdmi, void *data,
 	unsigned int wr_clk =
 		readl_relaxed(priv->io_base + _REG(VPU_HDMI_SETTING));
 
-	DRM_DEBUG_DRIVER("%d:\"%s\"\n", mode->base.id, mode->name);
+	DRM_DEBUG_DRIVER("\"%s\" div%d\n", mode->name,
+			 mode->clock > 340000 ? 40 : 10);
 
 	/* Enable clocks */
 	regmap_update_bits(priv->hhi, HHI_HDMI_CLK_CNTL, 0xffff, 0x100);
@@ -385,9 +386,17 @@ static int dw_hdmi_phy_init(struct dw_hdmi *hdmi, void *data,
 	/* Enable normal output to PHY */
 	dw_hdmi_top_write(dw_hdmi, HDMITX_TOP_BIST_CNTL, BIT(12));
 
-	/* TMDS pattern setup (TOFIX pattern for 4k2k scrambling) */
-	dw_hdmi_top_write(dw_hdmi, HDMITX_TOP_TMDS_CLK_PTTN_01, 0x001f001f);
-	dw_hdmi_top_write(dw_hdmi, HDMITX_TOP_TMDS_CLK_PTTN_23, 0x001f001f);
+	/* TMDS pattern setup (TOFIX Handle the YUV420 case) */
+	if (mode->clock > 340000) {
+		dw_hdmi_top_write(dw_hdmi, HDMITX_TOP_TMDS_CLK_PTTN_01, 0);
+		dw_hdmi_top_write(dw_hdmi, HDMITX_TOP_TMDS_CLK_PTTN_23,
+				  0x03ff03ff);
+	} else {
+		dw_hdmi_top_write(dw_hdmi, HDMITX_TOP_TMDS_CLK_PTTN_01,
+				  0x001f001f);
+		dw_hdmi_top_write(dw_hdmi, HDMITX_TOP_TMDS_CLK_PTTN_23,
+				  0x001f001f);
+	}
 
 	/* Load TMDS pattern */
 	dw_hdmi_top_write(dw_hdmi, HDMITX_TOP_TMDS_CLK_PTTN_CNTL, 0x1);
@@ -413,6 +422,8 @@ static int dw_hdmi_phy_init(struct dw_hdmi *hdmi, void *data,
 	/* Disable clock, fifo, fifo_wr */
 	regmap_update_bits(priv->hhi, HHI_HDMI_PHY_CNTL1, 0xf, 0);
 
+	dw_hdmi_set_high_tmds_clock_ratio(hdmi);
+
 	msleep(100);
 
 	/* Reset PHY 3 times in a row */
@@ -555,12 +566,11 @@ dw_hdmi_mode_valid(struct drm_connector *connector,
 	int vic = drm_match_cea_mode(mode);
 	enum drm_mode_status status;
 
-	DRM_DEBUG_DRIVER("Modeline %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x\n",
-		mode->base.id, mode->name, mode->vrefresh, mode->clock,
-		mode->hdisplay, mode->hsync_start,
-		mode->hsync_end, mode->htotal,
-		mode->vdisplay, mode->vsync_start,
-		mode->vsync_end, mode->vtotal, mode->type, mode->flags);
+	DRM_DEBUG_DRIVER("Modeline " DRM_MODE_FMT "\n", DRM_MODE_ARG(mode));
+
+	/* If sink max TMDS clock, we reject the mode */
+	if (mode->clock > connector->display_info.max_tmds_clock)
+		return MODE_BAD;
 
 	/* Check against non-VIC supported modes */
 	if (!vic) {
@@ -650,8 +660,7 @@ static void meson_venc_hdmi_encoder_mode_set(struct drm_encoder *encoder,
 	struct meson_drm *priv = dw_hdmi->priv;
 	int vic = drm_match_cea_mode(mode);
 
-	DRM_DEBUG_DRIVER("%d:\"%s\" vic %d\n",
-			 mode->base.id, mode->name, vic);
+	DRM_DEBUG_DRIVER("\"%s\" vic %d\n", mode->name, vic);
 
 	/* VENC + VENC-DVI Mode setup */
 	meson_venc_hdmi_mode_set(priv, vic, mode);
diff --git a/drivers/gpu/drm/meson/meson_venc.c b/drivers/gpu/drm/meson/meson_venc.c
index 0ba04f6813e6..66d73a932d19 100644
--- a/drivers/gpu/drm/meson/meson_venc.c
+++ b/drivers/gpu/drm/meson/meson_venc.c
@@ -848,6 +848,8 @@ struct meson_hdmi_venc_vic_mode {
 	{ 93, &meson_hdmi_encp_mode_2160p24 },
 	{ 94, &meson_hdmi_encp_mode_2160p25 },
 	{ 95, &meson_hdmi_encp_mode_2160p30 },
+	{ 96, &meson_hdmi_encp_mode_2160p25 },
+	{ 97, &meson_hdmi_encp_mode_2160p30 },
 	{ 0, NULL}, /* sentinel */
 };
 
diff --git a/drivers/gpu/drm/meson/meson_venc_cvbs.c b/drivers/gpu/drm/meson/meson_venc_cvbs.c
index f7945bae3b4a..d622d817b6df 100644
--- a/drivers/gpu/drm/meson/meson_venc_cvbs.c
+++ b/drivers/gpu/drm/meson/meson_venc_cvbs.c
@@ -26,9 +26,9 @@
 #include <linux/of_graph.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_edid.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 
 #include "meson_venc_cvbs.h"
 #include "meson_venc.h"
diff --git a/drivers/gpu/drm/mga/mga_drv.c b/drivers/gpu/drm/mga/mga_drv.c
index 1aad27813c23..6e1d1054ad06 100644
--- a/drivers/gpu/drm/mga/mga_drv.c
+++ b/drivers/gpu/drm/mga/mga_drv.c
@@ -57,7 +57,7 @@ static const struct file_operations mga_driver_fops = {
 static struct drm_driver driver = {
 	.driver_features =
 	    DRIVER_USE_AGP | DRIVER_PCI_DMA | DRIVER_LEGACY |
-	    DRIVER_HAVE_DMA | DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED,
+	    DRIVER_HAVE_DMA | DRIVER_HAVE_IRQ,
 	.dev_priv_size = sizeof(drm_mga_buf_priv_t),
 	.load = mga_driver_load,
 	.unload = mga_driver_unload,
diff --git a/drivers/gpu/drm/mgag200/mgag200_fb.c b/drivers/gpu/drm/mgag200/mgag200_fb.c
index 30726c9fe28c..6893934b26c0 100644
--- a/drivers/gpu/drm/mgag200/mgag200_fb.c
+++ b/drivers/gpu/drm/mgag200/mgag200_fb.c
@@ -12,6 +12,7 @@
  */
 #include <linux/module.h>
 #include <drm/drmP.h>
+#include <drm/drm_util.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_crtc_helper.h>
 
diff --git a/drivers/gpu/drm/mgag200/mgag200_main.c b/drivers/gpu/drm/mgag200/mgag200_main.c
index 79d54103d470..163255099779 100644
--- a/drivers/gpu/drm/mgag200/mgag200_main.c
+++ b/drivers/gpu/drm/mgag200/mgag200_main.c
@@ -33,7 +33,7 @@ int mgag200_framebuffer_init(struct drm_device *dev,
 			     struct drm_gem_object *obj)
 {
 	int ret;
-	
+
 	drm_helper_mode_fill_fb_struct(dev, &gfb->base, mode_cmd);
 	gfb->obj = obj;
 	ret = drm_framebuffer_init(dev, &gfb->base, &mga_fb_funcs);
@@ -318,13 +318,9 @@ int mgag200_dumb_create(struct drm_file *file,
 
 static void mgag200_bo_unref(struct mgag200_bo **bo)
 {
-	struct ttm_buffer_object *tbo;
-
 	if ((*bo) == NULL)
 		return;
-
-	tbo = &((*bo)->bo);
-	ttm_bo_unref(&tbo);
+	ttm_bo_put(&((*bo)->bo));
 	*bo = NULL;
 }
 
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c
index acf7bfe68454..7481a3d556ad 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -16,6 +16,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "mgag200_drv.h"
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
index 9be7c355debd..b776fca571f3 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
@@ -20,10 +20,10 @@
 #include <linux/sort.h>
 #include <linux/debugfs.h>
 #include <linux/ktime.h>
-#include <drm/drm_mode.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_flip_work.h>
+#include <drm/drm_mode.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_rect.h>
 
 #include "dpu_kms.h"
@@ -465,8 +465,6 @@ static void _dpu_crtc_setup_mixer_for_encoder(
 			return;
 		}
 
-		mixer->encoder = enc;
-
 		cstate->num_mixers++;
 		DPU_DEBUG("setup mixer %d: lm %d\n",
 				i, mixer->hw_lm->idx - LM_0);
@@ -718,11 +716,8 @@ void dpu_crtc_commit_kickoff(struct drm_crtc *crtc, bool async)
 	 * may delay and flush at an irq event (e.g. ppdone)
 	 */
 	drm_for_each_encoder_mask(encoder, crtc->dev,
-				  crtc->state->encoder_mask) {
-		struct dpu_encoder_kickoff_params params = { 0 };
-		dpu_encoder_prepare_for_kickoff(encoder, &params, async);
-	}
-
+				  crtc->state->encoder_mask)
+		dpu_encoder_prepare_for_kickoff(encoder, async);
 
 	if (!async) {
 		/* wait for frame_event_done completion */
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h
index dbfb38a1986c..e59d62be4980 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h
@@ -84,14 +84,12 @@ struct dpu_crtc_smmu_state_data {
  * struct dpu_crtc_mixer: stores the map for each virtual pipeline in the CRTC
  * @hw_lm:	LM HW Driver context
  * @lm_ctl:	CTL Path HW driver context
- * @encoder:	Encoder attached to this lm & ctl
  * @mixer_op_mode:	mixer blending operation mode
  * @flush_mask:	mixer flush mask for ctl, mixer and pipe
  */
 struct dpu_crtc_mixer {
 	struct dpu_hw_mixer *hw_lm;
 	struct dpu_hw_ctl *lm_ctl;
-	struct drm_encoder *encoder;
 	u32 mixer_op_mode;
 	u32 flush_mask;
 };
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 36158b7d99cd..5aa3307f3f0c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -24,7 +24,7 @@
 #include "msm_drv.h"
 #include "dpu_kms.h"
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "dpu_hwio.h"
 #include "dpu_hw_catalog.h"
 #include "dpu_hw_intf.h"
@@ -205,7 +205,7 @@ struct dpu_encoder_virt {
 	bool idle_pc_supported;
 	struct mutex rc_lock;
 	enum dpu_enc_rc_states rc_state;
-	struct kthread_delayed_work delayed_off_work;
+	struct delayed_work delayed_off_work;
 	struct kthread_work vsync_event_work;
 	struct msm_display_topology topology;
 	bool mode_set_complete;
@@ -742,7 +742,6 @@ static int dpu_encoder_resource_control(struct drm_encoder *drm_enc,
 {
 	struct dpu_encoder_virt *dpu_enc;
 	struct msm_drm_private *priv;
-	struct msm_drm_thread *disp_thread;
 	bool is_vid_mode = false;
 
 	if (!drm_enc || !drm_enc->dev || !drm_enc->dev->dev_private ||
@@ -755,12 +754,6 @@ static int dpu_encoder_resource_control(struct drm_encoder *drm_enc,
 	is_vid_mode = dpu_enc->disp_info.capabilities &
 						MSM_DISPLAY_CAP_VID_MODE;
 
-	if (drm_enc->crtc->index >= ARRAY_SIZE(priv->disp_thread)) {
-		DPU_ERROR("invalid crtc index\n");
-		return -EINVAL;
-	}
-	disp_thread = &priv->disp_thread[drm_enc->crtc->index];
-
 	/*
 	 * when idle_pc is not supported, process only KICKOFF, STOP and MODESET
 	 * events and return early for other events (ie wb display).
@@ -777,8 +770,7 @@ static int dpu_encoder_resource_control(struct drm_encoder *drm_enc,
 	switch (sw_event) {
 	case DPU_ENC_RC_EVENT_KICKOFF:
 		/* cancel delayed off work, if any */
-		if (kthread_cancel_delayed_work_sync(
-				&dpu_enc->delayed_off_work))
+		if (cancel_delayed_work_sync(&dpu_enc->delayed_off_work))
 			DPU_DEBUG_ENC(dpu_enc, "sw_event:%d, work cancelled\n",
 					sw_event);
 
@@ -837,10 +829,8 @@ static int dpu_encoder_resource_control(struct drm_encoder *drm_enc,
 			return 0;
 		}
 
-		kthread_queue_delayed_work(
-			&disp_thread->worker,
-			&dpu_enc->delayed_off_work,
-			msecs_to_jiffies(dpu_enc->idle_timeout));
+		queue_delayed_work(priv->wq, &dpu_enc->delayed_off_work,
+				   msecs_to_jiffies(dpu_enc->idle_timeout));
 
 		trace_dpu_enc_rc(DRMID(drm_enc), sw_event,
 				 dpu_enc->idle_pc_supported, dpu_enc->rc_state,
@@ -849,8 +839,7 @@ static int dpu_encoder_resource_control(struct drm_encoder *drm_enc,
 
 	case DPU_ENC_RC_EVENT_PRE_STOP:
 		/* cancel delayed off work, if any */
-		if (kthread_cancel_delayed_work_sync(
-				&dpu_enc->delayed_off_work))
+		if (cancel_delayed_work_sync(&dpu_enc->delayed_off_work))
 			DPU_DEBUG_ENC(dpu_enc, "sw_event:%d, work cancelled\n",
 					sw_event);
 
@@ -1368,7 +1357,7 @@ static void dpu_encoder_frame_done_callback(
 	}
 }
 
-static void dpu_encoder_off_work(struct kthread_work *work)
+static void dpu_encoder_off_work(struct work_struct *work)
 {
 	struct dpu_encoder_virt *dpu_enc = container_of(work,
 			struct dpu_encoder_virt, delayed_off_work.work);
@@ -1756,15 +1745,14 @@ static void dpu_encoder_vsync_event_work_handler(struct kthread_work *work)
 			nsecs_to_jiffies(ktime_to_ns(wakeup_time)));
 }
 
-void dpu_encoder_prepare_for_kickoff(struct drm_encoder *drm_enc,
-		struct dpu_encoder_kickoff_params *params, bool async)
+void dpu_encoder_prepare_for_kickoff(struct drm_encoder *drm_enc, bool async)
 {
 	struct dpu_encoder_virt *dpu_enc;
 	struct dpu_encoder_phys *phys;
 	bool needs_hw_reset = false;
 	unsigned int i;
 
-	if (!drm_enc || !params) {
+	if (!drm_enc) {
 		DPU_ERROR("invalid args\n");
 		return;
 	}
@@ -1778,7 +1766,7 @@ void dpu_encoder_prepare_for_kickoff(struct drm_encoder *drm_enc,
 		phys = dpu_enc->phys_encs[i];
 		if (phys) {
 			if (phys->ops.prepare_for_kickoff)
-				phys->ops.prepare_for_kickoff(phys, params);
+				phys->ops.prepare_for_kickoff(phys);
 			if (phys->enable_state == DPU_ENC_ERR_NEEDS_HW_RESET)
 				needs_hw_reset = true;
 		}
@@ -2193,7 +2181,7 @@ int dpu_encoder_setup(struct drm_device *dev, struct drm_encoder *enc,
 
 
 	mutex_init(&dpu_enc->rc_lock);
-	kthread_init_delayed_work(&dpu_enc->delayed_off_work,
+	INIT_DELAYED_WORK(&dpu_enc->delayed_off_work,
 			dpu_encoder_off_work);
 	dpu_enc->idle_timeout = IDLE_TIMEOUT;
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
index 3f5dafe00580..d77f74fb26d4 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
@@ -38,15 +38,6 @@ struct dpu_encoder_hw_resources {
 };
 
 /**
- * dpu_encoder_kickoff_params - info encoder requires at kickoff
- * @affected_displays:  bitmask, bit set means the ROI of the commit lies within
- *                      the bounds of the physical display at the bit index
- */
-struct dpu_encoder_kickoff_params {
-	unsigned long affected_displays;
-};
-
-/**
  * dpu_encoder_get_hw_resources - Populate table of required hardware resources
  * @encoder:	encoder pointer
  * @hw_res:	resource table to populate with encoder required resources
@@ -88,11 +79,9 @@ void dpu_encoder_register_frame_event_callback(struct drm_encoder *encoder,
  *	Immediately: if no previous commit is outstanding.
  *	Delayed: Block until next trigger can be issued.
  * @encoder:	encoder pointer
- * @params:	kickoff time parameters
  * @async:	true if this is an asynchronous commit
  */
-void dpu_encoder_prepare_for_kickoff(struct drm_encoder *encoder,
-		struct dpu_encoder_kickoff_params *params, bool async);
+void dpu_encoder_prepare_for_kickoff(struct drm_encoder *encoder,  bool async);
 
 /**
  * dpu_encoder_trigger_kickoff_pending - Clear the flush bits from previous
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
index 44e6f8b68e70..db94f3d3bea3 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
@@ -144,8 +144,7 @@ struct dpu_encoder_phys_ops {
 	int (*wait_for_commit_done)(struct dpu_encoder_phys *phys_enc);
 	int (*wait_for_tx_complete)(struct dpu_encoder_phys *phys_enc);
 	int (*wait_for_vblank)(struct dpu_encoder_phys *phys_enc);
-	void (*prepare_for_kickoff)(struct dpu_encoder_phys *phys_enc,
-			struct dpu_encoder_kickoff_params *params);
+	void (*prepare_for_kickoff)(struct dpu_encoder_phys *phys_enc);
 	void (*handle_post_kickoff)(struct dpu_encoder_phys *phys_enc);
 	void (*trigger_start)(struct dpu_encoder_phys *phys_enc);
 	bool (*needs_single_flush)(struct dpu_encoder_phys *phys_enc);
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
index 99ab5ca9bed3..a399e1edd313 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
@@ -594,8 +594,7 @@ static void dpu_encoder_phys_cmd_get_hw_resources(
 }
 
 static void dpu_encoder_phys_cmd_prepare_for_kickoff(
-		struct dpu_encoder_phys *phys_enc,
-		struct dpu_encoder_kickoff_params *params)
+		struct dpu_encoder_phys *phys_enc)
 {
 	struct dpu_encoder_phys_cmd *cmd_enc =
 			to_dpu_encoder_phys_cmd(phys_enc);
@@ -693,7 +692,7 @@ static int dpu_encoder_phys_cmd_wait_for_commit_done(
 
 	/* required for both controllers */
 	if (!rc && cmd_enc->serialize_wait4pp)
-		dpu_encoder_phys_cmd_prepare_for_kickoff(phys_enc, NULL);
+		dpu_encoder_phys_cmd_prepare_for_kickoff(phys_enc);
 
 	return rc;
 }
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index acdab5b0db18..3c4eb470a82c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -587,14 +587,13 @@ static int dpu_encoder_phys_vid_wait_for_vblank(
 }
 
 static void dpu_encoder_phys_vid_prepare_for_kickoff(
-		struct dpu_encoder_phys *phys_enc,
-		struct dpu_encoder_kickoff_params *params)
+		struct dpu_encoder_phys *phys_enc)
 {
 	struct dpu_encoder_phys_vid *vid_enc;
 	struct dpu_hw_ctl *ctl;
 	int rc;
 
-	if (!phys_enc || !params) {
+	if (!phys_enc) {
 		DPU_ERROR("invalid encoder/parameters\n");
 		return;
 	}
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c
index 0874f0a53bf9..f59fe1a9f4b9 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c
@@ -263,13 +263,13 @@ static const struct dpu_format dpu_format_map[] = {
 
 	INTERLEAVED_RGB_FMT(RGB565,
 		0, COLOR_5BIT, COLOR_6BIT, COLOR_5BIT,
-		C1_B_Cb, C0_G_Y, C2_R_Cr, 0, 3,
+		C2_R_Cr, C0_G_Y, C1_B_Cb, 0, 3,
 		false, 2, 0,
 		DPU_FETCH_LINEAR, 1),
 
 	INTERLEAVED_RGB_FMT(BGR565,
 		0, COLOR_5BIT, COLOR_6BIT, COLOR_5BIT,
-		C2_R_Cr, C0_G_Y, C1_B_Cb, 0, 3,
+		C1_B_Cb, C0_G_Y, C2_R_Cr, 0, 3,
 		false, 2, 0,
 		DPU_FETCH_LINEAR, 1),
 
@@ -1137,36 +1137,3 @@ const struct msm_format *dpu_get_msm_format(
 		return &fmt->base;
 	return NULL;
 }
-
-uint32_t dpu_populate_formats(
-		const struct dpu_format_extended *format_list,
-		uint32_t *pixel_formats,
-		uint64_t *pixel_modifiers,
-		uint32_t pixel_formats_max)
-{
-	uint32_t i, fourcc_format;
-
-	if (!format_list || !pixel_formats)
-		return 0;
-
-	for (i = 0, fourcc_format = 0;
-			format_list->fourcc_format && i < pixel_formats_max;
-			++format_list) {
-		/* verify if listed format is in dpu_format_map? */
-
-		/* optionally return modified formats */
-		if (pixel_modifiers) {
-			/* assume same modifier for all fb planes */
-			pixel_formats[i] = format_list->fourcc_format;
-			pixel_modifiers[i++] = format_list->modifier;
-		} else {
-			/* assume base formats grouped together */
-			if (fourcc_format != format_list->fourcc_format) {
-				fourcc_format = format_list->fourcc_format;
-				pixel_formats[i++] = fourcc_format;
-			}
-		}
-	}
-
-	return i;
-}
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h
index a54451d8d011..c02c81e7a667 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h
@@ -41,20 +41,6 @@ const struct msm_format *dpu_get_msm_format(
 		const uint64_t modifiers);
 
 /**
- * dpu_populate_formats - populate the given array with fourcc codes supported
- * @format_list:       pointer to list of possible formats
- * @pixel_formats:     array to populate with fourcc codes
- * @pixel_modifiers:   array to populate with drm modifiers, can be NULL
- * @pixel_formats_max: length of pixel formats array
- * Return: number of elements populated
- */
-uint32_t dpu_populate_formats(
-		const struct dpu_format_extended *format_list,
-		uint32_t *pixel_formats,
-		uint64_t *pixel_modifiers,
-		uint32_t pixel_formats_max);
-
-/**
  * dpu_format_check_modified_format - validate format and buffers for
  *                   dpu non-standard, i.e. modified format
  * @kms:             kms driver
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index 512ac0834d2b..df6852cc98b9 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -151,7 +151,9 @@ static const struct dpu_sspp_blks_common sdm845_sspp_common = {
 		.id = DPU_SSPP_CSC_10BIT, \
 		.base = 0x1a00, .len = 0x100,}, \
 	.format_list = plane_formats_yuv, \
+	.num_formats = ARRAY_SIZE(plane_formats_yuv), \
 	.virt_format_list = plane_formats, \
+	.virt_num_formats = ARRAY_SIZE(plane_formats), \
 	}
 
 #define _DMA_SBLK(num, sdma_pri) \
@@ -163,7 +165,9 @@ static const struct dpu_sspp_blks_common sdm845_sspp_common = {
 	.src_blk = {.name = STRCAT("sspp_src_", num), \
 		.id = DPU_SSPP_SRC, .base = 0x00, .len = 0x150,}, \
 	.format_list = plane_formats, \
+	.num_formats = ARRAY_SIZE(plane_formats), \
 	.virt_format_list = plane_formats, \
+	.virt_num_formats = ARRAY_SIZE(plane_formats), \
 	}
 
 static const struct dpu_sspp_sub_blks sdm845_vig_sblk_0 = _VIG_SBLK("0", 5);
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 144358a3d0fb..a55653b2e466 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -252,17 +252,6 @@ struct dpu_pp_blk {
 };
 
 /**
- * struct dpu_format_extended - define dpu specific pixel format+modifier
- * @fourcc_format: Base FOURCC pixel format code
- * @modifier: 64-bit drm format modifier, same modifier must be applied to all
- *            framebuffer planes
- */
-struct dpu_format_extended {
-	uint32_t fourcc_format;
-	uint64_t modifier;
-};
-
-/**
  * enum dpu_qos_lut_usage - define QoS LUT use cases
  */
 enum dpu_qos_lut_usage {
@@ -348,7 +337,9 @@ struct dpu_sspp_blks_common {
  * @pcc_blk:
  * @igc_blk:
  * @format_list: Pointer to list of supported formats
+ * @num_formats: Number of supported formats
  * @virt_format_list: Pointer to list of supported formats for virtual planes
+ * @virt_num_formats: Number of supported formats for virtual planes
  */
 struct dpu_sspp_sub_blks {
 	const struct dpu_sspp_blks_common *common;
@@ -366,8 +357,10 @@ struct dpu_sspp_sub_blks {
 	struct dpu_pp_blk pcc_blk;
 	struct dpu_pp_blk igc_blk;
 
-	const struct dpu_format_extended *format_list;
-	const struct dpu_format_extended *virt_format_list;
+	const u32 *format_list;
+	u32 num_formats;
+	const u32 *virt_format_list;
+	u32 virt_num_formats;
 };
 
 /**
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog_format.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog_format.h
index 3c9f028628ef..d09730985951 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog_format.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog_format.h
@@ -12,157 +12,81 @@
 
 #include "dpu_hw_mdss.h"
 
-static const struct dpu_format_extended plane_formats[] = {
-	{DRM_FORMAT_ARGB8888, 0},
-	{DRM_FORMAT_ABGR8888, 0},
-	{DRM_FORMAT_RGBA8888, 0},
-	{DRM_FORMAT_ABGR8888, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_BGRA8888, 0},
-	{DRM_FORMAT_XRGB8888, 0},
-	{DRM_FORMAT_RGBX8888, 0},
-	{DRM_FORMAT_BGRX8888, 0},
-	{DRM_FORMAT_XBGR8888, 0},
-	{DRM_FORMAT_XBGR8888, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_RGB888, 0},
-	{DRM_FORMAT_BGR888, 0},
-	{DRM_FORMAT_RGB565, 0},
-	{DRM_FORMAT_BGR565, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_BGR565, 0},
-	{DRM_FORMAT_ARGB1555, 0},
-	{DRM_FORMAT_ABGR1555, 0},
-	{DRM_FORMAT_RGBA5551, 0},
-	{DRM_FORMAT_BGRA5551, 0},
-	{DRM_FORMAT_XRGB1555, 0},
-	{DRM_FORMAT_XBGR1555, 0},
-	{DRM_FORMAT_RGBX5551, 0},
-	{DRM_FORMAT_BGRX5551, 0},
-	{DRM_FORMAT_ARGB4444, 0},
-	{DRM_FORMAT_ABGR4444, 0},
-	{DRM_FORMAT_RGBA4444, 0},
-	{DRM_FORMAT_BGRA4444, 0},
-	{DRM_FORMAT_XRGB4444, 0},
-	{DRM_FORMAT_XBGR4444, 0},
-	{DRM_FORMAT_RGBX4444, 0},
-	{DRM_FORMAT_BGRX4444, 0},
-	{0, 0},
+static const uint32_t qcom_compressed_supported_formats[] = {
+	DRM_FORMAT_ABGR8888,
+	DRM_FORMAT_XBGR8888,
+	DRM_FORMAT_BGR565,
 };
 
-static const struct dpu_format_extended plane_formats_yuv[] = {
-	{DRM_FORMAT_ARGB8888, 0},
-	{DRM_FORMAT_ABGR8888, 0},
-	{DRM_FORMAT_RGBA8888, 0},
-	{DRM_FORMAT_BGRX8888, 0},
-	{DRM_FORMAT_ABGR8888, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_BGRA8888, 0},
-	{DRM_FORMAT_XRGB8888, 0},
-	{DRM_FORMAT_XBGR8888, 0},
-	{DRM_FORMAT_RGBX8888, 0},
-	{DRM_FORMAT_XBGR8888, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_RGB888, 0},
-	{DRM_FORMAT_BGR888, 0},
-	{DRM_FORMAT_RGB565, 0},
-	{DRM_FORMAT_BGR565, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_BGR565, 0},
-	{DRM_FORMAT_ARGB1555, 0},
-	{DRM_FORMAT_ABGR1555, 0},
-	{DRM_FORMAT_RGBA5551, 0},
-	{DRM_FORMAT_BGRA5551, 0},
-	{DRM_FORMAT_XRGB1555, 0},
-	{DRM_FORMAT_XBGR1555, 0},
-	{DRM_FORMAT_RGBX5551, 0},
-	{DRM_FORMAT_BGRX5551, 0},
-	{DRM_FORMAT_ARGB4444, 0},
-	{DRM_FORMAT_ABGR4444, 0},
-	{DRM_FORMAT_RGBA4444, 0},
-	{DRM_FORMAT_BGRA4444, 0},
-	{DRM_FORMAT_XRGB4444, 0},
-	{DRM_FORMAT_XBGR4444, 0},
-	{DRM_FORMAT_RGBX4444, 0},
-	{DRM_FORMAT_BGRX4444, 0},
-
-	{DRM_FORMAT_NV12, 0},
-	{DRM_FORMAT_NV12, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_NV21, 0},
-	{DRM_FORMAT_NV16, 0},
-	{DRM_FORMAT_NV61, 0},
-	{DRM_FORMAT_VYUY, 0},
-	{DRM_FORMAT_UYVY, 0},
-	{DRM_FORMAT_YUYV, 0},
-	{DRM_FORMAT_YVYU, 0},
-	{DRM_FORMAT_YUV420, 0},
-	{DRM_FORMAT_YVU420, 0},
-	{0, 0},
-};
-
-static const struct dpu_format_extended cursor_formats[] = {
-	{DRM_FORMAT_ARGB8888, 0},
-	{DRM_FORMAT_ABGR8888, 0},
-	{DRM_FORMAT_RGBA8888, 0},
-	{DRM_FORMAT_BGRA8888, 0},
-	{DRM_FORMAT_XRGB8888, 0},
-	{DRM_FORMAT_ARGB1555, 0},
-	{DRM_FORMAT_ABGR1555, 0},
-	{DRM_FORMAT_RGBA5551, 0},
-	{DRM_FORMAT_BGRA5551, 0},
-	{DRM_FORMAT_ARGB4444, 0},
-	{DRM_FORMAT_ABGR4444, 0},
-	{DRM_FORMAT_RGBA4444, 0},
-	{DRM_FORMAT_BGRA4444, 0},
-	{0, 0},
+static const uint32_t plane_formats[] = {
+	DRM_FORMAT_ARGB8888,
+	DRM_FORMAT_ABGR8888,
+	DRM_FORMAT_RGBA8888,
+	DRM_FORMAT_BGRA8888,
+	DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGBX8888,
+	DRM_FORMAT_BGRX8888,
+	DRM_FORMAT_XBGR8888,
+	DRM_FORMAT_RGB888,
+	DRM_FORMAT_BGR888,
+	DRM_FORMAT_RGB565,
+	DRM_FORMAT_BGR565,
+	DRM_FORMAT_ARGB1555,
+	DRM_FORMAT_ABGR1555,
+	DRM_FORMAT_RGBA5551,
+	DRM_FORMAT_BGRA5551,
+	DRM_FORMAT_XRGB1555,
+	DRM_FORMAT_XBGR1555,
+	DRM_FORMAT_RGBX5551,
+	DRM_FORMAT_BGRX5551,
+	DRM_FORMAT_ARGB4444,
+	DRM_FORMAT_ABGR4444,
+	DRM_FORMAT_RGBA4444,
+	DRM_FORMAT_BGRA4444,
+	DRM_FORMAT_XRGB4444,
+	DRM_FORMAT_XBGR4444,
+	DRM_FORMAT_RGBX4444,
+	DRM_FORMAT_BGRX4444,
 };
 
-static const struct dpu_format_extended wb2_formats[] = {
-	{DRM_FORMAT_RGB565, 0},
-	{DRM_FORMAT_BGR565, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_RGB888, 0},
-	{DRM_FORMAT_ARGB8888, 0},
-	{DRM_FORMAT_RGBA8888, 0},
-	{DRM_FORMAT_ABGR8888, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_XRGB8888, 0},
-	{DRM_FORMAT_RGBX8888, 0},
-	{DRM_FORMAT_XBGR8888, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_ARGB1555, 0},
-	{DRM_FORMAT_RGBA5551, 0},
-	{DRM_FORMAT_XRGB1555, 0},
-	{DRM_FORMAT_RGBX5551, 0},
-	{DRM_FORMAT_ARGB4444, 0},
-	{DRM_FORMAT_RGBA4444, 0},
-	{DRM_FORMAT_RGBX4444, 0},
-	{DRM_FORMAT_XRGB4444, 0},
-
-	{DRM_FORMAT_BGR565, 0},
-	{DRM_FORMAT_BGR888, 0},
-	{DRM_FORMAT_ABGR8888, 0},
-	{DRM_FORMAT_BGRA8888, 0},
-	{DRM_FORMAT_BGRX8888, 0},
-	{DRM_FORMAT_XBGR8888, 0},
-	{DRM_FORMAT_ABGR1555, 0},
-	{DRM_FORMAT_BGRA5551, 0},
-	{DRM_FORMAT_XBGR1555, 0},
-	{DRM_FORMAT_BGRX5551, 0},
-	{DRM_FORMAT_ABGR4444, 0},
-	{DRM_FORMAT_BGRA4444, 0},
-	{DRM_FORMAT_BGRX4444, 0},
-	{DRM_FORMAT_XBGR4444, 0},
-
-	{DRM_FORMAT_YUV420, 0},
-	{DRM_FORMAT_NV12, 0},
-	{DRM_FORMAT_NV12, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_NV16, 0},
-	{DRM_FORMAT_YUYV, 0},
-
-	{0, 0},
-};
+static const uint32_t plane_formats_yuv[] = {
+	DRM_FORMAT_ARGB8888,
+	DRM_FORMAT_ABGR8888,
+	DRM_FORMAT_RGBA8888,
+	DRM_FORMAT_BGRX8888,
+	DRM_FORMAT_BGRA8888,
+	DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_XBGR8888,
+	DRM_FORMAT_RGBX8888,
+	DRM_FORMAT_RGB888,
+	DRM_FORMAT_BGR888,
+	DRM_FORMAT_RGB565,
+	DRM_FORMAT_BGR565,
+	DRM_FORMAT_ARGB1555,
+	DRM_FORMAT_ABGR1555,
+	DRM_FORMAT_RGBA5551,
+	DRM_FORMAT_BGRA5551,
+	DRM_FORMAT_XRGB1555,
+	DRM_FORMAT_XBGR1555,
+	DRM_FORMAT_RGBX5551,
+	DRM_FORMAT_BGRX5551,
+	DRM_FORMAT_ARGB4444,
+	DRM_FORMAT_ABGR4444,
+	DRM_FORMAT_RGBA4444,
+	DRM_FORMAT_BGRA4444,
+	DRM_FORMAT_XRGB4444,
+	DRM_FORMAT_XBGR4444,
+	DRM_FORMAT_RGBX4444,
+	DRM_FORMAT_BGRX4444,
 
-static const struct dpu_format_extended rgb_10bit_formats[] = {
-	{DRM_FORMAT_BGRA1010102, 0},
-	{DRM_FORMAT_BGRX1010102, 0},
-	{DRM_FORMAT_RGBA1010102, 0},
-	{DRM_FORMAT_RGBX1010102, 0},
-	{DRM_FORMAT_ABGR2101010, 0},
-	{DRM_FORMAT_ABGR2101010, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_XBGR2101010, 0},
-	{DRM_FORMAT_XBGR2101010, DRM_FORMAT_MOD_QCOM_COMPRESSED},
-	{DRM_FORMAT_ARGB2101010, 0},
-	{DRM_FORMAT_XRGB2101010, 0},
+	DRM_FORMAT_NV12,
+	DRM_FORMAT_NV21,
+	DRM_FORMAT_NV16,
+	DRM_FORMAT_NV61,
+	DRM_FORMAT_VYUY,
+	DRM_FORMAT_UYVY,
+	DRM_FORMAT_YUYV,
+	DRM_FORMAT_YVYU,
+	DRM_FORMAT_YUV420,
+	DRM_FORMAT_YVU420,
 };
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
index c0b7f0049365..8a28a03ac6a9 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c
@@ -170,10 +170,6 @@
 /**
  * AD4 interrupt status bit definitions
  */
-#define DPU_INTR_BRIGHTPR_UPDATED BIT(4)
-#define DPU_INTR_DARKENH_UPDATED BIT(3)
-#define DPU_INTR_STREN_OUTROI_UPDATED BIT(2)
-#define DPU_INTR_STREN_INROI_UPDATED BIT(1)
 #define DPU_INTR_BACKLIGHT_UPDATED BIT(0)
 /**
  * struct dpu_intr_reg - array of DPU register sets
@@ -782,18 +778,6 @@ static int dpu_hw_intr_irqidx_lookup(enum dpu_intr_type intr_type,
 	return -EINVAL;
 }
 
-static void dpu_hw_intr_set_mask(struct dpu_hw_intr *intr, uint32_t reg_off,
-		uint32_t mask)
-{
-	if (!intr)
-		return;
-
-	DPU_REG_WRITE(&intr->hw, reg_off, mask);
-
-	/* ensure register writes go through */
-	wmb();
-}
-
 static void dpu_hw_intr_dispatch_irq(struct dpu_hw_intr *intr,
 		void (*cbfunc)(void *, int),
 		void *arg)
@@ -1004,18 +988,6 @@ static int dpu_hw_intr_disable_irqs(struct dpu_hw_intr *intr)
 	return 0;
 }
 
-static int dpu_hw_intr_get_valid_interrupts(struct dpu_hw_intr *intr,
-		uint32_t *mask)
-{
-	if (!intr || !mask)
-		return -EINVAL;
-
-	*mask = IRQ_SOURCE_MDP | IRQ_SOURCE_DSI0 | IRQ_SOURCE_DSI1
-		| IRQ_SOURCE_HDMI | IRQ_SOURCE_EDP;
-
-	return 0;
-}
-
 static void dpu_hw_intr_get_interrupt_statuses(struct dpu_hw_intr *intr)
 {
 	int i;
@@ -1065,19 +1037,6 @@ static void dpu_hw_intr_clear_intr_status_nolock(struct dpu_hw_intr *intr,
 	wmb();
 }
 
-static void dpu_hw_intr_clear_interrupt_status(struct dpu_hw_intr *intr,
-		int irq_idx)
-{
-	unsigned long irq_flags;
-
-	if (!intr)
-		return;
-
-	spin_lock_irqsave(&intr->irq_lock, irq_flags);
-	dpu_hw_intr_clear_intr_status_nolock(intr, irq_idx);
-	spin_unlock_irqrestore(&intr->irq_lock, irq_flags);
-}
-
 static u32 dpu_hw_intr_get_interrupt_status(struct dpu_hw_intr *intr,
 		int irq_idx, bool clear)
 {
@@ -1113,16 +1072,13 @@ static u32 dpu_hw_intr_get_interrupt_status(struct dpu_hw_intr *intr,
 
 static void __setup_intr_ops(struct dpu_hw_intr_ops *ops)
 {
-	ops->set_mask = dpu_hw_intr_set_mask;
 	ops->irq_idx_lookup = dpu_hw_intr_irqidx_lookup;
 	ops->enable_irq = dpu_hw_intr_enable_irq;
 	ops->disable_irq = dpu_hw_intr_disable_irq;
 	ops->dispatch_irqs = dpu_hw_intr_dispatch_irq;
 	ops->clear_all_irqs = dpu_hw_intr_clear_irqs;
 	ops->disable_all_irqs = dpu_hw_intr_disable_irqs;
-	ops->get_valid_interrupts = dpu_hw_intr_get_valid_interrupts;
 	ops->get_interrupt_statuses = dpu_hw_intr_get_interrupt_statuses;
-	ops->clear_interrupt_status = dpu_hw_intr_clear_interrupt_status;
 	ops->clear_intr_status_nolock = dpu_hw_intr_clear_intr_status_nolock;
 	ops->get_interrupt_status = dpu_hw_intr_get_interrupt_status;
 }
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h
index 61e4cba36562..4d7a1c727ce2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h
@@ -20,13 +20,6 @@
 #include "dpu_hw_util.h"
 #include "dpu_hw_mdss.h"
 
-#define IRQ_SOURCE_MDP		BIT(0)
-#define IRQ_SOURCE_DSI0		BIT(4)
-#define IRQ_SOURCE_DSI1		BIT(5)
-#define IRQ_SOURCE_HDMI		BIT(8)
-#define IRQ_SOURCE_EDP		BIT(12)
-#define IRQ_SOURCE_MHL		BIT(16)
-
 /**
  * dpu_intr_type - HW Interrupt Type
  * @DPU_IRQ_TYPE_WB_ROT_COMP:		WB rotator done
@@ -96,18 +89,6 @@ struct dpu_hw_intr;
  */
 struct dpu_hw_intr_ops {
 	/**
-	 * set_mask - Programs the given interrupt register with the
-	 *            given interrupt mask. Register value will get overwritten.
-	 * @intr:	HW interrupt handle
-	 * @reg_off:	MDSS HW register offset
-	 * @irqmask:	IRQ mask value
-	 */
-	void (*set_mask)(
-			struct dpu_hw_intr *intr,
-			uint32_t reg,
-			uint32_t irqmask);
-
-	/**
 	 * irq_idx_lookup - Lookup IRQ index on the HW interrupt type
 	 *                 Used for all irq related ops
 	 * @intr_type:		Interrupt type defined in dpu_intr_type
@@ -177,16 +158,6 @@ struct dpu_hw_intr_ops {
 			struct dpu_hw_intr *intr);
 
 	/**
-	 * clear_interrupt_status - Clears HW interrupt status based on given
-	 *                          lookup IRQ index.
-	 * @intr:	HW interrupt handle
-	 * @irq_idx:	Lookup irq index return from irq_idx_lookup
-	 */
-	void (*clear_interrupt_status)(
-			struct dpu_hw_intr *intr,
-			int irq_idx);
-
-	/**
 	 * clear_intr_status_nolock() - clears the HW interrupts without lock
 	 * @intr:	HW interrupt handle
 	 * @irq_idx:	Lookup irq index return from irq_idx_lookup
@@ -206,21 +177,6 @@ struct dpu_hw_intr_ops {
 			struct dpu_hw_intr *intr,
 			int irq_idx,
 			bool clear);
-
-	/**
-	 * get_valid_interrupts - Gets a mask of all valid interrupt sources
-	 *                        within DPU. These are actually status bits
-	 *                        within interrupt registers that specify the
-	 *                        source of the interrupt in IRQs. For example,
-	 *                        valid interrupt sources can be MDP, DSI,
-	 *                        HDMI etc.
-	 * @intr:	HW interrupt handle
-	 * @mask:	Returning the interrupt source MASK
-	 * @return:	0 for success, otherwise failure
-	 */
-	int (*get_valid_interrupts)(
-			struct dpu_hw_intr *intr,
-			uint32_t *mask);
 };
 
 /**
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
index 68c54d2c9677..1ab8d4a889f7 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
@@ -258,12 +258,6 @@ enum dpu_vbif {
 	VBIF_NRT = VBIF_1
 };
 
-enum dpu_iommu_domain {
-	DPU_IOMMU_DOMAIN_UNSECURE,
-	DPU_IOMMU_DOMAIN_SECURE,
-	DPU_IOMMU_DOMAIN_MAX
-};
-
 /**
  * DPU HW,Component order color map
  */
@@ -358,7 +352,6 @@ enum dpu_3d_blend_mode {
  * @alpha_enable: whether the format has an alpha channel
  * @num_planes: number of planes (including meta data planes)
  * @fetch_mode: linear, tiled, or ubwc hw fetch behavior
- * @is_yuv: is format a yuv variant
  * @flag: usage bit flags
  * @tile_width: format tile width
  * @tile_height: format tile height
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h
index 321fc64ddd0e..efe70c508ee0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_util.h
@@ -18,7 +18,6 @@
 #include "dpu_hw_mdss.h"
 
 #define REG_MASK(n)                     ((BIT(n)) - 1)
-struct dpu_format_extended;
 
 /*
  * This is the common struct maintained by each sub block
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 4d67b3c96702..885bf88afa3e 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -405,35 +405,38 @@ static void dpu_kms_wait_for_commit_done(struct msm_kms *kms,
 	}
 }
 
-static void _dpu_kms_initialize_dsi(struct drm_device *dev,
+static int _dpu_kms_initialize_dsi(struct drm_device *dev,
 				    struct msm_drm_private *priv,
 				    struct dpu_kms *dpu_kms)
 {
 	struct drm_encoder *encoder = NULL;
-	int i, rc;
+	int i, rc = 0;
+
+	if (!(priv->dsi[0] || priv->dsi[1]))
+		return rc;
 
 	/*TODO: Support two independent DSI connectors */
 	encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
-	if (IS_ERR_OR_NULL(encoder)) {
+	if (IS_ERR(encoder)) {
 		DPU_ERROR("encoder init failed for dsi display\n");
-		return;
+		return PTR_ERR(encoder);
 	}
 
 	priv->encoders[priv->num_encoders++] = encoder;
 
 	for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) {
-		if (!priv->dsi[i]) {
-			DPU_DEBUG("invalid msm_dsi for ctrl %d\n", i);
-			return;
-		}
+		if (!priv->dsi[i])
+			continue;
 
 		rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder);
 		if (rc) {
 			DPU_ERROR("modeset_init failed for dsi[%d], rc = %d\n",
 				i, rc);
-			continue;
+			break;
 		}
 	}
+
+	return rc;
 }
 
 /**
@@ -444,16 +447,16 @@ static void _dpu_kms_initialize_dsi(struct drm_device *dev,
  * @dpu_kms:    Pointer to dpu kms structure
  * Returns:     Zero on success
  */
-static void _dpu_kms_setup_displays(struct drm_device *dev,
+static int _dpu_kms_setup_displays(struct drm_device *dev,
 				    struct msm_drm_private *priv,
 				    struct dpu_kms *dpu_kms)
 {
-	_dpu_kms_initialize_dsi(dev, priv, dpu_kms);
-
 	/**
 	 * Extend this function to initialize other
 	 * types of displays
 	 */
+
+	return _dpu_kms_initialize_dsi(dev, priv, dpu_kms);
 }
 
 static void _dpu_kms_drm_obj_destroy(struct dpu_kms *dpu_kms)
@@ -516,7 +519,9 @@ static int _dpu_kms_drm_obj_init(struct dpu_kms *dpu_kms)
 	 * Create encoder and query display drivers to create
 	 * bridges and connectors
 	 */
-	_dpu_kms_setup_displays(dev, priv, dpu_kms);
+	ret = _dpu_kms_setup_displays(dev, priv, dpu_kms);
+	if (ret)
+		goto fail;
 
 	max_crtc_count = min(catalog->mixer_count, priv->num_encoders);
 
@@ -627,6 +632,10 @@ static void _dpu_kms_hw_destroy(struct dpu_kms *dpu_kms)
 		devm_iounmap(&dpu_kms->pdev->dev, dpu_kms->vbif[VBIF_RT]);
 	dpu_kms->vbif[VBIF_RT] = NULL;
 
+	if (dpu_kms->hw_mdp)
+		dpu_hw_mdp_destroy(dpu_kms->hw_mdp);
+	dpu_kms->hw_mdp = NULL;
+
 	if (dpu_kms->mmio)
 		devm_iounmap(&dpu_kms->pdev->dev, dpu_kms->mmio);
 	dpu_kms->mmio = NULL;
@@ -877,8 +886,7 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
 		goto power_error;
 	}
 
-	rc = dpu_rm_init(&dpu_kms->rm, dpu_kms->catalog, dpu_kms->mmio,
-			dpu_kms->dev);
+	rc = dpu_rm_init(&dpu_kms->rm, dpu_kms->catalog, dpu_kms->mmio);
 	if (rc) {
 		DPU_ERROR("rm init failed: %d\n", rc);
 		goto power_error;
@@ -886,11 +894,10 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
 
 	dpu_kms->rm_init = true;
 
-	dpu_kms->hw_mdp = dpu_rm_get_mdp(&dpu_kms->rm);
-	if (IS_ERR_OR_NULL(dpu_kms->hw_mdp)) {
+	dpu_kms->hw_mdp = dpu_hw_mdptop_init(MDP_TOP, dpu_kms->mmio,
+					     dpu_kms->catalog);
+	if (IS_ERR(dpu_kms->hw_mdp)) {
 		rc = PTR_ERR(dpu_kms->hw_mdp);
-		if (!dpu_kms->hw_mdp)
-			rc = -EINVAL;
 		DPU_ERROR("failed to get hw_mdp: %d\n", rc);
 		dpu_kms->hw_mdp = NULL;
 		goto power_error;
@@ -926,16 +933,6 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
 		goto hw_intr_init_err;
 	}
 
-	/*
-	 * _dpu_kms_drm_obj_init should create the DRM related objects
-	 * i.e. CRTCs, planes, encoders, connectors and so forth
-	 */
-	rc = _dpu_kms_drm_obj_init(dpu_kms);
-	if (rc) {
-		DPU_ERROR("modeset init failed: %d\n", rc);
-		goto drm_obj_init_err;
-	}
-
 	dev->mode_config.min_width = 0;
 	dev->mode_config.min_height = 0;
 
@@ -952,6 +949,16 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
 	 */
 	dev->mode_config.allow_fb_modifiers = true;
 
+	/*
+	 * _dpu_kms_drm_obj_init should create the DRM related objects
+	 * i.e. CRTCs, planes, encoders, connectors and so forth
+	 */
+	rc = _dpu_kms_drm_obj_init(dpu_kms);
+	if (rc) {
+		DPU_ERROR("modeset init failed: %d\n", rc);
+		goto drm_obj_init_err;
+	}
+
 	dpu_vbif_init_memtypes(dpu_kms);
 
 	pm_runtime_put_sync(&dpu_kms->pdev->dev);
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
index cb307a2abf06..7316b4ab1b85 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
@@ -23,11 +23,14 @@ struct dpu_mdss {
 	struct dpu_irq_controller irq_controller;
 };
 
-static irqreturn_t dpu_mdss_irq(int irq, void *arg)
+static void dpu_mdss_irq(struct irq_desc *desc)
 {
-	struct dpu_mdss *dpu_mdss = arg;
+	struct dpu_mdss *dpu_mdss = irq_desc_get_handler_data(desc);
+	struct irq_chip *chip = irq_desc_get_chip(desc);
 	u32 interrupts;
 
+	chained_irq_enter(chip, desc);
+
 	interrupts = readl_relaxed(dpu_mdss->mmio + HW_INTR_STATUS);
 
 	while (interrupts) {
@@ -39,20 +42,20 @@ static irqreturn_t dpu_mdss_irq(int irq, void *arg)
 					   hwirq);
 		if (mapping == 0) {
 			DRM_ERROR("couldn't find irq mapping for %lu\n", hwirq);
-			return IRQ_NONE;
+			break;
 		}
 
 		rc = generic_handle_irq(mapping);
 		if (rc < 0) {
 			DRM_ERROR("handle irq fail: irq=%lu mapping=%u rc=%d\n",
 				  hwirq, mapping, rc);
-			return IRQ_NONE;
+			break;
 		}
 
 		interrupts &= ~(1 << hwirq);
 	}
 
-	return IRQ_HANDLED;
+	chained_irq_exit(chip, desc);
 }
 
 static void dpu_mdss_irq_mask(struct irq_data *irqd)
@@ -83,16 +86,16 @@ static struct irq_chip dpu_mdss_irq_chip = {
 	.irq_unmask = dpu_mdss_irq_unmask,
 };
 
+static struct lock_class_key dpu_mdss_lock_key, dpu_mdss_request_key;
+
 static int dpu_mdss_irqdomain_map(struct irq_domain *domain,
 		unsigned int irq, irq_hw_number_t hwirq)
 {
 	struct dpu_mdss *dpu_mdss = domain->host_data;
-	int ret;
 
+	irq_set_lockdep_class(irq, &dpu_mdss_lock_key, &dpu_mdss_request_key);
 	irq_set_chip_and_handler(irq, &dpu_mdss_irq_chip, handle_level_irq);
-	ret = irq_set_chip_data(irq, dpu_mdss);
-
-	return ret;
+	return irq_set_chip_data(irq, dpu_mdss);
 }
 
 static const struct irq_domain_ops dpu_mdss_irqdomain_ops = {
@@ -159,11 +162,13 @@ static void dpu_mdss_destroy(struct drm_device *dev)
 	struct msm_drm_private *priv = dev->dev_private;
 	struct dpu_mdss *dpu_mdss = to_dpu_mdss(priv->mdss);
 	struct dss_module_power *mp = &dpu_mdss->mp;
+	int irq;
 
 	pm_runtime_suspend(dev->dev);
 	pm_runtime_disable(dev->dev);
 	_dpu_mdss_irq_domain_fini(dpu_mdss);
-	free_irq(platform_get_irq(pdev, 0), dpu_mdss);
+	irq = platform_get_irq(pdev, 0);
+	irq_set_chained_handler_and_data(irq, NULL, NULL);
 	msm_dss_put_clk(mp->clk_config, mp->num_clk);
 	devm_kfree(&pdev->dev, mp->clk_config);
 
@@ -187,6 +192,7 @@ int dpu_mdss_init(struct drm_device *dev)
 	struct dpu_mdss *dpu_mdss;
 	struct dss_module_power *mp;
 	int ret = 0;
+	int irq;
 
 	dpu_mdss = devm_kzalloc(dev->dev, sizeof(*dpu_mdss), GFP_KERNEL);
 	if (!dpu_mdss)
@@ -219,12 +225,12 @@ int dpu_mdss_init(struct drm_device *dev)
 	if (ret)
 		goto irq_domain_error;
 
-	ret = request_irq(platform_get_irq(pdev, 0),
-			dpu_mdss_irq, 0, "dpu_mdss_isr", dpu_mdss);
-	if (ret) {
-		DPU_ERROR("failed to init irq: %d\n", ret);
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0)
 		goto irq_error;
-	}
+
+	irq_set_chained_handler_and_data(irq, dpu_mdss_irq,
+					 dpu_mdss);
 
 	pm_runtime_enable(dev->dev);
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index 6aefcd6db46b..b01183b309b9 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -95,8 +95,6 @@ struct dpu_plane {
 
 	enum dpu_sspp pipe;
 	uint32_t features;      /* capabilities from catalog */
-	uint32_t nformats;
-	uint32_t formats[64];
 
 	struct dpu_hw_pipe *pipe_hw;
 	struct dpu_hw_pipe_cfg pipe_cfg;
@@ -121,6 +119,12 @@ struct dpu_plane {
 	bool debugfs_default_scale;
 };
 
+static const uint64_t supported_format_modifiers[] = {
+	DRM_FORMAT_MOD_QCOM_COMPRESSED,
+	DRM_FORMAT_MOD_LINEAR,
+	DRM_FORMAT_MOD_INVALID
+};
+
 #define to_dpu_plane(x) container_of(x, struct dpu_plane, base)
 
 static struct dpu_kms *_dpu_plane_get_kms(struct drm_plane *plane)
@@ -1410,6 +1414,23 @@ static void dpu_plane_early_unregister(struct drm_plane *plane)
 	debugfs_remove_recursive(pdpu->debugfs_root);
 }
 
+static bool dpu_plane_format_mod_supported(struct drm_plane *plane,
+		uint32_t format, uint64_t modifier)
+{
+	if (modifier == DRM_FORMAT_MOD_LINEAR)
+		return true;
+
+	if (modifier == DRM_FORMAT_MOD_QCOM_COMPRESSED) {
+		int i;
+		for (i = 0; i < ARRAY_SIZE(qcom_compressed_supported_formats); i++) {
+			if (format == qcom_compressed_supported_formats[i])
+				return true;
+		}
+	}
+
+	return false;
+}
+
 static const struct drm_plane_funcs dpu_plane_funcs = {
 		.update_plane = drm_atomic_helper_update_plane,
 		.disable_plane = drm_atomic_helper_disable_plane,
@@ -1419,6 +1440,7 @@ static const struct drm_plane_funcs dpu_plane_funcs = {
 		.atomic_destroy_state = dpu_plane_destroy_state,
 		.late_register = dpu_plane_late_register,
 		.early_unregister = dpu_plane_early_unregister,
+		.format_mod_supported = dpu_plane_format_mod_supported,
 };
 
 static const struct drm_plane_helper_funcs dpu_plane_helper_funcs = {
@@ -1444,11 +1466,12 @@ struct drm_plane *dpu_plane_init(struct drm_device *dev,
 		unsigned long possible_crtcs, u32 master_plane_id)
 {
 	struct drm_plane *plane = NULL, *master_plane = NULL;
-	const struct dpu_format_extended *format_list;
+	const uint32_t *format_list;
 	struct dpu_plane *pdpu;
 	struct msm_drm_private *priv = dev->dev_private;
 	struct dpu_kms *kms = to_dpu_kms(priv->kms);
 	int zpos_max = DPU_ZPOS_MAX;
+	uint32_t num_formats;
 	int ret = -EINVAL;
 
 	/* create and zero local structure */
@@ -1491,24 +1514,18 @@ struct drm_plane *dpu_plane_init(struct drm_device *dev,
 		goto clean_sspp;
 	}
 
-	if (!master_plane_id)
-		format_list = pdpu->pipe_sblk->format_list;
-	else
+	if (pdpu->is_virtual) {
 		format_list = pdpu->pipe_sblk->virt_format_list;
-
-	pdpu->nformats = dpu_populate_formats(format_list,
-				pdpu->formats,
-				0,
-				ARRAY_SIZE(pdpu->formats));
-
-	if (!pdpu->nformats) {
-		DPU_ERROR("[%u]no valid formats for plane\n", pipe);
-		goto clean_sspp;
+		num_formats = pdpu->pipe_sblk->virt_num_formats;
+	}
+	else {
+		format_list = pdpu->pipe_sblk->format_list;
+		num_formats = pdpu->pipe_sblk->num_formats;
 	}
 
 	ret = drm_universal_plane_init(dev, plane, 0xff, &dpu_plane_funcs,
-				pdpu->formats, pdpu->nformats,
-				NULL, type, NULL);
+				format_list, num_formats,
+				supported_format_modifiers, type, NULL);
 	if (ret)
 		goto clean_sspp;
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h
index 7fed0b627708..0e6063acd041 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h
@@ -28,23 +28,18 @@
 /**
  * struct dpu_plane_state: Define dpu extension of drm plane state object
  * @base:	base drm plane state object
- * @property_state: Local storage for msm_prop properties
- * @property_values:	cached plane property values
  * @aspace:	pointer to address space for input/output buffers
- * @input_fence:	dereferenced input fence pointer
  * @stage:	assigned by crtc blender
  * @multirect_index: index of the rectangle of SSPP
  * @multirect_mode: parallel or time multiplex multirect mode
  * @pending:	whether the current update is still pending
  * @scaler3_cfg: configuration data for scaler3
  * @pixel_ext: configuration data for pixel extensions
- * @scaler_check_state: indicates status of user provided pixel extension data
  * @cdp_cfg:	CDP configuration
  */
 struct dpu_plane_state {
 	struct drm_plane_state base;
 	struct msm_gem_address_space *aspace;
-	void *input_fence;
 	enum dpu_stage stage;
 	uint32_t multirect_index;
 	uint32_t multirect_mode;
@@ -107,12 +102,6 @@ void dpu_plane_restore(struct drm_plane *plane);
 void dpu_plane_flush(struct drm_plane *plane);
 
 /**
- * dpu_plane_kickoff - final plane operations before commit kickoff
- * @plane: Pointer to drm plane structure
- */
-void dpu_plane_kickoff(struct drm_plane *plane);
-
-/**
  * dpu_plane_set_error: enable/disable error condition
  * @plane: pointer to drm_plane structure
  */
@@ -147,14 +136,6 @@ int dpu_plane_validate_multirect_v2(struct dpu_multirect_plane_states *plane);
 void dpu_plane_clear_multirect(const struct drm_plane_state *drm_state);
 
 /**
- * dpu_plane_wait_input_fence - wait for input fence object
- * @plane:   Pointer to DRM plane object
- * @wait_ms: Wait timeout value
- * Returns: Zero on success
- */
-int dpu_plane_wait_input_fence(struct drm_plane *plane, uint32_t wait_ms);
-
-/**
  * dpu_plane_color_fill - enables color fill on plane
  * @plane:  Pointer to DRM plane object
  * @color:  RGB fill color value, [23..16] Blue, [15..8] Green, [7..0] Red
@@ -164,12 +145,4 @@ int dpu_plane_wait_input_fence(struct drm_plane *plane, uint32_t wait_ms);
 int dpu_plane_color_fill(struct drm_plane *plane,
 		uint32_t color, uint32_t alpha);
 
-/**
- * dpu_plane_set_revalidate - sets revalidate flag which forces a full
- *	validation of the plane properties in the next atomic check
- * @plane: Pointer to DRM plane object
- * @enable: Boolean to set/unset the flag
- */
-void dpu_plane_set_revalidate(struct drm_plane *plane, bool enable);
-
 #endif /* _DPU_PLANE_H_ */
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index bdb117709674..037d9f4187f9 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -21,8 +21,8 @@
 #include "dpu_encoder.h"
 #include "dpu_trace.h"
 
-#define RESERVED_BY_OTHER(h, r) \
-	((h)->rsvp && ((h)->rsvp->enc_id != (r)->enc_id))
+#define RESERVED_BY_OTHER(h, r)  \
+		((h)->enc_id && (h)->enc_id != r)
 
 /**
  * struct dpu_rm_requirements - Reservation requirements parameter bundle
@@ -34,90 +34,21 @@ struct dpu_rm_requirements {
 	struct dpu_encoder_hw_resources hw_res;
 };
 
-/**
- * struct dpu_rm_rsvp - Use Case Reservation tagging structure
- *	Used to tag HW blocks as reserved by a CRTC->Encoder->Connector chain
- *	By using as a tag, rather than lists of pointers to HW blocks used
- *	we can avoid some list management since we don't know how many blocks
- *	of each type a given use case may require.
- * @list:	List head for list of all reservations
- * @seq:	Global RSVP sequence number for debugging, especially for
- *		differentiating differenct allocations for same encoder.
- * @enc_id:	Reservations are tracked by Encoder DRM object ID.
- *		CRTCs may be connected to multiple Encoders.
- *		An encoder or connector id identifies the display path.
- */
-struct dpu_rm_rsvp {
-	struct list_head list;
-	uint32_t seq;
-	uint32_t enc_id;
-};
 
 /**
  * struct dpu_rm_hw_blk - hardware block tracking list member
  * @list:	List head for list of all hardware blocks tracking items
- * @rsvp:	Pointer to use case reservation if reserved by a client
- * @rsvp_nxt:	Temporary pointer used during reservation to the incoming
- *		request. Will be swapped into rsvp if proposal is accepted
- * @type:	Type of hardware block this structure tracks
  * @id:		Hardware ID number, within it's own space, ie. LM_X
- * @catalog:	Pointer to the hardware catalog entry for this block
+ * @enc_id:	Encoder id to which this blk is binded
  * @hw:		Pointer to the hardware register access object for this block
  */
 struct dpu_rm_hw_blk {
 	struct list_head list;
-	struct dpu_rm_rsvp *rsvp;
-	struct dpu_rm_rsvp *rsvp_nxt;
-	enum dpu_hw_blk_type type;
 	uint32_t id;
+	uint32_t enc_id;
 	struct dpu_hw_blk *hw;
 };
 
-/**
- * dpu_rm_dbg_rsvp_stage - enum of steps in making reservation for event logging
- */
-enum dpu_rm_dbg_rsvp_stage {
-	DPU_RM_STAGE_BEGIN,
-	DPU_RM_STAGE_AFTER_CLEAR,
-	DPU_RM_STAGE_AFTER_RSVPNEXT,
-	DPU_RM_STAGE_FINAL
-};
-
-static void _dpu_rm_print_rsvps(
-		struct dpu_rm *rm,
-		enum dpu_rm_dbg_rsvp_stage stage)
-{
-	struct dpu_rm_rsvp *rsvp;
-	struct dpu_rm_hw_blk *blk;
-	enum dpu_hw_blk_type type;
-
-	DPU_DEBUG("%d\n", stage);
-
-	list_for_each_entry(rsvp, &rm->rsvps, list) {
-		DRM_DEBUG_KMS("%d rsvp[s%ue%u]\n", stage, rsvp->seq,
-			      rsvp->enc_id);
-	}
-
-	for (type = 0; type < DPU_HW_BLK_MAX; type++) {
-		list_for_each_entry(blk, &rm->hw_blks[type], list) {
-			if (!blk->rsvp && !blk->rsvp_nxt)
-				continue;
-
-			DRM_DEBUG_KMS("%d rsvp[s%ue%u->s%ue%u] %d %d\n", stage,
-				(blk->rsvp) ? blk->rsvp->seq : 0,
-				(blk->rsvp) ? blk->rsvp->enc_id : 0,
-				(blk->rsvp_nxt) ? blk->rsvp_nxt->seq : 0,
-				(blk->rsvp_nxt) ? blk->rsvp_nxt->enc_id : 0,
-				blk->type, blk->id);
-		}
-	}
-}
-
-struct dpu_hw_mdp *dpu_rm_get_mdp(struct dpu_rm *rm)
-{
-	return rm->hw_mdp;
-}
-
 void dpu_rm_init_hw_iter(
 		struct dpu_rm_hw_iter *iter,
 		uint32_t enc_id,
@@ -148,15 +79,7 @@ static bool _dpu_rm_get_hw_locked(struct dpu_rm *rm, struct dpu_rm_hw_iter *i)
 	i->blk = list_prepare_entry(i->blk, blk_list, list);
 
 	list_for_each_entry_continue(i->blk, blk_list, list) {
-		struct dpu_rm_rsvp *rsvp = i->blk->rsvp;
-
-		if (i->blk->type != i->type) {
-			DPU_ERROR("found incorrect block type %d on %d list\n",
-					i->blk->type, i->type);
-			return false;
-		}
-
-		if ((i->enc_id == 0) || (rsvp && rsvp->enc_id == i->enc_id)) {
+		if (i->enc_id == i->blk->enc_id) {
 			i->hw = i->blk->hw;
 			DPU_DEBUG("found type %d id %d for enc %d\n",
 					i->type, i->blk->id, i->enc_id);
@@ -208,34 +131,18 @@ static void _dpu_rm_hw_destroy(enum dpu_hw_blk_type type, void *hw)
 
 int dpu_rm_destroy(struct dpu_rm *rm)
 {
-
-	struct dpu_rm_rsvp *rsvp_cur, *rsvp_nxt;
 	struct dpu_rm_hw_blk *hw_cur, *hw_nxt;
 	enum dpu_hw_blk_type type;
 
-	if (!rm) {
-		DPU_ERROR("invalid rm\n");
-		return -EINVAL;
-	}
-
-	list_for_each_entry_safe(rsvp_cur, rsvp_nxt, &rm->rsvps, list) {
-		list_del(&rsvp_cur->list);
-		kfree(rsvp_cur);
-	}
-
-
 	for (type = 0; type < DPU_HW_BLK_MAX; type++) {
 		list_for_each_entry_safe(hw_cur, hw_nxt, &rm->hw_blks[type],
 				list) {
 			list_del(&hw_cur->list);
-			_dpu_rm_hw_destroy(hw_cur->type, hw_cur->hw);
+			_dpu_rm_hw_destroy(type, hw_cur->hw);
 			kfree(hw_cur);
 		}
 	}
 
-	dpu_hw_mdp_destroy(rm->hw_mdp);
-	rm->hw_mdp = NULL;
-
 	mutex_destroy(&rm->rm_lock);
 
 	return 0;
@@ -250,11 +157,8 @@ static int _dpu_rm_hw_blk_create(
 		void *hw_catalog_info)
 {
 	struct dpu_rm_hw_blk *blk;
-	struct dpu_hw_mdp *hw_mdp;
 	void *hw;
 
-	hw_mdp = rm->hw_mdp;
-
 	switch (type) {
 	case DPU_HW_BLK_LM:
 		hw = dpu_hw_lm_init(id, mmio, cat);
@@ -290,9 +194,9 @@ static int _dpu_rm_hw_blk_create(
 		return -ENOMEM;
 	}
 
-	blk->type = type;
 	blk->id = id;
 	blk->hw = hw;
+	blk->enc_id = 0;
 	list_add_tail(&blk->list, &rm->hw_blks[type]);
 
 	return 0;
@@ -300,13 +204,12 @@ static int _dpu_rm_hw_blk_create(
 
 int dpu_rm_init(struct dpu_rm *rm,
 		struct dpu_mdss_cfg *cat,
-		void __iomem *mmio,
-		struct drm_device *dev)
+		void __iomem *mmio)
 {
 	int rc, i;
 	enum dpu_hw_blk_type type;
 
-	if (!rm || !cat || !mmio || !dev) {
+	if (!rm || !cat || !mmio) {
 		DPU_ERROR("invalid kms\n");
 		return -EINVAL;
 	}
@@ -316,21 +219,9 @@ int dpu_rm_init(struct dpu_rm *rm,
 
 	mutex_init(&rm->rm_lock);
 
-	INIT_LIST_HEAD(&rm->rsvps);
 	for (type = 0; type < DPU_HW_BLK_MAX; type++)
 		INIT_LIST_HEAD(&rm->hw_blks[type]);
 
-	rm->dev = dev;
-
-	/* Some of the sub-blocks require an mdptop to be created */
-	rm->hw_mdp = dpu_hw_mdptop_init(MDP_TOP, mmio, cat);
-	if (IS_ERR_OR_NULL(rm->hw_mdp)) {
-		rc = PTR_ERR(rm->hw_mdp);
-		rm->hw_mdp = NULL;
-		DPU_ERROR("failed: mdp hw not available\n");
-		goto fail;
-	}
-
 	/* Interrogate HW catalog and create tracking items for hw blocks */
 	for (i = 0; i < cat->mixer_count; i++) {
 		struct dpu_lm_cfg *lm = &cat->mixer[i];
@@ -410,7 +301,7 @@ static bool _dpu_rm_needs_split_display(const struct msm_display_topology *top)
  *	proposed use case requirements, incl. hardwired dependent blocks like
  *	pingpong
  * @rm: dpu resource manager handle
- * @rsvp: reservation currently being created
+ * @enc_id: encoder id requesting for allocation
  * @reqs: proposed use case requirements
  * @lm: proposed layer mixer, function checks if lm, and all other hardwired
  *      blocks connected to the lm (pp) is available and appropriate
@@ -422,7 +313,7 @@ static bool _dpu_rm_needs_split_display(const struct msm_display_topology *top)
  */
 static bool _dpu_rm_check_lm_and_get_connected_blks(
 		struct dpu_rm *rm,
-		struct dpu_rm_rsvp *rsvp,
+		uint32_t enc_id,
 		struct dpu_rm_requirements *reqs,
 		struct dpu_rm_hw_blk *lm,
 		struct dpu_rm_hw_blk **pp,
@@ -449,7 +340,7 @@ static bool _dpu_rm_check_lm_and_get_connected_blks(
 	}
 
 	/* Already reserved? */
-	if (RESERVED_BY_OTHER(lm, rsvp)) {
+	if (RESERVED_BY_OTHER(lm, enc_id)) {
 		DPU_DEBUG("lm %d already reserved\n", lm_cfg->id);
 		return false;
 	}
@@ -467,7 +358,7 @@ static bool _dpu_rm_check_lm_and_get_connected_blks(
 		return false;
 	}
 
-	if (RESERVED_BY_OTHER(*pp, rsvp)) {
+	if (RESERVED_BY_OTHER(*pp, enc_id)) {
 		DPU_DEBUG("lm %d pp %d already reserved\n", lm->id,
 				(*pp)->id);
 		return false;
@@ -476,10 +367,8 @@ static bool _dpu_rm_check_lm_and_get_connected_blks(
 	return true;
 }
 
-static int _dpu_rm_reserve_lms(
-		struct dpu_rm *rm,
-		struct dpu_rm_rsvp *rsvp,
-		struct dpu_rm_requirements *reqs)
+static int _dpu_rm_reserve_lms(struct dpu_rm *rm, uint32_t enc_id,
+			       struct dpu_rm_requirements *reqs)
 
 {
 	struct dpu_rm_hw_blk *lm[MAX_BLOCKS];
@@ -504,7 +393,7 @@ static int _dpu_rm_reserve_lms(
 		lm[lm_count] = iter_i.blk;
 
 		if (!_dpu_rm_check_lm_and_get_connected_blks(
-				rm, rsvp, reqs, lm[lm_count],
+				rm, enc_id, reqs, lm[lm_count],
 				&pp[lm_count], NULL))
 			continue;
 
@@ -519,7 +408,7 @@ static int _dpu_rm_reserve_lms(
 				continue;
 
 			if (!_dpu_rm_check_lm_and_get_connected_blks(
-					rm, rsvp, reqs, iter_j.blk,
+					rm, enc_id, reqs, iter_j.blk,
 					&pp[lm_count], iter_i.blk))
 				continue;
 
@@ -537,11 +426,10 @@ static int _dpu_rm_reserve_lms(
 		if (!lm[i])
 			break;
 
-		lm[i]->rsvp_nxt = rsvp;
-		pp[i]->rsvp_nxt = rsvp;
+		lm[i]->enc_id = enc_id;
+		pp[i]->enc_id = enc_id;
 
-		trace_dpu_rm_reserve_lms(lm[i]->id, lm[i]->type, rsvp->enc_id,
-					 pp[i]->id);
+		trace_dpu_rm_reserve_lms(lm[i]->id, enc_id, pp[i]->id);
 	}
 
 	return rc;
@@ -549,7 +437,7 @@ static int _dpu_rm_reserve_lms(
 
 static int _dpu_rm_reserve_ctls(
 		struct dpu_rm *rm,
-		struct dpu_rm_rsvp *rsvp,
+		uint32_t enc_id,
 		const struct msm_display_topology *top)
 {
 	struct dpu_rm_hw_blk *ctls[MAX_BLOCKS];
@@ -570,7 +458,7 @@ static int _dpu_rm_reserve_ctls(
 		unsigned long features = ctl->caps->features;
 		bool has_split_display;
 
-		if (RESERVED_BY_OTHER(iter.blk, rsvp))
+		if (RESERVED_BY_OTHER(iter.blk, enc_id))
 			continue;
 
 		has_split_display = BIT(DPU_CTL_SPLIT_DISPLAY) & features;
@@ -591,9 +479,8 @@ static int _dpu_rm_reserve_ctls(
 		return -ENAVAIL;
 
 	for (i = 0; i < ARRAY_SIZE(ctls) && i < num_ctls; i++) {
-		ctls[i]->rsvp_nxt = rsvp;
-		trace_dpu_rm_reserve_ctls(ctls[i]->id, ctls[i]->type,
-					  rsvp->enc_id);
+		ctls[i]->enc_id = enc_id;
+		trace_dpu_rm_reserve_ctls(ctls[i]->id, enc_id);
 	}
 
 	return 0;
@@ -601,7 +488,7 @@ static int _dpu_rm_reserve_ctls(
 
 static int _dpu_rm_reserve_intf(
 		struct dpu_rm *rm,
-		struct dpu_rm_rsvp *rsvp,
+		uint32_t enc_id,
 		uint32_t id,
 		enum dpu_hw_blk_type type)
 {
@@ -614,14 +501,13 @@ static int _dpu_rm_reserve_intf(
 		if (iter.blk->id != id)
 			continue;
 
-		if (RESERVED_BY_OTHER(iter.blk, rsvp)) {
+		if (RESERVED_BY_OTHER(iter.blk, enc_id)) {
 			DPU_ERROR("type %d id %d already reserved\n", type, id);
 			return -ENAVAIL;
 		}
 
-		iter.blk->rsvp_nxt = rsvp;
-		trace_dpu_rm_reserve_intf(iter.blk->id, iter.blk->type,
-					  rsvp->enc_id);
+		iter.blk->enc_id = enc_id;
+		trace_dpu_rm_reserve_intf(iter.blk->id, enc_id);
 		break;
 	}
 
@@ -636,7 +522,7 @@ static int _dpu_rm_reserve_intf(
 
 static int _dpu_rm_reserve_intf_related_hw(
 		struct dpu_rm *rm,
-		struct dpu_rm_rsvp *rsvp,
+		uint32_t enc_id,
 		struct dpu_encoder_hw_resources *hw_res)
 {
 	int i, ret = 0;
@@ -646,7 +532,7 @@ static int _dpu_rm_reserve_intf_related_hw(
 		if (hw_res->intfs[i] == INTF_MODE_NONE)
 			continue;
 		id = i + INTF_0;
-		ret = _dpu_rm_reserve_intf(rm, rsvp, id,
+		ret = _dpu_rm_reserve_intf(rm, enc_id, id,
 				DPU_HW_BLK_INTF);
 		if (ret)
 			return ret;
@@ -655,33 +541,27 @@ static int _dpu_rm_reserve_intf_related_hw(
 	return ret;
 }
 
-static int _dpu_rm_make_next_rsvp(
+static int _dpu_rm_make_reservation(
 		struct dpu_rm *rm,
 		struct drm_encoder *enc,
 		struct drm_crtc_state *crtc_state,
-		struct dpu_rm_rsvp *rsvp,
 		struct dpu_rm_requirements *reqs)
 {
 	int ret;
 
-	/* Create reservation info, tag reserved blocks with it as we go */
-	rsvp->seq = ++rm->rsvp_next_seq;
-	rsvp->enc_id = enc->base.id;
-	list_add_tail(&rsvp->list, &rm->rsvps);
-
-	ret = _dpu_rm_reserve_lms(rm, rsvp, reqs);
+	ret = _dpu_rm_reserve_lms(rm, enc->base.id, reqs);
 	if (ret) {
 		DPU_ERROR("unable to find appropriate mixers\n");
 		return ret;
 	}
 
-	ret = _dpu_rm_reserve_ctls(rm, rsvp, &reqs->topology);
+	ret = _dpu_rm_reserve_ctls(rm, enc->base.id, &reqs->topology);
 	if (ret) {
 		DPU_ERROR("unable to find appropriate CTL\n");
 		return ret;
 	}
 
-	ret = _dpu_rm_reserve_intf_related_hw(rm, rsvp, &reqs->hw_res);
+	ret = _dpu_rm_reserve_intf_related_hw(rm, enc->base.id, &reqs->hw_res);
 	if (ret)
 		return ret;
 
@@ -706,108 +586,31 @@ static int _dpu_rm_populate_requirements(
 	return 0;
 }
 
-static struct dpu_rm_rsvp *_dpu_rm_get_rsvp(
-		struct dpu_rm *rm,
-		struct drm_encoder *enc)
+static void _dpu_rm_release_reservation(struct dpu_rm *rm, uint32_t enc_id)
 {
-	struct dpu_rm_rsvp *i;
-
-	if (!rm || !enc) {
-		DPU_ERROR("invalid params\n");
-		return NULL;
-	}
-
-	if (list_empty(&rm->rsvps))
-		return NULL;
-
-	list_for_each_entry(i, &rm->rsvps, list)
-		if (i->enc_id == enc->base.id)
-			return i;
-
-	return NULL;
-}
-
-/**
- * _dpu_rm_release_rsvp - release resources and release a reservation
- * @rm:	KMS handle
- * @rsvp:	RSVP pointer to release and release resources for
- */
-static void _dpu_rm_release_rsvp(struct dpu_rm *rm, struct dpu_rm_rsvp *rsvp)
-{
-	struct dpu_rm_rsvp *rsvp_c, *rsvp_n;
 	struct dpu_rm_hw_blk *blk;
 	enum dpu_hw_blk_type type;
 
-	if (!rsvp)
-		return;
-
-	DPU_DEBUG("rel rsvp %d enc %d\n", rsvp->seq, rsvp->enc_id);
-
-	list_for_each_entry_safe(rsvp_c, rsvp_n, &rm->rsvps, list) {
-		if (rsvp == rsvp_c) {
-			list_del(&rsvp_c->list);
-			break;
-		}
-	}
-
 	for (type = 0; type < DPU_HW_BLK_MAX; type++) {
 		list_for_each_entry(blk, &rm->hw_blks[type], list) {
-			if (blk->rsvp == rsvp) {
-				blk->rsvp = NULL;
-				DPU_DEBUG("rel rsvp %d enc %d %d %d\n",
-						rsvp->seq, rsvp->enc_id,
-						blk->type, blk->id);
-			}
-			if (blk->rsvp_nxt == rsvp) {
-				blk->rsvp_nxt = NULL;
-				DPU_DEBUG("rel rsvp_nxt %d enc %d %d %d\n",
-						rsvp->seq, rsvp->enc_id,
-						blk->type, blk->id);
+			if (blk->enc_id == enc_id) {
+				blk->enc_id = 0;
+				DPU_DEBUG("rel enc %d %d %d\n", enc_id,
+					  type, blk->id);
 			}
 		}
 	}
-
-	kfree(rsvp);
 }
 
 void dpu_rm_release(struct dpu_rm *rm, struct drm_encoder *enc)
 {
-	struct dpu_rm_rsvp *rsvp;
-
-	if (!rm || !enc) {
-		DPU_ERROR("invalid params\n");
-		return;
-	}
-
 	mutex_lock(&rm->rm_lock);
 
-	rsvp = _dpu_rm_get_rsvp(rm, enc);
-	if (!rsvp) {
-		DPU_ERROR("failed to find rsvp for enc %d\n", enc->base.id);
-		goto end;
-	}
+	_dpu_rm_release_reservation(rm, enc->base.id);
 
-	_dpu_rm_release_rsvp(rm, rsvp);
-end:
 	mutex_unlock(&rm->rm_lock);
 }
 
-static void _dpu_rm_commit_rsvp(struct dpu_rm *rm, struct dpu_rm_rsvp *rsvp)
-{
-	struct dpu_rm_hw_blk *blk;
-	enum dpu_hw_blk_type type;
-
-	/* Swap next rsvp to be the active */
-	for (type = 0; type < DPU_HW_BLK_MAX; type++) {
-		list_for_each_entry(blk, &rm->hw_blks[type], list) {
-			if (blk->rsvp_nxt) {
-				blk->rsvp = blk->rsvp_nxt;
-				blk->rsvp_nxt = NULL;
-			}
-		}
-	}
-}
-
 int dpu_rm_reserve(
 		struct dpu_rm *rm,
 		struct drm_encoder *enc,
@@ -815,7 +618,6 @@ int dpu_rm_reserve(
 		struct msm_display_topology topology,
 		bool test_only)
 {
-	struct dpu_rm_rsvp *rsvp_cur, *rsvp_nxt;
 	struct dpu_rm_requirements reqs;
 	int ret;
 
@@ -828,8 +630,6 @@ int dpu_rm_reserve(
 
 	mutex_lock(&rm->rm_lock);
 
-	_dpu_rm_print_rsvps(rm, DPU_RM_STAGE_BEGIN);
-
 	ret = _dpu_rm_populate_requirements(rm, enc, crtc_state, &reqs,
 					    topology);
 	if (ret) {
@@ -837,50 +637,17 @@ int dpu_rm_reserve(
 		goto end;
 	}
 
-	/*
-	 * We only support one active reservation per-hw-block. But to implement
-	 * transactional semantics for test-only, and for allowing failure while
-	 * modifying your existing reservation, over the course of this
-	 * function we can have two reservations:
-	 * Current: Existing reservation
-	 * Next: Proposed reservation. The proposed reservation may fail, or may
-	 *       be discarded if in test-only mode.
-	 * If reservation is successful, and we're not in test-only, then we
-	 * replace the current with the next.
-	 */
-	rsvp_nxt = kzalloc(sizeof(*rsvp_nxt), GFP_KERNEL);
-	if (!rsvp_nxt) {
-		ret = -ENOMEM;
-		goto end;
-	}
-
-	rsvp_cur = _dpu_rm_get_rsvp(rm, enc);
-
-	/* Check the proposed reservation, store it in hw's "next" field */
-	ret = _dpu_rm_make_next_rsvp(rm, enc, crtc_state, rsvp_nxt, &reqs);
-
-	_dpu_rm_print_rsvps(rm, DPU_RM_STAGE_AFTER_RSVPNEXT);
-
+	ret = _dpu_rm_make_reservation(rm, enc, crtc_state, &reqs);
 	if (ret) {
 		DPU_ERROR("failed to reserve hw resources: %d\n", ret);
-		_dpu_rm_release_rsvp(rm, rsvp_nxt);
+		_dpu_rm_release_reservation(rm, enc->base.id);
 	} else if (test_only) {
-		/*
-		 * Normally, if test_only, test the reservation and then undo
-		 * However, if the user requests LOCK, then keep the reservation
-		 * made during the atomic_check phase.
-		 */
-		DPU_DEBUG("test_only: discard test rsvp[s%de%d]\n",
-				rsvp_nxt->seq, rsvp_nxt->enc_id);
-		_dpu_rm_release_rsvp(rm, rsvp_nxt);
-	} else {
-		_dpu_rm_release_rsvp(rm, rsvp_cur);
-
-		_dpu_rm_commit_rsvp(rm, rsvp_nxt);
+		 /* test_only: test the reservation and then undo */
+		DPU_DEBUG("test_only: discard test [enc: %d]\n",
+				enc->base.id);
+		_dpu_rm_release_reservation(rm, enc->base.id);
 	}
 
-	_dpu_rm_print_rsvps(rm, DPU_RM_STAGE_FINAL);
-
 end:
 	mutex_unlock(&rm->rm_lock);
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
index b8273bd23801..381611fc5877 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
@@ -22,22 +22,14 @@
 
 /**
  * struct dpu_rm - DPU dynamic hardware resource manager
- * @dev: device handle for event logging purposes
- * @rsvps: list of hardware reservations by each crtc->encoder->connector
  * @hw_blks: array of lists of hardware resources present in the system, one
  *	list per type of hardware block
- * @hw_mdp: hardware object for mdp_top
  * @lm_max_width: cached layer mixer maximum width
- * @rsvp_next_seq: sequence number for next reservation for debugging purposes
  * @rm_lock: resource manager mutex
  */
 struct dpu_rm {
-	struct drm_device *dev;
-	struct list_head rsvps;
 	struct list_head hw_blks[DPU_HW_BLK_MAX];
-	struct dpu_hw_mdp *hw_mdp;
 	uint32_t lm_max_width;
-	uint32_t rsvp_next_seq;
 	struct mutex rm_lock;
 };
 
@@ -67,13 +59,11 @@ struct dpu_rm_hw_iter {
  * @rm: DPU Resource Manager handle
  * @cat: Pointer to hardware catalog
  * @mmio: mapped register io address of MDP
- * @dev: device handle for event logging purposes
  * @Return: 0 on Success otherwise -ERROR
  */
 int dpu_rm_init(struct dpu_rm *rm,
 		struct dpu_mdss_cfg *cat,
-		void __iomem *mmio,
-		struct drm_device *dev);
+		void __iomem *mmio);
 
 /**
  * dpu_rm_destroy - Free all memory allocated by dpu_rm_init
@@ -112,14 +102,6 @@ int dpu_rm_reserve(struct dpu_rm *rm,
 void dpu_rm_release(struct dpu_rm *rm, struct drm_encoder *enc);
 
 /**
- * dpu_rm_get_mdp - Retrieve HW block for MDP TOP.
- *	This is never reserved, and is usable by any display.
- * @rm: DPU Resource Manager handle
- * @Return: Pointer to hw block or NULL
- */
-struct dpu_hw_mdp *dpu_rm_get_mdp(struct dpu_rm *rm);
-
-/**
  * dpu_rm_init_hw_iter - setup given iterator for new iteration over hw list
  *	using dpu_rm_get_hw
  * @iter: iter object to initialize
@@ -144,12 +126,4 @@ void dpu_rm_init_hw_iter(
  * @Return: true on match found, false on no match found
  */
 bool dpu_rm_get_hw(struct dpu_rm *rm, struct dpu_rm_hw_iter *iter);
-
-/**
- * dpu_rm_check_property_topctl - validate property bitmask before it is set
- * @val: user's proposed topology control bitmask
- * @Return: 0 on success or error
- */
-int dpu_rm_check_property_topctl(uint64_t val);
-
 #endif /* __DPU_RM_H__ */
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h
index c78b521ceda1..8bb46090bd16 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h
@@ -831,48 +831,42 @@ TRACE_EVENT(dpu_plane_disable,
 );
 
 DECLARE_EVENT_CLASS(dpu_rm_iter_template,
-	TP_PROTO(uint32_t id, enum dpu_hw_blk_type type, uint32_t enc_id),
-	TP_ARGS(id, type, enc_id),
+	TP_PROTO(uint32_t id, uint32_t enc_id),
+	TP_ARGS(id, enc_id),
 	TP_STRUCT__entry(
 		__field(	uint32_t,		id	)
-		__field(	enum dpu_hw_blk_type,	type	)
 		__field(	uint32_t,		enc_id	)
 	),
 	TP_fast_assign(
 		__entry->id = id;
-		__entry->type = type;
 		__entry->enc_id = enc_id;
 	),
-	TP_printk("id:%d type:%d enc_id:%u", __entry->id, __entry->type,
-		  __entry->enc_id)
+	TP_printk("id:%d enc_id:%u", __entry->id, __entry->enc_id)
 );
 DEFINE_EVENT(dpu_rm_iter_template, dpu_rm_reserve_intf,
-	TP_PROTO(uint32_t id, enum dpu_hw_blk_type type, uint32_t enc_id),
-	TP_ARGS(id, type, enc_id)
+	TP_PROTO(uint32_t id, uint32_t enc_id),
+	TP_ARGS(id, enc_id)
 );
 DEFINE_EVENT(dpu_rm_iter_template, dpu_rm_reserve_ctls,
-	TP_PROTO(uint32_t id, enum dpu_hw_blk_type type, uint32_t enc_id),
-	TP_ARGS(id, type, enc_id)
+	TP_PROTO(uint32_t id, uint32_t enc_id),
+	TP_ARGS(id, enc_id)
 );
 
 TRACE_EVENT(dpu_rm_reserve_lms,
-	TP_PROTO(uint32_t id, enum dpu_hw_blk_type type, uint32_t enc_id,
-		 uint32_t pp_id),
-	TP_ARGS(id, type, enc_id, pp_id),
+	TP_PROTO(uint32_t id, uint32_t enc_id, uint32_t pp_id),
+	TP_ARGS(id, enc_id, pp_id),
 	TP_STRUCT__entry(
 		__field(	uint32_t,		id	)
-		__field(	enum dpu_hw_blk_type,	type	)
 		__field(	uint32_t,		enc_id	)
 		__field(	uint32_t,		pp_id	)
 	),
 	TP_fast_assign(
 		__entry->id = id;
-		__entry->type = type;
 		__entry->enc_id = enc_id;
 		__entry->pp_id = pp_id;
 	),
-	TP_printk("id:%d type:%d enc_id:%u pp_id:%u", __entry->id,
-		  __entry->type, __entry->enc_id, __entry->pp_id)
+	TP_printk("id:%d enc_id:%u pp_id:%u", __entry->id,
+		  __entry->enc_id, __entry->pp_id)
 );
 
 TRACE_EVENT(dpu_vbif_wait_xin_halt_fail,
diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c
index 8f2359dc87b4..0cfd4c06b610 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c
@@ -16,9 +16,9 @@
  */
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_flip_work.h>
 #include <drm/drm_mode.h>
+#include <drm/drm_probe_helper.h>
 
 #include "mdp4_kms.h"
 
@@ -244,14 +244,8 @@ static void mdp4_crtc_mode_set_nofb(struct drm_crtc *crtc)
 
 	mode = &crtc->state->adjusted_mode;
 
-	DBG("%s: set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-			mdp4_crtc->name, mode->base.id, mode->name,
-			mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal,
-			mode->type, mode->flags);
+	DBG("%s: set mode: " DRM_MODE_FMT,
+			mdp4_crtc->name, DRM_MODE_ARG(mode));
 
 	mdp4_write(mdp4_kms, REG_MDP4_DMA_SRC_SIZE(dma),
 			MDP4_DMA_SRC_SIZE_WIDTH(mode->hdisplay) |
diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_dsi_encoder.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_dsi_encoder.c
index 6a1ebdace391..caa39b4621e3 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_dsi_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_dsi_encoder.c
@@ -18,7 +18,7 @@
  */
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "mdp4_kms.h"
 
@@ -58,14 +58,7 @@ static void mdp4_dsi_encoder_mode_set(struct drm_encoder *encoder,
 
 	mode = adjusted_mode;
 
-	DBG("set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-			mode->base.id, mode->name,
-			mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal,
-			mode->type, mode->flags);
+	DBG("set mode: " DRM_MODE_FMT, DRM_MODE_ARG(mode));
 
 	ctrl_pol = 0;
 	if (mode->flags & DRM_MODE_FLAG_NHSYNC)
diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_dtv_encoder.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_dtv_encoder.c
index a8fd14d4846b..259d51971401 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_dtv_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_dtv_encoder.c
@@ -16,7 +16,7 @@
  */
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "mdp4_kms.h"
 
@@ -104,14 +104,7 @@ static void mdp4_dtv_encoder_mode_set(struct drm_encoder *encoder,
 
 	mode = adjusted_mode;
 
-	DBG("set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-			mode->base.id, mode->name,
-			mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal,
-			mode->type, mode->flags);
+	DBG("set mode: " DRM_MODE_FMT, DRM_MODE_ARG(mode));
 
 	mdp4_dtv_encoder->pixclock = mode->clock * 1000;
 
diff --git a/drivers/gpu/drm/msm/disp/mdp4/mdp4_lcdc_encoder.c b/drivers/gpu/drm/msm/disp/mdp4/mdp4_lcdc_encoder.c
index c9e34501a89e..df6f9803a1d7 100644
--- a/drivers/gpu/drm/msm/disp/mdp4/mdp4_lcdc_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp4/mdp4_lcdc_encoder.c
@@ -17,7 +17,7 @@
  */
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "mdp4_kms.h"
 
@@ -273,14 +273,7 @@ static void mdp4_lcdc_encoder_mode_set(struct drm_encoder *encoder,
 
 	mode = adjusted_mode;
 
-	DBG("set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-			mode->base.id, mode->name,
-			mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal,
-			mode->type, mode->flags);
+	DBG("set mode: " DRM_MODE_FMT, DRM_MODE_ARG(mode));
 
 	mdp4_lcdc_encoder->pixclock = mode->clock * 1000;
 
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
index c1962f29ec7d..9bf9d6065c55 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_cmd_encoder.c
@@ -12,7 +12,7 @@
  */
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "mdp5_kms.h"
 
@@ -134,14 +134,7 @@ void mdp5_cmd_encoder_mode_set(struct drm_encoder *encoder,
 {
 	mode = adjusted_mode;
 
-	DBG("set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-			mode->base.id, mode->name,
-			mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal,
-			mode->type, mode->flags);
+	DBG("set mode: " DRM_MODE_FMT, DRM_MODE_ARG(mode));
 	pingpong_tearcheck_setup(encoder, mode);
 	mdp5_crtc_set_pipeline(encoder->crtc);
 }
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
index c5fde1a4191a..b0cf63c4e3d7 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
@@ -19,8 +19,8 @@
 #include <linux/sort.h>
 #include <drm/drm_mode.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_flip_work.h>
+#include <drm/drm_probe_helper.h>
 
 #include "mdp5_kms.h"
 
@@ -384,14 +384,7 @@ static void mdp5_crtc_mode_set_nofb(struct drm_crtc *crtc)
 
 	mode = &crtc->state->adjusted_mode;
 
-	DBG("%s: set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-			crtc->name, mode->base.id, mode->name,
-			mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal,
-			mode->type, mode->flags);
+	DBG("%s: set mode: " DRM_MODE_FMT, crtc->name, DRM_MODE_ARG(mode));
 
 	mixer_width = mode->hdisplay;
 	if (r_mixer)
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_encoder.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_encoder.c
index fcd44d1d1068..820a62c40063 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_encoder.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_encoder.c
@@ -17,7 +17,7 @@
  */
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "mdp5_kms.h"
 
@@ -118,14 +118,7 @@ static void mdp5_vid_encoder_mode_set(struct drm_encoder *encoder,
 
 	mode = adjusted_mode;
 
-	DBG("set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-			mode->base.id, mode->name,
-			mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal,
-			mode->type, mode->flags);
+	DBG("set mode: " DRM_MODE_FMT, DRM_MODE_ARG(mode));
 
 	ctrl_pol = 0;
 
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
index d27e35a217bd..97179bec8902 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
@@ -144,7 +144,7 @@ static int mdp5_global_obj_init(struct mdp5_kms *mdp5_kms)
 
 	state->mdp5_kms = mdp5_kms;
 
-	drm_atomic_private_obj_init(&mdp5_kms->glob_state,
+	drm_atomic_private_obj_init(mdp5_kms->dev, &mdp5_kms->glob_state,
 				    &state->base,
 				    &mdp5_global_state_funcs);
 	return 0;
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
index 7cebcb2b3a37..6153514db04c 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
@@ -16,6 +16,7 @@
  * this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <drm/drm_util.h>
 
 #include "mdp5_kms.h"
 #include "mdp5_smp.h"
diff --git a/drivers/gpu/drm/msm/dsi/dsi.h b/drivers/gpu/drm/msm/dsi/dsi.h
index 08f3fc6771b7..9c6b31c2d79f 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.h
@@ -168,7 +168,7 @@ int msm_dsi_host_power_on(struct mipi_dsi_host *host,
 			bool is_dual_dsi);
 int msm_dsi_host_power_off(struct mipi_dsi_host *host);
 int msm_dsi_host_set_display_mode(struct mipi_dsi_host *host,
-					struct drm_display_mode *mode);
+				  const struct drm_display_mode *mode);
 struct drm_panel *msm_dsi_host_get_panel(struct mipi_dsi_host *host,
 					unsigned long *panel_flags);
 struct drm_bridge *msm_dsi_host_get_bridge(struct mipi_dsi_host *host);
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
index 38e481d2d606..610183db1daf 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -2424,7 +2424,7 @@ unlock_ret:
 }
 
 int msm_dsi_host_set_display_mode(struct mipi_dsi_host *host,
-					struct drm_display_mode *mode)
+				  const struct drm_display_mode *mode)
 {
 	struct msm_dsi_host *msm_host = to_msm_dsi_host(host);
 
diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c b/drivers/gpu/drm/msm/dsi/dsi_manager.c
index 80aa6344185e..979a8e929341 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
@@ -527,8 +527,8 @@ disable_phy:
 }
 
 static void dsi_mgr_bridge_mode_set(struct drm_bridge *bridge,
-		struct drm_display_mode *mode,
-		struct drm_display_mode *adjusted_mode)
+		const struct drm_display_mode *mode,
+		const struct drm_display_mode *adjusted_mode)
 {
 	int id = dsi_mgr_bridge_get_id(bridge);
 	struct msm_dsi *msm_dsi = dsi_mgr_get_dsi(id);
@@ -536,14 +536,7 @@ static void dsi_mgr_bridge_mode_set(struct drm_bridge *bridge,
 	struct mipi_dsi_host *host = msm_dsi->host;
 	bool is_dual_dsi = IS_DUAL_DSI();
 
-	DBG("set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-			mode->base.id, mode->name,
-			mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal,
-			mode->type, mode->flags);
+	DBG("set mode: " DRM_MODE_FMT, DRM_MODE_ARG(mode));
 
 	if (is_dual_dsi && !IS_MASTER_DSI_LINK(id))
 		return;
diff --git a/drivers/gpu/drm/msm/edp/edp_bridge.c b/drivers/gpu/drm/msm/edp/edp_bridge.c
index 931a5c97cccf..11166bf232ff 100644
--- a/drivers/gpu/drm/msm/edp/edp_bridge.c
+++ b/drivers/gpu/drm/msm/edp/edp_bridge.c
@@ -52,22 +52,15 @@ static void edp_bridge_post_disable(struct drm_bridge *bridge)
 }
 
 static void edp_bridge_mode_set(struct drm_bridge *bridge,
-		struct drm_display_mode *mode,
-		struct drm_display_mode *adjusted_mode)
+		const struct drm_display_mode *mode,
+		const struct drm_display_mode *adjusted_mode)
 {
 	struct drm_device *dev = bridge->dev;
 	struct drm_connector *connector;
 	struct edp_bridge *edp_bridge = to_edp_bridge(bridge);
 	struct msm_edp *edp = edp_bridge->edp;
 
-	DBG("set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-			mode->base.id, mode->name,
-			mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal,
-			mode->type, mode->flags);
+	DBG("set mode: " DRM_MODE_FMT, DRM_MODE_ARG(mode));
 
 	list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
 		if ((connector->encoder != NULL) &&
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
index 98d61c690260..03197b8959ba 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
@@ -101,7 +101,8 @@ static void msm_hdmi_config_avi_infoframe(struct hdmi *hdmi)
 	u32 val;
 	int len;
 
-	drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
+	drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
+						 hdmi->connector, mode);
 
 	len = hdmi_infoframe_pack(&frame, buffer, sizeof(buffer));
 	if (len < 0) {
@@ -207,8 +208,8 @@ static void msm_hdmi_bridge_post_disable(struct drm_bridge *bridge)
 }
 
 static void msm_hdmi_bridge_mode_set(struct drm_bridge *bridge,
-		 struct drm_display_mode *mode,
-		 struct drm_display_mode *adjusted_mode)
+		 const struct drm_display_mode *mode,
+		 const struct drm_display_mode *adjusted_mode)
 {
 	struct hdmi_bridge *hdmi_bridge = to_hdmi_bridge(bridge);
 	struct hdmi *hdmi = hdmi_bridge->hdmi;
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index d2cdc7b553fe..0bdd93648761 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -207,62 +207,44 @@ u32 msm_readl(const void __iomem *addr)
 	return val;
 }
 
-struct vblank_event {
-	struct list_head node;
+struct msm_vblank_work {
+	struct work_struct work;
 	int crtc_id;
 	bool enable;
+	struct msm_drm_private *priv;
 };
 
-static void vblank_ctrl_worker(struct kthread_work *work)
+static void vblank_ctrl_worker(struct work_struct *work)
 {
-	struct msm_vblank_ctrl *vbl_ctrl = container_of(work,
-						struct msm_vblank_ctrl, work);
-	struct msm_drm_private *priv = container_of(vbl_ctrl,
-					struct msm_drm_private, vblank_ctrl);
+	struct msm_vblank_work *vbl_work = container_of(work,
+						struct msm_vblank_work, work);
+	struct msm_drm_private *priv = vbl_work->priv;
 	struct msm_kms *kms = priv->kms;
-	struct vblank_event *vbl_ev, *tmp;
-	unsigned long flags;
-
-	spin_lock_irqsave(&vbl_ctrl->lock, flags);
-	list_for_each_entry_safe(vbl_ev, tmp, &vbl_ctrl->event_list, node) {
-		list_del(&vbl_ev->node);
-		spin_unlock_irqrestore(&vbl_ctrl->lock, flags);
-
-		if (vbl_ev->enable)
-			kms->funcs->enable_vblank(kms,
-						priv->crtcs[vbl_ev->crtc_id]);
-		else
-			kms->funcs->disable_vblank(kms,
-						priv->crtcs[vbl_ev->crtc_id]);
-
-		kfree(vbl_ev);
 
-		spin_lock_irqsave(&vbl_ctrl->lock, flags);
-	}
+	if (vbl_work->enable)
+		kms->funcs->enable_vblank(kms, priv->crtcs[vbl_work->crtc_id]);
+	else
+		kms->funcs->disable_vblank(kms,	priv->crtcs[vbl_work->crtc_id]);
 
-	spin_unlock_irqrestore(&vbl_ctrl->lock, flags);
+	kfree(vbl_work);
 }
 
 static int vblank_ctrl_queue_work(struct msm_drm_private *priv,
 					int crtc_id, bool enable)
 {
-	struct msm_vblank_ctrl *vbl_ctrl = &priv->vblank_ctrl;
-	struct vblank_event *vbl_ev;
-	unsigned long flags;
+	struct msm_vblank_work *vbl_work;
 
-	vbl_ev = kzalloc(sizeof(*vbl_ev), GFP_ATOMIC);
-	if (!vbl_ev)
+	vbl_work = kzalloc(sizeof(*vbl_work), GFP_ATOMIC);
+	if (!vbl_work)
 		return -ENOMEM;
 
-	vbl_ev->crtc_id = crtc_id;
-	vbl_ev->enable = enable;
+	INIT_WORK(&vbl_work->work, vblank_ctrl_worker);
 
-	spin_lock_irqsave(&vbl_ctrl->lock, flags);
-	list_add_tail(&vbl_ev->node, &vbl_ctrl->event_list);
-	spin_unlock_irqrestore(&vbl_ctrl->lock, flags);
+	vbl_work->crtc_id = crtc_id;
+	vbl_work->enable = enable;
+	vbl_work->priv = priv;
 
-	kthread_queue_work(&priv->disp_thread[crtc_id].worker,
-			&vbl_ctrl->work);
+	queue_work(priv->wq, &vbl_work->work);
 
 	return 0;
 }
@@ -274,31 +256,20 @@ static int msm_drm_uninit(struct device *dev)
 	struct msm_drm_private *priv = ddev->dev_private;
 	struct msm_kms *kms = priv->kms;
 	struct msm_mdss *mdss = priv->mdss;
-	struct msm_vblank_ctrl *vbl_ctrl = &priv->vblank_ctrl;
-	struct vblank_event *vbl_ev, *tmp;
 	int i;
 
 	/* We must cancel and cleanup any pending vblank enable/disable
 	 * work before drm_irq_uninstall() to avoid work re-enabling an
 	 * irq after uninstall has disabled it.
 	 */
-	kthread_flush_work(&vbl_ctrl->work);
-	list_for_each_entry_safe(vbl_ev, tmp, &vbl_ctrl->event_list, node) {
-		list_del(&vbl_ev->node);
-		kfree(vbl_ev);
-	}
 
-	/* clean up display commit/event worker threads */
-	for (i = 0; i < priv->num_crtcs; i++) {
-		if (priv->disp_thread[i].thread) {
-			kthread_flush_worker(&priv->disp_thread[i].worker);
-			kthread_stop(priv->disp_thread[i].thread);
-			priv->disp_thread[i].thread = NULL;
-		}
+	flush_workqueue(priv->wq);
+	destroy_workqueue(priv->wq);
 
+	/* clean up event worker threads */
+	for (i = 0; i < priv->num_crtcs; i++) {
 		if (priv->event_thread[i].thread) {
-			kthread_flush_worker(&priv->event_thread[i].worker);
-			kthread_stop(priv->event_thread[i].thread);
+			kthread_destroy_worker(&priv->event_thread[i].worker);
 			priv->event_thread[i].thread = NULL;
 		}
 	}
@@ -323,9 +294,6 @@ static int msm_drm_uninit(struct device *dev)
 	drm_irq_uninstall(ddev);
 	pm_runtime_put_sync(dev);
 
-	flush_workqueue(priv->wq);
-	destroy_workqueue(priv->wq);
-
 	if (kms && kms->funcs)
 		kms->funcs->destroy(kms);
 
@@ -490,9 +458,6 @@ static int msm_drm_init(struct device *dev, struct drm_driver *drv)
 	priv->wq = alloc_ordered_workqueue("msm", 0);
 
 	INIT_LIST_HEAD(&priv->inactive_list);
-	INIT_LIST_HEAD(&priv->vblank_ctrl.event_list);
-	kthread_init_work(&priv->vblank_ctrl.work, vblank_ctrl_worker);
-	spin_lock_init(&priv->vblank_ctrl.lock);
 
 	drm_mode_config_init(ddev);
 
@@ -554,27 +519,6 @@ static int msm_drm_init(struct device *dev, struct drm_driver *drv)
 	 */
 	param.sched_priority = 16;
 	for (i = 0; i < priv->num_crtcs; i++) {
-
-		/* initialize display thread */
-		priv->disp_thread[i].crtc_id = priv->crtcs[i]->base.id;
-		kthread_init_worker(&priv->disp_thread[i].worker);
-		priv->disp_thread[i].dev = ddev;
-		priv->disp_thread[i].thread =
-			kthread_run(kthread_worker_fn,
-				&priv->disp_thread[i].worker,
-				"crtc_commit:%d", priv->disp_thread[i].crtc_id);
-		if (IS_ERR(priv->disp_thread[i].thread)) {
-			DRM_DEV_ERROR(dev, "failed to create crtc_commit kthread\n");
-			priv->disp_thread[i].thread = NULL;
-			goto err_msm_uninit;
-		}
-
-		ret = sched_setscheduler(priv->disp_thread[i].thread,
-					 SCHED_FIFO, &param);
-		if (ret)
-			dev_warn(dev, "disp_thread set priority failed: %d\n",
-				 ret);
-
 		/* initialize event thread */
 		priv->event_thread[i].crtc_id = priv->crtcs[i]->base.id;
 		kthread_init_worker(&priv->event_thread[i].worker);
@@ -589,13 +533,6 @@ static int msm_drm_init(struct device *dev, struct drm_driver *drv)
 			goto err_msm_uninit;
 		}
 
-		/**
-		 * event thread should also run at same priority as disp_thread
-		 * because it is handling frame_done events. A lower priority
-		 * event thread and higher priority disp_thread can causes
-		 * frame_pending counters beyond 2. This can lead to commit
-		 * failure at crtc commit level.
-		 */
 		ret = sched_setscheduler(priv->event_thread[i].thread,
 					 SCHED_FIFO, &param);
 		if (ret)
@@ -914,8 +851,12 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
 			ret = -EINVAL;
 			break;
 		}
-		ret = copy_from_user(msm_obj->name,
-			u64_to_user_ptr(args->value), args->len);
+		if (copy_from_user(msm_obj->name, u64_to_user_ptr(args->value),
+				   args->len)) {
+			msm_obj->name[0] = '\0';
+			ret = -EFAULT;
+			break;
+		}
 		msm_obj->name[args->len] = '\0';
 		for (i = 0; i < args->len; i++) {
 			if (!isprint(msm_obj->name[i])) {
@@ -931,8 +872,9 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
 		}
 		args->len = strlen(msm_obj->name);
 		if (args->value) {
-			ret = copy_to_user(u64_to_user_ptr(args->value),
-					msm_obj->name, args->len);
+			if (copy_to_user(u64_to_user_ptr(args->value),
+					 msm_obj->name, args->len))
+				ret = -EFAULT;
 		}
 		break;
 	}
@@ -1063,8 +1005,7 @@ static const struct file_operations fops = {
 };
 
 static struct drm_driver msm_driver = {
-	.driver_features    = DRIVER_HAVE_IRQ |
-				DRIVER_GEM |
+	.driver_features    = DRIVER_GEM |
 				DRIVER_PRIME |
 				DRIVER_RENDER |
 				DRIVER_ATOMIC |
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 927e5d86f7c1..c56dade2c1dc 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -39,8 +39,8 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/msm_drm.h>
 #include <drm/drm_gem.h>
@@ -77,12 +77,6 @@ enum msm_mdp_plane_property {
 	PLANE_PROP_MAX_NUM
 };
 
-struct msm_vblank_ctrl {
-	struct kthread_work work;
-	struct list_head event_list;
-	spinlock_t lock;
-};
-
 #define MSM_GPU_MAX_RINGS 4
 #define MAX_H_TILES_PER_DISPLAY 2
 
@@ -126,7 +120,7 @@ struct msm_display_topology {
 
 /**
  * struct msm_display_info - defines display properties
- * @intf_type:          DRM_MODE_CONNECTOR_ display type
+ * @intf_type:          DRM_MODE_ENCODER_ type
  * @capabilities:       Bitmask of display flags
  * @num_of_h_tiles:     Number of horizontal tiles in case of split interface
  * @h_tile_instance:    Controller instance used per tile. Number of elements is
@@ -199,7 +193,6 @@ struct msm_drm_private {
 	unsigned int num_crtcs;
 	struct drm_crtc *crtcs[MAX_CRTCS];
 
-	struct msm_drm_thread disp_thread[MAX_CRTCS];
 	struct msm_drm_thread event_thread[MAX_CRTCS];
 
 	unsigned int num_encoders;
@@ -228,7 +221,6 @@ struct msm_drm_private {
 	struct notifier_block vmap_notifier;
 	struct shrinker shrinker;
 
-	struct msm_vblank_ctrl vblank_ctrl;
 	struct drm_atomic_state *pm_state;
 };
 
diff --git a/drivers/gpu/drm/msm/msm_fb.c b/drivers/gpu/drm/msm/msm_fb.c
index 67dfd8d3dc12..136058978e0f 100644
--- a/drivers/gpu/drm/msm/msm_fb.c
+++ b/drivers/gpu/drm/msm/msm_fb.c
@@ -16,8 +16,8 @@
  */
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "msm_drv.h"
 #include "msm_kms.h"
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index c8886d3071fa..18ca651ab942 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -762,7 +762,7 @@ static void describe_fence(struct dma_fence *fence, const char *type,
 		struct seq_file *m)
 {
 	if (!dma_fence_is_signaled(fence))
-		seq_printf(m, "\t%9s: %s %s seq %u\n", type,
+		seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
 				fence->ops->get_driver_name(fence),
 				fence->ops->get_timeline_name(fence),
 				fence->seqno);
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_crtc.c b/drivers/gpu/drm/mxsfb/mxsfb_crtc.c
index 24b1f0c1432e..0ee1ca8a316a 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_crtc.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_crtc.c
@@ -19,12 +19,12 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_simple_kms_helper.h>
 #include <linux/clk.h>
 #include <linux/iopoll.h>
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_drv.c b/drivers/gpu/drm/mxsfb/mxsfb_drv.c
index 88ba003979e6..967379f3f571 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_drv.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_drv.c
@@ -31,13 +31,13 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_simple_kms_helper.h>
 
 #include "mxsfb_drv.h"
@@ -263,23 +263,12 @@ static int mxsfb_load(struct drm_device *drm, unsigned long flags)
 
 	drm_kms_helper_poll_init(drm);
 
-	mxsfb->fbdev = drm_fbdev_cma_init(drm, 32,
-					  drm->mode_config.num_connector);
-	if (IS_ERR(mxsfb->fbdev)) {
-		ret = PTR_ERR(mxsfb->fbdev);
-		mxsfb->fbdev = NULL;
-		dev_err(drm->dev, "Failed to init FB CMA area\n");
-		goto err_cma;
-	}
-
 	platform_set_drvdata(pdev, drm);
 
 	drm_helper_hpd_irq_event(drm);
 
 	return 0;
 
-err_cma:
-	drm_irq_uninstall(drm);
 err_irq:
 	drm_panel_detach(mxsfb->panel);
 err_vblank:
@@ -290,11 +279,6 @@ err_vblank:
 
 static void mxsfb_unload(struct drm_device *drm)
 {
-	struct mxsfb_drm_private *mxsfb = drm->dev_private;
-
-	if (mxsfb->fbdev)
-		drm_fbdev_cma_fini(mxsfb->fbdev);
-
 	drm_kms_helper_poll_fini(drm);
 	drm_mode_config_cleanup(drm);
 
@@ -307,13 +291,6 @@ static void mxsfb_unload(struct drm_device *drm)
 	pm_runtime_disable(drm->dev);
 }
 
-static void mxsfb_lastclose(struct drm_device *drm)
-{
-	struct mxsfb_drm_private *mxsfb = drm->dev_private;
-
-	drm_fbdev_cma_restore_mode(mxsfb->fbdev);
-}
-
 static void mxsfb_irq_preinstall(struct drm_device *drm)
 {
 	struct mxsfb_drm_private *mxsfb = drm->dev_private;
@@ -345,9 +322,7 @@ DEFINE_DRM_GEM_CMA_FOPS(fops);
 
 static struct drm_driver mxsfb_driver = {
 	.driver_features	= DRIVER_GEM | DRIVER_MODESET |
-				  DRIVER_PRIME | DRIVER_ATOMIC |
-				  DRIVER_HAVE_IRQ,
-	.lastclose		= mxsfb_lastclose,
+				  DRIVER_PRIME | DRIVER_ATOMIC,
 	.irq_handler		= mxsfb_irq_handler,
 	.irq_preinstall		= mxsfb_irq_preinstall,
 	.irq_uninstall		= mxsfb_irq_preinstall,
@@ -412,6 +387,8 @@ static int mxsfb_probe(struct platform_device *pdev)
 	if (ret)
 		goto err_unload;
 
+	drm_fbdev_generic_setup(drm, 32);
+
 	return 0;
 
 err_unload:
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_drv.h b/drivers/gpu/drm/mxsfb/mxsfb_drv.h
index 5d0883fc805b..bedd6801edca 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_drv.h
+++ b/drivers/gpu/drm/mxsfb/mxsfb_drv.h
@@ -37,7 +37,6 @@ struct mxsfb_drm_private {
 	struct drm_simple_display_pipe	pipe;
 	struct drm_connector		connector;
 	struct drm_panel		*panel;
-	struct drm_fbdev_cma		*fbdev;
 };
 
 int mxsfb_setup_crtc(struct drm_device *dev);
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_out.c b/drivers/gpu/drm/mxsfb/mxsfb_out.c
index e5edf016a439..27add9976931 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_out.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_out.c
@@ -16,12 +16,12 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_simple_kms_helper.h>
 #include <drm/drmP.h>
 
diff --git a/drivers/gpu/drm/nouveau/Kbuild b/drivers/gpu/drm/nouveau/Kbuild
index b17843dd050d..581404e6544d 100644
--- a/drivers/gpu/drm/nouveau/Kbuild
+++ b/drivers/gpu/drm/nouveau/Kbuild
@@ -30,6 +30,8 @@ nouveau-y += nouveau_vga.o
 # DRM - memory management
 nouveau-y += nouveau_bo.o
 nouveau-y += nouveau_gem.o
+nouveau-$(CONFIG_DRM_NOUVEAU_SVM) += nouveau_svm.o
+nouveau-$(CONFIG_DRM_NOUVEAU_SVM) += nouveau_dmem.o
 nouveau-y += nouveau_mem.o
 nouveau-y += nouveau_prime.o
 nouveau-y += nouveau_sgdma.o
diff --git a/drivers/gpu/drm/nouveau/Kconfig b/drivers/gpu/drm/nouveau/Kconfig
index 432c440223bb..00cd9ab8948d 100644
--- a/drivers/gpu/drm/nouveau/Kconfig
+++ b/drivers/gpu/drm/nouveau/Kconfig
@@ -71,3 +71,15 @@ config DRM_NOUVEAU_BACKLIGHT
 	help
 	  Say Y here if you want to control the backlight of your display
 	  (e.g. a laptop panel).
+
+config DRM_NOUVEAU_SVM
+	bool "(EXPERIMENTAL) Enable SVM (Shared Virtual Memory) support"
+	depends on ARCH_HAS_HMM
+	depends on DRM_NOUVEAU
+	depends on STAGING
+	select HMM_MIRROR
+	select DEVICE_PRIVATE
+	default n
+	help
+	  Say Y here if you want to enable experimental support for
+	  Shared Virtual Memory (SVM).
diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index 2c569e264df3..f22f01020625 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -40,6 +40,7 @@
 #include "nvreg.h"
 #include "nouveau_fbcon.h"
 #include "disp.h"
+#include "nouveau_dma.h"
 
 #include <subdev/bios/pll.h>
 #include <subdev/clk.h>
@@ -1077,12 +1078,223 @@ nouveau_crtc_set_config(struct drm_mode_set *set,
 	return ret;
 }
 
+struct nv04_page_flip_state {
+	struct list_head head;
+	struct drm_pending_vblank_event *event;
+	struct drm_crtc *crtc;
+	int bpp, pitch;
+	u64 offset;
+};
+
+static int
+nv04_finish_page_flip(struct nouveau_channel *chan,
+		      struct nv04_page_flip_state *ps)
+{
+	struct nouveau_fence_chan *fctx = chan->fence;
+	struct nouveau_drm *drm = chan->drm;
+	struct drm_device *dev = drm->dev;
+	struct nv04_page_flip_state *s;
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->event_lock, flags);
+
+	if (list_empty(&fctx->flip)) {
+		NV_ERROR(drm, "unexpected pageflip\n");
+		spin_unlock_irqrestore(&dev->event_lock, flags);
+		return -EINVAL;
+	}
+
+	s = list_first_entry(&fctx->flip, struct nv04_page_flip_state, head);
+	if (s->event) {
+		drm_crtc_arm_vblank_event(s->crtc, s->event);
+	} else {
+		/* Give up ownership of vblank for page-flipped crtc */
+		drm_crtc_vblank_put(s->crtc);
+	}
+
+	list_del(&s->head);
+	if (ps)
+		*ps = *s;
+	kfree(s);
+
+	spin_unlock_irqrestore(&dev->event_lock, flags);
+	return 0;
+}
+
+int
+nv04_flip_complete(struct nvif_notify *notify)
+{
+	struct nouveau_cli *cli = (void *)notify->object->client;
+	struct nouveau_drm *drm = cli->drm;
+	struct nouveau_channel *chan = drm->channel;
+	struct nv04_page_flip_state state;
+
+	if (!nv04_finish_page_flip(chan, &state)) {
+		nv_set_crtc_base(drm->dev, drm_crtc_index(state.crtc),
+				 state.offset + state.crtc->y *
+				 state.pitch + state.crtc->x *
+				 state.bpp / 8);
+	}
+
+	return NVIF_NOTIFY_KEEP;
+}
+
+static int
+nv04_page_flip_emit(struct nouveau_channel *chan,
+		    struct nouveau_bo *old_bo,
+		    struct nouveau_bo *new_bo,
+		    struct nv04_page_flip_state *s,
+		    struct nouveau_fence **pfence)
+{
+	struct nouveau_fence_chan *fctx = chan->fence;
+	struct nouveau_drm *drm = chan->drm;
+	struct drm_device *dev = drm->dev;
+	unsigned long flags;
+	int ret;
+
+	/* Queue it to the pending list */
+	spin_lock_irqsave(&dev->event_lock, flags);
+	list_add_tail(&s->head, &fctx->flip);
+	spin_unlock_irqrestore(&dev->event_lock, flags);
+
+	/* Synchronize with the old framebuffer */
+	ret = nouveau_fence_sync(old_bo, chan, false, false);
+	if (ret)
+		goto fail;
+
+	/* Emit the pageflip */
+	ret = RING_SPACE(chan, 2);
+	if (ret)
+		goto fail;
+
+	BEGIN_NV04(chan, NvSubSw, NV_SW_PAGE_FLIP, 1);
+	OUT_RING  (chan, 0x00000000);
+	FIRE_RING (chan);
+
+	ret = nouveau_fence_new(chan, false, pfence);
+	if (ret)
+		goto fail;
+
+	return 0;
+fail:
+	spin_lock_irqsave(&dev->event_lock, flags);
+	list_del(&s->head);
+	spin_unlock_irqrestore(&dev->event_lock, flags);
+	return ret;
+}
+
+static int
+nv04_crtc_page_flip(struct drm_crtc *crtc, struct drm_framebuffer *fb,
+		    struct drm_pending_vblank_event *event, u32 flags,
+		    struct drm_modeset_acquire_ctx *ctx)
+{
+	const int swap_interval = (flags & DRM_MODE_PAGE_FLIP_ASYNC) ? 0 : 1;
+	struct drm_device *dev = crtc->dev;
+	struct nouveau_drm *drm = nouveau_drm(dev);
+	struct nouveau_bo *old_bo = nouveau_framebuffer(crtc->primary->fb)->nvbo;
+	struct nouveau_bo *new_bo = nouveau_framebuffer(fb)->nvbo;
+	struct nv04_page_flip_state *s;
+	struct nouveau_channel *chan;
+	struct nouveau_cli *cli;
+	struct nouveau_fence *fence;
+	struct nv04_display *dispnv04 = nv04_display(dev);
+	int head = nouveau_crtc(crtc)->index;
+	int ret;
+
+	chan = drm->channel;
+	if (!chan)
+		return -ENODEV;
+	cli = (void *)chan->user.client;
+
+	s = kzalloc(sizeof(*s), GFP_KERNEL);
+	if (!s)
+		return -ENOMEM;
+
+	if (new_bo != old_bo) {
+		ret = nouveau_bo_pin(new_bo, TTM_PL_FLAG_VRAM, true);
+		if (ret)
+			goto fail_free;
+	}
+
+	mutex_lock(&cli->mutex);
+	ret = ttm_bo_reserve(&new_bo->bo, true, false, NULL);
+	if (ret)
+		goto fail_unpin;
+
+	/* synchronise rendering channel with the kernel's channel */
+	ret = nouveau_fence_sync(new_bo, chan, false, true);
+	if (ret) {
+		ttm_bo_unreserve(&new_bo->bo);
+		goto fail_unpin;
+	}
+
+	if (new_bo != old_bo) {
+		ttm_bo_unreserve(&new_bo->bo);
+
+		ret = ttm_bo_reserve(&old_bo->bo, true, false, NULL);
+		if (ret)
+			goto fail_unpin;
+	}
+
+	/* Initialize a page flip struct */
+	*s = (struct nv04_page_flip_state)
+		{ { }, event, crtc, fb->format->cpp[0] * 8, fb->pitches[0],
+		  new_bo->bo.offset };
+
+	/* Keep vblanks on during flip, for the target crtc of this flip */
+	drm_crtc_vblank_get(crtc);
+
+	/* Emit a page flip */
+	if (swap_interval) {
+		ret = RING_SPACE(chan, 8);
+		if (ret)
+			goto fail_unreserve;
+
+		BEGIN_NV04(chan, NvSubImageBlit, 0x012c, 1);
+		OUT_RING  (chan, 0);
+		BEGIN_NV04(chan, NvSubImageBlit, 0x0134, 1);
+		OUT_RING  (chan, head);
+		BEGIN_NV04(chan, NvSubImageBlit, 0x0100, 1);
+		OUT_RING  (chan, 0);
+		BEGIN_NV04(chan, NvSubImageBlit, 0x0130, 1);
+		OUT_RING  (chan, 0);
+	}
+
+	nouveau_bo_ref(new_bo, &dispnv04->image[head]);
+
+	ret = nv04_page_flip_emit(chan, old_bo, new_bo, s, &fence);
+	if (ret)
+		goto fail_unreserve;
+	mutex_unlock(&cli->mutex);
+
+	/* Update the crtc struct and cleanup */
+	crtc->primary->fb = fb;
+
+	nouveau_bo_fence(old_bo, fence, false);
+	ttm_bo_unreserve(&old_bo->bo);
+	if (old_bo != new_bo)
+		nouveau_bo_unpin(old_bo);
+	nouveau_fence_unref(&fence);
+	return 0;
+
+fail_unreserve:
+	drm_crtc_vblank_put(crtc);
+	ttm_bo_unreserve(&old_bo->bo);
+fail_unpin:
+	mutex_unlock(&cli->mutex);
+	if (old_bo != new_bo)
+		nouveau_bo_unpin(new_bo);
+fail_free:
+	kfree(s);
+	return ret;
+}
+
 static const struct drm_crtc_funcs nv04_crtc_funcs = {
 	.cursor_set = nv04_crtc_cursor_set,
 	.cursor_move = nv04_crtc_cursor_move,
 	.gamma_set = nv_crtc_gamma_set,
 	.set_config = nouveau_crtc_set_config,
-	.page_flip = nouveau_crtc_page_flip,
+	.page_flip = nv04_crtc_page_flip,
 	.destroy = nv_crtc_destroy,
 };
 
diff --git a/drivers/gpu/drm/nouveau/dispnv04/disp.c b/drivers/gpu/drm/nouveau/dispnv04/disp.c
index 1727d399833c..5713bacaee80 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/disp.c
@@ -30,6 +30,160 @@
 #include "hw.h"
 #include "nouveau_encoder.h"
 #include "nouveau_connector.h"
+#include "nouveau_bo.h"
+
+#include <nvif/if0004.h>
+
+static void
+nv04_display_fini(struct drm_device *dev, bool suspend)
+{
+	struct nv04_display *disp = nv04_display(dev);
+	struct drm_crtc *crtc;
+
+	/* Disable flip completion events. */
+	nvif_notify_put(&disp->flip);
+
+	/* Disable vblank interrupts. */
+	NVWriteCRTC(dev, 0, NV_PCRTC_INTR_EN_0, 0);
+	if (nv_two_heads(dev))
+		NVWriteCRTC(dev, 1, NV_PCRTC_INTR_EN_0, 0);
+
+	if (!suspend)
+		return;
+
+	/* Un-pin FB and cursors so they'll be evicted to system memory. */
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+		struct nouveau_framebuffer *nouveau_fb;
+
+		nouveau_fb = nouveau_framebuffer(crtc->primary->fb);
+		if (!nouveau_fb || !nouveau_fb->nvbo)
+			continue;
+
+		nouveau_bo_unpin(nouveau_fb->nvbo);
+	}
+
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
+		if (nv_crtc->cursor.nvbo) {
+			if (nv_crtc->cursor.set_offset)
+				nouveau_bo_unmap(nv_crtc->cursor.nvbo);
+			nouveau_bo_unpin(nv_crtc->cursor.nvbo);
+		}
+	}
+}
+
+static int
+nv04_display_init(struct drm_device *dev, bool resume, bool runtime)
+{
+	struct nv04_display *disp = nv04_display(dev);
+	struct nouveau_drm *drm = nouveau_drm(dev);
+	struct nouveau_encoder *encoder;
+	struct drm_crtc *crtc;
+	int ret;
+
+	/* meh.. modeset apparently doesn't setup all the regs and depends
+	 * on pre-existing state, for now load the state of the card *before*
+	 * nouveau was loaded, and then do a modeset.
+	 *
+	 * best thing to do probably is to make save/restore routines not
+	 * save/restore "pre-load" state, but more general so we can save
+	 * on suspend too.
+	 */
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
+		nv_crtc->save(&nv_crtc->base);
+	}
+
+	list_for_each_entry(encoder, &dev->mode_config.encoder_list, base.base.head)
+		encoder->enc_save(&encoder->base.base);
+
+	/* Enable flip completion events. */
+	nvif_notify_get(&disp->flip);
+
+	if (!resume)
+		return 0;
+
+	/* Re-pin FB/cursors. */
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+		struct nouveau_framebuffer *nouveau_fb;
+
+		nouveau_fb = nouveau_framebuffer(crtc->primary->fb);
+		if (!nouveau_fb || !nouveau_fb->nvbo)
+			continue;
+
+		ret = nouveau_bo_pin(nouveau_fb->nvbo, TTM_PL_FLAG_VRAM, true);
+		if (ret)
+			NV_ERROR(drm, "Could not pin framebuffer\n");
+	}
+
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
+		if (!nv_crtc->cursor.nvbo)
+			continue;
+
+		ret = nouveau_bo_pin(nv_crtc->cursor.nvbo, TTM_PL_FLAG_VRAM, true);
+		if (!ret && nv_crtc->cursor.set_offset)
+			ret = nouveau_bo_map(nv_crtc->cursor.nvbo);
+		if (ret)
+			NV_ERROR(drm, "Could not pin/map cursor.\n");
+	}
+
+	/* Force CLUT to get re-loaded during modeset. */
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
+
+		nv_crtc->lut.depth = 0;
+	}
+
+	/* This should ensure we don't hit a locking problem when someone
+	 * wakes us up via a connector.  We should never go into suspend
+	 * while the display is on anyways.
+	 */
+	if (runtime)
+		return 0;
+
+	/* Restore mode. */
+	drm_helper_resume_force_mode(dev);
+
+	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
+
+		if (!nv_crtc->cursor.nvbo)
+			continue;
+
+		if (nv_crtc->cursor.set_offset)
+			nv_crtc->cursor.set_offset(nv_crtc, nv_crtc->cursor.nvbo->bo.offset);
+		nv_crtc->cursor.set_pos(nv_crtc, nv_crtc->cursor_saved_x,
+						 nv_crtc->cursor_saved_y);
+	}
+
+	return 0;
+}
+
+static void
+nv04_display_destroy(struct drm_device *dev)
+{
+	struct nv04_display *disp = nv04_display(dev);
+	struct nouveau_drm *drm = nouveau_drm(dev);
+	struct nouveau_encoder *encoder;
+	struct nouveau_crtc *nv_crtc;
+
+	/* Restore state */
+	list_for_each_entry(encoder, &dev->mode_config.encoder_list, base.base.head)
+		encoder->enc_restore(&encoder->base.base);
+
+	list_for_each_entry(nv_crtc, &dev->mode_config.crtc_list, base.head)
+		nv_crtc->restore(&nv_crtc->base);
+
+	nouveau_hw_save_vga_fonts(dev, 0);
+
+	nvif_notify_fini(&disp->flip);
+
+	nouveau_display(dev)->priv = NULL;
+	kfree(disp);
+
+	nvif_object_unmap(&drm->client.device.object);
+}
 
 int
 nv04_display_create(struct drm_device *dev)
@@ -58,6 +212,13 @@ nv04_display_create(struct drm_device *dev)
 	/* Pre-nv50 doesn't support atomic, so don't expose the ioctls */
 	dev->driver->driver_features &= ~DRIVER_ATOMIC;
 
+	/* Request page flip completion event. */
+	if (drm->nvsw.client) {
+		nvif_notify_init(&drm->nvsw, nv04_flip_complete,
+				 false, NV04_NVSW_NTFY_UEVENT,
+				 NULL, 0, 0, &disp->flip);
+	}
+
 	nouveau_hw_save_vga_fonts(dev, 1);
 
 	nv04_crtc_create(dev, 0);
@@ -121,58 +282,3 @@ nv04_display_create(struct drm_device *dev)
 
 	return 0;
 }
-
-void
-nv04_display_destroy(struct drm_device *dev)
-{
-	struct nv04_display *disp = nv04_display(dev);
-	struct nouveau_drm *drm = nouveau_drm(dev);
-	struct nouveau_encoder *encoder;
-	struct nouveau_crtc *nv_crtc;
-
-	/* Restore state */
-	list_for_each_entry(encoder, &dev->mode_config.encoder_list, base.base.head)
-		encoder->enc_restore(&encoder->base.base);
-
-	list_for_each_entry(nv_crtc, &dev->mode_config.crtc_list, base.head)
-		nv_crtc->restore(&nv_crtc->base);
-
-	nouveau_hw_save_vga_fonts(dev, 0);
-
-	nouveau_display(dev)->priv = NULL;
-	kfree(disp);
-
-	nvif_object_unmap(&drm->client.device.object);
-}
-
-int
-nv04_display_init(struct drm_device *dev)
-{
-	struct nouveau_encoder *encoder;
-	struct nouveau_crtc *crtc;
-
-	/* meh.. modeset apparently doesn't setup all the regs and depends
-	 * on pre-existing state, for now load the state of the card *before*
-	 * nouveau was loaded, and then do a modeset.
-	 *
-	 * best thing to do probably is to make save/restore routines not
-	 * save/restore "pre-load" state, but more general so we can save
-	 * on suspend too.
-	 */
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, base.head)
-		crtc->save(&crtc->base);
-
-	list_for_each_entry(encoder, &dev->mode_config.encoder_list, base.base.head)
-		encoder->enc_save(&encoder->base.base);
-
-	return 0;
-}
-
-void
-nv04_display_fini(struct drm_device *dev)
-{
-	/* disable vblank interrupts */
-	NVWriteCRTC(dev, 0, NV_PCRTC_INTR_EN_0, 0);
-	if (nv_two_heads(dev))
-		NVWriteCRTC(dev, 1, NV_PCRTC_INTR_EN_0, 0);
-}
diff --git a/drivers/gpu/drm/nouveau/dispnv04/disp.h b/drivers/gpu/drm/nouveau/dispnv04/disp.h
index f74f1f2b186e..c6ed20a09f4a 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/disp.h
+++ b/drivers/gpu/drm/nouveau/dispnv04/disp.h
@@ -82,6 +82,7 @@ struct nv04_display {
 	uint32_t saved_vga_font[4][16384];
 	uint32_t dac_users[4];
 	struct nouveau_bo *image[2];
+	struct nvif_notify flip;
 };
 
 static inline struct nv04_display *
@@ -92,9 +93,6 @@ nv04_display(struct drm_device *dev)
 
 /* nv04_display.c */
 int nv04_display_create(struct drm_device *);
-void nv04_display_destroy(struct drm_device *);
-int nv04_display_init(struct drm_device *);
-void nv04_display_fini(struct drm_device *);
 
 /* nv04_crtc.c */
 int nv04_crtc_create(struct drm_device *, int index);
@@ -176,4 +174,5 @@ nouveau_bios_run_init_table(struct drm_device *dev, u16 table,
 	);
 }
 
+int nv04_flip_complete(struct nvif_notify *);
 #endif
diff --git a/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c b/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c
index 6a4ca139cf5d..26fd71c06626 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c
@@ -26,6 +26,7 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "nouveau_drv.h"
 #include "nouveau_reg.h"
 #include "nouveau_encoder.h"
@@ -750,7 +751,9 @@ static int nv17_tv_set_property(struct drm_encoder *encoder,
 		/* Disable the crtc to ensure a full modeset is
 		 * performed whenever it's turned on again. */
 		if (crtc)
-			drm_crtc_force_disable(crtc);
+			drm_crtc_helper_set_mode(crtc, &crtc->mode,
+						 crtc->x, crtc->y,
+						 crtc->primary->fb);
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h b/drivers/gpu/drm/nouveau/dispnv50/atom.h
index a194990d2b0d..b5fae5ab3fa8 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/atom.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h
@@ -116,6 +116,12 @@ struct nv50_head_atom {
 		u8 depth:4;
 	} or;
 
+	/* Currently only used for MST */
+	struct {
+		int pbn;
+		u8 tu:6;
+	} dp;
+
 	union nv50_head_atom_mask {
 		struct {
 			bool olut:1;
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core.c b/drivers/gpu/drm/nouveau/dispnv50/core.c
index c25e0ebe3c92..27ea3f34706d 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/core.c
@@ -42,7 +42,7 @@ nv50_core_new(struct nouveau_drm *drm, struct nv50_core **pcore)
 		int version;
 		int (*new)(struct nouveau_drm *, s32, struct nv50_core **);
 	} cores[] = {
-		{ TU104_DISP_CORE_CHANNEL_DMA, 0, corec57d_new },
+		{ TU102_DISP_CORE_CHANNEL_DMA, 0, corec57d_new },
 		{ GV100_DISP_CORE_CHANNEL_DMA, 0, corec37d_new },
 		{ GP102_DISP_CORE_CHANNEL_DMA, 0, core917d_new },
 		{ GP100_DISP_CORE_CHANNEL_DMA, 0, core917d_new },
diff --git a/drivers/gpu/drm/nouveau/dispnv50/curs.c b/drivers/gpu/drm/nouveau/dispnv50/curs.c
index cb6e4d2b1b45..121c24a18f11 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/curs.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/curs.c
@@ -31,7 +31,7 @@ nv50_curs_new(struct nouveau_drm *drm, int head, struct nv50_wndw **pwndw)
 		int version;
 		int (*new)(struct nouveau_drm *, int, s32, struct nv50_wndw **);
 	} curses[] = {
-		{ TU104_DISP_CURSOR, 0, cursc37a_new },
+		{ TU102_DISP_CURSOR, 0, cursc37a_new },
 		{ GV100_DISP_CURSOR, 0, cursc37a_new },
 		{ GK104_DISP_CURSOR, 0, curs907a_new },
 		{ GF110_DISP_CURSOR, 0, curs907a_new },
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index 134701a837c8..4b1650f51955 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -32,10 +32,10 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_scdc_helper.h>
 #include <drm/drm_edid.h>
 
@@ -561,7 +561,7 @@ nv50_hdmi_enable(struct drm_encoder *encoder, struct drm_display_mode *mode)
 	u32 max_ac_packet;
 	union hdmi_infoframe avi_frame;
 	union hdmi_infoframe vendor_frame;
-	bool scdc_supported, high_tmds_clock_ratio = false, scrambling = false;
+	bool high_tmds_clock_ratio = false, scrambling = false;
 	u8 config;
 	int ret;
 	int size;
@@ -571,10 +571,9 @@ nv50_hdmi_enable(struct drm_encoder *encoder, struct drm_display_mode *mode)
 		return;
 
 	hdmi = &nv_connector->base.display_info.hdmi;
-	scdc_supported = hdmi->scdc.supported;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&avi_frame.avi, mode,
-						       scdc_supported);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&avi_frame.avi,
+						       &nv_connector->base, mode);
 	if (!ret) {
 		/* We have an AVI InfoFrame, populate it to the display */
 		args.pwr.avi_infoframe_length
@@ -660,8 +659,6 @@ struct nv50_mstc {
 
 	struct drm_display_mode *native;
 	struct edid *edid;
-
-	int pbn;
 };
 
 struct nv50_msto {
@@ -680,6 +677,8 @@ nv50_msto_payload(struct nv50_msto *msto)
 	struct nv50_mstm *mstm = mstc->mstm;
 	int vcpi = mstc->port->vcpi.vcpi, i;
 
+	WARN_ON(!mutex_is_locked(&mstm->mgr.payload_lock));
+
 	NV_ATOMIC(drm, "%s: vcpi %d\n", msto->encoder.name, vcpi);
 	for (i = 0; i < mstm->mgr.max_payloads; i++) {
 		struct drm_dp_payload *payload = &mstm->mgr.payloads[i];
@@ -704,14 +703,16 @@ nv50_msto_cleanup(struct nv50_msto *msto)
 	struct nv50_mstc *mstc = msto->mstc;
 	struct nv50_mstm *mstm = mstc->mstm;
 
+	if (!msto->disabled)
+		return;
+
 	NV_ATOMIC(drm, "%s: msto cleanup\n", msto->encoder.name);
-	if (mstc->port && mstc->port->vcpi.vcpi > 0 && !nv50_msto_payload(msto))
-		drm_dp_mst_deallocate_vcpi(&mstm->mgr, mstc->port);
-	if (msto->disabled) {
-		msto->mstc = NULL;
-		msto->head = NULL;
-		msto->disabled = false;
-	}
+
+	drm_dp_mst_deallocate_vcpi(&mstm->mgr, mstc->port);
+
+	msto->mstc = NULL;
+	msto->head = NULL;
+	msto->disabled = false;
 }
 
 static void
@@ -731,8 +732,10 @@ nv50_msto_prepare(struct nv50_msto *msto)
 			       (0x0100 << msto->head->base.index),
 	};
 
+	mutex_lock(&mstm->mgr.payload_lock);
+
 	NV_ATOMIC(drm, "%s: msto prepare\n", msto->encoder.name);
-	if (mstc->port && mstc->port->vcpi.vcpi > 0) {
+	if (mstc->port->vcpi.vcpi > 0) {
 		struct drm_dp_payload *payload = nv50_msto_payload(msto);
 		if (payload) {
 			args.vcpi.start_slot = payload->start_slot;
@@ -746,7 +749,9 @@ nv50_msto_prepare(struct nv50_msto *msto)
 		  msto->encoder.name, msto->head->base.base.name,
 		  args.vcpi.start_slot, args.vcpi.num_slots,
 		  args.vcpi.pbn, args.vcpi.aligned_pbn);
+
 	nvif_mthd(&drm->display->disp.object, 0, &args, sizeof(args));
+	mutex_unlock(&mstm->mgr.payload_lock);
 }
 
 static int
@@ -754,16 +759,31 @@ nv50_msto_atomic_check(struct drm_encoder *encoder,
 		       struct drm_crtc_state *crtc_state,
 		       struct drm_connector_state *conn_state)
 {
-	struct nv50_mstc *mstc = nv50_mstc(conn_state->connector);
+	struct drm_atomic_state *state = crtc_state->state;
+	struct drm_connector *connector = conn_state->connector;
+	struct nv50_mstc *mstc = nv50_mstc(connector);
 	struct nv50_mstm *mstm = mstc->mstm;
-	int bpp = conn_state->connector->display_info.bpc * 3;
+	struct nv50_head_atom *asyh = nv50_head_atom(crtc_state);
 	int slots;
 
-	mstc->pbn = drm_dp_calc_pbn_mode(crtc_state->adjusted_mode.clock, bpp);
+	/* When restoring duplicated states, we need to make sure that the
+	 * bw remains the same and avoid recalculating it, as the connector's
+	 * bpc may have changed after the state was duplicated
+	 */
+	if (!state->duplicated)
+		asyh->dp.pbn =
+			drm_dp_calc_pbn_mode(crtc_state->adjusted_mode.clock,
+					     connector->display_info.bpc * 3);
+
+	if (drm_atomic_crtc_needs_modeset(crtc_state)) {
+		slots = drm_dp_atomic_find_vcpi_slots(state, &mstm->mgr,
+						      mstc->port,
+						      asyh->dp.pbn);
+		if (slots < 0)
+			return slots;
 
-	slots = drm_dp_find_vcpi_slots(&mstm->mgr, mstc->pbn);
-	if (slots < 0)
-		return slots;
+		asyh->dp.tu = slots;
+	}
 
 	return nv50_outp_atomic_check_view(encoder, crtc_state, conn_state,
 					   mstc->native);
@@ -773,13 +793,13 @@ static void
 nv50_msto_enable(struct drm_encoder *encoder)
 {
 	struct nv50_head *head = nv50_head(encoder->crtc);
+	struct nv50_head_atom *armh = nv50_head_atom(head->base.base.state);
 	struct nv50_msto *msto = nv50_msto(encoder);
 	struct nv50_mstc *mstc = NULL;
 	struct nv50_mstm *mstm = NULL;
 	struct drm_connector *connector;
 	struct drm_connector_list_iter conn_iter;
 	u8 proto, depth;
-	int slots;
 	bool r;
 
 	drm_connector_list_iter_begin(encoder->dev, &conn_iter);
@@ -795,9 +815,10 @@ nv50_msto_enable(struct drm_encoder *encoder)
 	if (WARN_ON(!mstc))
 		return;
 
-	slots = drm_dp_find_vcpi_slots(&mstm->mgr, mstc->pbn);
-	r = drm_dp_mst_allocate_vcpi(&mstm->mgr, mstc->port, mstc->pbn, slots);
-	WARN_ON(!r);
+	r = drm_dp_mst_allocate_vcpi(&mstm->mgr, mstc->port, armh->dp.pbn,
+				     armh->dp.tu);
+	if (!r)
+		DRM_DEBUG_KMS("Failed to allocate VCPI\n");
 
 	if (!mstm->links++)
 		nv50_outp_acquire(mstm->outp);
@@ -814,8 +835,7 @@ nv50_msto_enable(struct drm_encoder *encoder)
 	default: depth = 0x6; break;
 	}
 
-	mstm->outp->update(mstm->outp, head->base.index,
-			   nv50_head_atom(head->base.base.state), proto, depth);
+	mstm->outp->update(mstm->outp, head->base.index, armh, proto, depth);
 
 	msto->head = head;
 	msto->mstc = mstc;
@@ -829,8 +849,7 @@ nv50_msto_disable(struct drm_encoder *encoder)
 	struct nv50_mstc *mstc = msto->mstc;
 	struct nv50_mstm *mstm = mstc->mstm;
 
-	if (mstc->port)
-		drm_dp_mst_reset_vcpi_slots(&mstm->mgr, mstc->port);
+	drm_dp_mst_reset_vcpi_slots(&mstm->mgr, mstc->port);
 
 	mstm->outp->update(mstm->outp, msto->head->base.index, NULL, 0, 0);
 	mstm->modified = true;
@@ -927,12 +946,43 @@ nv50_mstc_get_modes(struct drm_connector *connector)
 	return ret;
 }
 
+static int
+nv50_mstc_atomic_check(struct drm_connector *connector,
+		       struct drm_connector_state *new_conn_state)
+{
+	struct drm_atomic_state *state = new_conn_state->state;
+	struct nv50_mstc *mstc = nv50_mstc(connector);
+	struct drm_dp_mst_topology_mgr *mgr = &mstc->mstm->mgr;
+	struct drm_connector_state *old_conn_state =
+		drm_atomic_get_old_connector_state(state, connector);
+	struct drm_crtc_state *crtc_state;
+	struct drm_crtc *new_crtc = new_conn_state->crtc;
+
+	if (!old_conn_state->crtc)
+		return 0;
+
+	/* We only want to free VCPI if this state disables the CRTC on this
+	 * connector
+	 */
+	if (new_crtc) {
+		crtc_state = drm_atomic_get_new_crtc_state(state, new_crtc);
+
+		if (!crtc_state ||
+		    !drm_atomic_crtc_needs_modeset(crtc_state) ||
+		    crtc_state->enable)
+			return 0;
+	}
+
+	return drm_dp_atomic_release_vcpi_slots(state, mgr, mstc->port);
+}
+
 static const struct drm_connector_helper_funcs
 nv50_mstc_help = {
 	.get_modes = nv50_mstc_get_modes,
 	.mode_valid = nv50_mstc_mode_valid,
 	.best_encoder = nv50_mstc_best_encoder,
 	.atomic_best_encoder = nv50_mstc_atomic_best_encoder,
+	.atomic_check = nv50_mstc_atomic_check,
 };
 
 static enum drm_connector_status
@@ -942,7 +992,7 @@ nv50_mstc_detect(struct drm_connector *connector, bool force)
 	enum drm_connector_status conn_status;
 	int ret;
 
-	if (!mstc->port)
+	if (drm_connector_is_unregistered(connector))
 		return connector_status_disconnected;
 
 	ret = pm_runtime_get_sync(connector->dev->dev);
@@ -961,7 +1011,10 @@ static void
 nv50_mstc_destroy(struct drm_connector *connector)
 {
 	struct nv50_mstc *mstc = nv50_mstc(connector);
+
 	drm_connector_cleanup(&mstc->connector);
+	drm_dp_mst_put_port_malloc(mstc->port);
+
 	kfree(mstc);
 }
 
@@ -1009,6 +1062,7 @@ nv50_mstc_new(struct nv50_mstm *mstm, struct drm_dp_mst_port *port,
 	drm_object_attach_property(&mstc->connector.base, dev->mode_config.path_property, 0);
 	drm_object_attach_property(&mstc->connector.base, dev->mode_config.tile_property, 0);
 	drm_connector_set_path_property(&mstc->connector, path);
+	drm_dp_mst_get_port_malloc(port);
 	return 0;
 }
 
@@ -1063,13 +1117,6 @@ nv50_mstm_prepare(struct nv50_mstm *mstm)
 }
 
 static void
-nv50_mstm_hotplug(struct drm_dp_mst_topology_mgr *mgr)
-{
-	struct nv50_mstm *mstm = nv50_mstm(mgr);
-	drm_kms_helper_hotplug_event(mstm->outp->base.base.dev);
-}
-
-static void
 nv50_mstm_destroy_connector(struct drm_dp_mst_topology_mgr *mgr,
 			    struct drm_connector *connector)
 {
@@ -1080,10 +1127,6 @@ nv50_mstm_destroy_connector(struct drm_dp_mst_topology_mgr *mgr,
 
 	drm_fb_helper_remove_one_connector(&drm->fbcon->helper, &mstc->connector);
 
-	drm_modeset_lock(&drm->dev->mode_config.connection_mutex, NULL);
-	mstc->port = NULL;
-	drm_modeset_unlock(&drm->dev->mode_config.connection_mutex);
-
 	drm_connector_put(&mstc->connector);
 }
 
@@ -1106,11 +1149,8 @@ nv50_mstm_add_connector(struct drm_dp_mst_topology_mgr *mgr,
 	int ret;
 
 	ret = nv50_mstc_new(mstm, port, path, &mstc);
-	if (ret) {
-		if (mstc)
-			mstc->connector.funcs->destroy(&mstc->connector);
+	if (ret)
 		return NULL;
-	}
 
 	return &mstc->connector;
 }
@@ -1120,7 +1160,6 @@ nv50_mstm = {
 	.add_connector = nv50_mstm_add_connector,
 	.register_connector = nv50_mstm_register_connector,
 	.destroy_connector = nv50_mstm_destroy_connector,
-	.hotplug = nv50_mstm_hotplug,
 };
 
 void
@@ -2125,6 +2164,10 @@ nv50_disp_atomic_check(struct drm_device *dev, struct drm_atomic_state *state)
 			return ret;
 	}
 
+	ret = drm_dp_mst_atomic_check(state);
+	if (ret)
+		return ret;
+
 	return 0;
 }
 
@@ -2178,8 +2221,8 @@ nv50_disp_func = {
  * Init
  *****************************************************************************/
 
-void
-nv50_display_fini(struct drm_device *dev)
+static void
+nv50_display_fini(struct drm_device *dev, bool suspend)
 {
 	struct nouveau_encoder *nv_encoder;
 	struct drm_encoder *encoder;
@@ -2200,8 +2243,8 @@ nv50_display_fini(struct drm_device *dev)
 	}
 }
 
-int
-nv50_display_init(struct drm_device *dev)
+static int
+nv50_display_init(struct drm_device *dev, bool resume, bool runtime)
 {
 	struct nv50_core *core = nv50_disp(dev)->core;
 	struct drm_encoder *encoder;
@@ -2227,7 +2270,7 @@ nv50_display_init(struct drm_device *dev)
 	return 0;
 }
 
-void
+static void
 nv50_display_destroy(struct drm_device *dev)
 {
 	struct nv50_disp *disp = nv50_disp(dev);
diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c
index ac97ebce5b35..2e7a0c347ddb 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.c
@@ -413,6 +413,7 @@ nv50_head_atomic_duplicate_state(struct drm_crtc *crtc)
 	asyh->ovly = armh->ovly;
 	asyh->dither = armh->dither;
 	asyh->procamp = armh->procamp;
+	asyh->dp = armh->dp;
 	asyh->clr.mask = 0;
 	asyh->set.mask = 0;
 	return &asyh->state;
diff --git a/drivers/gpu/drm/nouveau/dispnv50/wimm.c b/drivers/gpu/drm/nouveau/dispnv50/wimm.c
index bc9eeaf212ae..a1ac153d5e98 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/wimm.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/wimm.c
@@ -31,7 +31,7 @@ nv50_wimm_init(struct nouveau_drm *drm, struct nv50_wndw *wndw)
 		int version;
 		int (*init)(struct nouveau_drm *, s32, struct nv50_wndw *);
 	} wimms[] = {
-		{ TU104_DISP_WINDOW_IMM_CHANNEL_DMA, 0, wimmc37b_init },
+		{ TU102_DISP_WINDOW_IMM_CHANNEL_DMA, 0, wimmc37b_init },
 		{ GV100_DISP_WINDOW_IMM_CHANNEL_DMA, 0, wimmc37b_init },
 		{}
 	};
diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
index ba9eea2ff16b..b95181027b31 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
@@ -626,7 +626,7 @@ nv50_wndw_new(struct nouveau_drm *drm, enum drm_plane_type type, int index,
 		int (*new)(struct nouveau_drm *, enum drm_plane_type,
 			   int, s32, struct nv50_wndw **);
 	} wndws[] = {
-		{ TU104_DISP_WINDOW_CHANNEL_DMA, 0, wndwc57e_new },
+		{ TU102_DISP_WINDOW_CHANNEL_DMA, 0, wndwc57e_new },
 		{ GV100_DISP_WINDOW_CHANNEL_DMA, 0, wndwc37e_new },
 		{}
 	};
diff --git a/drivers/gpu/drm/nouveau/include/nvif/class.h b/drivers/gpu/drm/nouveau/include/nvif/class.h
index 1d82cbf70cf4..7d556a1c92fa 100644
--- a/drivers/gpu/drm/nouveau/include/nvif/class.h
+++ b/drivers/gpu/drm/nouveau/include/nvif/class.h
@@ -54,6 +54,9 @@
 
 #define VOLTA_USERMODE_A                                             0x0000c361
 
+#define MAXWELL_FAULT_BUFFER_A                        /* clb069.h */ 0x0000b069
+#define VOLTA_FAULT_BUFFER_A                          /* clb069.h */ 0x0000c369
+
 #define NV03_CHANNEL_DMA                              /* cl506b.h */ 0x0000006b
 #define NV10_CHANNEL_DMA                              /* cl506b.h */ 0x0000006e
 #define NV17_CHANNEL_DMA                              /* cl506b.h */ 0x0000176e
@@ -84,7 +87,7 @@
 #define GP100_DISP                                    /* cl5070.h */ 0x00009770
 #define GP102_DISP                                    /* cl5070.h */ 0x00009870
 #define GV100_DISP                                    /* cl5070.h */ 0x0000c370
-#define TU104_DISP                                    /* cl5070.h */ 0x0000c570
+#define TU102_DISP                                    /* cl5070.h */ 0x0000c570
 
 #define NV31_MPEG                                                    0x00003174
 #define G82_MPEG                                                     0x00008274
@@ -97,7 +100,7 @@
 #define GF110_DISP_CURSOR                             /* cl507a.h */ 0x0000907a
 #define GK104_DISP_CURSOR                             /* cl507a.h */ 0x0000917a
 #define GV100_DISP_CURSOR                             /* cl507a.h */ 0x0000c37a
-#define TU104_DISP_CURSOR                             /* cl507a.h */ 0x0000c57a
+#define TU102_DISP_CURSOR                             /* cl507a.h */ 0x0000c57a
 
 #define NV50_DISP_OVERLAY                             /* cl507b.h */ 0x0000507b
 #define G82_DISP_OVERLAY                              /* cl507b.h */ 0x0000827b
@@ -106,7 +109,7 @@
 #define GK104_DISP_OVERLAY                            /* cl507b.h */ 0x0000917b
 
 #define GV100_DISP_WINDOW_IMM_CHANNEL_DMA             /* clc37b.h */ 0x0000c37b
-#define TU104_DISP_WINDOW_IMM_CHANNEL_DMA             /* clc37b.h */ 0x0000c57b
+#define TU102_DISP_WINDOW_IMM_CHANNEL_DMA             /* clc37b.h */ 0x0000c57b
 
 #define NV50_DISP_BASE_CHANNEL_DMA                    /* cl507c.h */ 0x0000507c
 #define G82_DISP_BASE_CHANNEL_DMA                     /* cl507c.h */ 0x0000827c
@@ -129,7 +132,7 @@
 #define GP100_DISP_CORE_CHANNEL_DMA                   /* cl507d.h */ 0x0000977d
 #define GP102_DISP_CORE_CHANNEL_DMA                   /* cl507d.h */ 0x0000987d
 #define GV100_DISP_CORE_CHANNEL_DMA                   /* cl507d.h */ 0x0000c37d
-#define TU104_DISP_CORE_CHANNEL_DMA                   /* cl507d.h */ 0x0000c57d
+#define TU102_DISP_CORE_CHANNEL_DMA                   /* cl507d.h */ 0x0000c57d
 
 #define NV50_DISP_OVERLAY_CHANNEL_DMA                 /* cl507e.h */ 0x0000507e
 #define G82_DISP_OVERLAY_CHANNEL_DMA                  /* cl507e.h */ 0x0000827e
@@ -139,7 +142,7 @@
 #define GK104_DISP_OVERLAY_CONTROL_DMA                /* cl507e.h */ 0x0000917e
 
 #define GV100_DISP_WINDOW_CHANNEL_DMA                 /* clc37e.h */ 0x0000c37e
-#define TU104_DISP_WINDOW_CHANNEL_DMA                 /* clc37e.h */ 0x0000c57e
+#define TU102_DISP_WINDOW_CHANNEL_DMA                 /* clc37e.h */ 0x0000c57e
 
 #define NV50_TESLA                                                   0x00005097
 #define G82_TESLA                                                    0x00008297
diff --git a/drivers/gpu/drm/nouveau/include/nvif/clb069.h b/drivers/gpu/drm/nouveau/include/nvif/clb069.h
new file mode 100644
index 000000000000..eef5d0227bab
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/include/nvif/clb069.h
@@ -0,0 +1,12 @@
+#ifndef __NVIF_CLB069_H__
+#define __NVIF_CLB069_H__
+struct nvif_clb069_v0 {
+	__u8  version;
+	__u8  pad01[3];
+	__u32 entries;
+	__u32 get;
+	__u32 put;
+};
+
+#define NVB069_V0_NTFY_FAULT                                                0x00
+#endif
diff --git a/drivers/gpu/drm/nouveau/include/nvif/if000c.h b/drivers/gpu/drm/nouveau/include/nvif/if000c.h
index 2928ecd989ad..d6dd40f21eed 100644
--- a/drivers/gpu/drm/nouveau/include/nvif/if000c.h
+++ b/drivers/gpu/drm/nouveau/include/nvif/if000c.h
@@ -3,7 +3,8 @@
 struct nvif_vmm_v0 {
 	__u8  version;
 	__u8  page_nr;
-	__u8  pad02[6];
+	__u8  managed;
+	__u8  pad03[5];
 	__u64 addr;
 	__u64 size;
 	__u8  data[];
@@ -14,6 +15,9 @@ struct nvif_vmm_v0 {
 #define NVIF_VMM_V0_PUT                                                    0x02
 #define NVIF_VMM_V0_MAP                                                    0x03
 #define NVIF_VMM_V0_UNMAP                                                  0x04
+#define NVIF_VMM_V0_PFNMAP                                                 0x05
+#define NVIF_VMM_V0_PFNCLR                                                 0x06
+#define NVIF_VMM_V0_MTHD(i)                                         ((i) + 0x80)
 
 struct nvif_vmm_page_v0 {
 	__u8  version;
@@ -61,4 +65,28 @@ struct nvif_vmm_unmap_v0 {
 	__u8  pad01[7];
 	__u64 addr;
 };
+
+struct nvif_vmm_pfnmap_v0 {
+	__u8  version;
+	__u8  page;
+	__u8  pad02[6];
+	__u64 addr;
+	__u64 size;
+#define NVIF_VMM_PFNMAP_V0_ADDR                           0xfffffffffffff000ULL
+#define NVIF_VMM_PFNMAP_V0_ADDR_SHIFT                                        12
+#define NVIF_VMM_PFNMAP_V0_APER                           0x00000000000000f0ULL
+#define NVIF_VMM_PFNMAP_V0_HOST                           0x0000000000000000ULL
+#define NVIF_VMM_PFNMAP_V0_VRAM                           0x0000000000000010ULL
+#define NVIF_VMM_PFNMAP_V0_W                              0x0000000000000002ULL
+#define NVIF_VMM_PFNMAP_V0_V                              0x0000000000000001ULL
+#define NVIF_VMM_PFNMAP_V0_NONE                           0x0000000000000000ULL
+	__u64 phys[];
+};
+
+struct nvif_vmm_pfnclr_v0 {
+	__u8  version;
+	__u8  pad01[7];
+	__u64 addr;
+	__u64 size;
+};
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvif/ifc00d.h b/drivers/gpu/drm/nouveau/include/nvif/ifc00d.h
index 1d9c637859f3..4cabd613a280 100644
--- a/drivers/gpu/drm/nouveau/include/nvif/ifc00d.h
+++ b/drivers/gpu/drm/nouveau/include/nvif/ifc00d.h
@@ -6,6 +6,12 @@ struct gp100_vmm_vn {
 	/* nvif_vmm_vX ... */
 };
 
+struct gp100_vmm_v0 {
+	/* nvif_vmm_vX ... */
+	__u8  version;
+	__u8  fault_replay;
+};
+
 struct gp100_vmm_map_vn {
 	/* nvif_vmm_map_vX ... */
 };
@@ -18,4 +24,19 @@ struct gp100_vmm_map_v0 {
 	__u8  priv;
 	__u8  kind;
 };
+
+#define GP100_VMM_VN_FAULT_REPLAY                         NVIF_VMM_V0_MTHD(0x00)
+#define GP100_VMM_VN_FAULT_CANCEL                         NVIF_VMM_V0_MTHD(0x01)
+
+struct gp100_vmm_fault_replay_vn {
+};
+
+struct gp100_vmm_fault_cancel_v0 {
+	__u8  version;
+	__u8  hub;
+	__u8  gpc;
+	__u8  client;
+	__u8  pad04[4];
+	__u64 inst;
+};
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvif/vmm.h b/drivers/gpu/drm/nouveau/include/nvif/vmm.h
index c5db8a2e82df..79bf85d2f43a 100644
--- a/drivers/gpu/drm/nouveau/include/nvif/vmm.h
+++ b/drivers/gpu/drm/nouveau/include/nvif/vmm.h
@@ -30,8 +30,8 @@ struct nvif_vmm {
 	int page_nr;
 };
 
-int nvif_vmm_init(struct nvif_mmu *, s32 oclass, u64 addr, u64 size,
-		  void *argv, u32 argc, struct nvif_vmm *);
+int nvif_vmm_init(struct nvif_mmu *, s32 oclass, bool managed, u64 addr,
+		  u64 size, void *argv, u32 argc, struct nvif_vmm *);
 void nvif_vmm_fini(struct nvif_vmm *);
 int nvif_vmm_get(struct nvif_vmm *, enum nvif_vmm_get, bool sparse,
 		 u8 page, u8 align, u64 size, struct nvif_vma *);
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/device.h b/drivers/gpu/drm/nouveau/include/nvkm/core/device.h
index 72e4dc1f0236..642492344196 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/core/device.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/core/device.h
@@ -28,6 +28,7 @@ enum nvkm_devidx {
 	NVKM_SUBDEV_ICCSENSE,
 	NVKM_SUBDEV_THERM,
 	NVKM_SUBDEV_CLK,
+	NVKM_SUBDEV_GSP,
 	NVKM_SUBDEV_SECBOOT,
 
 	NVKM_ENGINE_BSP,
@@ -137,6 +138,7 @@ struct nvkm_device {
 	struct nvkm_fb *fb;
 	struct nvkm_fuse *fuse;
 	struct nvkm_gpio *gpio;
+	struct nvkm_gsp *gsp;
 	struct nvkm_i2c *i2c;
 	struct nvkm_subdev *ibus;
 	struct nvkm_iccsense *iccsense;
@@ -209,6 +211,7 @@ struct nvkm_device_chip {
 	int (*fb      )(struct nvkm_device *, int idx, struct nvkm_fb **);
 	int (*fuse    )(struct nvkm_device *, int idx, struct nvkm_fuse **);
 	int (*gpio    )(struct nvkm_device *, int idx, struct nvkm_gpio **);
+	int (*gsp     )(struct nvkm_device *, int idx, struct nvkm_gsp **);
 	int (*i2c     )(struct nvkm_device *, int idx, struct nvkm_i2c **);
 	int (*ibus    )(struct nvkm_device *, int idx, struct nvkm_subdev **);
 	int (*iccsense)(struct nvkm_device *, int idx, struct nvkm_iccsense **);
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/ce.h b/drivers/gpu/drm/nouveau/include/nvkm/engine/ce.h
index 86abe76023c2..5f3650692e4d 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/engine/ce.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/ce.h
@@ -11,5 +11,5 @@ int gm200_ce_new(struct nvkm_device *, int, struct nvkm_engine **);
 int gp100_ce_new(struct nvkm_device *, int, struct nvkm_engine **);
 int gp102_ce_new(struct nvkm_device *, int, struct nvkm_engine **);
 int gv100_ce_new(struct nvkm_device *, int, struct nvkm_engine **);
-int tu104_ce_new(struct nvkm_device *, int, struct nvkm_engine **);
+int tu102_ce_new(struct nvkm_device *, int, struct nvkm_engine **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/disp.h b/drivers/gpu/drm/nouveau/include/nvkm/engine/disp.h
index 5ca86e178bb9..3026b22d44fb 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/engine/disp.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/disp.h
@@ -36,5 +36,5 @@ int gm200_disp_new(struct nvkm_device *, int, struct nvkm_disp **);
 int gp100_disp_new(struct nvkm_device *, int, struct nvkm_disp **);
 int gp102_disp_new(struct nvkm_device *, int, struct nvkm_disp **);
 int gv100_disp_new(struct nvkm_device *, int, struct nvkm_disp **);
-int tu104_disp_new(struct nvkm_device *, int, struct nvkm_disp **);
+int tu102_disp_new(struct nvkm_device *, int, struct nvkm_disp **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h
index 3b2b685778eb..b7fc04dd1628 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h
@@ -74,5 +74,5 @@ int gm20b_fifo_new(struct nvkm_device *, int, struct nvkm_fifo **);
 int gp100_fifo_new(struct nvkm_device *, int, struct nvkm_fifo **);
 int gp10b_fifo_new(struct nvkm_device *, int, struct nvkm_fifo **);
 int gv100_fifo_new(struct nvkm_device *, int, struct nvkm_fifo **);
-int tu104_fifo_new(struct nvkm_device *, int, struct nvkm_fifo **);
+int tu102_fifo_new(struct nvkm_device *, int, struct nvkm_fifo **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/gr.h b/drivers/gpu/drm/nouveau/include/nvkm/engine/gr.h
index ba1518ff8b66..1e924c7f7ba7 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/engine/gr.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/gr.h
@@ -10,6 +10,9 @@ struct nvkm_gr {
 
 u64 nvkm_gr_units(struct nvkm_gr *);
 int nvkm_gr_tlb_flush(struct nvkm_gr *);
+int nvkm_gr_ctxsw_pause(struct nvkm_device *);
+int nvkm_gr_ctxsw_resume(struct nvkm_device *);
+u32 nvkm_gr_ctxsw_inst(struct nvkm_device *);
 
 int nv04_gr_new(struct nvkm_device *, int, struct nvkm_gr **);
 int nv10_gr_new(struct nvkm_device *, int, struct nvkm_gr **);
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/nvdec.h b/drivers/gpu/drm/nouveau/include/nvkm/engine/nvdec.h
index fe716859d4a9..b72a4844c5f7 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/engine/nvdec.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/nvdec.h
@@ -6,6 +6,8 @@
 
 struct nvkm_nvdec {
 	struct nvkm_engine engine;
+	u32 addr;
+
 	struct nvkm_falcon *falcon;
 };
 
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/sec2.h b/drivers/gpu/drm/nouveau/include/nvkm/engine/sec2.h
index f7d89822b905..c93ad332461a 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/engine/sec2.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/sec2.h
@@ -5,10 +5,13 @@
 
 struct nvkm_sec2 {
 	struct nvkm_engine engine;
+	u32 addr;
+
 	struct nvkm_falcon *falcon;
 	struct nvkm_msgqueue *queue;
 	struct work_struct work;
 };
 
 int gp102_sec2_new(struct nvkm_device *, int, struct nvkm_sec2 **);
+int tu102_sec2_new(struct nvkm_device *, int, struct nvkm_sec2 **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bar.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bar.h
index fd9d713b611c..da14486317ca 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bar.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bar.h
@@ -29,5 +29,5 @@ int gf100_bar_new(struct nvkm_device *, int, struct nvkm_bar **);
 int gk20a_bar_new(struct nvkm_device *, int, struct nvkm_bar **);
 int gm107_bar_new(struct nvkm_device *, int, struct nvkm_bar **);
 int gm20b_bar_new(struct nvkm_device *, int, struct nvkm_bar **);
-int tu104_bar_new(struct nvkm_device *, int, struct nvkm_bar **);
+int tu102_bar_new(struct nvkm_device *, int, struct nvkm_bar **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/devinit.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/devinit.h
index 1b71812a790b..8ba982c2fdfb 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/devinit.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/devinit.h
@@ -31,5 +31,5 @@ int gf100_devinit_new(struct nvkm_device *, int, struct nvkm_devinit **);
 int gm107_devinit_new(struct nvkm_device *, int, struct nvkm_devinit **);
 int gm200_devinit_new(struct nvkm_device *, int, struct nvkm_devinit **);
 int gv100_devinit_new(struct nvkm_device *, int, struct nvkm_devinit **);
-int tu104_devinit_new(struct nvkm_device *, int, struct nvkm_devinit **);
+int tu102_devinit_new(struct nvkm_device *, int, struct nvkm_devinit **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h
index 127f48066026..97322f95b3ee 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/fault.h
@@ -13,6 +13,8 @@ struct nvkm_fault {
 	struct nvkm_event event;
 
 	struct nvkm_notify nrpfb;
+
+	struct nvkm_device_oclass user;
 };
 
 struct nvkm_fault_data {
@@ -30,5 +32,5 @@ struct nvkm_fault_data {
 
 int gp100_fault_new(struct nvkm_device *, int, struct nvkm_fault **);
 int gv100_fault_new(struct nvkm_device *, int, struct nvkm_fault **);
-int tu104_fault_new(struct nvkm_device *, int, struct nvkm_fault **);
+int tu102_fault_new(struct nvkm_device *, int, struct nvkm_fault **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
new file mode 100644
index 000000000000..4c672a5c4cd5
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
@@ -0,0 +1,14 @@
+#ifndef __NVKM_GSP_H__
+#define __NVKM_GSP_H__
+#define nvkm_gsp(p) container_of((p), struct nvkm_gsp, subdev)
+#include <core/subdev.h>
+
+struct nvkm_gsp {
+	struct nvkm_subdev subdev;
+	u32 addr;
+
+	struct nvkm_falcon *falcon;
+};
+
+int gv100_gsp_new(struct nvkm_device *, int, struct nvkm_gsp **);
+#endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mc.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mc.h
index b66dedd8abb6..e38f4958dea2 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mc.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mc.h
@@ -31,5 +31,5 @@ int gk104_mc_new(struct nvkm_device *, int, struct nvkm_mc **);
 int gk20a_mc_new(struct nvkm_device *, int, struct nvkm_mc **);
 int gp100_mc_new(struct nvkm_device *, int, struct nvkm_mc **);
 int gp10b_mc_new(struct nvkm_device *, int, struct nvkm_mc **);
-int tu104_mc_new(struct nvkm_device *, int, struct nvkm_mc **);
+int tu102_mc_new(struct nvkm_device *, int, struct nvkm_mc **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h
index 0a0e064f22e5..28ade86f74c5 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h
@@ -17,6 +17,7 @@ struct nvkm_vma {
 	bool part:1; /* Region was split from an allocated region by map(). */
 	bool user:1; /* Region user-allocated. */
 	bool busy:1; /* Region busy (for temporarily preventing user access). */
+	bool mapped:1; /* Region contains valid pages. */
 	struct nvkm_memory *memory; /* Memory currently mapped into VMA. */
 	struct nvkm_tags *tags; /* Compression tag reference. */
 };
@@ -44,6 +45,8 @@ struct nvkm_vmm {
 
 	dma_addr_t null;
 	void *nullp;
+
+	bool replay;
 };
 
 int nvkm_vmm_new(struct nvkm_device *, u64 addr, u64 size, void *argv, u32 argc,
@@ -63,6 +66,7 @@ struct nvkm_vmm_map {
 	struct nvkm_mm_node *mem;
 	struct scatterlist *sgl;
 	dma_addr_t *dma;
+	u64 *pfn;
 	u64 off;
 
 	const struct nvkm_vmm_page *page;
@@ -130,5 +134,5 @@ int gm20b_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 int gp100_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 int gp10b_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 int gv100_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
-int tu104_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
+int tu102_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/top.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/top.h
index f7d3eb647e2e..2904e67d79d2 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/top.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/top.h
@@ -9,6 +9,7 @@ struct nvkm_top {
 	struct list_head device;
 };
 
+u32 nvkm_top_addr(struct nvkm_device *, enum nvkm_devidx);
 u32 nvkm_top_reset(struct nvkm_device *, enum nvkm_devidx);
 u32 nvkm_top_intr(struct nvkm_device *, u32 intr, u64 *subdevs);
 u32 nvkm_top_intr_mask(struct nvkm_device *, enum nvkm_devidx);
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/volt.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/volt.h
index 8a0f85f5fc1a..6a765682fbfa 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/volt.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/volt.h
@@ -38,6 +38,7 @@ int nvkm_volt_set_id(struct nvkm_volt *, u8 id, u8 min_id, u8 temp,
 
 int nv40_volt_new(struct nvkm_device *, int, struct nvkm_volt **);
 int gf100_volt_new(struct nvkm_device *, int, struct nvkm_volt **);
+int gf117_volt_new(struct nvkm_device *, int, struct nvkm_volt **);
 int gk104_volt_new(struct nvkm_device *, int, struct nvkm_volt **);
 int gk20a_volt_new(struct nvkm_device *, int, struct nvkm_volt **);
 int gm20b_volt_new(struct nvkm_device *, int, struct nvkm_volt **);
diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c b/drivers/gpu/drm/nouveau/nouveau_abi16.c
index b06cdac8f3a2..c3fd5dd39ed9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.c
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c
@@ -214,6 +214,7 @@ nouveau_abi16_ioctl_getparam(ABI16_IOCTL_ARGS)
 			WARN_ON(1);
 			break;
 		}
+		break;
 	case NOUVEAU_GETPARAM_FB_SIZE:
 		getparam->value = drm->gem.vram_available;
 		break;
@@ -338,7 +339,8 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS)
 		goto done;
 
 	if (device->info.family >= NV_DEVICE_INFO_V0_TESLA) {
-		ret = nouveau_vma_new(chan->ntfy, &cli->vmm, &chan->ntfy_vma);
+		ret = nouveau_vma_new(chan->ntfy, chan->chan->vmm,
+				      &chan->ntfy_vma);
 		if (ret)
 			goto done;
 	}
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 73eff52036d2..34a998012bf6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -194,7 +194,7 @@ nouveau_bo_new(struct nouveau_cli *cli, u64 size, int align,
 	struct nouveau_drm *drm = cli->drm;
 	struct nouveau_bo *nvbo;
 	struct nvif_mmu *mmu = &cli->mmu;
-	struct nvif_vmm *vmm = &cli->vmm.vmm;
+	struct nvif_vmm *vmm = cli->svm.cli ? &cli->svm.vmm : &cli->vmm.vmm;
 	size_t acc_size;
 	int type = ttm_bo_type_device;
 	int ret, i, pi = -1;
@@ -1434,7 +1434,7 @@ nouveau_ttm_io_mem_reserve(struct ttm_bo_device *bdev, struct ttm_mem_reg *reg)
 		if (drm->client.mem->oclass < NVIF_CLASS_MEM_NV50 || !mem->kind)
 			/* untiled */
 			break;
-		/* fallthrough, tiled memory */
+		/* fall through - tiled memory */
 	case TTM_PL_VRAM:
 		reg->bus.offset = reg->start << PAGE_SHIFT;
 		reg->bus.base = device->func->resource_addr(device, 1);
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h b/drivers/gpu/drm/nouveau/nouveau_bo.h
index 73c48440d4d7..846f4bdec0de 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.h
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.h
@@ -61,12 +61,14 @@ nouveau_bo_ref(struct nouveau_bo *ref, struct nouveau_bo **pnvbo)
 		return -EINVAL;
 	prev = *pnvbo;
 
-	*pnvbo = ref ? nouveau_bo(ttm_bo_reference(&ref->bo)) : NULL;
-	if (prev) {
-		struct ttm_buffer_object *bo = &prev->bo;
-
-		ttm_bo_unref(&bo);
+	if (ref) {
+		ttm_bo_get(&ref->bo);
+		*pnvbo = nouveau_bo(&ref->bo);
+	} else {
+		*pnvbo = NULL;
 	}
+	if (prev)
+		ttm_bo_put(&prev->bo);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_chan.c b/drivers/gpu/drm/nouveau/nouveau_chan.c
index 668afbc29c3e..282fd90b65e1 100644
--- a/drivers/gpu/drm/nouveau/nouveau_chan.c
+++ b/drivers/gpu/drm/nouveau/nouveau_chan.c
@@ -42,6 +42,7 @@
 #include "nouveau_fence.h"
 #include "nouveau_abi16.h"
 #include "nouveau_vmm.h"
+#include "nouveau_svm.h"
 
 MODULE_PARM_DESC(vram_pushbuf, "Create DMA push buffers in VRAM");
 int nouveau_vram_pushbuf;
@@ -95,6 +96,10 @@ nouveau_channel_del(struct nouveau_channel **pchan)
 
 		if (chan->fence)
 			nouveau_fence(chan->drm)->context_del(chan);
+
+		if (cli)
+			nouveau_svmm_part(chan->vmm->svmm, chan->inst);
+
 		nvif_object_fini(&chan->nvsw);
 		nvif_object_fini(&chan->gart);
 		nvif_object_fini(&chan->vram);
@@ -130,6 +135,7 @@ nouveau_channel_prep(struct nouveau_drm *drm, struct nvif_device *device,
 
 	chan->device = device;
 	chan->drm = drm;
+	chan->vmm = cli->svm.cli ? &cli->svm : &cli->vmm;
 	atomic_set(&chan->killed, 0);
 
 	/* allocate memory for dma push buffer */
@@ -157,7 +163,7 @@ nouveau_channel_prep(struct nouveau_drm *drm, struct nvif_device *device,
 	chan->push.addr = chan->push.buffer->bo.offset;
 
 	if (device->info.family >= NV_DEVICE_INFO_V0_TESLA) {
-		ret = nouveau_vma_new(chan->push.buffer, &cli->vmm,
+		ret = nouveau_vma_new(chan->push.buffer, chan->vmm,
 				      &chan->push.vma);
 		if (ret) {
 			nouveau_channel_del(pchan);
@@ -172,7 +178,7 @@ nouveau_channel_prep(struct nouveau_drm *drm, struct nvif_device *device,
 		args.target = NV_DMA_V0_TARGET_VM;
 		args.access = NV_DMA_V0_ACCESS_VM;
 		args.start = 0;
-		args.limit = cli->vmm.vmm.limit - 1;
+		args.limit = chan->vmm->vmm.limit - 1;
 	} else
 	if (chan->push.buffer->bo.mem.mem_type == TTM_PL_VRAM) {
 		if (device->info.family == NV_DEVICE_INFO_V0_TNT) {
@@ -202,7 +208,7 @@ nouveau_channel_prep(struct nouveau_drm *drm, struct nvif_device *device,
 			args.target = NV_DMA_V0_TARGET_VM;
 			args.access = NV_DMA_V0_ACCESS_RDWR;
 			args.start = 0;
-			args.limit = cli->vmm.vmm.limit - 1;
+			args.limit = chan->vmm->vmm.limit - 1;
 		}
 	}
 
@@ -220,7 +226,6 @@ static int
 nouveau_channel_ind(struct nouveau_drm *drm, struct nvif_device *device,
 		    u64 runlist, bool priv, struct nouveau_channel **pchan)
 {
-	struct nouveau_cli *cli = (void *)device->object.client;
 	static const u16 oclasses[] = { TURING_CHANNEL_GPFIFO_A,
 					VOLTA_CHANNEL_GPFIFO_A,
 					PASCAL_CHANNEL_GPFIFO_A,
@@ -255,7 +260,7 @@ nouveau_channel_ind(struct nouveau_drm *drm, struct nvif_device *device,
 			args.volta.ilength = 0x02000;
 			args.volta.ioffset = 0x10000 + chan->push.addr;
 			args.volta.runlist = runlist;
-			args.volta.vmm = nvif_handle(&cli->vmm.vmm.object);
+			args.volta.vmm = nvif_handle(&chan->vmm->vmm.object);
 			args.volta.priv = priv;
 			size = sizeof(args.volta);
 		} else
@@ -264,7 +269,7 @@ nouveau_channel_ind(struct nouveau_drm *drm, struct nvif_device *device,
 			args.kepler.ilength = 0x02000;
 			args.kepler.ioffset = 0x10000 + chan->push.addr;
 			args.kepler.runlist = runlist;
-			args.kepler.vmm = nvif_handle(&cli->vmm.vmm.object);
+			args.kepler.vmm = nvif_handle(&chan->vmm->vmm.object);
 			args.kepler.priv = priv;
 			size = sizeof(args.kepler);
 		} else
@@ -272,14 +277,14 @@ nouveau_channel_ind(struct nouveau_drm *drm, struct nvif_device *device,
 			args.fermi.version = 0;
 			args.fermi.ilength = 0x02000;
 			args.fermi.ioffset = 0x10000 + chan->push.addr;
-			args.fermi.vmm = nvif_handle(&cli->vmm.vmm.object);
+			args.fermi.vmm = nvif_handle(&chan->vmm->vmm.object);
 			size = sizeof(args.fermi);
 		} else {
 			args.nv50.version = 0;
 			args.nv50.ilength = 0x02000;
 			args.nv50.ioffset = 0x10000 + chan->push.addr;
 			args.nv50.pushbuf = nvif_handle(&chan->push.ctxdma);
-			args.nv50.vmm = nvif_handle(&cli->vmm.vmm.object);
+			args.nv50.vmm = nvif_handle(&chan->vmm->vmm.object);
 			size = sizeof(args.nv50);
 		}
 
@@ -350,7 +355,6 @@ static int
 nouveau_channel_init(struct nouveau_channel *chan, u32 vram, u32 gart)
 {
 	struct nvif_device *device = chan->device;
-	struct nouveau_cli *cli = (void *)chan->user.client;
 	struct nouveau_drm *drm = chan->drm;
 	struct nv_dma_v0 args = {};
 	int ret, i;
@@ -376,7 +380,7 @@ nouveau_channel_init(struct nouveau_channel *chan, u32 vram, u32 gart)
 			args.target = NV_DMA_V0_TARGET_VM;
 			args.access = NV_DMA_V0_ACCESS_VM;
 			args.start = 0;
-			args.limit = cli->vmm.vmm.limit - 1;
+			args.limit = chan->vmm->vmm.limit - 1;
 		} else {
 			args.target = NV_DMA_V0_TARGET_VRAM;
 			args.access = NV_DMA_V0_ACCESS_RDWR;
@@ -393,7 +397,7 @@ nouveau_channel_init(struct nouveau_channel *chan, u32 vram, u32 gart)
 			args.target = NV_DMA_V0_TARGET_VM;
 			args.access = NV_DMA_V0_ACCESS_VM;
 			args.start = 0;
-			args.limit = cli->vmm.vmm.limit - 1;
+			args.limit = chan->vmm->vmm.limit - 1;
 		} else
 		if (chan->drm->agp.bridge) {
 			args.target = NV_DMA_V0_TARGET_AGP;
@@ -405,7 +409,7 @@ nouveau_channel_init(struct nouveau_channel *chan, u32 vram, u32 gart)
 			args.target = NV_DMA_V0_TARGET_VM;
 			args.access = NV_DMA_V0_ACCESS_RDWR;
 			args.start = 0;
-			args.limit = cli->vmm.vmm.limit - 1;
+			args.limit = chan->vmm->vmm.limit - 1;
 		}
 
 		ret = nvif_object_init(&chan->user, gart, NV_DMA_IN_MEMORY,
@@ -495,6 +499,10 @@ nouveau_channel_new(struct nouveau_drm *drm, struct nvif_device *device,
 		nouveau_channel_del(pchan);
 	}
 
+	ret = nouveau_svmm_join((*pchan)->vmm->svmm, (*pchan)->inst);
+	if (ret)
+		nouveau_channel_del(pchan);
+
 done:
 	cli->base.super = super;
 	return ret;
diff --git a/drivers/gpu/drm/nouveau/nouveau_chan.h b/drivers/gpu/drm/nouveau/nouveau_chan.h
index 28418f4e5748..93814d1d31e4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_chan.h
+++ b/drivers/gpu/drm/nouveau/nouveau_chan.h
@@ -8,6 +8,7 @@ struct nvif_device;
 struct nouveau_channel {
 	struct nvif_device *device;
 	struct nouveau_drm *drm;
+	struct nouveau_vmm *vmm;
 
 	int chid;
 	u64 inst;
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 3f463c91314a..4116ee62adaf 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -33,6 +33,7 @@
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_atomic.h>
 
 #include "nouveau_reg.h"
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c
index f326ffd86766..55c0fa451163 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -30,19 +30,15 @@
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
-
-#include <nvif/class.h>
+#include <drm/drm_probe_helper.h>
 
 #include "nouveau_fbcon.h"
-#include "dispnv04/hw.h"
 #include "nouveau_crtc.h"
-#include "nouveau_dma.h"
 #include "nouveau_gem.h"
 #include "nouveau_connector.h"
 #include "nv50_display.h"
 
-#include "nouveau_fence.h"
-
+#include <nvif/class.h>
 #include <nvif/cl0046.h>
 #include <nvif/event.h>
 
@@ -411,15 +407,14 @@ nouveau_display_acpi_ntfy(struct notifier_block *nb, unsigned long val,
 #endif
 
 int
-nouveau_display_init(struct drm_device *dev)
+nouveau_display_init(struct drm_device *dev, bool resume, bool runtime)
 {
 	struct nouveau_display *disp = nouveau_display(dev);
-	struct nouveau_drm *drm = nouveau_drm(dev);
 	struct drm_connector *connector;
 	struct drm_connector_list_iter conn_iter;
 	int ret;
 
-	ret = disp->init(dev);
+	ret = disp->init(dev, resume, runtime);
 	if (ret)
 		return ret;
 
@@ -436,8 +431,6 @@ nouveau_display_init(struct drm_device *dev)
 	}
 	drm_connector_list_iter_end(&conn_iter);
 
-	/* enable flip completion events */
-	nvif_notify_get(&drm->flip);
 	return ret;
 }
 
@@ -453,12 +446,9 @@ nouveau_display_fini(struct drm_device *dev, bool suspend, bool runtime)
 		if (drm_drv_uses_atomic_modeset(dev))
 			drm_atomic_helper_shutdown(dev);
 		else
-			drm_crtc_force_disable_all(dev);
+			drm_helper_force_disable_all(dev);
 	}
 
-	/* disable flip completion events */
-	nvif_notify_put(&drm->flip);
-
 	/* disable hotplug interrupts */
 	drm_connector_list_iter_begin(dev, &conn_iter);
 	nouveau_for_each_non_mst_connector_iter(connector, &conn_iter) {
@@ -471,7 +461,7 @@ nouveau_display_fini(struct drm_device *dev, bool suspend, bool runtime)
 		cancel_work_sync(&drm->hpd_work);
 
 	drm_kms_helper_poll_disable(dev);
-	disp->fini(dev);
+	disp->fini(dev, suspend);
 }
 
 static void
@@ -624,7 +614,6 @@ int
 nouveau_display_suspend(struct drm_device *dev, bool runtime)
 {
 	struct nouveau_display *disp = nouveau_display(dev);
-	struct drm_crtc *crtc;
 
 	if (drm_drv_uses_atomic_modeset(dev)) {
 		if (!runtime) {
@@ -635,32 +624,9 @@ nouveau_display_suspend(struct drm_device *dev, bool runtime)
 				return ret;
 			}
 		}
-
-		nouveau_display_fini(dev, true, runtime);
-		return 0;
 	}
 
 	nouveau_display_fini(dev, true, runtime);
-
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_framebuffer *nouveau_fb;
-
-		nouveau_fb = nouveau_framebuffer(crtc->primary->fb);
-		if (!nouveau_fb || !nouveau_fb->nvbo)
-			continue;
-
-		nouveau_bo_unpin(nouveau_fb->nvbo);
-	}
-
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-		if (nv_crtc->cursor.nvbo) {
-			if (nv_crtc->cursor.set_offset)
-				nouveau_bo_unmap(nv_crtc->cursor.nvbo);
-			nouveau_bo_unpin(nv_crtc->cursor.nvbo);
-		}
-	}
-
 	return 0;
 }
 
@@ -668,275 +634,16 @@ void
 nouveau_display_resume(struct drm_device *dev, bool runtime)
 {
 	struct nouveau_display *disp = nouveau_display(dev);
-	struct nouveau_drm *drm = nouveau_drm(dev);
-	struct drm_crtc *crtc;
-	int ret;
+
+	nouveau_display_init(dev, true, runtime);
 
 	if (drm_drv_uses_atomic_modeset(dev)) {
-		nouveau_display_init(dev);
 		if (disp->suspend) {
 			drm_atomic_helper_resume(dev, disp->suspend);
 			disp->suspend = NULL;
 		}
 		return;
 	}
-
-	/* re-pin fb/cursors */
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_framebuffer *nouveau_fb;
-
-		nouveau_fb = nouveau_framebuffer(crtc->primary->fb);
-		if (!nouveau_fb || !nouveau_fb->nvbo)
-			continue;
-
-		ret = nouveau_bo_pin(nouveau_fb->nvbo, TTM_PL_FLAG_VRAM, true);
-		if (ret)
-			NV_ERROR(drm, "Could not pin framebuffer\n");
-	}
-
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-		if (!nv_crtc->cursor.nvbo)
-			continue;
-
-		ret = nouveau_bo_pin(nv_crtc->cursor.nvbo, TTM_PL_FLAG_VRAM, true);
-		if (!ret && nv_crtc->cursor.set_offset)
-			ret = nouveau_bo_map(nv_crtc->cursor.nvbo);
-		if (ret)
-			NV_ERROR(drm, "Could not pin/map cursor.\n");
-	}
-
-	nouveau_display_init(dev);
-
-	/* Force CLUT to get re-loaded during modeset */
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-
-		nv_crtc->lut.depth = 0;
-	}
-
-	/* This should ensure we don't hit a locking problem when someone
-	 * wakes us up via a connector.  We should never go into suspend
-	 * while the display is on anyways.
-	 */
-	if (runtime)
-		return;
-
-	drm_helper_resume_force_mode(dev);
-
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-
-		if (!nv_crtc->cursor.nvbo)
-			continue;
-
-		if (nv_crtc->cursor.set_offset)
-			nv_crtc->cursor.set_offset(nv_crtc, nv_crtc->cursor.nvbo->bo.offset);
-		nv_crtc->cursor.set_pos(nv_crtc, nv_crtc->cursor_saved_x,
-						 nv_crtc->cursor_saved_y);
-	}
-}
-
-static int
-nouveau_page_flip_emit(struct nouveau_channel *chan,
-		       struct nouveau_bo *old_bo,
-		       struct nouveau_bo *new_bo,
-		       struct nouveau_page_flip_state *s,
-		       struct nouveau_fence **pfence)
-{
-	struct nouveau_fence_chan *fctx = chan->fence;
-	struct nouveau_drm *drm = chan->drm;
-	struct drm_device *dev = drm->dev;
-	unsigned long flags;
-	int ret;
-
-	/* Queue it to the pending list */
-	spin_lock_irqsave(&dev->event_lock, flags);
-	list_add_tail(&s->head, &fctx->flip);
-	spin_unlock_irqrestore(&dev->event_lock, flags);
-
-	/* Synchronize with the old framebuffer */
-	ret = nouveau_fence_sync(old_bo, chan, false, false);
-	if (ret)
-		goto fail;
-
-	/* Emit the pageflip */
-	ret = RING_SPACE(chan, 2);
-	if (ret)
-		goto fail;
-
-	BEGIN_NV04(chan, NvSubSw, NV_SW_PAGE_FLIP, 1);
-	OUT_RING  (chan, 0x00000000);
-	FIRE_RING (chan);
-
-	ret = nouveau_fence_new(chan, false, pfence);
-	if (ret)
-		goto fail;
-
-	return 0;
-fail:
-	spin_lock_irqsave(&dev->event_lock, flags);
-	list_del(&s->head);
-	spin_unlock_irqrestore(&dev->event_lock, flags);
-	return ret;
-}
-
-int
-nouveau_crtc_page_flip(struct drm_crtc *crtc, struct drm_framebuffer *fb,
-		       struct drm_pending_vblank_event *event, u32 flags,
-		       struct drm_modeset_acquire_ctx *ctx)
-{
-	const int swap_interval = (flags & DRM_MODE_PAGE_FLIP_ASYNC) ? 0 : 1;
-	struct drm_device *dev = crtc->dev;
-	struct nouveau_drm *drm = nouveau_drm(dev);
-	struct nouveau_bo *old_bo = nouveau_framebuffer(crtc->primary->fb)->nvbo;
-	struct nouveau_bo *new_bo = nouveau_framebuffer(fb)->nvbo;
-	struct nouveau_page_flip_state *s;
-	struct nouveau_channel *chan;
-	struct nouveau_cli *cli;
-	struct nouveau_fence *fence;
-	struct nv04_display *dispnv04 = nv04_display(dev);
-	int head = nouveau_crtc(crtc)->index;
-	int ret;
-
-	chan = drm->channel;
-	if (!chan)
-		return -ENODEV;
-	cli = (void *)chan->user.client;
-
-	s = kzalloc(sizeof(*s), GFP_KERNEL);
-	if (!s)
-		return -ENOMEM;
-
-	if (new_bo != old_bo) {
-		ret = nouveau_bo_pin(new_bo, TTM_PL_FLAG_VRAM, true);
-		if (ret)
-			goto fail_free;
-	}
-
-	mutex_lock(&cli->mutex);
-	ret = ttm_bo_reserve(&new_bo->bo, true, false, NULL);
-	if (ret)
-		goto fail_unpin;
-
-	/* synchronise rendering channel with the kernel's channel */
-	ret = nouveau_fence_sync(new_bo, chan, false, true);
-	if (ret) {
-		ttm_bo_unreserve(&new_bo->bo);
-		goto fail_unpin;
-	}
-
-	if (new_bo != old_bo) {
-		ttm_bo_unreserve(&new_bo->bo);
-
-		ret = ttm_bo_reserve(&old_bo->bo, true, false, NULL);
-		if (ret)
-			goto fail_unpin;
-	}
-
-	/* Initialize a page flip struct */
-	*s = (struct nouveau_page_flip_state)
-		{ { }, event, crtc, fb->format->cpp[0] * 8, fb->pitches[0],
-		  new_bo->bo.offset };
-
-	/* Keep vblanks on during flip, for the target crtc of this flip */
-	drm_crtc_vblank_get(crtc);
-
-	/* Emit a page flip */
-	if (swap_interval) {
-		ret = RING_SPACE(chan, 8);
-		if (ret)
-			goto fail_unreserve;
-
-		BEGIN_NV04(chan, NvSubImageBlit, 0x012c, 1);
-		OUT_RING  (chan, 0);
-		BEGIN_NV04(chan, NvSubImageBlit, 0x0134, 1);
-		OUT_RING  (chan, head);
-		BEGIN_NV04(chan, NvSubImageBlit, 0x0100, 1);
-		OUT_RING  (chan, 0);
-		BEGIN_NV04(chan, NvSubImageBlit, 0x0130, 1);
-		OUT_RING  (chan, 0);
-	}
-
-	nouveau_bo_ref(new_bo, &dispnv04->image[head]);
-
-	ret = nouveau_page_flip_emit(chan, old_bo, new_bo, s, &fence);
-	if (ret)
-		goto fail_unreserve;
-	mutex_unlock(&cli->mutex);
-
-	/* Update the crtc struct and cleanup */
-	crtc->primary->fb = fb;
-
-	nouveau_bo_fence(old_bo, fence, false);
-	ttm_bo_unreserve(&old_bo->bo);
-	if (old_bo != new_bo)
-		nouveau_bo_unpin(old_bo);
-	nouveau_fence_unref(&fence);
-	return 0;
-
-fail_unreserve:
-	drm_crtc_vblank_put(crtc);
-	ttm_bo_unreserve(&old_bo->bo);
-fail_unpin:
-	mutex_unlock(&cli->mutex);
-	if (old_bo != new_bo)
-		nouveau_bo_unpin(new_bo);
-fail_free:
-	kfree(s);
-	return ret;
-}
-
-int
-nouveau_finish_page_flip(struct nouveau_channel *chan,
-			 struct nouveau_page_flip_state *ps)
-{
-	struct nouveau_fence_chan *fctx = chan->fence;
-	struct nouveau_drm *drm = chan->drm;
-	struct drm_device *dev = drm->dev;
-	struct nouveau_page_flip_state *s;
-	unsigned long flags;
-
-	spin_lock_irqsave(&dev->event_lock, flags);
-
-	if (list_empty(&fctx->flip)) {
-		NV_ERROR(drm, "unexpected pageflip\n");
-		spin_unlock_irqrestore(&dev->event_lock, flags);
-		return -EINVAL;
-	}
-
-	s = list_first_entry(&fctx->flip, struct nouveau_page_flip_state, head);
-	if (s->event) {
-		drm_crtc_arm_vblank_event(s->crtc, s->event);
-	} else {
-		/* Give up ownership of vblank for page-flipped crtc */
-		drm_crtc_vblank_put(s->crtc);
-	}
-
-	list_del(&s->head);
-	if (ps)
-		*ps = *s;
-	kfree(s);
-
-	spin_unlock_irqrestore(&dev->event_lock, flags);
-	return 0;
-}
-
-int
-nouveau_flip_complete(struct nvif_notify *notify)
-{
-	struct nouveau_drm *drm = container_of(notify, typeof(*drm), flip);
-	struct nouveau_channel *chan = drm->channel;
-	struct nouveau_page_flip_state state;
-
-	if (!nouveau_finish_page_flip(chan, &state)) {
-		nv_set_crtc_base(drm->dev, drm_crtc_index(state.crtc),
-				 state.offset + state.crtc->y *
-				 state.pitch + state.crtc->x *
-				 state.bpp / 8);
-	}
-
-	return NVIF_NOTIFY_KEEP;
 }
 
 int
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.h b/drivers/gpu/drm/nouveau/nouveau_display.h
index eb77e41c2d4e..311e175f0513 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.h
+++ b/drivers/gpu/drm/nouveau/nouveau_display.h
@@ -25,19 +25,11 @@ int nouveau_framebuffer_new(struct drm_device *,
 			    const struct drm_mode_fb_cmd2 *,
 			    struct nouveau_bo *, struct nouveau_framebuffer **);
 
-struct nouveau_page_flip_state {
-	struct list_head head;
-	struct drm_pending_vblank_event *event;
-	struct drm_crtc *crtc;
-	int bpp, pitch;
-	u64 offset;
-};
-
 struct nouveau_display {
 	void *priv;
 	void (*dtor)(struct drm_device *);
-	int  (*init)(struct drm_device *);
-	void (*fini)(struct drm_device *);
+	int  (*init)(struct drm_device *, bool resume, bool runtime);
+	void (*fini)(struct drm_device *, bool suspend);
 
 	struct nvif_disp disp;
 
@@ -61,7 +53,7 @@ nouveau_display(struct drm_device *dev)
 
 int  nouveau_display_create(struct drm_device *dev);
 void nouveau_display_destroy(struct drm_device *dev);
-int  nouveau_display_init(struct drm_device *dev);
+int  nouveau_display_init(struct drm_device *dev, bool resume, bool runtime);
 void nouveau_display_fini(struct drm_device *dev, bool suspend, bool runtime);
 int  nouveau_display_suspend(struct drm_device *dev, bool runtime);
 void nouveau_display_resume(struct drm_device *dev, bool runtime);
@@ -71,13 +63,6 @@ bool  nouveau_display_scanoutpos(struct drm_device *, unsigned int,
 				 bool, int *, int *, ktime_t *,
 				 ktime_t *, const struct drm_display_mode *);
 
-int  nouveau_crtc_page_flip(struct drm_crtc *crtc, struct drm_framebuffer *fb,
-			    struct drm_pending_vblank_event *event,
-			    uint32_t page_flip_flags,
-			    struct drm_modeset_acquire_ctx *ctx);
-int  nouveau_finish_page_flip(struct nouveau_channel *,
-			      struct nouveau_page_flip_state *);
-
 int  nouveau_display_dumb_create(struct drm_file *, struct drm_device *,
 				 struct drm_mode_create_dumb *args);
 int  nouveau_display_dumb_map_offset(struct drm_file *, struct drm_device *,
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
new file mode 100644
index 000000000000..8be7a83ced9b
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -0,0 +1,887 @@
+/*
+ * Copyright 2018 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+#include "nouveau_dmem.h"
+#include "nouveau_drv.h"
+#include "nouveau_chan.h"
+#include "nouveau_dma.h"
+#include "nouveau_mem.h"
+#include "nouveau_bo.h"
+
+#include <nvif/class.h>
+#include <nvif/object.h>
+#include <nvif/if500b.h>
+#include <nvif/if900b.h>
+
+#include <linux/sched/mm.h>
+#include <linux/hmm.h>
+
+/*
+ * FIXME: this is ugly right now we are using TTM to allocate vram and we pin
+ * it in vram while in use. We likely want to overhaul memory management for
+ * nouveau to be more page like (not necessarily with system page size but a
+ * bigger page size) at lowest level and have some shim layer on top that would
+ * provide the same functionality as TTM.
+ */
+#define DMEM_CHUNK_SIZE (2UL << 20)
+#define DMEM_CHUNK_NPAGES (DMEM_CHUNK_SIZE >> PAGE_SHIFT)
+
+struct nouveau_migrate;
+
+enum nouveau_aper {
+	NOUVEAU_APER_VIRT,
+	NOUVEAU_APER_VRAM,
+	NOUVEAU_APER_HOST,
+};
+
+typedef int (*nouveau_migrate_copy_t)(struct nouveau_drm *drm, u64 npages,
+				      enum nouveau_aper, u64 dst_addr,
+				      enum nouveau_aper, u64 src_addr);
+
+struct nouveau_dmem_chunk {
+	struct list_head list;
+	struct nouveau_bo *bo;
+	struct nouveau_drm *drm;
+	unsigned long pfn_first;
+	unsigned long callocated;
+	unsigned long bitmap[BITS_TO_LONGS(DMEM_CHUNK_NPAGES)];
+	spinlock_t lock;
+};
+
+struct nouveau_dmem_migrate {
+	nouveau_migrate_copy_t copy_func;
+	struct nouveau_channel *chan;
+};
+
+struct nouveau_dmem {
+	struct hmm_devmem *devmem;
+	struct nouveau_dmem_migrate migrate;
+	struct list_head chunk_free;
+	struct list_head chunk_full;
+	struct list_head chunk_empty;
+	struct mutex mutex;
+};
+
+struct nouveau_dmem_fault {
+	struct nouveau_drm *drm;
+	struct nouveau_fence *fence;
+	dma_addr_t *dma;
+	unsigned long npages;
+};
+
+struct nouveau_migrate {
+	struct vm_area_struct *vma;
+	struct nouveau_drm *drm;
+	struct nouveau_fence *fence;
+	unsigned long npages;
+	dma_addr_t *dma;
+	unsigned long dma_nr;
+};
+
+static void
+nouveau_dmem_free(struct hmm_devmem *devmem, struct page *page)
+{
+	struct nouveau_dmem_chunk *chunk;
+	struct nouveau_drm *drm;
+	unsigned long idx;
+
+	chunk = (void *)hmm_devmem_page_get_drvdata(page);
+	idx = page_to_pfn(page) - chunk->pfn_first;
+	drm = chunk->drm;
+
+	/*
+	 * FIXME:
+	 *
+	 * This is really a bad example, we need to overhaul nouveau memory
+	 * management to be more page focus and allow lighter locking scheme
+	 * to be use in the process.
+	 */
+	spin_lock(&chunk->lock);
+	clear_bit(idx, chunk->bitmap);
+	WARN_ON(!chunk->callocated);
+	chunk->callocated--;
+	/*
+	 * FIXME when chunk->callocated reach 0 we should add the chunk to
+	 * a reclaim list so that it can be freed in case of memory pressure.
+	 */
+	spin_unlock(&chunk->lock);
+}
+
+static void
+nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma,
+				  const unsigned long *src_pfns,
+				  unsigned long *dst_pfns,
+				  unsigned long start,
+				  unsigned long end,
+				  void *private)
+{
+	struct nouveau_dmem_fault *fault = private;
+	struct nouveau_drm *drm = fault->drm;
+	struct device *dev = drm->dev->dev;
+	unsigned long addr, i, npages = 0;
+	nouveau_migrate_copy_t copy;
+	int ret;
+
+
+	/* First allocate new memory */
+	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
+		struct page *dpage, *spage;
+
+		dst_pfns[i] = 0;
+		spage = migrate_pfn_to_page(src_pfns[i]);
+		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE))
+			continue;
+
+		dpage = hmm_vma_alloc_locked_page(vma, addr);
+		if (!dpage) {
+			dst_pfns[i] = MIGRATE_PFN_ERROR;
+			continue;
+		}
+
+		dst_pfns[i] = migrate_pfn(page_to_pfn(dpage)) |
+			      MIGRATE_PFN_LOCKED;
+		npages++;
+	}
+
+	/* Allocate storage for DMA addresses, so we can unmap later. */
+	fault->dma = kmalloc(sizeof(*fault->dma) * npages, GFP_KERNEL);
+	if (!fault->dma)
+		goto error;
+
+	/* Copy things over */
+	copy = drm->dmem->migrate.copy_func;
+	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
+		struct nouveau_dmem_chunk *chunk;
+		struct page *spage, *dpage;
+		u64 src_addr, dst_addr;
+
+		dpage = migrate_pfn_to_page(dst_pfns[i]);
+		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
+			continue;
+
+		spage = migrate_pfn_to_page(src_pfns[i]);
+		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) {
+			dst_pfns[i] = MIGRATE_PFN_ERROR;
+			__free_page(dpage);
+			continue;
+		}
+
+		fault->dma[fault->npages] =
+			dma_map_page_attrs(dev, dpage, 0, PAGE_SIZE,
+					   PCI_DMA_BIDIRECTIONAL,
+					   DMA_ATTR_SKIP_CPU_SYNC);
+		if (dma_mapping_error(dev, fault->dma[fault->npages])) {
+			dst_pfns[i] = MIGRATE_PFN_ERROR;
+			__free_page(dpage);
+			continue;
+		}
+
+		dst_addr = fault->dma[fault->npages++];
+
+		chunk = (void *)hmm_devmem_page_get_drvdata(spage);
+		src_addr = page_to_pfn(spage) - chunk->pfn_first;
+		src_addr = (src_addr << PAGE_SHIFT) + chunk->bo->bo.offset;
+
+		ret = copy(drm, 1, NOUVEAU_APER_HOST, dst_addr,
+				   NOUVEAU_APER_VRAM, src_addr);
+		if (ret) {
+			dst_pfns[i] = MIGRATE_PFN_ERROR;
+			__free_page(dpage);
+			continue;
+		}
+	}
+
+	nouveau_fence_new(drm->dmem->migrate.chan, false, &fault->fence);
+
+	return;
+
+error:
+	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, ++i) {
+		struct page *page;
+
+		if (!dst_pfns[i] || dst_pfns[i] == MIGRATE_PFN_ERROR)
+			continue;
+
+		page = migrate_pfn_to_page(dst_pfns[i]);
+		dst_pfns[i] = MIGRATE_PFN_ERROR;
+		if (page == NULL)
+			continue;
+
+		__free_page(page);
+	}
+}
+
+void nouveau_dmem_fault_finalize_and_map(struct vm_area_struct *vma,
+					 const unsigned long *src_pfns,
+					 const unsigned long *dst_pfns,
+					 unsigned long start,
+					 unsigned long end,
+					 void *private)
+{
+	struct nouveau_dmem_fault *fault = private;
+	struct nouveau_drm *drm = fault->drm;
+
+	if (fault->fence) {
+		nouveau_fence_wait(fault->fence, true, false);
+		nouveau_fence_unref(&fault->fence);
+	} else {
+		/*
+		 * FIXME wait for channel to be IDLE before calling finalizing
+		 * the hmem object below (nouveau_migrate_hmem_fini()).
+		 */
+	}
+
+	while (fault->npages--) {
+		dma_unmap_page(drm->dev->dev, fault->dma[fault->npages],
+			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+	}
+	kfree(fault->dma);
+}
+
+static const struct migrate_vma_ops nouveau_dmem_fault_migrate_ops = {
+	.alloc_and_copy		= nouveau_dmem_fault_alloc_and_copy,
+	.finalize_and_map	= nouveau_dmem_fault_finalize_and_map,
+};
+
+static int
+nouveau_dmem_fault(struct hmm_devmem *devmem,
+		   struct vm_area_struct *vma,
+		   unsigned long addr,
+		   const struct page *page,
+		   unsigned int flags,
+		   pmd_t *pmdp)
+{
+	struct drm_device *drm_dev = dev_get_drvdata(devmem->device);
+	unsigned long src[1] = {0}, dst[1] = {0};
+	struct nouveau_dmem_fault fault = {0};
+	int ret;
+
+
+
+	/*
+	 * FIXME what we really want is to find some heuristic to migrate more
+	 * than just one page on CPU fault. When such fault happens it is very
+	 * likely that more surrounding page will CPU fault too.
+	 */
+	fault.drm = nouveau_drm(drm_dev);
+	ret = migrate_vma(&nouveau_dmem_fault_migrate_ops, vma, addr,
+			  addr + PAGE_SIZE, src, dst, &fault);
+	if (ret)
+		return VM_FAULT_SIGBUS;
+
+	if (dst[0] == MIGRATE_PFN_ERROR)
+		return VM_FAULT_SIGBUS;
+
+	return 0;
+}
+
+static const struct hmm_devmem_ops
+nouveau_dmem_devmem_ops = {
+	.free = nouveau_dmem_free,
+	.fault = nouveau_dmem_fault,
+};
+
+static int
+nouveau_dmem_chunk_alloc(struct nouveau_drm *drm)
+{
+	struct nouveau_dmem_chunk *chunk;
+	int ret;
+
+	if (drm->dmem == NULL)
+		return -EINVAL;
+
+	mutex_lock(&drm->dmem->mutex);
+	chunk = list_first_entry_or_null(&drm->dmem->chunk_empty,
+					 struct nouveau_dmem_chunk,
+					 list);
+	if (chunk == NULL) {
+		mutex_unlock(&drm->dmem->mutex);
+		return -ENOMEM;
+	}
+
+	list_del(&chunk->list);
+	mutex_unlock(&drm->dmem->mutex);
+
+	ret = nouveau_bo_new(&drm->client, DMEM_CHUNK_SIZE, 0,
+			     TTM_PL_FLAG_VRAM, 0, 0, NULL, NULL,
+			     &chunk->bo);
+	if (ret)
+		goto out;
+
+	ret = nouveau_bo_pin(chunk->bo, TTM_PL_FLAG_VRAM, false);
+	if (ret) {
+		nouveau_bo_ref(NULL, &chunk->bo);
+		goto out;
+	}
+
+	bitmap_zero(chunk->bitmap, DMEM_CHUNK_NPAGES);
+	spin_lock_init(&chunk->lock);
+
+out:
+	mutex_lock(&drm->dmem->mutex);
+	if (chunk->bo)
+		list_add(&chunk->list, &drm->dmem->chunk_empty);
+	else
+		list_add_tail(&chunk->list, &drm->dmem->chunk_empty);
+	mutex_unlock(&drm->dmem->mutex);
+
+	return ret;
+}
+
+static struct nouveau_dmem_chunk *
+nouveau_dmem_chunk_first_free_locked(struct nouveau_drm *drm)
+{
+	struct nouveau_dmem_chunk *chunk;
+
+	chunk = list_first_entry_or_null(&drm->dmem->chunk_free,
+					 struct nouveau_dmem_chunk,
+					 list);
+	if (chunk)
+		return chunk;
+
+	chunk = list_first_entry_or_null(&drm->dmem->chunk_empty,
+					 struct nouveau_dmem_chunk,
+					 list);
+	if (chunk->bo)
+		return chunk;
+
+	return NULL;
+}
+
+static int
+nouveau_dmem_pages_alloc(struct nouveau_drm *drm,
+			 unsigned long npages,
+			 unsigned long *pages)
+{
+	struct nouveau_dmem_chunk *chunk;
+	unsigned long c;
+	int ret;
+
+	memset(pages, 0xff, npages * sizeof(*pages));
+
+	mutex_lock(&drm->dmem->mutex);
+	for (c = 0; c < npages;) {
+		unsigned long i;
+
+		chunk = nouveau_dmem_chunk_first_free_locked(drm);
+		if (chunk == NULL) {
+			mutex_unlock(&drm->dmem->mutex);
+			ret = nouveau_dmem_chunk_alloc(drm);
+			if (ret) {
+				if (c)
+					break;
+				return ret;
+			}
+			continue;
+		}
+
+		spin_lock(&chunk->lock);
+		i = find_first_zero_bit(chunk->bitmap, DMEM_CHUNK_NPAGES);
+		while (i < DMEM_CHUNK_NPAGES && c < npages) {
+			pages[c] = chunk->pfn_first + i;
+			set_bit(i, chunk->bitmap);
+			chunk->callocated++;
+			c++;
+
+			i = find_next_zero_bit(chunk->bitmap,
+					DMEM_CHUNK_NPAGES, i);
+		}
+		spin_unlock(&chunk->lock);
+	}
+	mutex_unlock(&drm->dmem->mutex);
+
+	return 0;
+}
+
+static struct page *
+nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm)
+{
+	unsigned long pfns[1];
+	struct page *page;
+	int ret;
+
+	/* FIXME stop all the miss-match API ... */
+	ret = nouveau_dmem_pages_alloc(drm, 1, pfns);
+	if (ret)
+		return NULL;
+
+	page = pfn_to_page(pfns[0]);
+	get_page(page);
+	lock_page(page);
+	return page;
+}
+
+static void
+nouveau_dmem_page_free_locked(struct nouveau_drm *drm, struct page *page)
+{
+	unlock_page(page);
+	put_page(page);
+}
+
+void
+nouveau_dmem_resume(struct nouveau_drm *drm)
+{
+	struct nouveau_dmem_chunk *chunk;
+	int ret;
+
+	if (drm->dmem == NULL)
+		return;
+
+	mutex_lock(&drm->dmem->mutex);
+	list_for_each_entry (chunk, &drm->dmem->chunk_free, list) {
+		ret = nouveau_bo_pin(chunk->bo, TTM_PL_FLAG_VRAM, false);
+		/* FIXME handle pin failure */
+		WARN_ON(ret);
+	}
+	list_for_each_entry (chunk, &drm->dmem->chunk_full, list) {
+		ret = nouveau_bo_pin(chunk->bo, TTM_PL_FLAG_VRAM, false);
+		/* FIXME handle pin failure */
+		WARN_ON(ret);
+	}
+	list_for_each_entry (chunk, &drm->dmem->chunk_empty, list) {
+		ret = nouveau_bo_pin(chunk->bo, TTM_PL_FLAG_VRAM, false);
+		/* FIXME handle pin failure */
+		WARN_ON(ret);
+	}
+	mutex_unlock(&drm->dmem->mutex);
+}
+
+void
+nouveau_dmem_suspend(struct nouveau_drm *drm)
+{
+	struct nouveau_dmem_chunk *chunk;
+
+	if (drm->dmem == NULL)
+		return;
+
+	mutex_lock(&drm->dmem->mutex);
+	list_for_each_entry (chunk, &drm->dmem->chunk_free, list) {
+		nouveau_bo_unpin(chunk->bo);
+	}
+	list_for_each_entry (chunk, &drm->dmem->chunk_full, list) {
+		nouveau_bo_unpin(chunk->bo);
+	}
+	list_for_each_entry (chunk, &drm->dmem->chunk_empty, list) {
+		nouveau_bo_unpin(chunk->bo);
+	}
+	mutex_unlock(&drm->dmem->mutex);
+}
+
+void
+nouveau_dmem_fini(struct nouveau_drm *drm)
+{
+	struct nouveau_dmem_chunk *chunk, *tmp;
+
+	if (drm->dmem == NULL)
+		return;
+
+	mutex_lock(&drm->dmem->mutex);
+
+	WARN_ON(!list_empty(&drm->dmem->chunk_free));
+	WARN_ON(!list_empty(&drm->dmem->chunk_full));
+
+	list_for_each_entry_safe (chunk, tmp, &drm->dmem->chunk_empty, list) {
+		if (chunk->bo) {
+			nouveau_bo_unpin(chunk->bo);
+			nouveau_bo_ref(NULL, &chunk->bo);
+		}
+		list_del(&chunk->list);
+		kfree(chunk);
+	}
+
+	mutex_unlock(&drm->dmem->mutex);
+}
+
+static int
+nvc0b5_migrate_copy(struct nouveau_drm *drm, u64 npages,
+		    enum nouveau_aper dst_aper, u64 dst_addr,
+		    enum nouveau_aper src_aper, u64 src_addr)
+{
+	struct nouveau_channel *chan = drm->dmem->migrate.chan;
+	u32 launch_dma = (1 << 9) /* MULTI_LINE_ENABLE. */ |
+			 (1 << 8) /* DST_MEMORY_LAYOUT_PITCH. */ |
+			 (1 << 7) /* SRC_MEMORY_LAYOUT_PITCH. */ |
+			 (1 << 2) /* FLUSH_ENABLE_TRUE. */ |
+			 (2 << 0) /* DATA_TRANSFER_TYPE_NON_PIPELINED. */;
+	int ret;
+
+	ret = RING_SPACE(chan, 13);
+	if (ret)
+		return ret;
+
+	if (src_aper != NOUVEAU_APER_VIRT) {
+		switch (src_aper) {
+		case NOUVEAU_APER_VRAM:
+			BEGIN_IMC0(chan, NvSubCopy, 0x0260, 0);
+			break;
+		case NOUVEAU_APER_HOST:
+			BEGIN_IMC0(chan, NvSubCopy, 0x0260, 1);
+			break;
+		default:
+			return -EINVAL;
+		}
+		launch_dma |= 0x00001000; /* SRC_TYPE_PHYSICAL. */
+	}
+
+	if (dst_aper != NOUVEAU_APER_VIRT) {
+		switch (dst_aper) {
+		case NOUVEAU_APER_VRAM:
+			BEGIN_IMC0(chan, NvSubCopy, 0x0264, 0);
+			break;
+		case NOUVEAU_APER_HOST:
+			BEGIN_IMC0(chan, NvSubCopy, 0x0264, 1);
+			break;
+		default:
+			return -EINVAL;
+		}
+		launch_dma |= 0x00002000; /* DST_TYPE_PHYSICAL. */
+	}
+
+	BEGIN_NVC0(chan, NvSubCopy, 0x0400, 8);
+	OUT_RING  (chan, upper_32_bits(src_addr));
+	OUT_RING  (chan, lower_32_bits(src_addr));
+	OUT_RING  (chan, upper_32_bits(dst_addr));
+	OUT_RING  (chan, lower_32_bits(dst_addr));
+	OUT_RING  (chan, PAGE_SIZE);
+	OUT_RING  (chan, PAGE_SIZE);
+	OUT_RING  (chan, PAGE_SIZE);
+	OUT_RING  (chan, npages);
+	BEGIN_NVC0(chan, NvSubCopy, 0x0300, 1);
+	OUT_RING  (chan, launch_dma);
+	return 0;
+}
+
+static int
+nouveau_dmem_migrate_init(struct nouveau_drm *drm)
+{
+	switch (drm->ttm.copy.oclass) {
+	case PASCAL_DMA_COPY_A:
+	case PASCAL_DMA_COPY_B:
+	case  VOLTA_DMA_COPY_A:
+	case TURING_DMA_COPY_A:
+		drm->dmem->migrate.copy_func = nvc0b5_migrate_copy;
+		drm->dmem->migrate.chan = drm->ttm.chan;
+		return 0;
+	default:
+		break;
+	}
+	return -ENODEV;
+}
+
+void
+nouveau_dmem_init(struct nouveau_drm *drm)
+{
+	struct device *device = drm->dev->dev;
+	unsigned long i, size;
+	int ret;
+
+	/* This only make sense on PASCAL or newer */
+	if (drm->client.device.info.family < NV_DEVICE_INFO_V0_PASCAL)
+		return;
+
+	if (!(drm->dmem = kzalloc(sizeof(*drm->dmem), GFP_KERNEL)))
+		return;
+
+	mutex_init(&drm->dmem->mutex);
+	INIT_LIST_HEAD(&drm->dmem->chunk_free);
+	INIT_LIST_HEAD(&drm->dmem->chunk_full);
+	INIT_LIST_HEAD(&drm->dmem->chunk_empty);
+
+	size = ALIGN(drm->client.device.info.ram_user, DMEM_CHUNK_SIZE);
+
+	/* Initialize migration dma helpers before registering memory */
+	ret = nouveau_dmem_migrate_init(drm);
+	if (ret) {
+		kfree(drm->dmem);
+		drm->dmem = NULL;
+		return;
+	}
+
+	/*
+	 * FIXME we need some kind of policy to decide how much VRAM we
+	 * want to register with HMM. For now just register everything
+	 * and latter if we want to do thing like over commit then we
+	 * could revisit this.
+	 */
+	drm->dmem->devmem = hmm_devmem_add(&nouveau_dmem_devmem_ops,
+					   device, size);
+	if (drm->dmem->devmem == NULL) {
+		kfree(drm->dmem);
+		drm->dmem = NULL;
+		return;
+	}
+
+	for (i = 0; i < (size / DMEM_CHUNK_SIZE); ++i) {
+		struct nouveau_dmem_chunk *chunk;
+		struct page *page;
+		unsigned long j;
+
+		chunk = kzalloc(sizeof(*chunk), GFP_KERNEL);
+		if (chunk == NULL) {
+			nouveau_dmem_fini(drm);
+			return;
+		}
+
+		chunk->drm = drm;
+		chunk->pfn_first = drm->dmem->devmem->pfn_first;
+		chunk->pfn_first += (i * DMEM_CHUNK_NPAGES);
+		list_add_tail(&chunk->list, &drm->dmem->chunk_empty);
+
+		page = pfn_to_page(chunk->pfn_first);
+		for (j = 0; j < DMEM_CHUNK_NPAGES; ++j, ++page) {
+			hmm_devmem_page_set_drvdata(page, (long)chunk);
+		}
+	}
+
+	NV_INFO(drm, "DMEM: registered %ldMB of device memory\n", size >> 20);
+}
+
+static void
+nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma,
+				    const unsigned long *src_pfns,
+				    unsigned long *dst_pfns,
+				    unsigned long start,
+				    unsigned long end,
+				    void *private)
+{
+	struct nouveau_migrate *migrate = private;
+	struct nouveau_drm *drm = migrate->drm;
+	struct device *dev = drm->dev->dev;
+	unsigned long addr, i, npages = 0;
+	nouveau_migrate_copy_t copy;
+	int ret;
+
+	/* First allocate new memory */
+	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
+		struct page *dpage, *spage;
+
+		dst_pfns[i] = 0;
+		spage = migrate_pfn_to_page(src_pfns[i]);
+		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE))
+			continue;
+
+		dpage = nouveau_dmem_page_alloc_locked(drm);
+		if (!dpage)
+			continue;
+
+		dst_pfns[i] = migrate_pfn(page_to_pfn(dpage)) |
+			      MIGRATE_PFN_LOCKED |
+			      MIGRATE_PFN_DEVICE;
+		npages++;
+	}
+
+	if (!npages)
+		return;
+
+	/* Allocate storage for DMA addresses, so we can unmap later. */
+	migrate->dma = kmalloc(sizeof(*migrate->dma) * npages, GFP_KERNEL);
+	if (!migrate->dma)
+		goto error;
+
+	/* Copy things over */
+	copy = drm->dmem->migrate.copy_func;
+	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) {
+		struct nouveau_dmem_chunk *chunk;
+		struct page *spage, *dpage;
+		u64 src_addr, dst_addr;
+
+		dpage = migrate_pfn_to_page(dst_pfns[i]);
+		if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR)
+			continue;
+
+		chunk = (void *)hmm_devmem_page_get_drvdata(dpage);
+		dst_addr = page_to_pfn(dpage) - chunk->pfn_first;
+		dst_addr = (dst_addr << PAGE_SHIFT) + chunk->bo->bo.offset;
+
+		spage = migrate_pfn_to_page(src_pfns[i]);
+		if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) {
+			nouveau_dmem_page_free_locked(drm, dpage);
+			dst_pfns[i] = 0;
+			continue;
+		}
+
+		migrate->dma[migrate->dma_nr] =
+			dma_map_page_attrs(dev, spage, 0, PAGE_SIZE,
+					   PCI_DMA_BIDIRECTIONAL,
+					   DMA_ATTR_SKIP_CPU_SYNC);
+		if (dma_mapping_error(dev, migrate->dma[migrate->dma_nr])) {
+			nouveau_dmem_page_free_locked(drm, dpage);
+			dst_pfns[i] = 0;
+			continue;
+		}
+
+		src_addr = migrate->dma[migrate->dma_nr++];
+
+		ret = copy(drm, 1, NOUVEAU_APER_VRAM, dst_addr,
+				   NOUVEAU_APER_HOST, src_addr);
+		if (ret) {
+			nouveau_dmem_page_free_locked(drm, dpage);
+			dst_pfns[i] = 0;
+			continue;
+		}
+	}
+
+	nouveau_fence_new(drm->dmem->migrate.chan, false, &migrate->fence);
+
+	return;
+
+error:
+	for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, ++i) {
+		struct page *page;
+
+		if (!dst_pfns[i] || dst_pfns[i] == MIGRATE_PFN_ERROR)
+			continue;
+
+		page = migrate_pfn_to_page(dst_pfns[i]);
+		dst_pfns[i] = MIGRATE_PFN_ERROR;
+		if (page == NULL)
+			continue;
+
+		__free_page(page);
+	}
+}
+
+void nouveau_dmem_migrate_finalize_and_map(struct vm_area_struct *vma,
+					   const unsigned long *src_pfns,
+					   const unsigned long *dst_pfns,
+					   unsigned long start,
+					   unsigned long end,
+					   void *private)
+{
+	struct nouveau_migrate *migrate = private;
+	struct nouveau_drm *drm = migrate->drm;
+
+	if (migrate->fence) {
+		nouveau_fence_wait(migrate->fence, true, false);
+		nouveau_fence_unref(&migrate->fence);
+	} else {
+		/*
+		 * FIXME wait for channel to be IDLE before finalizing
+		 * the hmem object below (nouveau_migrate_hmem_fini()) ?
+		 */
+	}
+
+	while (migrate->dma_nr--) {
+		dma_unmap_page(drm->dev->dev, migrate->dma[migrate->dma_nr],
+			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+	}
+	kfree(migrate->dma);
+
+	/*
+	 * FIXME optimization: update GPU page table to point to newly
+	 * migrated memory.
+	 */
+}
+
+static const struct migrate_vma_ops nouveau_dmem_migrate_ops = {
+	.alloc_and_copy		= nouveau_dmem_migrate_alloc_and_copy,
+	.finalize_and_map	= nouveau_dmem_migrate_finalize_and_map,
+};
+
+int
+nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
+			 struct vm_area_struct *vma,
+			 unsigned long start,
+			 unsigned long end)
+{
+	unsigned long *src_pfns, *dst_pfns, npages;
+	struct nouveau_migrate migrate = {0};
+	unsigned long i, c, max;
+	int ret = 0;
+
+	npages = (end - start) >> PAGE_SHIFT;
+	max = min(SG_MAX_SINGLE_ALLOC, npages);
+	src_pfns = kzalloc(sizeof(long) * max, GFP_KERNEL);
+	if (src_pfns == NULL)
+		return -ENOMEM;
+	dst_pfns = kzalloc(sizeof(long) * max, GFP_KERNEL);
+	if (dst_pfns == NULL) {
+		kfree(src_pfns);
+		return -ENOMEM;
+	}
+
+	migrate.drm = drm;
+	migrate.vma = vma;
+	migrate.npages = npages;
+	for (i = 0; i < npages; i += c) {
+		unsigned long next;
+
+		c = min(SG_MAX_SINGLE_ALLOC, npages);
+		next = start + (c << PAGE_SHIFT);
+		ret = migrate_vma(&nouveau_dmem_migrate_ops, vma, start,
+				  next, src_pfns, dst_pfns, &migrate);
+		if (ret)
+			goto out;
+		start = next;
+	}
+
+out:
+	kfree(dst_pfns);
+	kfree(src_pfns);
+	return ret;
+}
+
+static inline bool
+nouveau_dmem_page(struct nouveau_drm *drm, struct page *page)
+{
+	if (!is_device_private_page(page))
+		return false;
+
+	if (drm->dmem->devmem != page->pgmap->data)
+		return false;
+
+	return true;
+}
+
+void
+nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
+			 struct hmm_range *range)
+{
+	unsigned long i, npages;
+
+	npages = (range->end - range->start) >> PAGE_SHIFT;
+	for (i = 0; i < npages; ++i) {
+		struct nouveau_dmem_chunk *chunk;
+		struct page *page;
+		uint64_t addr;
+
+		page = hmm_pfn_to_page(range, range->pfns[i]);
+		if (page == NULL)
+			continue;
+
+		if (!(range->pfns[i] & range->flags[HMM_PFN_DEVICE_PRIVATE])) {
+			continue;
+		}
+
+		if (!nouveau_dmem_page(drm, page)) {
+			WARN(1, "Some unknown device memory !\n");
+			range->pfns[i] = 0;
+			continue;
+		}
+
+		chunk = (void *)hmm_devmem_page_get_drvdata(page);
+		addr = page_to_pfn(page) - chunk->pfn_first;
+		addr = (addr + chunk->bo->bo.mem.start) << PAGE_SHIFT;
+
+		range->pfns[i] &= ((1UL << range->pfn_shift) - 1);
+		range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift;
+	}
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.h b/drivers/gpu/drm/nouveau/nouveau_dmem.h
new file mode 100644
index 000000000000..9d97d756fb7d
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.h
@@ -0,0 +1,60 @@
+/*
+ * Copyright 2018 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+#ifndef __NOUVEAU_DMEM_H__
+#define __NOUVEAU_DMEM_H__
+#include <nvif/os.h>
+struct drm_device;
+struct drm_file;
+struct nouveau_drm;
+struct hmm_range;
+
+#if IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM)
+void nouveau_dmem_init(struct nouveau_drm *);
+void nouveau_dmem_fini(struct nouveau_drm *);
+void nouveau_dmem_suspend(struct nouveau_drm *);
+void nouveau_dmem_resume(struct nouveau_drm *);
+
+int nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
+			     struct vm_area_struct *vma,
+			     unsigned long start,
+			     unsigned long end);
+
+void nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
+			      struct hmm_range *range);
+#else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
+static inline void nouveau_dmem_init(struct nouveau_drm *drm) {}
+static inline void nouveau_dmem_fini(struct nouveau_drm *drm) {}
+static inline void nouveau_dmem_suspend(struct nouveau_drm *drm) {}
+static inline void nouveau_dmem_resume(struct nouveau_drm *drm) {}
+
+static inline int nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
+					   struct vm_area_struct *vma,
+					   unsigned long start,
+					   unsigned long end)
+{
+	return 0;
+}
+
+static inline void nouveau_dmem_convert_pfn(struct nouveau_drm *drm,
+					    struct hmm_range *range) {}
+#endif /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
+#endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index f900e94592f8..5020265bfbd9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -44,7 +44,6 @@
 #include <nvif/class.h>
 #include <nvif/cl0002.h>
 #include <nvif/cla06f.h>
-#include <nvif/if0004.h>
 
 #include "nouveau_drv.h"
 #include "nouveau_dma.h"
@@ -63,6 +62,8 @@
 #include "nouveau_usif.h"
 #include "nouveau_connector.h"
 #include "nouveau_platform.h"
+#include "nouveau_svm.h"
+#include "nouveau_dmem.h"
 
 MODULE_PARM_DESC(config, "option string to pass to driver core");
 static char *nouveau_config;
@@ -173,6 +174,7 @@ nouveau_cli_fini(struct nouveau_cli *cli)
 	WARN_ON(!list_empty(&cli->worker));
 
 	usif_client_fini(cli);
+	nouveau_vmm_fini(&cli->svm);
 	nouveau_vmm_fini(&cli->vmm);
 	nvif_mmu_fini(&cli->mmu);
 	nvif_device_fini(&cli->device);
@@ -283,19 +285,134 @@ done:
 }
 
 static void
-nouveau_accel_fini(struct nouveau_drm *drm)
+nouveau_accel_ce_fini(struct nouveau_drm *drm)
+{
+	nouveau_channel_idle(drm->cechan);
+	nvif_object_fini(&drm->ttm.copy);
+	nouveau_channel_del(&drm->cechan);
+}
+
+static void
+nouveau_accel_ce_init(struct nouveau_drm *drm)
+{
+	struct nvif_device *device = &drm->client.device;
+	int ret = 0;
+
+	/* Allocate channel that has access to a (preferably async) copy
+	 * engine, to use for TTM buffer moves.
+	 */
+	if (device->info.family >= NV_DEVICE_INFO_V0_KEPLER) {
+		ret = nouveau_channel_new(drm, device,
+					  nvif_fifo_runlist_ce(device), 0,
+					  true, &drm->cechan);
+	} else
+	if (device->info.chipset >= 0xa3 &&
+	    device->info.chipset != 0xaa &&
+	    device->info.chipset != 0xac) {
+		/* Prior to Kepler, there's only a single runlist, so all
+		 * engines can be accessed from any channel.
+		 *
+		 * We still want to use a separate channel though.
+		 */
+		ret = nouveau_channel_new(drm, device, NvDmaFB, NvDmaTT, false,
+					  &drm->cechan);
+	}
+
+	if (ret)
+		NV_ERROR(drm, "failed to create ce channel, %d\n", ret);
+}
+
+static void
+nouveau_accel_gr_fini(struct nouveau_drm *drm)
 {
 	nouveau_channel_idle(drm->channel);
 	nvif_object_fini(&drm->ntfy);
 	nvkm_gpuobj_del(&drm->notify);
-	nvif_notify_fini(&drm->flip);
 	nvif_object_fini(&drm->nvsw);
 	nouveau_channel_del(&drm->channel);
+}
 
-	nouveau_channel_idle(drm->cechan);
-	nvif_object_fini(&drm->ttm.copy);
-	nouveau_channel_del(&drm->cechan);
+static void
+nouveau_accel_gr_init(struct nouveau_drm *drm)
+{
+	struct nvif_device *device = &drm->client.device;
+	u32 arg0, arg1;
+	int ret;
+
+	/* Allocate channel that has access to the graphics engine. */
+	if (device->info.family >= NV_DEVICE_INFO_V0_KEPLER) {
+		arg0 = nvif_fifo_runlist(device, NV_DEVICE_INFO_ENGINE_GR);
+		arg1 = 1;
+	} else {
+		arg0 = NvDmaFB;
+		arg1 = NvDmaTT;
+	}
 
+	ret = nouveau_channel_new(drm, device, arg0, arg1, false,
+				  &drm->channel);
+	if (ret) {
+		NV_ERROR(drm, "failed to create kernel channel, %d\n", ret);
+		nouveau_accel_gr_fini(drm);
+		return;
+	}
+
+	/* A SW class is used on pre-NV50 HW to assist with handling the
+	 * synchronisation of page flips, as well as to implement fences
+	 * on TNT/TNT2 HW that lacks any kind of support in host.
+	 */
+	if (device->info.family < NV_DEVICE_INFO_V0_TESLA) {
+		ret = nvif_object_init(&drm->channel->user, NVDRM_NVSW,
+				       nouveau_abi16_swclass(drm), NULL, 0,
+				       &drm->nvsw);
+		if (ret == 0) {
+			ret = RING_SPACE(drm->channel, 2);
+			if (ret == 0) {
+				BEGIN_NV04(drm->channel, NvSubSw, 0, 1);
+				OUT_RING  (drm->channel, drm->nvsw.handle);
+			}
+		}
+
+		if (ret) {
+			NV_ERROR(drm, "failed to allocate sw class, %d\n", ret);
+			nouveau_accel_gr_fini(drm);
+			return;
+		}
+	}
+
+	/* NvMemoryToMemoryFormat requires a notifier ctxdma for some reason,
+	 * even if notification is never requested, so, allocate a ctxdma on
+	 * any GPU where it's possible we'll end up using M2MF for BO moves.
+	 */
+	if (device->info.family < NV_DEVICE_INFO_V0_FERMI) {
+		ret = nvkm_gpuobj_new(nvxx_device(device), 32, 0, false, NULL,
+				      &drm->notify);
+		if (ret) {
+			NV_ERROR(drm, "failed to allocate notifier, %d\n", ret);
+			nouveau_accel_gr_fini(drm);
+			return;
+		}
+
+		ret = nvif_object_init(&drm->channel->user, NvNotify0,
+				       NV_DMA_IN_MEMORY,
+				       &(struct nv_dma_v0) {
+						.target = NV_DMA_V0_TARGET_VRAM,
+						.access = NV_DMA_V0_ACCESS_RDWR,
+						.start = drm->notify->addr,
+						.limit = drm->notify->addr + 31
+				       }, sizeof(struct nv_dma_v0),
+				       &drm->ntfy);
+		if (ret) {
+			nouveau_accel_gr_fini(drm);
+			return;
+		}
+	}
+}
+
+static void
+nouveau_accel_fini(struct nouveau_drm *drm)
+{
+	nouveau_accel_ce_fini(drm);
+	nouveau_accel_gr_fini(drm);
 	if (drm->fence)
 		nouveau_fence(drm)->dtor(drm);
 }
@@ -305,23 +422,16 @@ nouveau_accel_init(struct nouveau_drm *drm)
 {
 	struct nvif_device *device = &drm->client.device;
 	struct nvif_sclass *sclass;
-	u32 arg0, arg1;
 	int ret, i, n;
 
 	if (nouveau_noaccel)
 		return;
 
+	/* Initialise global support for channels, and synchronisation. */
 	ret = nouveau_channels_init(drm);
 	if (ret)
 		return;
 
-	if (drm->client.device.info.family >= NV_DEVICE_INFO_V0_VOLTA) {
-		ret = nvif_user_init(device);
-		if (ret)
-			return;
-	}
-
-	/* initialise synchronisation routines */
 	/*XXX: this is crap, but the fence/channel stuff is a little
 	 *     backwards in some places.  this will be fixed.
 	 */
@@ -368,95 +478,18 @@ nouveau_accel_init(struct nouveau_drm *drm)
 		return;
 	}
 
-	if (device->info.family >= NV_DEVICE_INFO_V0_KEPLER) {
-		ret = nouveau_channel_new(drm, &drm->client.device,
-					  nvif_fifo_runlist_ce(device), 0,
-					  true, &drm->cechan);
-		if (ret)
-			NV_ERROR(drm, "failed to create ce channel, %d\n", ret);
-
-		arg0 = nvif_fifo_runlist(device, NV_DEVICE_INFO_ENGINE_GR);
-		arg1 = 1;
-	} else
-	if (device->info.chipset >= 0xa3 &&
-	    device->info.chipset != 0xaa &&
-	    device->info.chipset != 0xac) {
-		ret = nouveau_channel_new(drm, &drm->client.device,
-					  NvDmaFB, NvDmaTT, false,
-					  &drm->cechan);
+	/* Volta requires access to a doorbell register for kickoff. */
+	if (drm->client.device.info.family >= NV_DEVICE_INFO_V0_VOLTA) {
+		ret = nvif_user_init(device);
 		if (ret)
-			NV_ERROR(drm, "failed to create ce channel, %d\n", ret);
-
-		arg0 = NvDmaFB;
-		arg1 = NvDmaTT;
-	} else {
-		arg0 = NvDmaFB;
-		arg1 = NvDmaTT;
-	}
-
-	ret = nouveau_channel_new(drm, &drm->client.device,
-				  arg0, arg1, false, &drm->channel);
-	if (ret) {
-		NV_ERROR(drm, "failed to create kernel channel, %d\n", ret);
-		nouveau_accel_fini(drm);
-		return;
-	}
-
-	if (device->info.family < NV_DEVICE_INFO_V0_TESLA) {
-		ret = nvif_object_init(&drm->channel->user, NVDRM_NVSW,
-				       nouveau_abi16_swclass(drm), NULL, 0,
-				       &drm->nvsw);
-		if (ret == 0) {
-			ret = RING_SPACE(drm->channel, 2);
-			if (ret == 0) {
-				BEGIN_NV04(drm->channel, NvSubSw, 0, 1);
-				OUT_RING  (drm->channel, drm->nvsw.handle);
-			}
-
-			ret = nvif_notify_init(&drm->nvsw,
-					       nouveau_flip_complete,
-					       false, NV04_NVSW_NTFY_UEVENT,
-					       NULL, 0, 0, &drm->flip);
-			if (ret == 0)
-				ret = nvif_notify_get(&drm->flip);
-			if (ret) {
-				nouveau_accel_fini(drm);
-				return;
-			}
-		}
-
-		if (ret) {
-			NV_ERROR(drm, "failed to allocate sw class, %d\n", ret);
-			nouveau_accel_fini(drm);
-			return;
-		}
-	}
-
-	if (device->info.family < NV_DEVICE_INFO_V0_FERMI) {
-		ret = nvkm_gpuobj_new(nvxx_device(&drm->client.device), 32, 0,
-				      false, NULL, &drm->notify);
-		if (ret) {
-			NV_ERROR(drm, "failed to allocate notifier, %d\n", ret);
-			nouveau_accel_fini(drm);
 			return;
-		}
-
-		ret = nvif_object_init(&drm->channel->user, NvNotify0,
-				       NV_DMA_IN_MEMORY,
-				       &(struct nv_dma_v0) {
-						.target = NV_DMA_V0_TARGET_VRAM,
-						.access = NV_DMA_V0_ACCESS_RDWR,
-						.start = drm->notify->addr,
-						.limit = drm->notify->addr + 31
-				       }, sizeof(struct nv_dma_v0),
-				       &drm->ntfy);
-		if (ret) {
-			nouveau_accel_fini(drm);
-			return;
-		}
 	}
 
+	/* Allocate channels we need to support various functions. */
+	nouveau_accel_gr_init(drm);
+	nouveau_accel_ce_init(drm);
 
+	/* Initialise accelerated TTM buffer moves. */
 	nouveau_bo_move_init(drm);
 }
 
@@ -504,19 +537,22 @@ nouveau_drm_device_init(struct drm_device *dev)
 	if (ret)
 		goto fail_bios;
 
+	nouveau_accel_init(drm);
+
 	ret = nouveau_display_create(dev);
 	if (ret)
 		goto fail_dispctor;
 
 	if (dev->mode_config.num_crtc) {
-		ret = nouveau_display_init(dev);
+		ret = nouveau_display_init(dev, false, false);
 		if (ret)
 			goto fail_dispinit;
 	}
 
 	nouveau_debugfs_init(drm);
 	nouveau_hwmon_init(dev);
-	nouveau_accel_init(drm);
+	nouveau_svm_init(drm);
+	nouveau_dmem_init(drm);
 	nouveau_fbcon_init(dev);
 	nouveau_led_init(dev);
 
@@ -534,6 +570,7 @@ nouveau_drm_device_init(struct drm_device *dev)
 fail_dispinit:
 	nouveau_display_destroy(dev);
 fail_dispctor:
+	nouveau_accel_fini(drm);
 	nouveau_bios_takedown(dev);
 fail_bios:
 	nouveau_ttm_fini(drm);
@@ -559,7 +596,8 @@ nouveau_drm_device_fini(struct drm_device *dev)
 
 	nouveau_led_fini(dev);
 	nouveau_fbcon_fini(dev);
-	nouveau_accel_fini(drm);
+	nouveau_dmem_fini(drm);
+	nouveau_svm_fini(drm);
 	nouveau_hwmon_fini(dev);
 	nouveau_debugfs_fini(drm);
 
@@ -567,6 +605,7 @@ nouveau_drm_device_fini(struct drm_device *dev)
 		nouveau_display_fini(dev, false, false);
 	nouveau_display_destroy(dev);
 
+	nouveau_accel_fini(drm);
 	nouveau_bios_takedown(dev);
 
 	nouveau_ttm_fini(drm);
@@ -704,6 +743,8 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
 	struct nouveau_drm *drm = nouveau_drm(dev);
 	int ret;
 
+	nouveau_svm_suspend(drm);
+	nouveau_dmem_suspend(drm);
 	nouveau_led_suspend(dev);
 
 	if (dev->mode_config.num_crtc) {
@@ -780,7 +821,8 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
 	}
 
 	nouveau_led_resume(dev);
-
+	nouveau_dmem_resume(drm);
+	nouveau_svm_resume(drm);
 	return 0;
 }
 
@@ -1000,6 +1042,8 @@ nouveau_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GROBJ_ALLOC, nouveau_abi16_ioctl_grobj_alloc, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(NOUVEAU_NOTIFIEROBJ_ALLOC, nouveau_abi16_ioctl_notifierobj_alloc, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GPUOBJ_FREE, nouveau_abi16_ioctl_gpuobj_free, DRM_AUTH|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(NOUVEAU_SVM_INIT, nouveau_svmm_init, DRM_AUTH|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(NOUVEAU_SVM_BIND, nouveau_svmm_bind, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_NEW, nouveau_gem_ioctl_new, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_PUSHBUF, nouveau_gem_ioctl_pushbuf, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_PREP, nouveau_gem_ioctl_cpu_prep, DRM_AUTH|DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h
index d20b9ba4b1c1..da847244479d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -96,6 +96,7 @@ struct nouveau_cli {
 	struct nvif_device device;
 	struct nvif_mmu mmu;
 	struct nouveau_vmm vmm;
+	struct nouveau_vmm svm;
 	const struct nvif_mclass *mem;
 
 	struct list_head head;
@@ -181,7 +182,6 @@ struct nouveau_drm {
 	struct nouveau_fbdev *fbcon;
 	struct nvif_object nvsw;
 	struct nvif_object ntfy;
-	struct nvif_notify flip;
 
 	/* nv10-nv40 tiling regions */
 	struct {
@@ -210,6 +210,10 @@ struct nouveau_drm {
 	bool have_disp_power_ref;
 
 	struct dev_pm_domain vga_pm_domain;
+
+	struct nouveau_svm *svm;
+
+	struct nouveau_dmem *dmem;
 };
 
 static inline struct nouveau_drm *
diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
index 032317c81bf0..0d3cd4e05728 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
@@ -353,7 +353,7 @@ nouveau_fbcon_create(struct drm_fb_helper *helper,
 
 	chan = nouveau_nofbaccel ? NULL : drm->channel;
 	if (chan && device->info.family >= NV_DEVICE_INFO_V0_TESLA) {
-		ret = nouveau_vma_new(nvbo, &drm->client.vmm, &fb->vma);
+		ret = nouveau_vma_new(nvbo, chan->vmm, &fb->vma);
 		if (ret) {
 			NV_ERROR(drm, "failed to map fb into chan: %d\n", ret);
 			chan = NULL;
@@ -374,9 +374,9 @@ nouveau_fbcon_create(struct drm_fb_helper *helper,
 
 	strcpy(info->fix.id, "nouveaufb");
 	if (!chan)
-		info->flags = FBINFO_DEFAULT | FBINFO_HWACCEL_DISABLED;
+		info->flags = FBINFO_HWACCEL_DISABLED;
 	else
-		info->flags = FBINFO_DEFAULT | FBINFO_HWACCEL_COPYAREA |
+		info->flags = FBINFO_HWACCEL_COPYAREA |
 			      FBINFO_HWACCEL_FILLRECT |
 			      FBINFO_HWACCEL_IMAGEBLIT;
 	info->fbops = &nouveau_fbcon_sw_ops;
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h
index b999e6058046..ad27caeca0fd 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -82,8 +82,6 @@ int nv50_fence_create(struct nouveau_drm *);
 int nv84_fence_create(struct nouveau_drm *);
 int nvc0_fence_create(struct nouveau_drm *);
 
-int nouveau_flip_complete(struct nvif_notify *);
-
 struct nv84_fence_chan {
 	struct nouveau_fence_chan base;
 	struct nouveau_vma *vma;
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index b56524d343c3..b4bda716564d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -41,7 +41,6 @@ nouveau_gem_object_del(struct drm_gem_object *gem)
 {
 	struct nouveau_bo *nvbo = nouveau_gem_object(gem);
 	struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
-	struct ttm_buffer_object *bo = &nvbo->bo;
 	struct device *dev = drm->dev->dev;
 	int ret;
 
@@ -56,7 +55,7 @@ nouveau_gem_object_del(struct drm_gem_object *gem)
 
 	/* reset filp so nouveau_bo_del_ttm() can test for it */
 	gem->filp = NULL;
-	ttm_bo_unref(&bo);
+	ttm_bo_put(&nvbo->bo);
 
 	pm_runtime_mark_last_busy(dev);
 	pm_runtime_put_autosuspend(dev);
@@ -69,10 +68,11 @@ nouveau_gem_object_open(struct drm_gem_object *gem, struct drm_file *file_priv)
 	struct nouveau_bo *nvbo = nouveau_gem_object(gem);
 	struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
 	struct device *dev = drm->dev->dev;
+	struct nouveau_vmm *vmm = cli->svm.cli ? &cli->svm : &cli->vmm;
 	struct nouveau_vma *vma;
 	int ret;
 
-	if (cli->vmm.vmm.object.oclass < NVIF_CLASS_VMM_NV50)
+	if (vmm->vmm.object.oclass < NVIF_CLASS_VMM_NV50)
 		return 0;
 
 	ret = ttm_bo_reserve(&nvbo->bo, false, false, NULL);
@@ -83,7 +83,7 @@ nouveau_gem_object_open(struct drm_gem_object *gem, struct drm_file *file_priv)
 	if (ret < 0 && ret != -EACCES)
 		goto out;
 
-	ret = nouveau_vma_new(nvbo, &cli->vmm, &vma);
+	ret = nouveau_vma_new(nvbo, vmm, &vma);
 	pm_runtime_mark_last_busy(dev);
 	pm_runtime_put_autosuspend(dev);
 out:
@@ -143,17 +143,18 @@ nouveau_gem_object_close(struct drm_gem_object *gem, struct drm_file *file_priv)
 	struct nouveau_bo *nvbo = nouveau_gem_object(gem);
 	struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
 	struct device *dev = drm->dev->dev;
+	struct nouveau_vmm *vmm = cli->svm.cli ? &cli->svm : & cli->vmm;
 	struct nouveau_vma *vma;
 	int ret;
 
-	if (cli->vmm.vmm.object.oclass < NVIF_CLASS_VMM_NV50)
+	if (vmm->vmm.object.oclass < NVIF_CLASS_VMM_NV50)
 		return;
 
 	ret = ttm_bo_reserve(&nvbo->bo, false, false, NULL);
 	if (ret)
 		return;
 
-	vma = nouveau_vma_find(nvbo, &cli->vmm);
+	vma = nouveau_vma_find(nvbo, vmm);
 	if (vma) {
 		if (--vma->refs == 0) {
 			ret = pm_runtime_get_sync(dev);
@@ -220,6 +221,7 @@ nouveau_gem_info(struct drm_file *file_priv, struct drm_gem_object *gem,
 {
 	struct nouveau_cli *cli = nouveau_cli(file_priv);
 	struct nouveau_bo *nvbo = nouveau_gem_object(gem);
+	struct nouveau_vmm *vmm = cli->svm.cli ? &cli->svm : &cli->vmm;
 	struct nouveau_vma *vma;
 
 	if (is_power_of_2(nvbo->valid_domains))
@@ -229,8 +231,8 @@ nouveau_gem_info(struct drm_file *file_priv, struct drm_gem_object *gem,
 	else
 		rep->domain = NOUVEAU_GEM_DOMAIN_VRAM;
 	rep->offset = nvbo->bo.offset;
-	if (cli->vmm.vmm.object.oclass >= NVIF_CLASS_VMM_NV50) {
-		vma = nouveau_vma_find(nvbo, &cli->vmm);
+	if (vmm->vmm.object.oclass >= NVIF_CLASS_VMM_NV50) {
+		vma = nouveau_vma_find(nvbo, vmm);
 		if (!vma)
 			return -EINVAL;
 
@@ -322,7 +324,8 @@ struct validate_op {
 };
 
 static void
-validate_fini_no_ticket(struct validate_op *op, struct nouveau_fence *fence,
+validate_fini_no_ticket(struct validate_op *op, struct nouveau_channel *chan,
+			struct nouveau_fence *fence,
 			struct drm_nouveau_gem_pushbuf_bo *pbbo)
 {
 	struct nouveau_bo *nvbo;
@@ -333,13 +336,11 @@ validate_fini_no_ticket(struct validate_op *op, struct nouveau_fence *fence,
 		b = &pbbo[nvbo->pbbo_index];
 
 		if (likely(fence)) {
-			struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
-			struct nouveau_vma *vma;
-
 			nouveau_bo_fence(nvbo, fence, !!b->write_domains);
 
-			if (drm->client.vmm.vmm.object.oclass >= NVIF_CLASS_VMM_NV50) {
-				vma = (void *)(unsigned long)b->user_priv;
+			if (chan->vmm->vmm.object.oclass >= NVIF_CLASS_VMM_NV50) {
+				struct nouveau_vma *vma =
+					(void *)(unsigned long)b->user_priv;
 				nouveau_fence_unref(&vma->fence);
 				dma_fence_get(&fence->base);
 				vma->fence = fence;
@@ -359,10 +360,11 @@ validate_fini_no_ticket(struct validate_op *op, struct nouveau_fence *fence,
 }
 
 static void
-validate_fini(struct validate_op *op, struct nouveau_fence *fence,
+validate_fini(struct validate_op *op, struct nouveau_channel *chan,
+	      struct nouveau_fence *fence,
 	      struct drm_nouveau_gem_pushbuf_bo *pbbo)
 {
-	validate_fini_no_ticket(op, fence, pbbo);
+	validate_fini_no_ticket(op, chan, fence, pbbo);
 	ww_acquire_fini(&op->ticket);
 }
 
@@ -417,7 +419,7 @@ retry:
 			list_splice_tail_init(&vram_list, &op->list);
 			list_splice_tail_init(&gart_list, &op->list);
 			list_splice_tail_init(&both_list, &op->list);
-			validate_fini_no_ticket(op, NULL, NULL);
+			validate_fini_no_ticket(op, chan, NULL, NULL);
 			if (unlikely(ret == -EDEADLK)) {
 				ret = ttm_bo_reserve_slowpath(&nvbo->bo, true,
 							      &op->ticket);
@@ -431,8 +433,8 @@ retry:
 			}
 		}
 
-		if (cli->vmm.vmm.object.oclass >= NVIF_CLASS_VMM_NV50) {
-			struct nouveau_vmm *vmm = &cli->vmm;
+		if (chan->vmm->vmm.object.oclass >= NVIF_CLASS_VMM_NV50) {
+			struct nouveau_vmm *vmm = chan->vmm;
 			struct nouveau_vma *vma = nouveau_vma_find(nvbo, vmm);
 			if (!vma) {
 				NV_PRINTK(err, cli, "vma not found!\n");
@@ -472,7 +474,7 @@ retry:
 	list_splice_tail(&gart_list, &op->list);
 	list_splice_tail(&both_list, &op->list);
 	if (ret)
-		validate_fini(op, NULL, NULL);
+		validate_fini(op, chan, NULL, NULL);
 	return ret;
 
 }
@@ -564,7 +566,7 @@ nouveau_gem_pushbuf_validate(struct nouveau_channel *chan,
 	if (unlikely(ret < 0)) {
 		if (ret != -ERESTARTSYS)
 			NV_PRINTK(err, cli, "validating bo list\n");
-		validate_fini(op, NULL, NULL);
+		validate_fini(op, chan, NULL, NULL);
 		return ret;
 	}
 	*apply_relocs = ret;
@@ -843,7 +845,7 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
 	}
 
 out:
-	validate_fini(&op, fence, bo);
+	validate_fini(&op, chan, fence, bo);
 	nouveau_fence_unref(&fence);
 
 out_prevalid:
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
new file mode 100644
index 000000000000..93ed43c413f0
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -0,0 +1,835 @@
+/*
+ * Copyright 2018 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+#include "nouveau_svm.h"
+#include "nouveau_drv.h"
+#include "nouveau_chan.h"
+#include "nouveau_dmem.h"
+
+#include <nvif/notify.h>
+#include <nvif/object.h>
+#include <nvif/vmm.h>
+
+#include <nvif/class.h>
+#include <nvif/clb069.h>
+#include <nvif/ifc00d.h>
+
+#include <linux/sched/mm.h>
+#include <linux/sort.h>
+#include <linux/hmm.h>
+
+struct nouveau_svm {
+	struct nouveau_drm *drm;
+	struct mutex mutex;
+	struct list_head inst;
+
+	struct nouveau_svm_fault_buffer {
+		int id;
+		struct nvif_object object;
+		u32 entries;
+		u32 getaddr;
+		u32 putaddr;
+		u32 get;
+		u32 put;
+		struct nvif_notify notify;
+
+		struct nouveau_svm_fault {
+			u64 inst;
+			u64 addr;
+			u64 time;
+			u32 engine;
+			u8  gpc;
+			u8  hub;
+			u8  access;
+			u8  client;
+			u8  fault;
+			struct nouveau_svmm *svmm;
+		} **fault;
+		int fault_nr;
+	} buffer[1];
+};
+
+#define SVM_DBG(s,f,a...) NV_DEBUG((s)->drm, "svm: "f"\n", ##a)
+#define SVM_ERR(s,f,a...) NV_WARN((s)->drm, "svm: "f"\n", ##a)
+
+struct nouveau_ivmm {
+	struct nouveau_svmm *svmm;
+	u64 inst;
+	struct list_head head;
+};
+
+static struct nouveau_ivmm *
+nouveau_ivmm_find(struct nouveau_svm *svm, u64 inst)
+{
+	struct nouveau_ivmm *ivmm;
+	list_for_each_entry(ivmm, &svm->inst, head) {
+		if (ivmm->inst == inst)
+			return ivmm;
+	}
+	return NULL;
+}
+
+struct nouveau_svmm {
+	struct nouveau_vmm *vmm;
+	struct {
+		unsigned long start;
+		unsigned long limit;
+	} unmanaged;
+
+	struct mutex mutex;
+
+	struct mm_struct *mm;
+	struct hmm_mirror mirror;
+};
+
+#define SVMM_DBG(s,f,a...)                                                     \
+	NV_DEBUG((s)->vmm->cli->drm, "svm-%p: "f"\n", (s), ##a)
+#define SVMM_ERR(s,f,a...)                                                     \
+	NV_WARN((s)->vmm->cli->drm, "svm-%p: "f"\n", (s), ##a)
+
+int
+nouveau_svmm_bind(struct drm_device *dev, void *data,
+		  struct drm_file *file_priv)
+{
+	struct nouveau_cli *cli = nouveau_cli(file_priv);
+	struct drm_nouveau_svm_bind *args = data;
+	unsigned target, cmd, priority;
+	unsigned long addr, end, size;
+	struct mm_struct *mm;
+
+	args->va_start &= PAGE_MASK;
+	args->va_end &= PAGE_MASK;
+
+	/* Sanity check arguments */
+	if (args->reserved0 || args->reserved1)
+		return -EINVAL;
+	if (args->header & (~NOUVEAU_SVM_BIND_VALID_MASK))
+		return -EINVAL;
+	if (args->va_start >= args->va_end)
+		return -EINVAL;
+	if (!args->npages)
+		return -EINVAL;
+
+	cmd = args->header >> NOUVEAU_SVM_BIND_COMMAND_SHIFT;
+	cmd &= NOUVEAU_SVM_BIND_COMMAND_MASK;
+	switch (cmd) {
+	case NOUVEAU_SVM_BIND_COMMAND__MIGRATE:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	priority = args->header >> NOUVEAU_SVM_BIND_PRIORITY_SHIFT;
+	priority &= NOUVEAU_SVM_BIND_PRIORITY_MASK;
+
+	/* FIXME support CPU target ie all target value < GPU_VRAM */
+	target = args->header >> NOUVEAU_SVM_BIND_TARGET_SHIFT;
+	target &= NOUVEAU_SVM_BIND_TARGET_MASK;
+	switch (target) {
+	case NOUVEAU_SVM_BIND_TARGET__GPU_VRAM:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/*
+	 * FIXME: For now refuse non 0 stride, we need to change the migrate
+	 * kernel function to handle stride to avoid to create a mess within
+	 * each device driver.
+	 */
+	if (args->stride)
+		return -EINVAL;
+
+	size = ((unsigned long)args->npages) << PAGE_SHIFT;
+	if ((args->va_start + size) <= args->va_start)
+		return -EINVAL;
+	if ((args->va_start + size) > args->va_end)
+		return -EINVAL;
+
+	/*
+	 * Ok we are ask to do something sane, for now we only support migrate
+	 * commands but we will add things like memory policy (what to do on
+	 * page fault) and maybe some other commands.
+	 */
+
+	mm = get_task_mm(current);
+	down_read(&mm->mmap_sem);
+
+	for (addr = args->va_start, end = args->va_start + size; addr < end;) {
+		struct vm_area_struct *vma;
+		unsigned long next;
+
+		vma = find_vma_intersection(mm, addr, end);
+		if (!vma)
+			break;
+
+		next = min(vma->vm_end, end);
+		/* This is a best effort so we ignore errors */
+		nouveau_dmem_migrate_vma(cli->drm, vma, addr, next);
+		addr = next;
+	}
+
+	/*
+	 * FIXME Return the number of page we have migrated, again we need to
+	 * update the migrate API to return that information so that we can
+	 * report it to user space.
+	 */
+	args->result = 0;
+
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+
+	return 0;
+}
+
+/* Unlink channel instance from SVMM. */
+void
+nouveau_svmm_part(struct nouveau_svmm *svmm, u64 inst)
+{
+	struct nouveau_ivmm *ivmm;
+	if (svmm) {
+		mutex_lock(&svmm->vmm->cli->drm->svm->mutex);
+		ivmm = nouveau_ivmm_find(svmm->vmm->cli->drm->svm, inst);
+		if (ivmm) {
+			list_del(&ivmm->head);
+			kfree(ivmm);
+		}
+		mutex_unlock(&svmm->vmm->cli->drm->svm->mutex);
+	}
+}
+
+/* Link channel instance to SVMM. */
+int
+nouveau_svmm_join(struct nouveau_svmm *svmm, u64 inst)
+{
+	struct nouveau_ivmm *ivmm;
+	if (svmm) {
+		if (!(ivmm = kmalloc(sizeof(*ivmm), GFP_KERNEL)))
+			return -ENOMEM;
+		ivmm->svmm = svmm;
+		ivmm->inst = inst;
+
+		mutex_lock(&svmm->vmm->cli->drm->svm->mutex);
+		list_add(&ivmm->head, &svmm->vmm->cli->drm->svm->inst);
+		mutex_unlock(&svmm->vmm->cli->drm->svm->mutex);
+	}
+	return 0;
+}
+
+/* Invalidate SVMM address-range on GPU. */
+static void
+nouveau_svmm_invalidate(struct nouveau_svmm *svmm, u64 start, u64 limit)
+{
+	if (limit > start) {
+		bool super = svmm->vmm->vmm.object.client->super;
+		svmm->vmm->vmm.object.client->super = true;
+		nvif_object_mthd(&svmm->vmm->vmm.object, NVIF_VMM_V0_PFNCLR,
+				 &(struct nvif_vmm_pfnclr_v0) {
+					.addr = start,
+					.size = limit - start,
+				 }, sizeof(struct nvif_vmm_pfnclr_v0));
+		svmm->vmm->vmm.object.client->super = super;
+	}
+}
+
+static int
+nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
+					const struct hmm_update *update)
+{
+	struct nouveau_svmm *svmm = container_of(mirror, typeof(*svmm), mirror);
+	unsigned long start = update->start;
+	unsigned long limit = update->end;
+
+	if (!update->blockable)
+		return -EAGAIN;
+
+	SVMM_DBG(svmm, "invalidate %016lx-%016lx", start, limit);
+
+	mutex_lock(&svmm->mutex);
+	if (limit > svmm->unmanaged.start && start < svmm->unmanaged.limit) {
+		if (start < svmm->unmanaged.start) {
+			nouveau_svmm_invalidate(svmm, start,
+						svmm->unmanaged.limit);
+		}
+		start = svmm->unmanaged.limit;
+	}
+
+	nouveau_svmm_invalidate(svmm, start, limit);
+	mutex_unlock(&svmm->mutex);
+	return 0;
+}
+
+static void
+nouveau_svmm_release(struct hmm_mirror *mirror)
+{
+}
+
+static const struct hmm_mirror_ops
+nouveau_svmm = {
+	.sync_cpu_device_pagetables = nouveau_svmm_sync_cpu_device_pagetables,
+	.release = nouveau_svmm_release,
+};
+
+void
+nouveau_svmm_fini(struct nouveau_svmm **psvmm)
+{
+	struct nouveau_svmm *svmm = *psvmm;
+	if (svmm) {
+		hmm_mirror_unregister(&svmm->mirror);
+		kfree(*psvmm);
+		*psvmm = NULL;
+	}
+}
+
+int
+nouveau_svmm_init(struct drm_device *dev, void *data,
+		  struct drm_file *file_priv)
+{
+	struct nouveau_cli *cli = nouveau_cli(file_priv);
+	struct nouveau_svmm *svmm;
+	struct drm_nouveau_svm_init *args = data;
+	int ret;
+
+	/* Allocate tracking for SVM-enabled VMM. */
+	if (!(svmm = kzalloc(sizeof(*svmm), GFP_KERNEL)))
+		return -ENOMEM;
+	svmm->vmm = &cli->svm;
+	svmm->unmanaged.start = args->unmanaged_addr;
+	svmm->unmanaged.limit = args->unmanaged_addr + args->unmanaged_size;
+	mutex_init(&svmm->mutex);
+
+	/* Check that SVM isn't already enabled for the client. */
+	mutex_lock(&cli->mutex);
+	if (cli->svm.cli) {
+		ret = -EBUSY;
+		goto done;
+	}
+
+	/* Allocate a new GPU VMM that can support SVM (managed by the
+	 * client, with replayable faults enabled).
+	 *
+	 * All future channel/memory allocations will make use of this
+	 * VMM instead of the standard one.
+	 */
+	ret = nvif_vmm_init(&cli->mmu, cli->vmm.vmm.object.oclass, true,
+			    args->unmanaged_addr, args->unmanaged_size,
+			    &(struct gp100_vmm_v0) {
+				.fault_replay = true,
+			    }, sizeof(struct gp100_vmm_v0), &cli->svm.vmm);
+	if (ret)
+		goto done;
+
+	/* Enable HMM mirroring of CPU address-space to VMM. */
+	svmm->mm = get_task_mm(current);
+	down_write(&svmm->mm->mmap_sem);
+	svmm->mirror.ops = &nouveau_svmm;
+	ret = hmm_mirror_register(&svmm->mirror, svmm->mm);
+	if (ret == 0) {
+		cli->svm.svmm = svmm;
+		cli->svm.cli = cli;
+	}
+	up_write(&svmm->mm->mmap_sem);
+	mmput(svmm->mm);
+
+done:
+	if (ret)
+		nouveau_svmm_fini(&svmm);
+	mutex_unlock(&cli->mutex);
+	return ret;
+}
+
+static const u64
+nouveau_svm_pfn_flags[HMM_PFN_FLAG_MAX] = {
+	[HMM_PFN_VALID         ] = NVIF_VMM_PFNMAP_V0_V,
+	[HMM_PFN_WRITE         ] = NVIF_VMM_PFNMAP_V0_W,
+	[HMM_PFN_DEVICE_PRIVATE] = NVIF_VMM_PFNMAP_V0_VRAM,
+};
+
+static const u64
+nouveau_svm_pfn_values[HMM_PFN_VALUE_MAX] = {
+	[HMM_PFN_ERROR  ] = ~NVIF_VMM_PFNMAP_V0_V,
+	[HMM_PFN_NONE   ] =  NVIF_VMM_PFNMAP_V0_NONE,
+	[HMM_PFN_SPECIAL] = ~NVIF_VMM_PFNMAP_V0_V,
+};
+
+/* Issue fault replay for GPU to retry accesses that faulted previously. */
+static void
+nouveau_svm_fault_replay(struct nouveau_svm *svm)
+{
+	SVM_DBG(svm, "replay");
+	WARN_ON(nvif_object_mthd(&svm->drm->client.vmm.vmm.object,
+				 GP100_VMM_VN_FAULT_REPLAY,
+				 &(struct gp100_vmm_fault_replay_vn) {},
+				 sizeof(struct gp100_vmm_fault_replay_vn)));
+}
+
+/* Cancel a replayable fault that could not be handled.
+ *
+ * Cancelling the fault will trigger recovery to reset the engine
+ * and kill the offending channel (ie. GPU SIGSEGV).
+ */
+static void
+nouveau_svm_fault_cancel(struct nouveau_svm *svm,
+			 u64 inst, u8 hub, u8 gpc, u8 client)
+{
+	SVM_DBG(svm, "cancel %016llx %d %02x %02x", inst, hub, gpc, client);
+	WARN_ON(nvif_object_mthd(&svm->drm->client.vmm.vmm.object,
+				 GP100_VMM_VN_FAULT_CANCEL,
+				 &(struct gp100_vmm_fault_cancel_v0) {
+					.hub = hub,
+					.gpc = gpc,
+					.client = client,
+					.inst = inst,
+				 }, sizeof(struct gp100_vmm_fault_cancel_v0)));
+}
+
+static void
+nouveau_svm_fault_cancel_fault(struct nouveau_svm *svm,
+			       struct nouveau_svm_fault *fault)
+{
+	nouveau_svm_fault_cancel(svm, fault->inst,
+				      fault->hub,
+				      fault->gpc,
+				      fault->client);
+}
+
+static int
+nouveau_svm_fault_cmp(const void *a, const void *b)
+{
+	const struct nouveau_svm_fault *fa = *(struct nouveau_svm_fault **)a;
+	const struct nouveau_svm_fault *fb = *(struct nouveau_svm_fault **)b;
+	int ret;
+	if ((ret = (s64)fa->inst - fb->inst))
+		return ret;
+	if ((ret = (s64)fa->addr - fb->addr))
+		return ret;
+	/*XXX: atomic? */
+	return (fa->access == 0 || fa->access == 3) -
+	       (fb->access == 0 || fb->access == 3);
+}
+
+static void
+nouveau_svm_fault_cache(struct nouveau_svm *svm,
+			struct nouveau_svm_fault_buffer *buffer, u32 offset)
+{
+	struct nvif_object *memory = &buffer->object;
+	const u32 instlo = nvif_rd32(memory, offset + 0x00);
+	const u32 insthi = nvif_rd32(memory, offset + 0x04);
+	const u32 addrlo = nvif_rd32(memory, offset + 0x08);
+	const u32 addrhi = nvif_rd32(memory, offset + 0x0c);
+	const u32 timelo = nvif_rd32(memory, offset + 0x10);
+	const u32 timehi = nvif_rd32(memory, offset + 0x14);
+	const u32 engine = nvif_rd32(memory, offset + 0x18);
+	const u32   info = nvif_rd32(memory, offset + 0x1c);
+	const u64   inst = (u64)insthi << 32 | instlo;
+	const u8     gpc = (info & 0x1f000000) >> 24;
+	const u8     hub = (info & 0x00100000) >> 20;
+	const u8  client = (info & 0x00007f00) >> 8;
+	struct nouveau_svm_fault *fault;
+
+	//XXX: i think we're supposed to spin waiting */
+	if (WARN_ON(!(info & 0x80000000)))
+		return;
+
+	nvif_mask(memory, offset + 0x1c, 0x80000000, 0x00000000);
+
+	if (!buffer->fault[buffer->fault_nr]) {
+		fault = kmalloc(sizeof(*fault), GFP_KERNEL);
+		if (WARN_ON(!fault)) {
+			nouveau_svm_fault_cancel(svm, inst, hub, gpc, client);
+			return;
+		}
+		buffer->fault[buffer->fault_nr] = fault;
+	}
+
+	fault = buffer->fault[buffer->fault_nr++];
+	fault->inst   = inst;
+	fault->addr   = (u64)addrhi << 32 | addrlo;
+	fault->time   = (u64)timehi << 32 | timelo;
+	fault->engine = engine;
+	fault->gpc    = gpc;
+	fault->hub    = hub;
+	fault->access = (info & 0x000f0000) >> 16;
+	fault->client = client;
+	fault->fault  = (info & 0x0000001f);
+
+	SVM_DBG(svm, "fault %016llx %016llx %02x",
+		fault->inst, fault->addr, fault->access);
+}
+
+static int
+nouveau_svm_fault(struct nvif_notify *notify)
+{
+	struct nouveau_svm_fault_buffer *buffer =
+		container_of(notify, typeof(*buffer), notify);
+	struct nouveau_svm *svm =
+		container_of(buffer, typeof(*svm), buffer[buffer->id]);
+	struct nvif_object *device = &svm->drm->client.device.object;
+	struct nouveau_svmm *svmm;
+	struct {
+		struct {
+			struct nvif_ioctl_v0 i;
+			struct nvif_ioctl_mthd_v0 m;
+			struct nvif_vmm_pfnmap_v0 p;
+		} i;
+		u64 phys[16];
+	} args;
+	struct hmm_range range;
+	struct vm_area_struct *vma;
+	u64 inst, start, limit;
+	int fi, fn, pi, fill;
+	int replay = 0, ret;
+
+	/* Parse available fault buffer entries into a cache, and update
+	 * the GET pointer so HW can reuse the entries.
+	 */
+	SVM_DBG(svm, "fault handler");
+	if (buffer->get == buffer->put) {
+		buffer->put = nvif_rd32(device, buffer->putaddr);
+		buffer->get = nvif_rd32(device, buffer->getaddr);
+		if (buffer->get == buffer->put)
+			return NVIF_NOTIFY_KEEP;
+	}
+	buffer->fault_nr = 0;
+
+	SVM_DBG(svm, "get %08x put %08x", buffer->get, buffer->put);
+	while (buffer->get != buffer->put) {
+		nouveau_svm_fault_cache(svm, buffer, buffer->get * 0x20);
+		if (++buffer->get == buffer->entries)
+			buffer->get = 0;
+	}
+	nvif_wr32(device, buffer->getaddr, buffer->get);
+	SVM_DBG(svm, "%d fault(s) pending", buffer->fault_nr);
+
+	/* Sort parsed faults by instance pointer to prevent unnecessary
+	 * instance to SVMM translations, followed by address and access
+	 * type to reduce the amount of work when handling the faults.
+	 */
+	sort(buffer->fault, buffer->fault_nr, sizeof(*buffer->fault),
+	     nouveau_svm_fault_cmp, NULL);
+
+	/* Lookup SVMM structure for each unique instance pointer. */
+	mutex_lock(&svm->mutex);
+	for (fi = 0, svmm = NULL; fi < buffer->fault_nr; fi++) {
+		if (!svmm || buffer->fault[fi]->inst != inst) {
+			struct nouveau_ivmm *ivmm =
+				nouveau_ivmm_find(svm, buffer->fault[fi]->inst);
+			svmm = ivmm ? ivmm->svmm : NULL;
+			inst = buffer->fault[fi]->inst;
+			SVM_DBG(svm, "inst %016llx -> svm-%p", inst, svmm);
+		}
+		buffer->fault[fi]->svmm = svmm;
+	}
+	mutex_unlock(&svm->mutex);
+
+	/* Process list of faults. */
+	args.i.i.version = 0;
+	args.i.i.type = NVIF_IOCTL_V0_MTHD;
+	args.i.m.version = 0;
+	args.i.m.method = NVIF_VMM_V0_PFNMAP;
+	args.i.p.version = 0;
+
+	for (fi = 0; fn = fi + 1, fi < buffer->fault_nr; fi = fn) {
+		/* Cancel any faults from non-SVM channels. */
+		if (!(svmm = buffer->fault[fi]->svmm)) {
+			nouveau_svm_fault_cancel_fault(svm, buffer->fault[fi]);
+			continue;
+		}
+		SVMM_DBG(svmm, "addr %016llx", buffer->fault[fi]->addr);
+
+		/* We try and group handling of faults within a small
+		 * window into a single update.
+		 */
+		start = buffer->fault[fi]->addr;
+		limit = start + (ARRAY_SIZE(args.phys) << PAGE_SHIFT);
+		if (start < svmm->unmanaged.limit)
+			limit = min_t(u64, limit, svmm->unmanaged.start);
+		else
+		if (limit > svmm->unmanaged.start)
+			start = max_t(u64, start, svmm->unmanaged.limit);
+		SVMM_DBG(svmm, "wndw %016llx-%016llx", start, limit);
+
+		/* Intersect fault window with the CPU VMA, cancelling
+		 * the fault if the address is invalid.
+		 */
+		down_read(&svmm->mm->mmap_sem);
+		vma = find_vma_intersection(svmm->mm, start, limit);
+		if (!vma) {
+			SVMM_ERR(svmm, "wndw %016llx-%016llx", start, limit);
+			up_read(&svmm->mm->mmap_sem);
+			nouveau_svm_fault_cancel_fault(svm, buffer->fault[fi]);
+			continue;
+		}
+		start = max_t(u64, start, vma->vm_start);
+		limit = min_t(u64, limit, vma->vm_end);
+		SVMM_DBG(svmm, "wndw %016llx-%016llx", start, limit);
+
+		if (buffer->fault[fi]->addr != start) {
+			SVMM_ERR(svmm, "addr %016llx", buffer->fault[fi]->addr);
+			up_read(&svmm->mm->mmap_sem);
+			nouveau_svm_fault_cancel_fault(svm, buffer->fault[fi]);
+			continue;
+		}
+
+		/* Prepare the GPU-side update of all pages within the
+		 * fault window, determining required pages and access
+		 * permissions based on pending faults.
+		 */
+		args.i.p.page = PAGE_SHIFT;
+		args.i.p.addr = start;
+		for (fn = fi, pi = 0;;) {
+			/* Determine required permissions based on GPU fault
+			 * access flags.
+			 *XXX: atomic?
+			 */
+			if (buffer->fault[fn]->access != 0 /* READ. */ &&
+			    buffer->fault[fn]->access != 3 /* PREFETCH. */) {
+				args.phys[pi++] = NVIF_VMM_PFNMAP_V0_V |
+						  NVIF_VMM_PFNMAP_V0_W;
+			} else {
+				args.phys[pi++] = NVIF_VMM_PFNMAP_V0_V;
+			}
+			args.i.p.size = pi << PAGE_SHIFT;
+
+			/* It's okay to skip over duplicate addresses from the
+			 * same SVMM as faults are ordered by access type such
+			 * that only the first one needs to be handled.
+			 *
+			 * ie. WRITE faults appear first, thus any handling of
+			 * pending READ faults will already be satisfied.
+			 */
+			while (++fn < buffer->fault_nr &&
+			       buffer->fault[fn]->svmm == svmm &&
+			       buffer->fault[fn    ]->addr ==
+			       buffer->fault[fn - 1]->addr);
+
+			/* If the next fault is outside the window, or all GPU
+			 * faults have been dealt with, we're done here.
+			 */
+			if (fn >= buffer->fault_nr ||
+			    buffer->fault[fn]->svmm != svmm ||
+			    buffer->fault[fn]->addr >= limit)
+				break;
+
+			/* Fill in the gap between this fault and the next. */
+			fill = (buffer->fault[fn    ]->addr -
+				buffer->fault[fn - 1]->addr) >> PAGE_SHIFT;
+			while (--fill)
+				args.phys[pi++] = NVIF_VMM_PFNMAP_V0_NONE;
+		}
+
+		SVMM_DBG(svmm, "wndw %016llx-%016llx covering %d fault(s)",
+			 args.i.p.addr,
+			 args.i.p.addr + args.i.p.size, fn - fi);
+
+		/* Have HMM fault pages within the fault window to the GPU. */
+		range.vma = vma;
+		range.start = args.i.p.addr;
+		range.end = args.i.p.addr + args.i.p.size;
+		range.pfns = args.phys;
+		range.flags = nouveau_svm_pfn_flags;
+		range.values = nouveau_svm_pfn_values;
+		range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT;
+again:
+		ret = hmm_vma_fault(&range, true);
+		if (ret == 0) {
+			mutex_lock(&svmm->mutex);
+			if (!hmm_vma_range_done(&range)) {
+				mutex_unlock(&svmm->mutex);
+				goto again;
+			}
+
+			nouveau_dmem_convert_pfn(svm->drm, &range);
+
+			svmm->vmm->vmm.object.client->super = true;
+			ret = nvif_object_ioctl(&svmm->vmm->vmm.object,
+						&args, sizeof(args.i) +
+						pi * sizeof(args.phys[0]),
+						NULL);
+			svmm->vmm->vmm.object.client->super = false;
+			mutex_unlock(&svmm->mutex);
+		}
+		up_read(&svmm->mm->mmap_sem);
+
+		/* Cancel any faults in the window whose pages didn't manage
+		 * to keep their valid bit, or stay writeable when required.
+		 *
+		 * If handling failed completely, cancel all faults.
+		 */
+		while (fi < fn) {
+			struct nouveau_svm_fault *fault = buffer->fault[fi++];
+			pi = (fault->addr - range.start) >> PAGE_SHIFT;
+			if (ret ||
+			     !(range.pfns[pi] & NVIF_VMM_PFNMAP_V0_V) ||
+			    (!(range.pfns[pi] & NVIF_VMM_PFNMAP_V0_W) &&
+			     fault->access != 0 && fault->access != 3)) {
+				nouveau_svm_fault_cancel_fault(svm, fault);
+				continue;
+			}
+			replay++;
+		}
+	}
+
+	/* Issue fault replay to the GPU. */
+	if (replay)
+		nouveau_svm_fault_replay(svm);
+	return NVIF_NOTIFY_KEEP;
+}
+
+static void
+nouveau_svm_fault_buffer_fini(struct nouveau_svm *svm, int id)
+{
+	struct nouveau_svm_fault_buffer *buffer = &svm->buffer[id];
+	nvif_notify_put(&buffer->notify);
+}
+
+static int
+nouveau_svm_fault_buffer_init(struct nouveau_svm *svm, int id)
+{
+	struct nouveau_svm_fault_buffer *buffer = &svm->buffer[id];
+	struct nvif_object *device = &svm->drm->client.device.object;
+	buffer->get = nvif_rd32(device, buffer->getaddr);
+	buffer->put = nvif_rd32(device, buffer->putaddr);
+	SVM_DBG(svm, "get %08x put %08x (init)", buffer->get, buffer->put);
+	return nvif_notify_get(&buffer->notify);
+}
+
+static void
+nouveau_svm_fault_buffer_dtor(struct nouveau_svm *svm, int id)
+{
+	struct nouveau_svm_fault_buffer *buffer = &svm->buffer[id];
+	int i;
+
+	if (buffer->fault) {
+		for (i = 0; buffer->fault[i] && i < buffer->entries; i++)
+			kfree(buffer->fault[i]);
+		kvfree(buffer->fault);
+	}
+
+	nouveau_svm_fault_buffer_fini(svm, id);
+
+	nvif_notify_fini(&buffer->notify);
+	nvif_object_fini(&buffer->object);
+}
+
+static int
+nouveau_svm_fault_buffer_ctor(struct nouveau_svm *svm, s32 oclass, int id)
+{
+	struct nouveau_svm_fault_buffer *buffer = &svm->buffer[id];
+	struct nouveau_drm *drm = svm->drm;
+	struct nvif_object *device = &drm->client.device.object;
+	struct nvif_clb069_v0 args = {};
+	int ret;
+
+	buffer->id = id;
+
+	ret = nvif_object_init(device, 0, oclass, &args, sizeof(args),
+			       &buffer->object);
+	if (ret < 0) {
+		SVM_ERR(svm, "Fault buffer allocation failed: %d", ret);
+		return ret;
+	}
+
+	nvif_object_map(&buffer->object, NULL, 0);
+	buffer->entries = args.entries;
+	buffer->getaddr = args.get;
+	buffer->putaddr = args.put;
+
+	ret = nvif_notify_init(&buffer->object, nouveau_svm_fault, true,
+			       NVB069_V0_NTFY_FAULT, NULL, 0, 0,
+			       &buffer->notify);
+	if (ret)
+		return ret;
+
+	buffer->fault = kvzalloc(sizeof(*buffer->fault) * buffer->entries, GFP_KERNEL);
+	if (!buffer->fault)
+		return -ENOMEM;
+
+	return nouveau_svm_fault_buffer_init(svm, id);
+}
+
+void
+nouveau_svm_resume(struct nouveau_drm *drm)
+{
+	struct nouveau_svm *svm = drm->svm;
+	if (svm)
+		nouveau_svm_fault_buffer_init(svm, 0);
+}
+
+void
+nouveau_svm_suspend(struct nouveau_drm *drm)
+{
+	struct nouveau_svm *svm = drm->svm;
+	if (svm)
+		nouveau_svm_fault_buffer_fini(svm, 0);
+}
+
+void
+nouveau_svm_fini(struct nouveau_drm *drm)
+{
+	struct nouveau_svm *svm = drm->svm;
+	if (svm) {
+		nouveau_svm_fault_buffer_dtor(svm, 0);
+		kfree(drm->svm);
+		drm->svm = NULL;
+	}
+}
+
+void
+nouveau_svm_init(struct nouveau_drm *drm)
+{
+	static const struct nvif_mclass buffers[] = {
+		{   VOLTA_FAULT_BUFFER_A, 0 },
+		{ MAXWELL_FAULT_BUFFER_A, 0 },
+		{}
+	};
+	struct nouveau_svm *svm;
+	int ret;
+
+	/* Disable on Volta and newer until channel recovery is fixed,
+	 * otherwise clients will have a trivial way to trash the GPU
+	 * for everyone.
+	 */
+	if (drm->client.device.info.family > NV_DEVICE_INFO_V0_PASCAL)
+		return;
+
+	if (!(drm->svm = svm = kzalloc(sizeof(*drm->svm), GFP_KERNEL)))
+		return;
+
+	drm->svm->drm = drm;
+	mutex_init(&drm->svm->mutex);
+	INIT_LIST_HEAD(&drm->svm->inst);
+
+	ret = nvif_mclass(&drm->client.device.object, buffers);
+	if (ret < 0) {
+		SVM_DBG(svm, "No supported fault buffer class");
+		nouveau_svm_fini(drm);
+		return;
+	}
+
+	ret = nouveau_svm_fault_buffer_ctor(svm, buffers[ret].oclass, 0);
+	if (ret) {
+		nouveau_svm_fini(drm);
+		return;
+	}
+
+	SVM_DBG(svm, "Initialised");
+}
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.h b/drivers/gpu/drm/nouveau/nouveau_svm.h
new file mode 100644
index 000000000000..e839d8189461
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.h
@@ -0,0 +1,48 @@
+#ifndef __NOUVEAU_SVM_H__
+#define __NOUVEAU_SVM_H__
+#include <nvif/os.h>
+struct drm_device;
+struct drm_file;
+struct nouveau_drm;
+
+struct nouveau_svmm;
+
+#if IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM)
+void nouveau_svm_init(struct nouveau_drm *);
+void nouveau_svm_fini(struct nouveau_drm *);
+void nouveau_svm_suspend(struct nouveau_drm *);
+void nouveau_svm_resume(struct nouveau_drm *);
+
+int nouveau_svmm_init(struct drm_device *, void *, struct drm_file *);
+void nouveau_svmm_fini(struct nouveau_svmm **);
+int nouveau_svmm_join(struct nouveau_svmm *, u64 inst);
+void nouveau_svmm_part(struct nouveau_svmm *, u64 inst);
+int nouveau_svmm_bind(struct drm_device *, void *, struct drm_file *);
+#else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
+static inline void nouveau_svm_init(struct nouveau_drm *drm) {}
+static inline void nouveau_svm_fini(struct nouveau_drm *drm) {}
+static inline void nouveau_svm_suspend(struct nouveau_drm *drm) {}
+static inline void nouveau_svm_resume(struct nouveau_drm *drm) {}
+
+static inline int nouveau_svmm_init(struct drm_device *device, void *p,
+				    struct drm_file *file)
+{
+	return -ENOSYS;
+}
+
+static inline void nouveau_svmm_fini(struct nouveau_svmm **svmmp) {}
+
+static inline int nouveau_svmm_join(struct nouveau_svmm *svmm, u64 inst)
+{
+	return 0;
+}
+
+static inline void nouveau_svmm_part(struct nouveau_svmm *svmm, u64 inst) {}
+
+static inline int nouveau_svmm_bind(struct drm_device *device, void *p,
+				    struct drm_file *file)
+{
+	return -ENOSYS;
+}
+#endif /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
+#endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_vmm.c b/drivers/gpu/drm/nouveau/nouveau_vmm.c
index 2032c3e4f6e5..77061182a1cf 100644
--- a/drivers/gpu/drm/nouveau/nouveau_vmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_vmm.c
@@ -22,6 +22,7 @@
 #include "nouveau_vmm.h"
 #include "nouveau_drv.h"
 #include "nouveau_bo.h"
+#include "nouveau_svm.h"
 #include "nouveau_mem.h"
 
 void
@@ -119,6 +120,7 @@ done:
 void
 nouveau_vmm_fini(struct nouveau_vmm *vmm)
 {
+	nouveau_svmm_fini(&vmm->svmm);
 	nvif_vmm_fini(&vmm->vmm);
 	vmm->cli = NULL;
 }
@@ -126,7 +128,7 @@ nouveau_vmm_fini(struct nouveau_vmm *vmm)
 int
 nouveau_vmm_init(struct nouveau_cli *cli, s32 oclass, struct nouveau_vmm *vmm)
 {
-	int ret = nvif_vmm_init(&cli->mmu, oclass, PAGE_SIZE, 0, NULL, 0,
+	int ret = nvif_vmm_init(&cli->mmu, oclass, false, PAGE_SIZE, 0, NULL, 0,
 				&vmm->vmm);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/nouveau/nouveau_vmm.h b/drivers/gpu/drm/nouveau/nouveau_vmm.h
index ede872f6f668..2b98d975f37e 100644
--- a/drivers/gpu/drm/nouveau/nouveau_vmm.h
+++ b/drivers/gpu/drm/nouveau/nouveau_vmm.h
@@ -25,6 +25,7 @@ void nouveau_vma_unmap(struct nouveau_vma *);
 struct nouveau_vmm {
 	struct nouveau_cli *cli;
 	struct nvif_vmm vmm;
+	struct nouveau_svmm *svmm;
 };
 
 int nouveau_vmm_init(struct nouveau_cli *, s32 oclass, struct nouveau_vmm *);
diff --git a/drivers/gpu/drm/nouveau/nv84_fence.c b/drivers/gpu/drm/nouveau/nv84_fence.c
index e721bb2163a0..f07da00f285f 100644
--- a/drivers/gpu/drm/nouveau/nv84_fence.c
+++ b/drivers/gpu/drm/nouveau/nv84_fence.c
@@ -109,7 +109,6 @@ nv84_fence_context_del(struct nouveau_channel *chan)
 int
 nv84_fence_context_new(struct nouveau_channel *chan)
 {
-	struct nouveau_cli *cli = (void *)chan->user.client;
 	struct nv84_fence_priv *priv = chan->drm->fence;
 	struct nv84_fence_chan *fctx;
 	int ret;
@@ -127,7 +126,7 @@ nv84_fence_context_new(struct nouveau_channel *chan)
 	fctx->base.sequence = nv84_fence_read(chan);
 
 	mutex_lock(&priv->mutex);
-	ret = nouveau_vma_new(priv->bo, &cli->vmm, &fctx->vma);
+	ret = nouveau_vma_new(priv->bo, chan->vmm, &fctx->vma);
 	mutex_unlock(&priv->mutex);
 
 	if (ret)
diff --git a/drivers/gpu/drm/nouveau/nvif/disp.c b/drivers/gpu/drm/nouveau/nvif/disp.c
index ef97dd223a32..61638b3b9d3d 100644
--- a/drivers/gpu/drm/nouveau/nvif/disp.c
+++ b/drivers/gpu/drm/nouveau/nvif/disp.c
@@ -34,7 +34,7 @@ int
 nvif_disp_ctor(struct nvif_device *device, s32 oclass, struct nvif_disp *disp)
 {
 	static const struct nvif_mclass disps[] = {
-		{ TU104_DISP, -1 },
+		{ TU102_DISP, -1 },
 		{ GV100_DISP, -1 },
 		{ GP102_DISP, -1 },
 		{ GP100_DISP, -1 },
diff --git a/drivers/gpu/drm/nouveau/nvif/vmm.c b/drivers/gpu/drm/nouveau/nvif/vmm.c
index 6b9c5776547f..11487c00b909 100644
--- a/drivers/gpu/drm/nouveau/nvif/vmm.c
+++ b/drivers/gpu/drm/nouveau/nvif/vmm.c
@@ -112,8 +112,8 @@ nvif_vmm_fini(struct nvif_vmm *vmm)
 }
 
 int
-nvif_vmm_init(struct nvif_mmu *mmu, s32 oclass, u64 addr, u64 size,
-	      void *argv, u32 argc, struct nvif_vmm *vmm)
+nvif_vmm_init(struct nvif_mmu *mmu, s32 oclass, bool managed, u64 addr,
+	      u64 size, void *argv, u32 argc, struct nvif_vmm *vmm)
 {
 	struct nvif_vmm_v0 *args;
 	u32 argn = sizeof(*args) + argc;
@@ -125,6 +125,7 @@ nvif_vmm_init(struct nvif_mmu *mmu, s32 oclass, u64 addr, u64 size,
 	if (!(args = kmalloc(argn, GFP_KERNEL)))
 		return -ENOMEM;
 	args->version = 0;
+	args->managed = managed;
 	args->addr = addr;
 	args->size = size;
 	memcpy(args->data, argv, argc);
diff --git a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
index c61b467cf45e..245990de1e90 100644
--- a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
+++ b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
@@ -39,6 +39,7 @@ nvkm_subdev_name[NVKM_SUBDEV_NR] = {
 	[NVKM_SUBDEV_FB      ] = "fb",
 	[NVKM_SUBDEV_FUSE    ] = "fuse",
 	[NVKM_SUBDEV_GPIO    ] = "gpio",
+	[NVKM_SUBDEV_GSP     ] = "gsp",
 	[NVKM_SUBDEV_I2C     ] = "i2c",
 	[NVKM_SUBDEV_IBUS    ] = "priv",
 	[NVKM_SUBDEV_ICCSENSE] = "iccsense",
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/ce/Kbuild b/drivers/gpu/drm/nouveau/nvkm/engine/ce/Kbuild
index 177a23301d6a..9211663239af 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/ce/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/ce/Kbuild
@@ -6,4 +6,4 @@ nvkm-y += nvkm/engine/ce/gm200.o
 nvkm-y += nvkm/engine/ce/gp100.o
 nvkm-y += nvkm/engine/ce/gp102.o
 nvkm-y += nvkm/engine/ce/gv100.o
-nvkm-y += nvkm/engine/ce/tu104.o
+nvkm-y += nvkm/engine/ce/tu102.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/ce/tu104.c b/drivers/gpu/drm/nouveau/nvkm/engine/ce/tu102.c
index 3c25043bbb33..b4308e2d8c75 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/ce/tu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/ce/tu102.c
@@ -24,7 +24,7 @@
 #include <nvif/class.h>
 
 static const struct nvkm_engine_func
-tu104_ce = {
+tu102_ce = {
 	.intr = gp100_ce_intr,
 	.sclass = {
 		{ -1, -1, TURING_DMA_COPY_A },
@@ -33,8 +33,8 @@ tu104_ce = {
 };
 
 int
-tu104_ce_new(struct nvkm_device *device, int index,
+tu102_ce_new(struct nvkm_device *device, int index,
 	     struct nvkm_engine **pengine)
 {
-	return nvkm_engine_new_(&tu104_ce, device, index, true, pengine);
+	return nvkm_engine_new_(&tu102_ce, device, index, true, pengine);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
index d9edb5785813..7971096b6767 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
@@ -1613,7 +1613,7 @@ nvd7_chipset = {
 	.pci = gf106_pci_new,
 	.therm = gf119_therm_new,
 	.timer = nv41_timer_new,
-	.volt = gf100_volt_new,
+	.volt = gf117_volt_new,
 	.ce[0] = gf100_ce_new,
 	.disp = gf119_disp_new,
 	.dma = gf119_dma_new,
@@ -2405,6 +2405,7 @@ nv140_chipset = {
 	.fb = gv100_fb_new,
 	.fuse = gm107_fuse_new,
 	.gpio = gk104_gpio_new,
+	.gsp = gv100_gsp_new,
 	.i2c = gm200_i2c_new,
 	.ibus = gm200_ibus_new,
 	.imem = nv50_instmem_new,
@@ -2437,97 +2438,106 @@ nv140_chipset = {
 static const struct nvkm_device_chip
 nv162_chipset = {
 	.name = "TU102",
-	.bar = tu104_bar_new,
+	.bar = tu102_bar_new,
 	.bios = nvkm_bios_new,
 	.bus = gf100_bus_new,
-	.devinit = tu104_devinit_new,
-	.fault = tu104_fault_new,
+	.devinit = tu102_devinit_new,
+	.fault = tu102_fault_new,
 	.fb = gv100_fb_new,
 	.fuse = gm107_fuse_new,
 	.gpio = gk104_gpio_new,
+	.gsp = gv100_gsp_new,
 	.i2c = gm200_i2c_new,
 	.ibus = gm200_ibus_new,
 	.imem = nv50_instmem_new,
 	.ltc = gp102_ltc_new,
-	.mc = tu104_mc_new,
-	.mmu = tu104_mmu_new,
+	.mc = tu102_mc_new,
+	.mmu = tu102_mmu_new,
 	.pci = gp100_pci_new,
 	.pmu = gp102_pmu_new,
 	.therm = gp100_therm_new,
 	.timer = gk20a_timer_new,
 	.top = gk104_top_new,
-	.ce[0] = tu104_ce_new,
-	.ce[1] = tu104_ce_new,
-	.ce[2] = tu104_ce_new,
-	.ce[3] = tu104_ce_new,
-	.ce[4] = tu104_ce_new,
-	.disp = tu104_disp_new,
+	.ce[0] = tu102_ce_new,
+	.ce[1] = tu102_ce_new,
+	.ce[2] = tu102_ce_new,
+	.ce[3] = tu102_ce_new,
+	.ce[4] = tu102_ce_new,
+	.disp = tu102_disp_new,
 	.dma = gv100_dma_new,
-	.fifo = tu104_fifo_new,
+	.fifo = tu102_fifo_new,
+	.nvdec[0] = gp102_nvdec_new,
+	.sec2 = tu102_sec2_new,
 };
 
 static const struct nvkm_device_chip
 nv164_chipset = {
 	.name = "TU104",
-	.bar = tu104_bar_new,
+	.bar = tu102_bar_new,
 	.bios = nvkm_bios_new,
 	.bus = gf100_bus_new,
-	.devinit = tu104_devinit_new,
-	.fault = tu104_fault_new,
+	.devinit = tu102_devinit_new,
+	.fault = tu102_fault_new,
 	.fb = gv100_fb_new,
 	.fuse = gm107_fuse_new,
 	.gpio = gk104_gpio_new,
+	.gsp = gv100_gsp_new,
 	.i2c = gm200_i2c_new,
 	.ibus = gm200_ibus_new,
 	.imem = nv50_instmem_new,
 	.ltc = gp102_ltc_new,
-	.mc = tu104_mc_new,
-	.mmu = tu104_mmu_new,
+	.mc = tu102_mc_new,
+	.mmu = tu102_mmu_new,
 	.pci = gp100_pci_new,
 	.pmu = gp102_pmu_new,
 	.therm = gp100_therm_new,
 	.timer = gk20a_timer_new,
 	.top = gk104_top_new,
-	.ce[0] = tu104_ce_new,
-	.ce[1] = tu104_ce_new,
-	.ce[2] = tu104_ce_new,
-	.ce[3] = tu104_ce_new,
-	.ce[4] = tu104_ce_new,
-	.disp = tu104_disp_new,
+	.ce[0] = tu102_ce_new,
+	.ce[1] = tu102_ce_new,
+	.ce[2] = tu102_ce_new,
+	.ce[3] = tu102_ce_new,
+	.ce[4] = tu102_ce_new,
+	.disp = tu102_disp_new,
 	.dma = gv100_dma_new,
-	.fifo = tu104_fifo_new,
+	.fifo = tu102_fifo_new,
+	.nvdec[0] = gp102_nvdec_new,
+	.sec2 = tu102_sec2_new,
 };
 
 static const struct nvkm_device_chip
 nv166_chipset = {
 	.name = "TU106",
-	.bar = tu104_bar_new,
+	.bar = tu102_bar_new,
 	.bios = nvkm_bios_new,
 	.bus = gf100_bus_new,
-	.devinit = tu104_devinit_new,
-	.fault = tu104_fault_new,
+	.devinit = tu102_devinit_new,
+	.fault = tu102_fault_new,
 	.fb = gv100_fb_new,
 	.fuse = gm107_fuse_new,
 	.gpio = gk104_gpio_new,
+	.gsp = gv100_gsp_new,
 	.i2c = gm200_i2c_new,
 	.ibus = gm200_ibus_new,
 	.imem = nv50_instmem_new,
 	.ltc = gp102_ltc_new,
-	.mc = tu104_mc_new,
-	.mmu = tu104_mmu_new,
+	.mc = tu102_mc_new,
+	.mmu = tu102_mmu_new,
 	.pci = gp100_pci_new,
 	.pmu = gp102_pmu_new,
 	.therm = gp100_therm_new,
 	.timer = gk20a_timer_new,
 	.top = gk104_top_new,
-	.ce[0] = tu104_ce_new,
-	.ce[1] = tu104_ce_new,
-	.ce[2] = tu104_ce_new,
-	.ce[3] = tu104_ce_new,
-	.ce[4] = tu104_ce_new,
-	.disp = tu104_disp_new,
+	.ce[0] = tu102_ce_new,
+	.ce[1] = tu102_ce_new,
+	.ce[2] = tu102_ce_new,
+	.ce[3] = tu102_ce_new,
+	.ce[4] = tu102_ce_new,
+	.disp = tu102_disp_new,
 	.dma = gv100_dma_new,
-	.fifo = tu104_fifo_new,
+	.fifo = tu102_fifo_new,
+	.nvdec[0] = gp102_nvdec_new,
+	.sec2 = tu102_sec2_new,
 };
 
 static int
@@ -2567,6 +2577,7 @@ nvkm_device_subdev(struct nvkm_device *device, int index)
 	_(FB      , device->fb      , &device->fb->subdev);
 	_(FUSE    , device->fuse    , &device->fuse->subdev);
 	_(GPIO    , device->gpio    , &device->gpio->subdev);
+	_(GSP     , device->gsp     , &device->gsp->subdev);
 	_(I2C     , device->i2c     , &device->i2c->subdev);
 	_(IBUS    , device->ibus    ,  device->ibus);
 	_(ICCSENSE, device->iccsense, &device->iccsense->subdev);
@@ -3050,6 +3061,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func,
 		_(NVKM_SUBDEV_FB      ,       fb);
 		_(NVKM_SUBDEV_FUSE    ,     fuse);
 		_(NVKM_SUBDEV_GPIO    ,     gpio);
+		_(NVKM_SUBDEV_GSP     ,      gsp);
 		_(NVKM_SUBDEV_I2C     ,      i2c);
 		_(NVKM_SUBDEV_IBUS    ,     ibus);
 		_(NVKM_SUBDEV_ICCSENSE, iccsense);
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h b/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h
index 253ab914a8ef..2a53e37dfa7a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/priv.h
@@ -12,6 +12,7 @@
 #include <subdev/fb.h>
 #include <subdev/fuse.h>
 #include <subdev/gpio.h>
+#include <subdev/gsp.h>
 #include <subdev/i2c.h>
 #include <subdev/ibus.h>
 #include <subdev/iccsense.h>
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c
index 092ddc4ffefa..03c6d9aef075 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/user.c
@@ -365,16 +365,15 @@ nvkm_udevice_child_get(struct nvkm_object *object, int index,
 	}
 
 	if (!sclass) {
-		switch (index) {
-		case 0: sclass = &nvkm_control_oclass; break;
-		case 1:
-			if (!device->mmu)
-				return -EINVAL;
+		if (index-- == 0)
+			sclass = &nvkm_control_oclass;
+		else if (device->mmu && index-- == 0)
 			sclass = &device->mmu->user;
-			break;
-		default:
+		else if (device->fault && index-- == 0)
+			sclass = &device->fault->user;
+		else
 			return -EINVAL;
-		}
+
 		oclass->base = sclass->base;
 	}
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild b/drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild
index c6a257ba4347..2c28a5e747cc 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild
@@ -15,7 +15,7 @@ nvkm-y += nvkm/engine/disp/gm200.o
 nvkm-y += nvkm/engine/disp/gp100.o
 nvkm-y += nvkm/engine/disp/gp102.o
 nvkm-y += nvkm/engine/disp/gv100.o
-nvkm-y += nvkm/engine/disp/tu104.o
+nvkm-y += nvkm/engine/disp/tu102.o
 nvkm-y += nvkm/engine/disp/vga.o
 
 nvkm-y += nvkm/engine/disp/head.o
@@ -39,7 +39,7 @@ nvkm-y += nvkm/engine/disp/sorgk104.o
 nvkm-y += nvkm/engine/disp/sorgm107.o
 nvkm-y += nvkm/engine/disp/sorgm200.o
 nvkm-y += nvkm/engine/disp/sorgv100.o
-nvkm-y += nvkm/engine/disp/sortu104.o
+nvkm-y += nvkm/engine/disp/sortu102.o
 
 nvkm-y += nvkm/engine/disp/outp.o
 nvkm-y += nvkm/engine/disp/dp.o
@@ -71,7 +71,7 @@ nvkm-y += nvkm/engine/disp/rootgm200.o
 nvkm-y += nvkm/engine/disp/rootgp100.o
 nvkm-y += nvkm/engine/disp/rootgp102.o
 nvkm-y += nvkm/engine/disp/rootgv100.o
-nvkm-y += nvkm/engine/disp/roottu104.o
+nvkm-y += nvkm/engine/disp/roottu102.o
 
 nvkm-y += nvkm/engine/disp/channv50.o
 nvkm-y += nvkm/engine/disp/changf119.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/gf119.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/gf119.c
index 794e90982641..e675d9b9d5d7 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/gf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/gf119.c
@@ -91,15 +91,21 @@ gf119_disp_intr_error(struct nv50_disp *disp, int chid)
 {
 	struct nvkm_subdev *subdev = &disp->base.engine.subdev;
 	struct nvkm_device *device = subdev->device;
-	u32 mthd = nvkm_rd32(device, 0x6101f0 + (chid * 12));
+	u32 stat = nvkm_rd32(device, 0x6101f0 + (chid * 12));
+	u32 type = (stat & 0x00007000) >> 12;
+	u32 mthd = (stat & 0x00000ffc);
 	u32 data = nvkm_rd32(device, 0x6101f4 + (chid * 12));
-	u32 unkn = nvkm_rd32(device, 0x6101f8 + (chid * 12));
+	u32 code = nvkm_rd32(device, 0x6101f8 + (chid * 12));
+	const struct nvkm_enum *reason =
+		nvkm_enum_find(nv50_disp_intr_error_type, type);
 
-	nvkm_error(subdev, "chid %d mthd %04x data %08x %08x %08x\n",
-		   chid, (mthd & 0x0000ffc), data, mthd, unkn);
+	nvkm_error(subdev, "chid %d stat %08x reason %d [%s] mthd %04x "
+			   "data %08x code %08x\n",
+		   chid, stat, type, reason ? reason->name : "",
+		   mthd, data, code);
 
 	if (chid < ARRAY_SIZE(disp->chan)) {
-		switch (mthd & 0xffc) {
+		switch (mthd) {
 		case 0x0080:
 			nv50_disp_chan_mthd(disp->chan[chid], NV_DBG_ERROR);
 			break;
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/gv100.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/gv100.c
index 47be0ba4aebe..892be6c9b76c 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/gv100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/gv100.c
@@ -103,10 +103,13 @@ gv100_disp_exception(struct nv50_disp *disp, int chid)
 	u32 mthd = (stat & 0x00000fff) << 2;
 	u32 data = nvkm_rd32(device, 0x611024 + (chid * 12));
 	u32 code = nvkm_rd32(device, 0x611028 + (chid * 12));
+	const struct nvkm_enum *reason =
+		nvkm_enum_find(nv50_disp_intr_error_type, type);
 
-	nvkm_error(subdev, "chid %d %08x [type %d mthd %04x] "
+	nvkm_error(subdev, "chid %d stat %08x reason %d [%s] mthd %04x "
 			   "data %08x code %08x\n",
-		   chid, stat, type, mthd, data, code);
+		   chid, stat, type, reason ? reason->name : "",
+		   mthd, data, code);
 
 	if (chid < ARRAY_SIZE(disp->chan) && disp->chan[chid]) {
 		switch (mthd) {
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/ior.h b/drivers/gpu/drm/nouveau/nvkm/engine/disp/ior.h
index 790e42f460fd..1681ddccd298 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/ior.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/ior.h
@@ -201,5 +201,5 @@ int gm200_sor_new(struct nvkm_disp *, int);
 int gv100_sor_cnt(struct nvkm_disp *, unsigned long *);
 int gv100_sor_new(struct nvkm_disp *, int);
 
-int tu104_sor_new(struct nvkm_disp *, int);
+int tu102_sor_new(struct nvkm_disp *, int);
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.c
index def005dd5fda..e21556bf2cb1 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.c
@@ -28,7 +28,6 @@
 #include "rootnv50.h"
 
 #include <core/client.h>
-#include <core/enum.h>
 #include <core/ramht.h>
 #include <subdev/bios.h>
 #include <subdev/bios/disp.h>
@@ -593,12 +592,15 @@ nv50_disp_super(struct work_struct *work)
 	nvkm_wr32(device, 0x610030, 0x80000000);
 }
 
-static const struct nvkm_enum
+const struct nvkm_enum
 nv50_disp_intr_error_type[] = {
-	{ 3, "ILLEGAL_MTHD" },
-	{ 4, "INVALID_VALUE" },
+	{ 0, "NONE" },
+	{ 1, "PUSHBUFFER_ERR" },
+	{ 2, "TRAP" },
+	{ 3, "RESERVED_METHOD" },
+	{ 4, "INVALID_ARG" },
 	{ 5, "INVALID_STATE" },
-	{ 7, "INVALID_HANDLE" },
+	{ 7, "UNRESOLVABLE_HANDLE" },
 	{}
 };
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.h b/drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.h
index c36a8a7cafa1..e5d00f478bb1 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.h
@@ -5,6 +5,8 @@
 #include "priv.h"
 struct nvkm_head;
 
+#include <core/enum.h>
+
 struct nv50_disp {
 	const struct nv50_disp_func *func;
 	struct nvkm_disp base;
@@ -71,6 +73,7 @@ int nv50_disp_init(struct nv50_disp *);
 void nv50_disp_fini(struct nv50_disp *);
 void nv50_disp_intr(struct nv50_disp *);
 void nv50_disp_super(struct work_struct *);
+extern const struct nvkm_enum nv50_disp_intr_error_type[];
 
 int gf119_disp_init(struct nv50_disp *);
 void gf119_disp_fini(struct nv50_disp *);
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.h b/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.h
index 97de928cbde1..aee9822a7a87 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.h
@@ -37,5 +37,5 @@ extern const struct nvkm_disp_oclass gm200_disp_root_oclass;
 extern const struct nvkm_disp_oclass gp100_disp_root_oclass;
 extern const struct nvkm_disp_oclass gp102_disp_root_oclass;
 extern const struct nvkm_disp_oclass gv100_disp_root_oclass;
-extern const struct nvkm_disp_oclass tu104_disp_root_oclass;
+extern const struct nvkm_disp_oclass tu102_disp_root_oclass;
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/roottu104.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/roottu102.c
index ad438c62f66c..579a5d02308a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/roottu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/roottu102.c
@@ -25,28 +25,28 @@
 #include <nvif/class.h>
 
 static const struct nv50_disp_root_func
-tu104_disp_root = {
+tu102_disp_root = {
 	.user = {
-		{{0,0,TU104_DISP_CURSOR                }, gv100_disp_curs_new },
-		{{0,0,TU104_DISP_WINDOW_IMM_CHANNEL_DMA}, gv100_disp_wimm_new },
-		{{0,0,TU104_DISP_CORE_CHANNEL_DMA      }, gv100_disp_core_new },
-		{{0,0,TU104_DISP_WINDOW_CHANNEL_DMA    }, gv100_disp_wndw_new },
+		{{0,0,TU102_DISP_CURSOR                }, gv100_disp_curs_new },
+		{{0,0,TU102_DISP_WINDOW_IMM_CHANNEL_DMA}, gv100_disp_wimm_new },
+		{{0,0,TU102_DISP_CORE_CHANNEL_DMA      }, gv100_disp_core_new },
+		{{0,0,TU102_DISP_WINDOW_CHANNEL_DMA    }, gv100_disp_wndw_new },
 		{}
 	},
 };
 
 static int
-tu104_disp_root_new(struct nvkm_disp *disp, const struct nvkm_oclass *oclass,
+tu102_disp_root_new(struct nvkm_disp *disp, const struct nvkm_oclass *oclass,
 		    void *data, u32 size, struct nvkm_object **pobject)
 {
-	return nv50_disp_root_new_(&tu104_disp_root, disp, oclass,
+	return nv50_disp_root_new_(&tu102_disp_root, disp, oclass,
 				   data, size, pobject);
 }
 
 const struct nvkm_disp_oclass
-tu104_disp_root_oclass = {
-	.base.oclass = TU104_DISP,
+tu102_disp_root_oclass = {
+	.base.oclass = TU102_DISP,
 	.base.minver = -1,
 	.base.maxver = -1,
-	.ctor = tu104_disp_root_new,
+	.ctor = tu102_disp_root_new,
 };
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/sortu104.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/sortu102.c
index df026a525ef1..d57b73ada89e 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/sortu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/sortu102.c
@@ -24,7 +24,7 @@
 #include <subdev/timer.h>
 
 static void
-tu104_sor_dp_vcpi(struct nvkm_ior *sor, int head,
+tu102_sor_dp_vcpi(struct nvkm_ior *sor, int head,
 		  u8 slot, u8 slot_nr, u16 pbn, u16 aligned)
 {
 	struct nvkm_device *device = sor->disp->engine.subdev.device;
@@ -35,7 +35,7 @@ tu104_sor_dp_vcpi(struct nvkm_ior *sor, int head,
 }
 
 static int
-tu104_sor_dp_links(struct nvkm_ior *sor, struct nvkm_i2c_aux *aux)
+tu102_sor_dp_links(struct nvkm_ior *sor, struct nvkm_i2c_aux *aux)
 {
 	struct nvkm_device *device = sor->disp->engine.subdev.device;
 	const u32 soff = nv50_ior_base(sor);
@@ -62,7 +62,7 @@ tu104_sor_dp_links(struct nvkm_ior *sor, struct nvkm_i2c_aux *aux)
 }
 
 static const struct nvkm_ior_func
-tu104_sor = {
+tu102_sor = {
 	.route = {
 		.get = gm200_sor_route_get,
 		.set = gm200_sor_route_set,
@@ -75,11 +75,11 @@ tu104_sor = {
 	},
 	.dp = {
 		.lanes = { 0, 1, 2, 3 },
-		.links = tu104_sor_dp_links,
+		.links = tu102_sor_dp_links,
 		.power = g94_sor_dp_power,
 		.pattern = gm107_sor_dp_pattern,
 		.drive = gm200_sor_dp_drive,
-		.vcpi = tu104_sor_dp_vcpi,
+		.vcpi = tu102_sor_dp_vcpi,
 		.audio = gv100_sor_dp_audio,
 		.audio_sym = gv100_sor_dp_audio_sym,
 		.watermark = gv100_sor_dp_watermark,
@@ -91,7 +91,7 @@ tu104_sor = {
 };
 
 int
-tu104_sor_new(struct nvkm_disp *disp, int id)
+tu102_sor_new(struct nvkm_disp *disp, int id)
 {
-	return nvkm_ior_new_(&tu104_sor, disp, SOR, id);
+	return nvkm_ior_new_(&tu102_sor, disp, SOR, id);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/tu104.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/tu102.c
index 13fa21459d38..883ae4151ff8 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/tu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/tu102.c
@@ -29,7 +29,7 @@
 #include <subdev/timer.h>
 
 static int
-tu104_disp_init(struct nv50_disp *disp)
+tu102_disp_init(struct nv50_disp *disp)
 {
 	struct nvkm_device *device = disp->base.engine.subdev.device;
 	struct nvkm_head *head;
@@ -132,21 +132,21 @@ tu104_disp_init(struct nv50_disp *disp)
 }
 
 static const struct nv50_disp_func
-tu104_disp = {
-	.init = tu104_disp_init,
+tu102_disp = {
+	.init = tu102_disp_init,
 	.fini = gv100_disp_fini,
 	.intr = gv100_disp_intr,
 	.uevent = &gv100_disp_chan_uevent,
 	.super = gv100_disp_super,
-	.root = &tu104_disp_root_oclass,
+	.root = &tu102_disp_root_oclass,
 	.wndw = { .cnt = gv100_disp_wndw_cnt },
 	.head = { .cnt = gv100_head_cnt, .new = gv100_head_new },
-	.sor = { .cnt = gv100_sor_cnt, .new = tu104_sor_new },
+	.sor = { .cnt = gv100_sor_cnt, .new = tu102_sor_new },
 	.ramht_size = 0x2000,
 };
 
 int
-tu104_disp_new(struct nvkm_device *device, int index, struct nvkm_disp **pdisp)
+tu102_disp_new(struct nvkm_device *device, int index, struct nvkm_disp **pdisp)
 {
-	return nv50_disp_new_(&tu104_disp, device, index, pdisp);
+	return nv50_disp_new_(&tu102_disp, device, index, pdisp);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/Kbuild b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/Kbuild
index 87d8e054e40a..05aada541ea5 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/Kbuild
@@ -16,7 +16,7 @@ nvkm-y += nvkm/engine/fifo/gm20b.o
 nvkm-y += nvkm/engine/fifo/gp100.o
 nvkm-y += nvkm/engine/fifo/gp10b.o
 nvkm-y += nvkm/engine/fifo/gv100.o
-nvkm-y += nvkm/engine/fifo/tu104.o
+nvkm-y += nvkm/engine/fifo/tu102.o
 
 nvkm-y += nvkm/engine/fifo/chan.o
 nvkm-y += nvkm/engine/fifo/channv50.o
@@ -34,7 +34,7 @@ nvkm-y += nvkm/engine/fifo/gpfifog84.o
 nvkm-y += nvkm/engine/fifo/gpfifogf100.o
 nvkm-y += nvkm/engine/fifo/gpfifogk104.o
 nvkm-y += nvkm/engine/fifo/gpfifogv100.o
-nvkm-y += nvkm/engine/fifo/gpfifotu104.o
+nvkm-y += nvkm/engine/fifo/gpfifotu102.o
 
 nvkm-y += nvkm/engine/fifo/usergv100.o
-nvkm-y += nvkm/engine/fifo/usertu104.o
+nvkm-y += nvkm/engine/fifo/usertu102.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/changk104.h b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/changk104.h
index a14545d871d8..f8557cdfbd81 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/changk104.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/changk104.h
@@ -47,6 +47,6 @@ int gv100_fifo_gpfifo_engine_init(struct nvkm_fifo_chan *,
 int gv100_fifo_gpfifo_engine_fini(struct nvkm_fifo_chan *,
 				  struct nvkm_engine *, bool);
 
-int tu104_fifo_gpfifo_new(struct gk104_fifo *, const struct nvkm_oclass *,
+int tu102_fifo_gpfifo_new(struct gk104_fifo *, const struct nvkm_oclass *,
 			  void *data, u32 size, struct nvkm_object **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifotu104.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifotu102.c
index ff70484dd01a..abef7fb6e2d3 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifotu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifotu102.c
@@ -29,14 +29,14 @@
 #include <nvif/unpack.h>
 
 static u32
-tu104_fifo_gpfifo_submit_token(struct nvkm_fifo_chan *base)
+tu102_fifo_gpfifo_submit_token(struct nvkm_fifo_chan *base)
 {
 	struct gk104_fifo_chan *chan = gk104_fifo_chan(base);
 	return (chan->runl << 16) | chan->base.chid;
 }
 
 static const struct nvkm_fifo_chan_func
-tu104_fifo_gpfifo = {
+tu102_fifo_gpfifo = {
 	.dtor = gk104_fifo_gpfifo_dtor,
 	.init = gk104_fifo_gpfifo_init,
 	.fini = gk104_fifo_gpfifo_fini,
@@ -45,11 +45,11 @@ tu104_fifo_gpfifo = {
 	.engine_dtor = gk104_fifo_gpfifo_engine_dtor,
 	.engine_init = gv100_fifo_gpfifo_engine_init,
 	.engine_fini = gv100_fifo_gpfifo_engine_fini,
-	.submit_token = tu104_fifo_gpfifo_submit_token,
+	.submit_token = tu102_fifo_gpfifo_submit_token,
 };
 
 int
-tu104_fifo_gpfifo_new(struct gk104_fifo *fifo, const struct nvkm_oclass *oclass,
+tu102_fifo_gpfifo_new(struct gk104_fifo *fifo, const struct nvkm_oclass *oclass,
 		      void *data, u32 size, struct nvkm_object **pobject)
 {
 	struct nvkm_object *parent = oclass->parent;
@@ -67,7 +67,7 @@ tu104_fifo_gpfifo_new(struct gk104_fifo *fifo, const struct nvkm_oclass *oclass,
 			   args->v0.ilength, args->v0.runlist, args->v0.priv);
 		if (args->v0.priv && !oclass->client->super)
 			return -EINVAL;
-		return gv100_fifo_gpfifo_new_(&tu104_fifo_gpfifo, fifo,
+		return gv100_fifo_gpfifo_new_(&tu102_fifo_gpfifo, fifo,
 					      &args->v0.runlist,
 					      &args->v0.chid,
 					       args->v0.vmm,
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/tu104.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/tu102.c
index 98c80705bc61..005f3e1729b9 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/tu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/tu102.c
@@ -29,7 +29,7 @@
 #include <nvif/class.h>
 
 static void
-tu104_fifo_runlist_commit(struct gk104_fifo *fifo, int runl,
+tu102_fifo_runlist_commit(struct gk104_fifo *fifo, int runl,
 			  struct nvkm_memory *mem, int nr)
 {
 	struct nvkm_device *device = fifo->base.engine.subdev.device;
@@ -44,15 +44,15 @@ tu104_fifo_runlist_commit(struct gk104_fifo *fifo, int runl,
 }
 
 const struct gk104_fifo_runlist_func
-tu104_fifo_runlist = {
+tu102_fifo_runlist = {
 	.size = 16,
 	.cgrp = gv100_fifo_runlist_cgrp,
 	.chan = gv100_fifo_runlist_chan,
-	.commit = tu104_fifo_runlist_commit,
+	.commit = tu102_fifo_runlist_commit,
 };
 
 static const struct nvkm_enum
-tu104_fifo_fault_engine[] = {
+tu102_fifo_fault_engine[] = {
 	{ 0x01, "DISPLAY" },
 	{ 0x03, "PTP" },
 	{ 0x06, "PWR_PMU" },
@@ -80,7 +80,7 @@ tu104_fifo_fault_engine[] = {
 };
 
 static void
-tu104_fifo_pbdma_init(struct gk104_fifo *fifo)
+tu102_fifo_pbdma_init(struct gk104_fifo *fifo)
 {
 	struct nvkm_device *device = fifo->base.engine.subdev.device;
 	const u32 mask = (1 << fifo->pbdma_nr) - 1;
@@ -89,28 +89,28 @@ tu104_fifo_pbdma_init(struct gk104_fifo *fifo)
 }
 
 static const struct gk104_fifo_pbdma_func
-tu104_fifo_pbdma = {
+tu102_fifo_pbdma = {
 	.nr = gm200_fifo_pbdma_nr,
-	.init = tu104_fifo_pbdma_init,
+	.init = tu102_fifo_pbdma_init,
 	.init_timeout = gk208_fifo_pbdma_init_timeout,
 };
 
 static const struct gk104_fifo_func
-tu104_fifo = {
-	.pbdma = &tu104_fifo_pbdma,
+tu102_fifo = {
+	.pbdma = &tu102_fifo_pbdma,
 	.fault.access = gv100_fifo_fault_access,
-	.fault.engine = tu104_fifo_fault_engine,
+	.fault.engine = tu102_fifo_fault_engine,
 	.fault.reason = gv100_fifo_fault_reason,
 	.fault.hubclient = gv100_fifo_fault_hubclient,
 	.fault.gpcclient = gv100_fifo_fault_gpcclient,
-	.runlist = &tu104_fifo_runlist,
-	.user = {{-1,-1,VOLTA_USERMODE_A       }, tu104_fifo_user_new   },
-	.chan = {{ 0, 0,TURING_CHANNEL_GPFIFO_A}, tu104_fifo_gpfifo_new },
+	.runlist = &tu102_fifo_runlist,
+	.user = {{-1,-1,VOLTA_USERMODE_A       }, tu102_fifo_user_new   },
+	.chan = {{ 0, 0,TURING_CHANNEL_GPFIFO_A}, tu102_fifo_gpfifo_new },
 	.cgrp_force = true,
 };
 
 int
-tu104_fifo_new(struct nvkm_device *device, int index, struct nvkm_fifo **pfifo)
+tu102_fifo_new(struct nvkm_device *device, int index, struct nvkm_fifo **pfifo)
 {
-	return gk104_fifo_new_(&tu104_fifo, device, index, 4096, pfifo);
+	return gk104_fifo_new_(&tu102_fifo, device, index, 4096, pfifo);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/user.h b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/user.h
index 14b0c6bde8eb..54a3a3092cc0 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/user.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/user.h
@@ -3,6 +3,6 @@
 #include "priv.h"
 int gv100_fifo_user_new(const struct nvkm_oclass *, void *, u32,
 			struct nvkm_object **);
-int tu104_fifo_user_new(const struct nvkm_oclass *, void *, u32,
+int tu102_fifo_user_new(const struct nvkm_oclass *, void *, u32,
 			struct nvkm_object **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/usertu104.c b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/usertu102.c
index 8f98548a21f6..217268f8ccad 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/fifo/usertu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/fifo/usertu102.c
@@ -22,7 +22,7 @@
 #include "user.h"
 
 static int
-tu104_fifo_user_map(struct nvkm_object *object, void *argv, u32 argc,
+tu102_fifo_user_map(struct nvkm_object *object, void *argv, u32 argc,
 		    enum nvkm_object_map *type, u64 *addr, u64 *size)
 {
 	struct nvkm_device *device = object->engine->subdev.device;
@@ -33,13 +33,13 @@ tu104_fifo_user_map(struct nvkm_object *object, void *argv, u32 argc,
 }
 
 static const struct nvkm_object_func
-tu104_fifo_user = {
-	.map = tu104_fifo_user_map,
+tu102_fifo_user = {
+	.map = tu102_fifo_user_map,
 };
 
 int
-tu104_fifo_user_new(const struct nvkm_oclass *oclass, void *argv, u32 argc,
+tu102_fifo_user_new(const struct nvkm_oclass *oclass, void *argv, u32 argc,
 		    struct nvkm_object **pobject)
 {
-	return nvkm_object_new_(&tu104_fifo_user, oclass, argv, argc, pobject);
+	return nvkm_object_new_(&tu102_fifo_user, oclass, argv, argc, pobject);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c
index cd8cf6f7024c..d41fb94524e9 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c
@@ -25,6 +25,33 @@
 
 #include <engine/fifo.h>
 
+u32
+nvkm_gr_ctxsw_inst(struct nvkm_device *device)
+{
+	struct nvkm_gr *gr = device->gr;
+	if (gr && gr->func->ctxsw.inst)
+		return gr->func->ctxsw.inst(gr);
+	return 0;
+}
+
+int
+nvkm_gr_ctxsw_resume(struct nvkm_device *device)
+{
+	struct nvkm_gr *gr = device->gr;
+	if (gr && gr->func->ctxsw.resume)
+		return gr->func->ctxsw.resume(gr);
+	return 0;
+}
+
+int
+nvkm_gr_ctxsw_pause(struct nvkm_device *device)
+{
+	struct nvkm_gr *gr = device->gr;
+	if (gr && gr->func->ctxsw.pause)
+		return gr->func->ctxsw.pause(gr);
+	return 0;
+}
+
 static bool
 nvkm_gr_chsw_load(struct nvkm_engine *engine)
 {
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c
index e813a3f8ea93..85f2d1e950e8 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.c
@@ -1523,13 +1523,9 @@ gf100_grctx_generate(struct gf100_gr *gr)
 	/* Make channel current. */
 	addr = nvkm_memory_addr(inst) >> 12;
 	if (gr->firmware) {
-		nvkm_wr32(device, 0x409840, 0x00000030);
-		nvkm_wr32(device, 0x409500, 0x80000000 | addr);
-		nvkm_wr32(device, 0x409504, 0x00000003);
-		nvkm_msec(device, 2000,
-			if (nvkm_rd32(device, 0x409800) & 0x00000010)
-				break;
-		);
+		ret = gf100_gr_fecs_bind_pointer(gr, 0x80000000 | addr);
+		if (ret)
+			goto done;
 
 		nvkm_kmap(data);
 		nvkm_wo32(data, 0x1c, 1);
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
index 70d3d41e616c..81a13cf9a292 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c
@@ -715,6 +715,211 @@ gf100_gr_pack_mmio[] = {
  * PGRAPH engine/subdev functions
  ******************************************************************************/
 
+static u32
+gf100_gr_ctxsw_inst(struct nvkm_gr *gr)
+{
+	return nvkm_rd32(gr->engine.subdev.device, 0x409b00);
+}
+
+static int
+gf100_gr_fecs_ctrl_ctxsw(struct gf100_gr *gr, u32 mthd)
+{
+	struct nvkm_device *device = gr->base.engine.subdev.device;
+
+	nvkm_wr32(device, 0x409804, 0xffffffff);
+	nvkm_wr32(device, 0x409840, 0xffffffff);
+	nvkm_wr32(device, 0x409500, 0xffffffff);
+	nvkm_wr32(device, 0x409504, mthd);
+	nvkm_msec(device, 2000,
+		u32 stat = nvkm_rd32(device, 0x409804);
+		if (stat == 0x00000002)
+			return -EIO;
+		if (stat == 0x00000001)
+			return 0;
+	);
+
+	return -ETIMEDOUT;
+}
+
+int
+gf100_gr_fecs_start_ctxsw(struct nvkm_gr *base)
+{
+	struct gf100_gr *gr = gf100_gr(base);
+	int ret = 0;
+
+	mutex_lock(&gr->fecs.mutex);
+	if (!--gr->fecs.disable) {
+		if (WARN_ON(ret = gf100_gr_fecs_ctrl_ctxsw(gr, 0x39)))
+			gr->fecs.disable++;
+	}
+	mutex_unlock(&gr->fecs.mutex);
+	return ret;
+}
+
+int
+gf100_gr_fecs_stop_ctxsw(struct nvkm_gr *base)
+{
+	struct gf100_gr *gr = gf100_gr(base);
+	int ret = 0;
+
+	mutex_lock(&gr->fecs.mutex);
+	if (!gr->fecs.disable++) {
+		if (WARN_ON(ret = gf100_gr_fecs_ctrl_ctxsw(gr, 0x38)))
+			gr->fecs.disable--;
+	}
+	mutex_unlock(&gr->fecs.mutex);
+	return ret;
+}
+
+int
+gf100_gr_fecs_bind_pointer(struct gf100_gr *gr, u32 inst)
+{
+	struct nvkm_device *device = gr->base.engine.subdev.device;
+
+	nvkm_wr32(device, 0x409840, 0x00000030);
+	nvkm_wr32(device, 0x409500, inst);
+	nvkm_wr32(device, 0x409504, 0x00000003);
+	nvkm_msec(device, 2000,
+		u32 stat = nvkm_rd32(device, 0x409800);
+		if (stat & 0x00000020)
+			return -EIO;
+		if (stat & 0x00000010)
+			return 0;
+	);
+
+	return -ETIMEDOUT;
+}
+
+static int
+gf100_gr_fecs_set_reglist_virtual_address(struct gf100_gr *gr, u64 addr)
+{
+	struct nvkm_device *device = gr->base.engine.subdev.device;
+
+	nvkm_wr32(device, 0x409810, addr >> 8);
+	nvkm_wr32(device, 0x409800, 0x00000000);
+	nvkm_wr32(device, 0x409500, 0x00000001);
+	nvkm_wr32(device, 0x409504, 0x00000032);
+	nvkm_msec(device, 2000,
+		if (nvkm_rd32(device, 0x409800) == 0x00000001)
+			return 0;
+	);
+
+	return -ETIMEDOUT;
+}
+
+static int
+gf100_gr_fecs_set_reglist_bind_instance(struct gf100_gr *gr, u32 inst)
+{
+	struct nvkm_device *device = gr->base.engine.subdev.device;
+
+	nvkm_wr32(device, 0x409810, inst);
+	nvkm_wr32(device, 0x409800, 0x00000000);
+	nvkm_wr32(device, 0x409500, 0x00000001);
+	nvkm_wr32(device, 0x409504, 0x00000031);
+	nvkm_msec(device, 2000,
+		if (nvkm_rd32(device, 0x409800) == 0x00000001)
+			return 0;
+	);
+
+	return -ETIMEDOUT;
+}
+
+static int
+gf100_gr_fecs_discover_reglist_image_size(struct gf100_gr *gr, u32 *psize)
+{
+	struct nvkm_device *device = gr->base.engine.subdev.device;
+
+	nvkm_wr32(device, 0x409800, 0x00000000);
+	nvkm_wr32(device, 0x409500, 0x00000001);
+	nvkm_wr32(device, 0x409504, 0x00000030);
+	nvkm_msec(device, 2000,
+		if ((*psize = nvkm_rd32(device, 0x409800)))
+			return 0;
+	);
+
+	return -ETIMEDOUT;
+}
+
+static int
+gf100_gr_fecs_elpg_bind(struct gf100_gr *gr)
+{
+	u32 size;
+	int ret;
+
+	ret = gf100_gr_fecs_discover_reglist_image_size(gr, &size);
+	if (ret)
+		return ret;
+
+	/*XXX: We need to allocate + map the above into PMU's inst block,
+	 *     which which means we probably need a proper PMU before we
+	 *     even bother.
+	 */
+
+	ret = gf100_gr_fecs_set_reglist_bind_instance(gr, 0);
+	if (ret)
+		return ret;
+
+	return gf100_gr_fecs_set_reglist_virtual_address(gr, 0);
+}
+
+static int
+gf100_gr_fecs_discover_pm_image_size(struct gf100_gr *gr, u32 *psize)
+{
+	struct nvkm_device *device = gr->base.engine.subdev.device;
+
+	nvkm_wr32(device, 0x409840, 0xffffffff);
+	nvkm_wr32(device, 0x409500, 0x00000000);
+	nvkm_wr32(device, 0x409504, 0x00000025);
+	nvkm_msec(device, 2000,
+		if ((*psize = nvkm_rd32(device, 0x409800)))
+			return 0;
+	);
+
+	return -ETIMEDOUT;
+}
+
+static int
+gf100_gr_fecs_discover_zcull_image_size(struct gf100_gr *gr, u32 *psize)
+{
+	struct nvkm_device *device = gr->base.engine.subdev.device;
+
+	nvkm_wr32(device, 0x409840, 0xffffffff);
+	nvkm_wr32(device, 0x409500, 0x00000000);
+	nvkm_wr32(device, 0x409504, 0x00000016);
+	nvkm_msec(device, 2000,
+		if ((*psize = nvkm_rd32(device, 0x409800)))
+			return 0;
+	);
+
+	return -ETIMEDOUT;
+}
+
+static int
+gf100_gr_fecs_discover_image_size(struct gf100_gr *gr, u32 *psize)
+{
+	struct nvkm_device *device = gr->base.engine.subdev.device;
+
+	nvkm_wr32(device, 0x409840, 0xffffffff);
+	nvkm_wr32(device, 0x409500, 0x00000000);
+	nvkm_wr32(device, 0x409504, 0x00000010);
+	nvkm_msec(device, 2000,
+		if ((*psize = nvkm_rd32(device, 0x409800)))
+			return 0;
+	);
+
+	return -ETIMEDOUT;
+}
+
+static void
+gf100_gr_fecs_set_watchdog_timeout(struct gf100_gr *gr, u32 timeout)
+{
+	struct nvkm_device *device = gr->base.engine.subdev.device;
+
+	nvkm_wr32(device, 0x409840, 0xffffffff);
+	nvkm_wr32(device, 0x409500, timeout);
+	nvkm_wr32(device, 0x409504, 0x00000021);
+}
+
 static bool
 gf100_gr_chsw_load(struct nvkm_gr *base)
 {
@@ -1487,6 +1692,7 @@ gf100_gr_init_ctxctl_ext(struct gf100_gr *gr)
 	struct nvkm_device *device = subdev->device;
 	struct nvkm_secboot *sb = device->secboot;
 	u32 secboot_mask = 0;
+	int ret;
 
 	/* load fuc microcode */
 	nvkm_mc_unk260(device, 0);
@@ -1495,12 +1701,12 @@ gf100_gr_init_ctxctl_ext(struct gf100_gr *gr)
 	if (nvkm_secboot_is_managed(sb, NVKM_SECBOOT_FALCON_FECS))
 		secboot_mask |= BIT(NVKM_SECBOOT_FALCON_FECS);
 	else
-		gf100_gr_init_fw(gr->fecs, &gr->fuc409c, &gr->fuc409d);
+		gf100_gr_init_fw(gr->fecs.falcon, &gr->fuc409c, &gr->fuc409d);
 
 	if (nvkm_secboot_is_managed(sb, NVKM_SECBOOT_FALCON_GPCCS))
 		secboot_mask |= BIT(NVKM_SECBOOT_FALCON_GPCCS);
 	else
-		gf100_gr_init_fw(gr->gpccs, &gr->fuc41ac, &gr->fuc41ad);
+		gf100_gr_init_fw(gr->gpccs.falcon, &gr->fuc41ac, &gr->fuc41ad);
 
 	if (secboot_mask != 0) {
 		int ret = nvkm_secboot_reset(sb, secboot_mask);
@@ -1515,8 +1721,8 @@ gf100_gr_init_ctxctl_ext(struct gf100_gr *gr)
 	nvkm_wr32(device, 0x41a10c, 0x00000000);
 	nvkm_wr32(device, 0x40910c, 0x00000000);
 
-	nvkm_falcon_start(gr->gpccs);
-	nvkm_falcon_start(gr->fecs);
+	nvkm_falcon_start(gr->gpccs.falcon);
+	nvkm_falcon_start(gr->fecs.falcon);
 
 	if (nvkm_msec(device, 2000,
 		if (nvkm_rd32(device, 0x409800) & 0x00000001)
@@ -1524,72 +1730,36 @@ gf100_gr_init_ctxctl_ext(struct gf100_gr *gr)
 	) < 0)
 		return -EBUSY;
 
-	nvkm_wr32(device, 0x409840, 0xffffffff);
-	nvkm_wr32(device, 0x409500, 0x7fffffff);
-	nvkm_wr32(device, 0x409504, 0x00000021);
+	gf100_gr_fecs_set_watchdog_timeout(gr, 0x7fffffff);
 
-	nvkm_wr32(device, 0x409840, 0xffffffff);
-	nvkm_wr32(device, 0x409500, 0x00000000);
-	nvkm_wr32(device, 0x409504, 0x00000010);
-	if (nvkm_msec(device, 2000,
-		if ((gr->size = nvkm_rd32(device, 0x409800)))
-			break;
-	) < 0)
-		return -EBUSY;
-
-	nvkm_wr32(device, 0x409840, 0xffffffff);
-	nvkm_wr32(device, 0x409500, 0x00000000);
-	nvkm_wr32(device, 0x409504, 0x00000016);
-	if (nvkm_msec(device, 2000,
-		if (nvkm_rd32(device, 0x409800))
-			break;
-	) < 0)
-		return -EBUSY;
+	/* Determine how much memory is required to store main context image. */
+	ret = gf100_gr_fecs_discover_image_size(gr, &gr->size);
+	if (ret)
+		return ret;
 
-	nvkm_wr32(device, 0x409840, 0xffffffff);
-	nvkm_wr32(device, 0x409500, 0x00000000);
-	nvkm_wr32(device, 0x409504, 0x00000025);
-	if (nvkm_msec(device, 2000,
-		if (nvkm_rd32(device, 0x409800))
-			break;
-	) < 0)
-		return -EBUSY;
+	/* Determine how much memory is required to store ZCULL image. */
+	ret = gf100_gr_fecs_discover_zcull_image_size(gr, &gr->size_zcull);
+	if (ret)
+		return ret;
 
-	if (device->chipset >= 0xe0) {
-		nvkm_wr32(device, 0x409800, 0x00000000);
-		nvkm_wr32(device, 0x409500, 0x00000001);
-		nvkm_wr32(device, 0x409504, 0x00000030);
-		if (nvkm_msec(device, 2000,
-			if (nvkm_rd32(device, 0x409800))
-				break;
-		) < 0)
-			return -EBUSY;
-
-		nvkm_wr32(device, 0x409810, 0xb00095c8);
-		nvkm_wr32(device, 0x409800, 0x00000000);
-		nvkm_wr32(device, 0x409500, 0x00000001);
-		nvkm_wr32(device, 0x409504, 0x00000031);
-		if (nvkm_msec(device, 2000,
-			if (nvkm_rd32(device, 0x409800))
-				break;
-		) < 0)
-			return -EBUSY;
-
-		nvkm_wr32(device, 0x409810, 0x00080420);
-		nvkm_wr32(device, 0x409800, 0x00000000);
-		nvkm_wr32(device, 0x409500, 0x00000001);
-		nvkm_wr32(device, 0x409504, 0x00000032);
-		if (nvkm_msec(device, 2000,
-			if (nvkm_rd32(device, 0x409800))
-				break;
-		) < 0)
-			return -EBUSY;
+	/* Determine how much memory is required to store PerfMon image. */
+	ret = gf100_gr_fecs_discover_pm_image_size(gr, &gr->size_pm);
+	if (ret)
+		return ret;
 
-		nvkm_wr32(device, 0x409614, 0x00000070);
-		nvkm_wr32(device, 0x409614, 0x00000770);
-		nvkm_wr32(device, 0x40802c, 0x00000001);
+	/*XXX: We (likely) require PMU support to even bother with this.
+	 *
+	 *     Also, it seems like not all GPUs support ELPG.  Traces I
+	 *     have here show RM enabling it on Kepler/Turing, but none
+	 *     of the GPUs between those.  NVGPU decides this by PCIID.
+	 */
+	if (0) {
+		ret = gf100_gr_fecs_elpg_bind(gr);
+		if (ret)
+			return ret;
 	}
 
+	/* Generate golden context image. */
 	if (gr->data == NULL) {
 		int ret = gf100_grctx_generate(gr);
 		if (ret) {
@@ -1614,15 +1784,19 @@ gf100_gr_init_ctxctl_int(struct gf100_gr *gr)
 
 	/* load HUB microcode */
 	nvkm_mc_unk260(device, 0);
-	nvkm_falcon_load_dmem(gr->fecs, gr->func->fecs.ucode->data.data, 0x0,
+	nvkm_falcon_load_dmem(gr->fecs.falcon,
+			      gr->func->fecs.ucode->data.data, 0x0,
 			      gr->func->fecs.ucode->data.size, 0);
-	nvkm_falcon_load_imem(gr->fecs, gr->func->fecs.ucode->code.data, 0x0,
+	nvkm_falcon_load_imem(gr->fecs.falcon,
+			      gr->func->fecs.ucode->code.data, 0x0,
 			      gr->func->fecs.ucode->code.size, 0, 0, false);
 
 	/* load GPC microcode */
-	nvkm_falcon_load_dmem(gr->gpccs, gr->func->gpccs.ucode->data.data, 0x0,
+	nvkm_falcon_load_dmem(gr->gpccs.falcon,
+			      gr->func->gpccs.ucode->data.data, 0x0,
 			      gr->func->gpccs.ucode->data.size, 0);
-	nvkm_falcon_load_imem(gr->gpccs, gr->func->gpccs.ucode->code.data, 0x0,
+	nvkm_falcon_load_imem(gr->gpccs.falcon,
+			      gr->func->gpccs.ucode->code.data, 0x0,
 			      gr->func->gpccs.ucode->code.size, 0, 0, false);
 	nvkm_mc_unk260(device, 1);
 
@@ -1769,11 +1943,13 @@ gf100_gr_oneinit(struct nvkm_gr *base)
 	int i, j;
 	int ret;
 
-	ret = nvkm_falcon_v1_new(subdev, "FECS", 0x409000, &gr->fecs);
+	ret = nvkm_falcon_v1_new(subdev, "FECS", 0x409000, &gr->fecs.falcon);
 	if (ret)
 		return ret;
 
-	ret = nvkm_falcon_v1_new(subdev, "GPCCS", 0x41a000, &gr->gpccs);
+	mutex_init(&gr->fecs.mutex);
+
+	ret = nvkm_falcon_v1_new(subdev, "GPCCS", 0x41a000, &gr->gpccs.falcon);
 	if (ret)
 		return ret;
 
@@ -1816,11 +1992,11 @@ gf100_gr_init_(struct nvkm_gr *base)
 
 	nvkm_pmu_pgob(gr->base.engine.subdev.device->pmu, false);
 
-	ret = nvkm_falcon_get(gr->fecs, subdev);
+	ret = nvkm_falcon_get(gr->fecs.falcon, subdev);
 	if (ret)
 		return ret;
 
-	ret = nvkm_falcon_get(gr->gpccs, subdev);
+	ret = nvkm_falcon_get(gr->gpccs.falcon, subdev);
 	if (ret)
 		return ret;
 
@@ -1832,8 +2008,8 @@ gf100_gr_fini_(struct nvkm_gr *base, bool suspend)
 {
 	struct gf100_gr *gr = gf100_gr(base);
 	struct nvkm_subdev *subdev = &gr->base.engine.subdev;
-	nvkm_falcon_put(gr->gpccs, subdev);
-	nvkm_falcon_put(gr->fecs, subdev);
+	nvkm_falcon_put(gr->gpccs.falcon, subdev);
+	nvkm_falcon_put(gr->fecs.falcon, subdev);
 	return 0;
 }
 
@@ -1859,8 +2035,8 @@ gf100_gr_dtor(struct nvkm_gr *base)
 		gr->func->dtor(gr);
 	kfree(gr->data);
 
-	nvkm_falcon_del(&gr->gpccs);
-	nvkm_falcon_del(&gr->fecs);
+	nvkm_falcon_del(&gr->gpccs.falcon);
+	nvkm_falcon_del(&gr->fecs.falcon);
 
 	gf100_gr_dtor_fw(&gr->fuc409c);
 	gf100_gr_dtor_fw(&gr->fuc409d);
@@ -1886,6 +2062,9 @@ gf100_gr_ = {
 	.chan_new = gf100_gr_chan_new,
 	.object_get = gf100_gr_object_get,
 	.chsw_load = gf100_gr_chsw_load,
+	.ctxsw.pause = gf100_gr_fecs_stop_ctxsw,
+	.ctxsw.resume = gf100_gr_fecs_start_ctxsw,
+	.ctxsw.inst = gf100_gr_ctxsw_inst,
 };
 
 int
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h
index dc46cf0131db..fafdd0bbea9b 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h
@@ -82,8 +82,16 @@ struct gf100_gr {
 	const struct gf100_gr_func *func;
 	struct nvkm_gr base;
 
-	struct nvkm_falcon *fecs;
-	struct nvkm_falcon *gpccs;
+	struct {
+		struct nvkm_falcon *falcon;
+		struct mutex mutex;
+		u32 disable;
+	} fecs;
+
+	struct {
+		struct nvkm_falcon *falcon;
+	} gpccs;
+
 	struct gf100_gr_fuc fuc409c;
 	struct gf100_gr_fuc fuc409d;
 	struct gf100_gr_fuc fuc41ac;
@@ -128,6 +136,8 @@ struct gf100_gr {
 	struct gf100_gr_mmio mmio_list[4096/8];
 	u32  size;
 	u32 *data;
+	u32 size_zcull;
+	u32 size_pm;
 };
 
 int gf100_gr_ctor(const struct gf100_gr_func *, struct nvkm_device *,
@@ -136,6 +146,8 @@ int gf100_gr_new_(const struct gf100_gr_func *, struct nvkm_device *,
 		  int, struct nvkm_gr **);
 void *gf100_gr_dtor(struct nvkm_gr *);
 
+int gf100_gr_fecs_bind_pointer(struct gf100_gr *, u32 inst);
+
 struct gf100_gr_func_zbc {
 	void (*clear_color)(struct gf100_gr *, int zbc);
 	void (*clear_depth)(struct gf100_gr *, int zbc);
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/priv.h b/drivers/gpu/drm/nouveau/nvkm/engine/gr/priv.h
index 66359c23cbce..d4d5601c51e7 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/priv.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/priv.h
@@ -27,6 +27,11 @@ struct nvkm_gr_func {
 	 */
 	u64 (*units)(struct nvkm_gr *);
 	bool (*chsw_load)(struct nvkm_gr *);
+	struct {
+		int (*pause)(struct nvkm_gr *);
+		int (*resume)(struct nvkm_gr *);
+		u32 (*inst)(struct nvkm_gr *);
+	} ctxsw;
 	struct nvkm_sclass sclass[];
 };
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/nvdec/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/nvdec/base.c
index 4807021fd990..4a63581bdd5e 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/nvdec/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/nvdec/base.c
@@ -21,13 +21,21 @@
  */
 #include "priv.h"
 
+#include <subdev/top.h>
 #include <engine/falcon.h>
 
 static int
 nvkm_nvdec_oneinit(struct nvkm_engine *engine)
 {
 	struct nvkm_nvdec *nvdec = nvkm_nvdec(engine);
-	return nvkm_falcon_v1_new(&nvdec->engine.subdev, "NVDEC", 0x84000,
+	struct nvkm_subdev *subdev = &nvdec->engine.subdev;
+
+	nvdec->addr = nvkm_top_addr(subdev->device, subdev->index);
+	if (!nvdec->addr)
+		return -EINVAL;
+
+	/*XXX: fix naming of this when adding support for multiple-NVDEC */
+	return nvkm_falcon_v1_new(subdev, "NVDEC", nvdec->addr,
 				  &nvdec->falcon);
 }
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/Kbuild b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/Kbuild
index 4b17254cfbd0..d9cdea7d9353 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/Kbuild
@@ -1,2 +1,3 @@
 nvkm-y += nvkm/engine/sec2/base.o
 nvkm-y += nvkm/engine/sec2/gp102.o
+nvkm-y += nvkm/engine/sec2/tu102.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/base.c
index f865d2a3e184..1b49e5b6717f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/base.c
@@ -22,6 +22,7 @@
 #include "priv.h"
 
 #include <core/msgqueue.h>
+#include <subdev/top.h>
 #include <engine/falcon.h>
 
 static void *
@@ -39,18 +40,18 @@ nvkm_sec2_intr(struct nvkm_engine *engine)
 	struct nvkm_sec2 *sec2 = nvkm_sec2(engine);
 	struct nvkm_subdev *subdev = &engine->subdev;
 	struct nvkm_device *device = subdev->device;
-	u32 disp = nvkm_rd32(device, 0x8701c);
-	u32 intr = nvkm_rd32(device, 0x87008) & disp & ~(disp >> 16);
+	u32 disp = nvkm_rd32(device, sec2->addr + 0x01c);
+	u32 intr = nvkm_rd32(device, sec2->addr + 0x008) & disp & ~(disp >> 16);
 
 	if (intr & 0x00000040) {
 		schedule_work(&sec2->work);
-		nvkm_wr32(device, 0x87004, 0x00000040);
+		nvkm_wr32(device, sec2->addr + 0x004, 0x00000040);
 		intr &= ~0x00000040;
 	}
 
 	if (intr) {
 		nvkm_error(subdev, "unhandled intr %08x\n", intr);
-		nvkm_wr32(device, 0x87004, intr);
+		nvkm_wr32(device, sec2->addr + 0x004, intr);
 
 	}
 }
@@ -74,8 +75,15 @@ static int
 nvkm_sec2_oneinit(struct nvkm_engine *engine)
 {
 	struct nvkm_sec2 *sec2 = nvkm_sec2(engine);
-	return nvkm_falcon_v1_new(&sec2->engine.subdev, "SEC2", 0x87000,
-				  &sec2->falcon);
+	struct nvkm_subdev *subdev = &sec2->engine.subdev;
+
+	if (!sec2->addr) {
+		sec2->addr = nvkm_top_addr(subdev->device, subdev->index);
+		if (WARN_ON(!sec2->addr))
+			return -EINVAL;
+	}
+
+	return nvkm_falcon_v1_new(subdev, "SEC2", sec2->addr, &sec2->falcon);
 }
 
 static int
@@ -95,13 +103,14 @@ nvkm_sec2 = {
 };
 
 int
-nvkm_sec2_new_(struct nvkm_device *device, int index,
+nvkm_sec2_new_(struct nvkm_device *device, int index, u32 addr,
 	       struct nvkm_sec2 **psec2)
 {
 	struct nvkm_sec2 *sec2;
 
 	if (!(sec2 = *psec2 = kzalloc(sizeof(*sec2), GFP_KERNEL)))
 		return -ENOMEM;
+	sec2->addr = addr;
 	INIT_WORK(&sec2->work, nvkm_sec2_recv);
 
 	return nvkm_engine_ctor(&nvkm_sec2, device, index, true, &sec2->engine);
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/gp102.c b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/gp102.c
index 9be1524c08f5..858cf27fa010 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/gp102.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/gp102.c
@@ -26,5 +26,5 @@ int
 gp102_sec2_new(struct nvkm_device *device, int index,
 	       struct nvkm_sec2 **psec2)
 {
-	return nvkm_sec2_new_(device, index, psec2);
+	return nvkm_sec2_new_(device, index, 0, psec2);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/priv.h b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/priv.h
index 2f97c806a79d..ab0165e2d1a3 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/sec2/priv.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/priv.h
@@ -5,6 +5,5 @@
 
 #define nvkm_sec2(p) container_of((p), struct nvkm_sec2, engine)
 
-int nvkm_sec2_new_(struct nvkm_device *, int, struct nvkm_sec2 **);
-
+int nvkm_sec2_new_(struct nvkm_device *, int, u32 addr, struct nvkm_sec2 **);
 #endif
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/dce120/i2caux_dce120.h b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/tu102.c
index b6ac47617c70..d655576164b1 100644
--- a/drivers/gpu/drm/amd/display/dc/i2caux/dce120/i2caux_dce120.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/sec2/tu102.c
@@ -1,5 +1,5 @@
 /*
- * Copyright 2012-16 Advanced Micro Devices, Inc.
+ * Copyright 2019 Red Hat Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -18,15 +18,16 @@
  * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
  */
 
-#ifndef __DAL_I2C_AUX_DCE120_H__
-#define __DAL_I2C_AUX_DCE120_H__
-
-struct i2caux *dal_i2caux_dce120_create(
-	struct dc_context *ctx);
+#include "priv.h"
 
-#endif /* __DAL_I2C_AUX_DCE120_H__ */
+int
+tu102_sec2_new(struct nvkm_device *device, int index,
+	       struct nvkm_sec2 **psec2)
+{
+	/* TOP info wasn't updated on Turing to reflect the PRI
+	 * address change for some reason.  We override it here.
+	 */
+	return nvkm_sec2_new_(device, index, 0x840000, psec2);
+}
diff --git a/drivers/gpu/drm/nouveau/nvkm/falcon/base.c b/drivers/gpu/drm/nouveau/nvkm/falcon/base.c
index 427340153640..366c87de6e72 100644
--- a/drivers/gpu/drm/nouveau/nvkm/falcon/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/falcon/base.c
@@ -204,6 +204,9 @@ nvkm_falcon_ctor(const struct nvkm_falcon_func *func,
 		debug_reg = 0x408;
 		falcon->has_emem = true;
 		break;
+	case NVKM_SUBDEV_GSP:
+		debug_reg = 0x0; /*XXX*/
+		break;
 	default:
 		nvkm_warn(subdev, "unsupported falcon %s!\n",
 			  nvkm_subdev_name[subdev->index]);
diff --git a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
index 771e16a16267..a8bee1e046aa 100644
--- a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
+++ b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
@@ -269,7 +269,7 @@ cmd_write(struct nvkm_msgqueue *priv, struct nvkm_msgqueue_hdr *cmd,
 		commit = false;
 	}
 
-	   cmd_queue_close(priv, queue, commit);
+	cmd_queue_close(priv, queue, commit);
 
 	return ret;
 }
@@ -347,7 +347,7 @@ nvkm_msgqueue_post(struct nvkm_msgqueue *priv, enum msgqueue_msg_priority prio,
 	ret = cmd_write(priv, cmd, queue);
 	if (ret) {
 		seq->state = SEQ_STATE_PENDING;
-		      msgqueue_seq_release(priv, seq);
+		msgqueue_seq_release(priv, seq);
 	}
 
 	return ret;
@@ -373,7 +373,7 @@ msgqueue_msg_handle(struct nvkm_msgqueue *priv, struct nvkm_msgqueue_hdr *hdr)
 	if (seq->completion)
 		complete(seq->completion);
 
-	   msgqueue_seq_release(priv, seq);
+	msgqueue_seq_release(priv, seq);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/Kbuild
index cfdffef1afb9..a339fe03d423 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/Kbuild
@@ -7,6 +7,7 @@ include $(src)/nvkm/subdev/fault/Kbuild
 include $(src)/nvkm/subdev/fb/Kbuild
 include $(src)/nvkm/subdev/fuse/Kbuild
 include $(src)/nvkm/subdev/gpio/Kbuild
+include $(src)/nvkm/subdev/gsp/Kbuild
 include $(src)/nvkm/subdev/i2c/Kbuild
 include $(src)/nvkm/subdev/ibus/Kbuild
 include $(src)/nvkm/subdev/iccsense/Kbuild
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/Kbuild
index ab0282dc0736..dc300600c019 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/Kbuild
@@ -5,4 +5,4 @@ nvkm-y += nvkm/subdev/bar/gf100.o
 nvkm-y += nvkm/subdev/bar/gk20a.o
 nvkm-y += nvkm/subdev/bar/gm107.o
 nvkm-y += nvkm/subdev/bar/gm20b.o
-nvkm-y += nvkm/subdev/bar/tu104.o
+nvkm-y += nvkm/subdev/bar/tu102.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/tu104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/tu102.c
index ecaead156e9b..798f65ec3a86 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/tu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/tu102.c
@@ -25,7 +25,7 @@
 #include <subdev/timer.h>
 
 static void
-tu104_bar_bar2_wait(struct nvkm_bar *bar)
+tu102_bar_bar2_wait(struct nvkm_bar *bar)
 {
 	struct nvkm_device *device = bar->subdev.device;
 	nvkm_msec(device, 2000,
@@ -35,13 +35,13 @@ tu104_bar_bar2_wait(struct nvkm_bar *bar)
 }
 
 static void
-tu104_bar_bar2_fini(struct nvkm_bar *bar)
+tu102_bar_bar2_fini(struct nvkm_bar *bar)
 {
 	nvkm_mask(bar->subdev.device, 0xb80f48, 0x80000000, 0x00000000);
 }
 
 static void
-tu104_bar_bar2_init(struct nvkm_bar *base)
+tu102_bar_bar2_init(struct nvkm_bar *base)
 {
 	struct nvkm_device *device = base->subdev.device;
 	struct gf100_bar *bar = gf100_bar(base);
@@ -52,7 +52,7 @@ tu104_bar_bar2_init(struct nvkm_bar *base)
 }
 
 static void
-tu104_bar_bar1_wait(struct nvkm_bar *bar)
+tu102_bar_bar1_wait(struct nvkm_bar *bar)
 {
 	struct nvkm_device *device = bar->subdev.device;
 	nvkm_msec(device, 2000,
@@ -62,13 +62,13 @@ tu104_bar_bar1_wait(struct nvkm_bar *bar)
 }
 
 static void
-tu104_bar_bar1_fini(struct nvkm_bar *bar)
+tu102_bar_bar1_fini(struct nvkm_bar *bar)
 {
 	nvkm_mask(bar->subdev.device, 0xb80f40, 0x80000000, 0x00000000);
 }
 
 static void
-tu104_bar_bar1_init(struct nvkm_bar *base)
+tu102_bar_bar1_init(struct nvkm_bar *base)
 {
 	struct nvkm_device *device = base->subdev.device;
 	struct gf100_bar *bar = gf100_bar(base);
@@ -77,22 +77,22 @@ tu104_bar_bar1_init(struct nvkm_bar *base)
 }
 
 static const struct nvkm_bar_func
-tu104_bar = {
+tu102_bar = {
 	.dtor = gf100_bar_dtor,
 	.oneinit = gf100_bar_oneinit,
-	.bar1.init = tu104_bar_bar1_init,
-	.bar1.fini = tu104_bar_bar1_fini,
-	.bar1.wait = tu104_bar_bar1_wait,
+	.bar1.init = tu102_bar_bar1_init,
+	.bar1.fini = tu102_bar_bar1_fini,
+	.bar1.wait = tu102_bar_bar1_wait,
 	.bar1.vmm = gf100_bar_bar1_vmm,
-	.bar2.init = tu104_bar_bar2_init,
-	.bar2.fini = tu104_bar_bar2_fini,
-	.bar2.wait = tu104_bar_bar2_wait,
+	.bar2.init = tu102_bar_bar2_init,
+	.bar2.fini = tu102_bar_bar2_fini,
+	.bar2.wait = tu102_bar_bar2_wait,
 	.bar2.vmm = gf100_bar_bar2_vmm,
 	.flush = g84_bar_flush,
 };
 
 int
-tu104_bar_new(struct nvkm_device *device, int index, struct nvkm_bar **pbar)
+tu102_bar_new(struct nvkm_device *device, int index, struct nvkm_bar **pbar)
 {
-	return gf100_bar_new_(&tu104_bar, device, index, pbar);
+	return gf100_bar_new_(&tu102_bar, device, index, pbar);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/dp.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/dp.c
index 3133b28f849c..b099d1209be8 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/dp.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/dp.c
@@ -212,7 +212,7 @@ nvbios_dpcfg_match(struct nvkm_bios *bios, u16 outp, u8 pc, u8 vs, u8 pe,
 	u16 data;
 
 	if (*ver >= 0x30) {
-		const u8 vsoff[] = { 0, 4, 7, 9 };
+		static const u8 vsoff[] = { 0, 4, 7, 9 };
 		idx = (pc * 10) + vsoff[vs] + pe;
 		if (*ver >= 0x40 && *ver <= 0x41 && *hdr >= 0x12)
 			idx += nvbios_rd08(bios, outp + 0x11) * 40;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
index 9cc10e438b3d..ec0e9f7224b5 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c
@@ -806,12 +806,12 @@ init_generic_condition(struct nvbios_init *init)
 	init->offset += 3;
 
 	switch (cond) {
-	case 0:
+	case 0: /* CONDITION_ID_INT_DP. */
 		if (init_conn(init) != DCB_CONNECTOR_eDP)
 			init_exec_set(init, false);
 		break;
-	case 1:
-	case 2:
+	case 1: /* CONDITION_ID_USE_SPPLL0. */
+	case 2: /* CONDITION_ID_USE_SPPLL1. */
 		if ( init->outp &&
 		    (data = nvbios_dpout_match(bios, DCB_OUTPUT_DP,
 					       (init->outp->or << 0) |
@@ -826,10 +826,13 @@ init_generic_condition(struct nvbios_init *init)
 		if (init_exec(init))
 			warn("script needs dp output table data\n");
 		break;
-	case 5:
+	case 5: /* CONDITION_ID_ASSR_SUPPORT. */
 		if (!(init_rdauxr(init, 0x0d) & 1))
 			init_exec_set(init, false);
 		break;
+	case 7: /* CONDITION_ID_NO_PANEL_SEQ_DELAYS. */
+		init_exec_set(init, false);
+		break;
 	default:
 		warn("INIT_GENERIC_CONDITON: unknown 0x%02x\n", cond);
 		init->offset += size;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/Kbuild
index 3ef505a5c01b..f3c388932b6f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/Kbuild
@@ -13,4 +13,4 @@ nvkm-y += nvkm/subdev/devinit/gf100.o
 nvkm-y += nvkm/subdev/devinit/gm107.o
 nvkm-y += nvkm/subdev/devinit/gm200.o
 nvkm-y += nvkm/subdev/devinit/gv100.o
-nvkm-y += nvkm/subdev/devinit/tu104.o
+nvkm-y += nvkm/subdev/devinit/tu102.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c
index aae87b3fc429..397670e72fff 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c
@@ -26,7 +26,7 @@
 #include <subdev/clk/pll.h>
 
 static int
-tu104_devinit_pll_set(struct nvkm_devinit *init, u32 type, u32 freq)
+tu102_devinit_pll_set(struct nvkm_devinit *init, u32 type, u32 freq)
 {
 	struct nvkm_subdev *subdev = &init->subdev;
 	struct nvkm_device *device = subdev->device;
@@ -66,7 +66,7 @@ tu104_devinit_pll_set(struct nvkm_devinit *init, u32 type, u32 freq)
 }
 
 static int
-tu104_devinit_post(struct nvkm_devinit *base, bool post)
+tu102_devinit_post(struct nvkm_devinit *base, bool post)
 {
 	struct nv50_devinit *init = nv50_devinit(base);
 	gm200_devinit_preos(init, post);
@@ -74,16 +74,16 @@ tu104_devinit_post(struct nvkm_devinit *base, bool post)
 }
 
 static const struct nvkm_devinit_func
-tu104_devinit = {
+tu102_devinit = {
 	.init = nv50_devinit_init,
-	.post = tu104_devinit_post,
-	.pll_set = tu104_devinit_pll_set,
+	.post = tu102_devinit_post,
+	.pll_set = tu102_devinit_pll_set,
 	.disable = gm107_devinit_disable,
 };
 
 int
-tu104_devinit_new(struct nvkm_device *device, int index,
+tu102_devinit_new(struct nvkm_device *device, int index,
 		struct nvkm_devinit **pinit)
 {
-	return nv50_devinit_new_(&tu104_devinit, device, index, pinit);
+	return nv50_devinit_new_(&tu102_devinit, device, index, pinit);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild
index 794eb1745b2f..42586267fc08 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/Kbuild
@@ -1,4 +1,5 @@
 nvkm-y += nvkm/subdev/fault/base.o
+nvkm-y += nvkm/subdev/fault/user.o
 nvkm-y += nvkm/subdev/fault/gp100.o
 nvkm-y += nvkm/subdev/fault/gv100.o
-nvkm-y += nvkm/subdev/fault/tu104.o
+nvkm-y += nvkm/subdev/fault/tu102.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c
index 4ba1e21e8fda..ca251560d3e0 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c
@@ -176,5 +176,7 @@ nvkm_fault_new_(const struct nvkm_fault_func *func, struct nvkm_device *device,
 		return -ENOMEM;
 	nvkm_subdev_ctor(&nvkm_fault, device, index, &fault->subdev);
 	fault->func = func;
+	fault->user.ctor = nvkm_ufault_new;
+	fault->user.base = func->user.base;
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c
index 8fb96fe614f9..4f3c4e091117 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gp100.c
@@ -23,6 +23,8 @@
 
 #include <subdev/mc.h>
 
+#include <nvif/class.h>
+
 static void
 gp100_fault_buffer_intr(struct nvkm_fault_buffer *buffer, bool enable)
 {
@@ -69,6 +71,7 @@ gp100_fault = {
 	.buffer.init = gp100_fault_buffer_init,
 	.buffer.fini = gp100_fault_buffer_fini,
 	.buffer.intr = gp100_fault_buffer_intr,
+	.user = { { 0, 0, MAXWELL_FAULT_BUFFER_A }, 0 },
 };
 
 int
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c
index 6fc54e17c935..6747f09c2dc3 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/gv100.c
@@ -25,6 +25,8 @@
 #include <subdev/mmu.h>
 #include <engine/fifo.h>
 
+#include <nvif/class.h>
+
 static void
 gv100_fault_buffer_process(struct nvkm_fault_buffer *buffer)
 {
@@ -166,6 +168,13 @@ gv100_fault_intr(struct nvkm_fault *fault)
 		}
 	}
 
+	if (stat & 0x08000000) {
+		if (fault->buffer[1]) {
+			nvkm_event_send(&fault->event, 1, 1, NULL, 0);
+			stat &= ~0x08000000;
+		}
+	}
+
 	if (stat) {
 		nvkm_debug(subdev, "intr %08x\n", stat);
 	}
@@ -208,6 +217,13 @@ gv100_fault = {
 	.buffer.init = gv100_fault_buffer_init,
 	.buffer.fini = gv100_fault_buffer_fini,
 	.buffer.intr = gv100_fault_buffer_intr,
+	/*TODO: Figure out how to expose non-replayable fault buffer, which,
+	 *      for some reason, is where recoverable CE faults appear...
+	 *
+	 * 	It's a bit tricky, as both NVKM and SVM will need access to
+	 * 	the non-replayable fault buffer.
+	 */
+	.user = { { 0, 0, VOLTA_FAULT_BUFFER_A }, 1 },
 };
 
 int
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/priv.h
index 8ca8b2876dad..975e66ac6344 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/priv.h
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/priv.h
@@ -34,7 +34,14 @@ struct nvkm_fault_func {
 		void (*fini)(struct nvkm_fault_buffer *);
 		void (*intr)(struct nvkm_fault_buffer *, bool enable);
 	} buffer;
+	struct {
+		struct nvkm_sclass base;
+		int rp;
+	} user;
 };
 
 int gv100_fault_oneinit(struct nvkm_fault *);
+
+int nvkm_ufault_new(struct nvkm_device *, const struct nvkm_oclass *,
+		    void *, u32, struct nvkm_object **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/tu104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/tu102.c
index 9c8a3adf99d7..fa1dfe5692b0 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/tu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/tu102.c
@@ -28,7 +28,7 @@
 #include <nvif/class.h>
 
 static void
-tu104_fault_buffer_intr(struct nvkm_fault_buffer *buffer, bool enable)
+tu102_fault_buffer_intr(struct nvkm_fault_buffer *buffer, bool enable)
 {
 	/*XXX: Earlier versions of RM touched the old regs on Turing,
 	 *     which don't appear to actually work anymore, but newer
@@ -37,7 +37,7 @@ tu104_fault_buffer_intr(struct nvkm_fault_buffer *buffer, bool enable)
 }
 
 static void
-tu104_fault_buffer_fini(struct nvkm_fault_buffer *buffer)
+tu102_fault_buffer_fini(struct nvkm_fault_buffer *buffer)
 {
 	struct nvkm_device *device = buffer->fault->subdev.device;
 	const u32 foff = buffer->id * 0x20;
@@ -45,7 +45,7 @@ tu104_fault_buffer_fini(struct nvkm_fault_buffer *buffer)
 }
 
 static void
-tu104_fault_buffer_init(struct nvkm_fault_buffer *buffer)
+tu102_fault_buffer_init(struct nvkm_fault_buffer *buffer)
 {
 	struct nvkm_device *device = buffer->fault->subdev.device;
 	const u32 foff = buffer->id * 0x20;
@@ -57,7 +57,7 @@ tu104_fault_buffer_init(struct nvkm_fault_buffer *buffer)
 }
 
 static void
-tu104_fault_buffer_info(struct nvkm_fault_buffer *buffer)
+tu102_fault_buffer_info(struct nvkm_fault_buffer *buffer)
 {
 	struct nvkm_device *device = buffer->fault->subdev.device;
 	const u32 foff = buffer->id * 0x20;
@@ -70,7 +70,7 @@ tu104_fault_buffer_info(struct nvkm_fault_buffer *buffer)
 }
 
 static void
-tu104_fault_intr_fault(struct nvkm_fault *fault)
+tu102_fault_intr_fault(struct nvkm_fault *fault)
 {
 	struct nvkm_subdev *subdev = &fault->subdev;
 	struct nvkm_device *device = subdev->device;
@@ -96,14 +96,14 @@ tu104_fault_intr_fault(struct nvkm_fault *fault)
 }
 
 static void
-tu104_fault_intr(struct nvkm_fault *fault)
+tu102_fault_intr(struct nvkm_fault *fault)
 {
 	struct nvkm_subdev *subdev = &fault->subdev;
 	struct nvkm_device *device = subdev->device;
 	u32 stat = nvkm_rd32(device, 0xb83094);
 
 	if (stat & 0x80000000) {
-		tu104_fault_intr_fault(fault);
+		tu102_fault_intr_fault(fault);
 		nvkm_wr32(device, 0xb83094, 0x80000000);
 		stat &= ~0x80000000;
 	}
@@ -129,7 +129,7 @@ tu104_fault_intr(struct nvkm_fault *fault)
 }
 
 static void
-tu104_fault_fini(struct nvkm_fault *fault)
+tu102_fault_fini(struct nvkm_fault *fault)
 {
 	nvkm_notify_put(&fault->nrpfb);
 	if (fault->buffer[0])
@@ -138,7 +138,7 @@ tu104_fault_fini(struct nvkm_fault *fault)
 }
 
 static void
-tu104_fault_init(struct nvkm_fault *fault)
+tu102_fault_init(struct nvkm_fault *fault)
 {
 	/*XXX: enable priv faults */
 	fault->func->buffer.init(fault->buffer[0]);
@@ -146,22 +146,23 @@ tu104_fault_init(struct nvkm_fault *fault)
 }
 
 static const struct nvkm_fault_func
-tu104_fault = {
+tu102_fault = {
 	.oneinit = gv100_fault_oneinit,
-	.init = tu104_fault_init,
-	.fini = tu104_fault_fini,
-	.intr = tu104_fault_intr,
+	.init = tu102_fault_init,
+	.fini = tu102_fault_fini,
+	.intr = tu102_fault_intr,
 	.buffer.nr = 2,
 	.buffer.entry_size = 32,
-	.buffer.info = tu104_fault_buffer_info,
-	.buffer.init = tu104_fault_buffer_init,
-	.buffer.fini = tu104_fault_buffer_fini,
-	.buffer.intr = tu104_fault_buffer_intr,
+	.buffer.info = tu102_fault_buffer_info,
+	.buffer.init = tu102_fault_buffer_init,
+	.buffer.fini = tu102_fault_buffer_fini,
+	.buffer.intr = tu102_fault_buffer_intr,
+	.user = { { 0, 0, VOLTA_FAULT_BUFFER_A }, 1 },
 };
 
 int
-tu104_fault_new(struct nvkm_device *device, int index,
+tu102_fault_new(struct nvkm_device *device, int index,
 		struct nvkm_fault **pfault)
 {
-	return nvkm_fault_new_(&tu104_fault, device, index, pfault);
+	return nvkm_fault_new_(&tu102_fault, device, index, pfault);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/user.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/user.c
new file mode 100644
index 000000000000..ac835c9582fd
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/user.c
@@ -0,0 +1,106 @@
+/*
+ * Copyright 2018 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+#include "priv.h"
+
+#include <core/memory.h>
+#include <subdev/mmu.h>
+
+#include <nvif/clb069.h>
+#include <nvif/unpack.h>
+
+static int
+nvkm_ufault_map(struct nvkm_object *object, void *argv, u32 argc,
+		enum nvkm_object_map *type, u64 *addr, u64 *size)
+{
+	struct nvkm_fault_buffer *buffer = nvkm_fault_buffer(object);
+	struct nvkm_device *device = buffer->fault->subdev.device;
+	*type = NVKM_OBJECT_MAP_IO;
+	*addr = device->func->resource_addr(device, 3) + buffer->addr;
+	*size = nvkm_memory_size(buffer->mem);
+	return 0;
+}
+
+static int
+nvkm_ufault_ntfy(struct nvkm_object *object, u32 type,
+		 struct nvkm_event **pevent)
+{
+	struct nvkm_fault_buffer *buffer = nvkm_fault_buffer(object);
+	if (type == NVB069_V0_NTFY_FAULT) {
+		*pevent = &buffer->fault->event;
+		return 0;
+	}
+	return -EINVAL;
+}
+
+static int
+nvkm_ufault_fini(struct nvkm_object *object, bool suspend)
+{
+	struct nvkm_fault_buffer *buffer = nvkm_fault_buffer(object);
+	buffer->fault->func->buffer.fini(buffer);
+	return 0;
+}
+
+static int
+nvkm_ufault_init(struct nvkm_object *object)
+{
+	struct nvkm_fault_buffer *buffer = nvkm_fault_buffer(object);
+	buffer->fault->func->buffer.init(buffer);
+	return 0;
+}
+
+static void *
+nvkm_ufault_dtor(struct nvkm_object *object)
+{
+	return NULL;
+}
+
+static const struct nvkm_object_func
+nvkm_ufault = {
+	.dtor = nvkm_ufault_dtor,
+	.init = nvkm_ufault_init,
+	.fini = nvkm_ufault_fini,
+	.ntfy = nvkm_ufault_ntfy,
+	.map = nvkm_ufault_map,
+};
+
+int
+nvkm_ufault_new(struct nvkm_device *device, const struct nvkm_oclass *oclass,
+		void *argv, u32 argc, struct nvkm_object **pobject)
+{
+	union {
+		struct nvif_clb069_v0 v0;
+	} *args = argv;
+	struct nvkm_fault *fault = device->fault;
+	struct nvkm_fault_buffer *buffer = fault->buffer[fault->func->user.rp];
+	int ret = -ENOSYS;
+
+	if (!(ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, false))) {
+		args->v0.entries = buffer->entries;
+		args->v0.get = buffer->get;
+		args->v0.put = buffer->put;
+	} else
+		return ret;
+
+	nvkm_object_ctor(&nvkm_ufault, oclass, &buffer->object);
+	*pobject = &buffer->object;
+	return 0;
+}
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gddr3.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gddr3.c
index 60ece0a8a2e1..1d2d6bae73cd 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gddr3.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gddr3.c
@@ -87,7 +87,7 @@ nvkm_gddr3_calc(struct nvkm_ram *ram)
 		WR  = (ram->next->bios.timing[2] & 0x007f0000) >> 16;
 		/* XXX: Get these values from the VBIOS instead */
 		DLL = !(ram->mr[1] & 0x1);
-		RON = !(ram->mr[1] & 0x300) >> 8;
+		RON = !((ram->mr[1] & 0x300) >> 8);
 		break;
 	default:
 		return -ENOSYS;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/Kbuild
new file mode 100644
index 000000000000..26fc6feb807e
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/Kbuild
@@ -0,0 +1 @@
+nvkm-y += nvkm/subdev/gsp/gv100.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/gv100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/gv100.c
new file mode 100644
index 000000000000..dccfaf1162e2
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/gv100.c
@@ -0,0 +1,62 @@
+/*
+ * Copyright 2019 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+#include <subdev/gsp.h>
+#include <subdev/top.h>
+#include <engine/falcon.h>
+
+static int
+gv100_gsp_oneinit(struct nvkm_subdev *subdev)
+{
+	struct nvkm_gsp *gsp = nvkm_gsp(subdev);
+
+	gsp->addr = nvkm_top_addr(subdev->device, subdev->index);
+	if (!gsp->addr)
+		return -EINVAL;
+
+	return nvkm_falcon_v1_new(subdev, "GSP", gsp->addr, &gsp->falcon);
+}
+
+static void *
+gv100_gsp_dtor(struct nvkm_subdev *subdev)
+{
+	struct nvkm_gsp *gsp = nvkm_gsp(subdev);
+	nvkm_falcon_del(&gsp->falcon);
+	return gsp;
+}
+
+static const struct nvkm_subdev_func
+gv100_gsp = {
+	.dtor = gv100_gsp_dtor,
+	.oneinit = gv100_gsp_oneinit,
+};
+
+int
+gv100_gsp_new(struct nvkm_device *device, int index, struct nvkm_gsp **pgsp)
+{
+	struct nvkm_gsp *gsp;
+
+	if (!(gsp = *pgsp = kzalloc(sizeof(*gsp), GFP_KERNEL)))
+		return -ENOMEM;
+
+	nvkm_subdev_ctor(&gv100_gsp, device, index, &gsp->subdev);
+	return 0;
+}
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/Kbuild
index f3b06329c338..c64e399326b3 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/Kbuild
@@ -12,4 +12,4 @@ nvkm-y += nvkm/subdev/mc/gk104.o
 nvkm-y += nvkm/subdev/mc/gk20a.o
 nvkm-y += nvkm/subdev/mc/gp100.o
 nvkm-y += nvkm/subdev/mc/gp10b.o
-nvkm-y += nvkm/subdev/mc/tu104.o
+nvkm-y += nvkm/subdev/mc/tu102.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c
index b7165bd18999..d098c44a4fcb 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/tu102.c
@@ -22,7 +22,7 @@
 #include "priv.h"
 
 static void
-tu104_mc_intr_hack(struct nvkm_mc *mc, bool *handled)
+tu102_mc_intr_hack(struct nvkm_mc *mc, bool *handled)
 {
 	struct nvkm_device *device = mc->subdev.device;
 	u32 stat = nvkm_rd32(device, 0xb81010);
@@ -37,19 +37,19 @@ tu104_mc_intr_hack(struct nvkm_mc *mc, bool *handled)
 }
 
 static const struct nvkm_mc_func
-tu104_mc = {
+tu102_mc = {
 	.init = nv50_mc_init,
 	.intr = gp100_mc_intr,
 	.intr_unarm = gp100_mc_intr_unarm,
 	.intr_rearm = gp100_mc_intr_rearm,
 	.intr_mask = gp100_mc_intr_mask,
 	.intr_stat = gf100_mc_intr_stat,
-	.intr_hack = tu104_mc_intr_hack,
+	.intr_hack = tu102_mc_intr_hack,
 	.reset = gk104_mc_reset,
 };
 
 int
-tu104_mc_new(struct nvkm_device *device, int index, struct nvkm_mc **pmc)
+tu102_mc_new(struct nvkm_device *device, int index, struct nvkm_mc **pmc)
 {
-	return gp100_mc_new_(&tu104_mc, device, index, pmc);
+	return gp100_mc_new_(&tu102_mc, device, index, pmc);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild
index 8966180b36cc..db9c56028f21 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild
@@ -13,7 +13,7 @@ nvkm-y += nvkm/subdev/mmu/gm20b.o
 nvkm-y += nvkm/subdev/mmu/gp100.o
 nvkm-y += nvkm/subdev/mmu/gp10b.o
 nvkm-y += nvkm/subdev/mmu/gv100.o
-nvkm-y += nvkm/subdev/mmu/tu104.o
+nvkm-y += nvkm/subdev/mmu/tu102.o
 
 nvkm-y += nvkm/subdev/mmu/mem.o
 nvkm-y += nvkm/subdev/mmu/memnv04.o
@@ -34,7 +34,7 @@ nvkm-y += nvkm/subdev/mmu/vmmgm20b.o
 nvkm-y += nvkm/subdev/mmu/vmmgp100.o
 nvkm-y += nvkm/subdev/mmu/vmmgp10b.o
 nvkm-y += nvkm/subdev/mmu/vmmgv100.o
-nvkm-y += nvkm/subdev/mmu/vmmtu104.o
+nvkm-y += nvkm/subdev/mmu/vmmtu102.o
 
 nvkm-y += nvkm/subdev/mmu/umem.o
 nvkm-y += nvkm/subdev/mmu/ummu.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp100.c
index 651b8805c67c..65cb9d28e60e 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp100.c
@@ -31,7 +31,7 @@ gp100_mmu = {
 	.dma_bits = 47,
 	.mmu = {{ -1, -1, NVIF_CLASS_MMU_GF100}},
 	.mem = {{ -1,  0, NVIF_CLASS_MEM_GF100}, gf100_mem_new, gf100_mem_map },
-	.vmm = {{ -1, -1, NVIF_CLASS_VMM_GP100}, gp100_vmm_new },
+	.vmm = {{ -1,  0, NVIF_CLASS_VMM_GP100}, gp100_vmm_new },
 	.kind = gm200_mmu_kind,
 	.kind_sys = true,
 };
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp10b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp10b.c
index 3bd3db31e0bb..0a50be9a785a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp10b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gp10b.c
@@ -31,7 +31,7 @@ gp10b_mmu = {
 	.dma_bits = 47,
 	.mmu = {{ -1, -1, NVIF_CLASS_MMU_GF100}},
 	.mem = {{ -1, -1, NVIF_CLASS_MEM_GF100}, .umap = gf100_mem_map },
-	.vmm = {{ -1, -1, NVIF_CLASS_VMM_GP100}, gp10b_vmm_new },
+	.vmm = {{ -1,  0, NVIF_CLASS_VMM_GP100}, gp10b_vmm_new },
 	.kind = gm200_mmu_kind,
 	.kind_sys = true,
 };
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gv100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gv100.c
index f666cb57f69e..e0997eedd6d9 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gv100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/gv100.c
@@ -31,7 +31,7 @@ gv100_mmu = {
 	.dma_bits = 47,
 	.mmu = {{ -1, -1, NVIF_CLASS_MMU_GF100}},
 	.mem = {{ -1,  0, NVIF_CLASS_MEM_GF100}, gf100_mem_new, gf100_mem_map },
-	.vmm = {{ -1, -1, NVIF_CLASS_VMM_GP100}, gv100_vmm_new },
+	.vmm = {{ -1,  0, NVIF_CLASS_VMM_GP100}, gv100_vmm_new },
 	.kind = gm200_mmu_kind,
 	.kind_sys = true,
 };
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/priv.h
index 948a48c21be4..2ad1102a4e31 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/priv.h
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/priv.h
@@ -28,7 +28,7 @@ struct nvkm_mmu_func {
 
 	struct {
 		struct nvkm_sclass user;
-		int (*ctor)(struct nvkm_mmu *, u64 addr, u64 size,
+		int (*ctor)(struct nvkm_mmu *, bool managed, u64 addr, u64 size,
 			    void *argv, u32 argc, struct lock_class_key *,
 			    const char *name, struct nvkm_vmm **);
 		bool global;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/tu104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/tu102.c
index 8e6f4096170d..c0db0ce10cba 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/tu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/tu102.c
@@ -27,17 +27,17 @@
 #include <nvif/class.h>
 
 static const struct nvkm_mmu_func
-tu104_mmu = {
+tu102_mmu = {
 	.dma_bits = 47,
 	.mmu = {{ -1, -1, NVIF_CLASS_MMU_GF100}},
 	.mem = {{ -1,  0, NVIF_CLASS_MEM_GF100}, gf100_mem_new, gf100_mem_map },
-	.vmm = {{ -1,  0, NVIF_CLASS_VMM_GP100}, tu104_vmm_new },
+	.vmm = {{ -1,  0, NVIF_CLASS_VMM_GP100}, tu102_vmm_new },
 	.kind = gm200_mmu_kind,
 	.kind_sys = true,
 };
 
 int
-tu104_mmu_new(struct nvkm_device *device, int index, struct nvkm_mmu **pmmu)
+tu102_mmu_new(struct nvkm_device *device, int index, struct nvkm_mmu **pmmu)
 {
-	return nvkm_mmu_new_(&tu104_mmu, device, index, pmmu);
+	return nvkm_mmu_new_(&tu102_mmu, device, index, pmmu);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c
index 6889076097ec..c43b8248c682 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c
@@ -43,6 +43,69 @@ nvkm_uvmm_search(struct nvkm_client *client, u64 handle)
 }
 
 static int
+nvkm_uvmm_mthd_pfnclr(struct nvkm_uvmm *uvmm, void *argv, u32 argc)
+{
+	struct nvkm_client *client = uvmm->object.client;
+	union {
+		struct nvif_vmm_pfnclr_v0 v0;
+	} *args = argv;
+	struct nvkm_vmm *vmm = uvmm->vmm;
+	int ret = -ENOSYS;
+	u64 addr, size;
+
+	if (!(ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, false))) {
+		addr = args->v0.addr;
+		size = args->v0.size;
+	} else
+		return ret;
+
+	if (!client->super)
+		return -ENOENT;
+
+	if (size) {
+		mutex_lock(&vmm->mutex);
+		ret = nvkm_vmm_pfn_unmap(vmm, addr, size);
+		mutex_unlock(&vmm->mutex);
+	}
+
+	return ret;
+}
+
+static int
+nvkm_uvmm_mthd_pfnmap(struct nvkm_uvmm *uvmm, void *argv, u32 argc)
+{
+	struct nvkm_client *client = uvmm->object.client;
+	union {
+		struct nvif_vmm_pfnmap_v0 v0;
+	} *args = argv;
+	struct nvkm_vmm *vmm = uvmm->vmm;
+	int ret = -ENOSYS;
+	u64 addr, size, *phys;
+	u8  page;
+
+	if (!(ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, true))) {
+		page = args->v0.page;
+		addr = args->v0.addr;
+		size = args->v0.size;
+		phys = args->v0.phys;
+		if (argc != (size >> page) * sizeof(args->v0.phys[0]))
+			return -EINVAL;
+	} else
+		return ret;
+
+	if (!client->super)
+		return -ENOENT;
+
+	if (size) {
+		mutex_lock(&vmm->mutex);
+		ret = nvkm_vmm_pfn_map(vmm, page, addr, size, phys);
+		mutex_unlock(&vmm->mutex);
+	}
+
+	return ret;
+}
+
+static int
 nvkm_uvmm_mthd_unmap(struct nvkm_uvmm *uvmm, void *argv, u32 argc)
 {
 	struct nvkm_client *client = uvmm->object.client;
@@ -78,7 +141,7 @@ nvkm_uvmm_mthd_unmap(struct nvkm_uvmm *uvmm, void *argv, u32 argc)
 		goto done;
 	}
 
-	nvkm_vmm_unmap_locked(vmm, vma);
+	nvkm_vmm_unmap_locked(vmm, vma, false);
 	ret = 0;
 done:
 	mutex_unlock(&vmm->mutex);
@@ -124,6 +187,11 @@ nvkm_uvmm_mthd_map(struct nvkm_uvmm *uvmm, void *argv, u32 argc)
 		goto fail;
 	}
 
+	if (ret = -EINVAL, vma->mapped && !vma->memory) {
+		VMM_DEBUG(vmm, "pfnmap %016llx", addr);
+		goto fail;
+	}
+
 	if (ret = -EINVAL, vma->addr != addr || vma->size != size) {
 		if (addr + size > vma->addr + vma->size || vma->memory ||
 		    (vma->refd == NVKM_VMA_PAGE_NONE && !vma->mapref)) {
@@ -271,6 +339,15 @@ nvkm_uvmm_mthd(struct nvkm_object *object, u32 mthd, void *argv, u32 argc)
 	case NVIF_VMM_V0_PUT   : return nvkm_uvmm_mthd_put   (uvmm, argv, argc);
 	case NVIF_VMM_V0_MAP   : return nvkm_uvmm_mthd_map   (uvmm, argv, argc);
 	case NVIF_VMM_V0_UNMAP : return nvkm_uvmm_mthd_unmap (uvmm, argv, argc);
+	case NVIF_VMM_V0_PFNMAP: return nvkm_uvmm_mthd_pfnmap(uvmm, argv, argc);
+	case NVIF_VMM_V0_PFNCLR: return nvkm_uvmm_mthd_pfnclr(uvmm, argv, argc);
+	case NVIF_VMM_V0_MTHD(0x00) ... NVIF_VMM_V0_MTHD(0x7f):
+		if (uvmm->vmm->func->mthd) {
+			return uvmm->vmm->func->mthd(uvmm->vmm,
+						     uvmm->object.client,
+						     mthd, argv, argc);
+		}
+		break;
 	default:
 		break;
 	}
@@ -304,8 +381,10 @@ nvkm_uvmm_new(const struct nvkm_oclass *oclass, void *argv, u32 argc,
 	struct nvkm_uvmm *uvmm;
 	int ret = -ENOSYS;
 	u64 addr, size;
+	bool managed;
 
 	if (!(ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, more))) {
+		managed = args->v0.managed != 0;
 		addr = args->v0.addr;
 		size = args->v0.size;
 	} else
@@ -317,7 +396,7 @@ nvkm_uvmm_new(const struct nvkm_oclass *oclass, void *argv, u32 argc,
 	*pobject = &uvmm->object;
 
 	if (!mmu->vmm) {
-		ret = mmu->func->vmm.ctor(mmu, addr, size, argv, argc,
+		ret = mmu->func->vmm.ctor(mmu, managed, addr, size, argv, argc,
 					  NULL, "user", &uvmm->vmm);
 		if (ret)
 			return ret;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
index 6b87fff014b3..fa93f964e6a4 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
@@ -255,11 +255,23 @@ nvkm_vmm_unref_sptes(struct nvkm_vmm_iter *it, struct nvkm_vmm_pt *pgt,
 }
 
 static bool
-nvkm_vmm_unref_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes)
+nvkm_vmm_unref_ptes(struct nvkm_vmm_iter *it, bool pfn, u32 ptei, u32 ptes)
 {
 	const struct nvkm_vmm_desc *desc = it->desc;
 	const int type = desc->type == SPT;
 	struct nvkm_vmm_pt *pgt = it->pt[0];
+	bool dma;
+
+	if (pfn) {
+		/* Need to clear PTE valid bits before we dma_unmap_page(). */
+		dma = desc->func->pfn_clear(it->vmm, pgt->pt[type], ptei, ptes);
+		if (dma) {
+			/* GPU may have cached the PT, flush before unmap. */
+			nvkm_vmm_flush_mark(it);
+			nvkm_vmm_flush(it);
+			desc->func->pfn_unmap(it->vmm, pgt->pt[type], ptei, ptes);
+		}
+	}
 
 	/* Drop PTE references. */
 	pgt->refs[type] -= ptes;
@@ -349,7 +361,7 @@ nvkm_vmm_ref_sptes(struct nvkm_vmm_iter *it, struct nvkm_vmm_pt *pgt,
 }
 
 static bool
-nvkm_vmm_ref_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes)
+nvkm_vmm_ref_ptes(struct nvkm_vmm_iter *it, bool pfn, u32 ptei, u32 ptes)
 {
 	const struct nvkm_vmm_desc *desc = it->desc;
 	const int type = desc->type == SPT;
@@ -379,7 +391,7 @@ nvkm_vmm_sparse_ptes(const struct nvkm_vmm_desc *desc,
 }
 
 static bool
-nvkm_vmm_sparse_unref_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes)
+nvkm_vmm_sparse_unref_ptes(struct nvkm_vmm_iter *it, bool pfn, u32 ptei, u32 ptes)
 {
 	struct nvkm_vmm_pt *pt = it->pt[0];
 	if (it->desc->type == PGD)
@@ -387,14 +399,14 @@ nvkm_vmm_sparse_unref_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes)
 	else
 	if (it->desc->type == LPT)
 		memset(&pt->pte[ptei], 0x00, sizeof(pt->pte[0]) * ptes);
-	return nvkm_vmm_unref_ptes(it, ptei, ptes);
+	return nvkm_vmm_unref_ptes(it, pfn, ptei, ptes);
 }
 
 static bool
-nvkm_vmm_sparse_ref_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes)
+nvkm_vmm_sparse_ref_ptes(struct nvkm_vmm_iter *it, bool pfn, u32 ptei, u32 ptes)
 {
 	nvkm_vmm_sparse_ptes(it->desc, it->pt[0], ptei, ptes);
-	return nvkm_vmm_ref_ptes(it, ptei, ptes);
+	return nvkm_vmm_ref_ptes(it, pfn, ptei, ptes);
 }
 
 static bool
@@ -487,8 +499,8 @@ nvkm_vmm_ref_swpt(struct nvkm_vmm_iter *it, struct nvkm_vmm_pt *pgd, u32 pdei)
 
 static inline u64
 nvkm_vmm_iter(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
-	      u64 addr, u64 size, const char *name, bool ref,
-	      bool (*REF_PTES)(struct nvkm_vmm_iter *, u32, u32),
+	      u64 addr, u64 size, const char *name, bool ref, bool pfn,
+	      bool (*REF_PTES)(struct nvkm_vmm_iter *, bool pfn, u32, u32),
 	      nvkm_vmm_pte_func MAP_PTES, struct nvkm_vmm_map *map,
 	      nvkm_vmm_pxe_func CLR_PTES)
 {
@@ -548,7 +560,7 @@ nvkm_vmm_iter(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
 		}
 
 		/* Handle PTE updates. */
-		if (!REF_PTES || REF_PTES(&it, ptei, ptes)) {
+		if (!REF_PTES || REF_PTES(&it, pfn, ptei, ptes)) {
 			struct nvkm_mmu_pt *pt = pgt->pt[type];
 			if (MAP_PTES || CLR_PTES) {
 				if (MAP_PTES)
@@ -590,7 +602,7 @@ static void
 nvkm_vmm_ptes_sparse_put(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
 			 u64 addr, u64 size)
 {
-	nvkm_vmm_iter(vmm, page, addr, size, "sparse unref", false,
+	nvkm_vmm_iter(vmm, page, addr, size, "sparse unref", false, false,
 		      nvkm_vmm_sparse_unref_ptes, NULL, NULL,
 		      page->desc->func->invalid ?
 		      page->desc->func->invalid : page->desc->func->unmap);
@@ -602,8 +614,8 @@ nvkm_vmm_ptes_sparse_get(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
 {
 	if ((page->type & NVKM_VMM_PAGE_SPARSE)) {
 		u64 fail = nvkm_vmm_iter(vmm, page, addr, size, "sparse ref",
-					 true, nvkm_vmm_sparse_ref_ptes, NULL,
-					 NULL, page->desc->func->sparse);
+					 true, false, nvkm_vmm_sparse_ref_ptes,
+					 NULL, NULL, page->desc->func->sparse);
 		if (fail != ~0ULL) {
 			if ((size = fail - addr))
 				nvkm_vmm_ptes_sparse_put(vmm, page, addr, size);
@@ -666,11 +678,11 @@ nvkm_vmm_ptes_sparse(struct nvkm_vmm *vmm, u64 addr, u64 size, bool ref)
 
 static void
 nvkm_vmm_ptes_unmap_put(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
-			u64 addr, u64 size, bool sparse)
+			u64 addr, u64 size, bool sparse, bool pfn)
 {
 	const struct nvkm_vmm_desc_func *func = page->desc->func;
 	nvkm_vmm_iter(vmm, page, addr, size, "unmap + unref",
-		      false, nvkm_vmm_unref_ptes, NULL, NULL,
+		      false, pfn, nvkm_vmm_unref_ptes, NULL, NULL,
 		      sparse ? func->sparse : func->invalid ? func->invalid :
 							      func->unmap);
 }
@@ -681,10 +693,10 @@ nvkm_vmm_ptes_get_map(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
 		      nvkm_vmm_pte_func func)
 {
 	u64 fail = nvkm_vmm_iter(vmm, page, addr, size, "ref + map", true,
-				 nvkm_vmm_ref_ptes, func, map, NULL);
+				 false, nvkm_vmm_ref_ptes, func, map, NULL);
 	if (fail != ~0ULL) {
 		if ((size = fail - addr))
-			nvkm_vmm_ptes_unmap_put(vmm, page, addr, size, false);
+			nvkm_vmm_ptes_unmap_put(vmm, page, addr, size, false, false);
 		return -ENOMEM;
 	}
 	return 0;
@@ -692,10 +704,11 @@ nvkm_vmm_ptes_get_map(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
 
 static void
 nvkm_vmm_ptes_unmap(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
-		    u64 addr, u64 size, bool sparse)
+		    u64 addr, u64 size, bool sparse, bool pfn)
 {
 	const struct nvkm_vmm_desc_func *func = page->desc->func;
-	nvkm_vmm_iter(vmm, page, addr, size, "unmap", false, NULL, NULL, NULL,
+	nvkm_vmm_iter(vmm, page, addr, size, "unmap", false, pfn,
+		      NULL, NULL, NULL,
 		      sparse ? func->sparse : func->invalid ? func->invalid :
 							      func->unmap);
 }
@@ -705,7 +718,7 @@ nvkm_vmm_ptes_map(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
 		  u64 addr, u64 size, struct nvkm_vmm_map *map,
 		  nvkm_vmm_pte_func func)
 {
-	nvkm_vmm_iter(vmm, page, addr, size, "map", false,
+	nvkm_vmm_iter(vmm, page, addr, size, "map", false, false,
 		      NULL, func, map, NULL);
 }
 
@@ -713,7 +726,7 @@ static void
 nvkm_vmm_ptes_put(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
 		  u64 addr, u64 size)
 {
-	nvkm_vmm_iter(vmm, page, addr, size, "unref", false,
+	nvkm_vmm_iter(vmm, page, addr, size, "unref", false, false,
 		      nvkm_vmm_unref_ptes, NULL, NULL, NULL);
 }
 
@@ -721,7 +734,7 @@ static int
 nvkm_vmm_ptes_get(struct nvkm_vmm *vmm, const struct nvkm_vmm_page *page,
 		  u64 addr, u64 size)
 {
-	u64 fail = nvkm_vmm_iter(vmm, page, addr, size, "ref", true,
+	u64 fail = nvkm_vmm_iter(vmm, page, addr, size, "ref", true, false,
 				 nvkm_vmm_ref_ptes, NULL, NULL, NULL);
 	if (fail != ~0ULL) {
 		if (fail != addr)
@@ -763,6 +776,7 @@ nvkm_vma_tail(struct nvkm_vma *vma, u64 tail)
 	new->part = vma->part;
 	new->user = vma->user;
 	new->busy = vma->busy;
+	new->mapped = vma->mapped;
 	list_add(&new->head, &vma->head);
 	return new;
 }
@@ -935,11 +949,40 @@ nvkm_vmm_node_split(struct nvkm_vmm *vmm,
 }
 
 static void
+nvkm_vma_dump(struct nvkm_vma *vma)
+{
+	printk(KERN_ERR "%016llx %016llx %c%c%c%c%c%c%c%c%c %p\n",
+	       vma->addr, (u64)vma->size,
+	       vma->used ? '-' : 'F',
+	       vma->mapref ? 'R' : '-',
+	       vma->sparse ? 'S' : '-',
+	       vma->page != NVKM_VMA_PAGE_NONE ? '0' + vma->page : '-',
+	       vma->refd != NVKM_VMA_PAGE_NONE ? '0' + vma->refd : '-',
+	       vma->part ? 'P' : '-',
+	       vma->user ? 'U' : '-',
+	       vma->busy ? 'B' : '-',
+	       vma->mapped ? 'M' : '-',
+	       vma->memory);
+}
+
+static void
+nvkm_vmm_dump(struct nvkm_vmm *vmm)
+{
+	struct nvkm_vma *vma;
+	list_for_each_entry(vma, &vmm->list, head) {
+		nvkm_vma_dump(vma);
+	}
+}
+
+static void
 nvkm_vmm_dtor(struct nvkm_vmm *vmm)
 {
 	struct nvkm_vma *vma;
 	struct rb_node *node;
 
+	if (0)
+		nvkm_vmm_dump(vmm);
+
 	while ((node = rb_first(&vmm->root))) {
 		struct nvkm_vma *vma = rb_entry(node, typeof(*vma), tree);
 		nvkm_vmm_put(vmm, &vma);
@@ -972,16 +1015,32 @@ nvkm_vmm_dtor(struct nvkm_vmm *vmm)
 	}
 }
 
+static int
+nvkm_vmm_ctor_managed(struct nvkm_vmm *vmm, u64 addr, u64 size)
+{
+	struct nvkm_vma *vma;
+	if (!(vma = nvkm_vma_new(addr, size)))
+		return -ENOMEM;
+	vma->mapref = true;
+	vma->sparse = false;
+	vma->used = true;
+	vma->user = true;
+	nvkm_vmm_node_insert(vmm, vma);
+	list_add_tail(&vma->head, &vmm->list);
+	return 0;
+}
+
 int
 nvkm_vmm_ctor(const struct nvkm_vmm_func *func, struct nvkm_mmu *mmu,
-	      u32 pd_header, u64 addr, u64 size, struct lock_class_key *key,
-	      const char *name, struct nvkm_vmm *vmm)
+	      u32 pd_header, bool managed, u64 addr, u64 size,
+	      struct lock_class_key *key, const char *name,
+	      struct nvkm_vmm *vmm)
 {
 	static struct lock_class_key _key;
 	const struct nvkm_vmm_page *page = func->page;
 	const struct nvkm_vmm_desc *desc;
 	struct nvkm_vma *vma;
-	int levels, bits = 0;
+	int levels, bits = 0, ret;
 
 	vmm->func = func;
 	vmm->mmu = mmu;
@@ -1009,11 +1068,6 @@ nvkm_vmm_ctor(const struct nvkm_vmm_func *func, struct nvkm_mmu *mmu,
 	if (WARN_ON(levels > NVKM_VMM_LEVELS_MAX))
 		return -EINVAL;
 
-	vmm->start = addr;
-	vmm->limit = size ? (addr + size) : (1ULL << bits);
-	if (vmm->start > vmm->limit || vmm->limit > (1ULL << bits))
-		return -EINVAL;
-
 	/* Allocate top-level page table. */
 	vmm->pd = nvkm_vmm_pt_new(desc, false, NULL);
 	if (!vmm->pd)
@@ -1036,50 +1090,273 @@ nvkm_vmm_ctor(const struct nvkm_vmm_func *func, struct nvkm_mmu *mmu,
 	vmm->free = RB_ROOT;
 	vmm->root = RB_ROOT;
 
-	if (!(vma = nvkm_vma_new(vmm->start, vmm->limit - vmm->start)))
-		return -ENOMEM;
+	if (managed) {
+		/* Address-space will be managed by the client for the most
+		 * part, except for a specified area where NVKM allocations
+		 * are allowed to be placed.
+		 */
+		vmm->start = 0;
+		vmm->limit = 1ULL << bits;
+		if (addr + size < addr || addr + size > vmm->limit)
+			return -EINVAL;
+
+		/* Client-managed area before the NVKM-managed area. */
+		if (addr && (ret = nvkm_vmm_ctor_managed(vmm, 0, addr)))
+			return ret;
+
+		/* NVKM-managed area. */
+		if (size) {
+			if (!(vma = nvkm_vma_new(addr, size)))
+				return -ENOMEM;
+			nvkm_vmm_free_insert(vmm, vma);
+			list_add_tail(&vma->head, &vmm->list);
+		}
+
+		/* Client-managed area after the NVKM-managed area. */
+		addr = addr + size;
+		size = vmm->limit - addr;
+		if (size && (ret = nvkm_vmm_ctor_managed(vmm, addr, size)))
+			return ret;
+	} else {
+		/* Address-space fully managed by NVKM, requiring calls to
+		 * nvkm_vmm_get()/nvkm_vmm_put() to allocate address-space.
+		 */
+		vmm->start = addr;
+		vmm->limit = size ? (addr + size) : (1ULL << bits);
+		if (vmm->start > vmm->limit || vmm->limit > (1ULL << bits))
+			return -EINVAL;
+
+		if (!(vma = nvkm_vma_new(vmm->start, vmm->limit - vmm->start)))
+			return -ENOMEM;
+
+		nvkm_vmm_free_insert(vmm, vma);
+		list_add(&vma->head, &vmm->list);
+	}
 
-	nvkm_vmm_free_insert(vmm, vma);
-	list_add(&vma->head, &vmm->list);
 	return 0;
 }
 
 int
 nvkm_vmm_new_(const struct nvkm_vmm_func *func, struct nvkm_mmu *mmu,
-	      u32 hdr, u64 addr, u64 size, struct lock_class_key *key,
-	      const char *name, struct nvkm_vmm **pvmm)
+	      u32 hdr, bool managed, u64 addr, u64 size,
+	      struct lock_class_key *key, const char *name,
+	      struct nvkm_vmm **pvmm)
 {
 	if (!(*pvmm = kzalloc(sizeof(**pvmm), GFP_KERNEL)))
 		return -ENOMEM;
-	return nvkm_vmm_ctor(func, mmu, hdr, addr, size, key, name, *pvmm);
+	return nvkm_vmm_ctor(func, mmu, hdr, managed, addr, size, key, name, *pvmm);
+}
+
+static struct nvkm_vma *
+nvkm_vmm_pfn_split_merge(struct nvkm_vmm *vmm, struct nvkm_vma *vma,
+			 u64 addr, u64 size, u8 page, bool map)
+{
+	struct nvkm_vma *prev = NULL;
+	struct nvkm_vma *next = NULL;
+
+	if (vma->addr == addr && vma->part && (prev = node(vma, prev))) {
+		if (prev->memory || prev->mapped != map)
+			prev = NULL;
+	}
+
+	if (vma->addr + vma->size == addr + size && (next = node(vma, next))) {
+		if (!next->part ||
+		    next->memory || next->mapped != map)
+			next = NULL;
+	}
+
+	if (prev || next)
+		return nvkm_vmm_node_merge(vmm, prev, vma, next, size);
+	return nvkm_vmm_node_split(vmm, vma, addr, size);
+}
+
+int
+nvkm_vmm_pfn_unmap(struct nvkm_vmm *vmm, u64 addr, u64 size)
+{
+	struct nvkm_vma *vma = nvkm_vmm_node_search(vmm, addr);
+	struct nvkm_vma *next;
+	u64 limit = addr + size;
+	u64 start = addr;
+
+	if (!vma)
+		return -EINVAL;
+
+	do {
+		if (!vma->mapped || vma->memory)
+			continue;
+
+		size = min(limit - start, vma->size - (start - vma->addr));
+
+		nvkm_vmm_ptes_unmap_put(vmm, &vmm->func->page[vma->refd],
+					start, size, false, true);
+
+		next = nvkm_vmm_pfn_split_merge(vmm, vma, start, size, 0, false);
+		if (!WARN_ON(!next)) {
+			vma = next;
+			vma->refd = NVKM_VMA_PAGE_NONE;
+			vma->mapped = false;
+		}
+	} while ((vma = node(vma, next)) && (start = vma->addr) < limit);
+
+	return 0;
+}
+
+/*TODO:
+ * - Avoid PT readback (for dma_unmap etc), this might end up being dealt
+ *   with inside HMM, which would be a lot nicer for us to deal with.
+ * - Multiple page sizes (particularly for huge page support).
+ * - Support for systems without a 4KiB page size.
+ */
+int
+nvkm_vmm_pfn_map(struct nvkm_vmm *vmm, u8 shift, u64 addr, u64 size, u64 *pfn)
+{
+	const struct nvkm_vmm_page *page = vmm->func->page;
+	struct nvkm_vma *vma, *tmp;
+	u64 limit = addr + size;
+	u64 start = addr;
+	int pm = size >> shift;
+	int pi = 0;
+
+	/* Only support mapping where the page size of the incoming page
+	 * array matches a page size available for direct mapping.
+	 */
+	while (page->shift && page->shift != shift &&
+	       page->desc->func->pfn == NULL)
+		page++;
+
+	if (!page->shift || !IS_ALIGNED(addr, 1ULL << shift) ||
+			    !IS_ALIGNED(size, 1ULL << shift) ||
+	    addr + size < addr || addr + size > vmm->limit) {
+		VMM_DEBUG(vmm, "paged map %d %d %016llx %016llx\n",
+			  shift, page->shift, addr, size);
+		return -EINVAL;
+	}
+
+	if (!(vma = nvkm_vmm_node_search(vmm, addr)))
+		return -ENOENT;
+
+	do {
+		bool map = !!(pfn[pi] & NVKM_VMM_PFN_V);
+		bool mapped = vma->mapped;
+		u64 size = limit - start;
+		u64 addr = start;
+		int pn, ret = 0;
+
+		/* Narrow the operation window to cover a single action (page
+		 * should be mapped or not) within a single VMA.
+		 */
+		for (pn = 0; pi + pn < pm; pn++) {
+			if (map != !!(pfn[pi + pn] & NVKM_VMM_PFN_V))
+				break;
+		}
+		size = min_t(u64, size, pn << page->shift);
+		size = min_t(u64, size, vma->size + vma->addr - addr);
+
+		/* Reject any operation to unmanaged regions, and areas that
+		 * have nvkm_memory objects mapped in them already.
+		 */
+		if (!vma->mapref || vma->memory) {
+			ret = -EINVAL;
+			goto next;
+		}
+
+		/* In order to both properly refcount GPU page tables, and
+		 * prevent "normal" mappings and these direct mappings from
+		 * interfering with each other, we need to track contiguous
+		 * ranges that have been mapped with this interface.
+		 *
+		 * Here we attempt to either split an existing VMA so we're
+		 * able to flag the region as either unmapped/mapped, or to
+		 * merge with adjacent VMAs that are already compatible.
+		 *
+		 * If the region is already compatible, nothing is required.
+		 */
+		if (map != mapped) {
+			tmp = nvkm_vmm_pfn_split_merge(vmm, vma, addr, size,
+						       page -
+						       vmm->func->page, map);
+			if (WARN_ON(!tmp)) {
+				ret = -ENOMEM;
+				goto next;
+			}
+
+			if ((tmp->mapped = map))
+				tmp->refd = page - vmm->func->page;
+			else
+				tmp->refd = NVKM_VMA_PAGE_NONE;
+			vma = tmp;
+		}
+
+		/* Update HW page tables. */
+		if (map) {
+			struct nvkm_vmm_map args;
+			args.page = page;
+			args.pfn = &pfn[pi];
+
+			if (!mapped) {
+				ret = nvkm_vmm_ptes_get_map(vmm, page, addr,
+							    size, &args, page->
+							    desc->func->pfn);
+			} else {
+				nvkm_vmm_ptes_map(vmm, page, addr, size, &args,
+						  page->desc->func->pfn);
+			}
+		} else {
+			if (mapped) {
+				nvkm_vmm_ptes_unmap_put(vmm, page, addr, size,
+							false, true);
+			}
+		}
+
+next:
+		/* Iterate to next operation. */
+		if (vma->addr + vma->size == addr + size)
+			vma = node(vma, next);
+		start += size;
+
+		if (ret) {
+			/* Failure is signalled by clearing the valid bit on
+			 * any PFN that couldn't be modified as requested.
+			 */
+			while (size) {
+				pfn[pi++] = NVKM_VMM_PFN_NONE;
+				size -= 1 << page->shift;
+			}
+		} else {
+			pi += size >> page->shift;
+		}
+	} while (vma && start < limit);
+
+	return 0;
 }
 
 void
 nvkm_vmm_unmap_region(struct nvkm_vmm *vmm, struct nvkm_vma *vma)
 {
-	struct nvkm_vma *next = node(vma, next);
 	struct nvkm_vma *prev = NULL;
+	struct nvkm_vma *next;
 
 	nvkm_memory_tags_put(vma->memory, vmm->mmu->subdev.device, &vma->tags);
 	nvkm_memory_unref(&vma->memory);
+	vma->mapped = false;
 
-	if (!vma->part || ((prev = node(vma, prev)), prev->memory))
+	if (vma->part && (prev = node(vma, prev)) && prev->mapped)
 		prev = NULL;
-	if (!next->part || next->memory)
+	if ((next = node(vma, next)) && (!next->part || next->mapped))
 		next = NULL;
 	nvkm_vmm_node_merge(vmm, prev, vma, next, vma->size);
 }
 
 void
-nvkm_vmm_unmap_locked(struct nvkm_vmm *vmm, struct nvkm_vma *vma)
+nvkm_vmm_unmap_locked(struct nvkm_vmm *vmm, struct nvkm_vma *vma, bool pfn)
 {
 	const struct nvkm_vmm_page *page = &vmm->func->page[vma->refd];
 
 	if (vma->mapref) {
-		nvkm_vmm_ptes_unmap_put(vmm, page, vma->addr, vma->size, vma->sparse);
+		nvkm_vmm_ptes_unmap_put(vmm, page, vma->addr, vma->size, vma->sparse, pfn);
 		vma->refd = NVKM_VMA_PAGE_NONE;
 	} else {
-		nvkm_vmm_ptes_unmap(vmm, page, vma->addr, vma->size, vma->sparse);
+		nvkm_vmm_ptes_unmap(vmm, page, vma->addr, vma->size, vma->sparse, pfn);
 	}
 
 	nvkm_vmm_unmap_region(vmm, vma);
@@ -1090,7 +1367,7 @@ nvkm_vmm_unmap(struct nvkm_vmm *vmm, struct nvkm_vma *vma)
 {
 	if (vma->memory) {
 		mutex_lock(&vmm->mutex);
-		nvkm_vmm_unmap_locked(vmm, vma);
+		nvkm_vmm_unmap_locked(vmm, vma, false);
 		mutex_unlock(&vmm->mutex);
 	}
 }
@@ -1224,6 +1501,7 @@ nvkm_vmm_map_locked(struct nvkm_vmm *vmm, struct nvkm_vma *vma,
 	nvkm_memory_tags_put(vma->memory, vmm->mmu->subdev.device, &vma->tags);
 	nvkm_memory_unref(&vma->memory);
 	vma->memory = nvkm_memory_ref(map->memory);
+	vma->mapped = true;
 	vma->tags = map->tags;
 	return 0;
 }
@@ -1269,14 +1547,16 @@ nvkm_vmm_put_locked(struct nvkm_vmm *vmm, struct nvkm_vma *vma)
 
 	if (vma->mapref || !vma->sparse) {
 		do {
-			const bool map = next->memory != NULL;
+			const bool mem = next->memory != NULL;
+			const bool map = next->mapped;
 			const u8  refd = next->refd;
 			const u64 addr = next->addr;
 			u64 size = next->size;
 
 			/* Merge regions that are in the same state. */
 			while ((next = node(next, next)) && next->part &&
-			       (next->memory != NULL) == map &&
+			       (next->mapped == map) &&
+			       (next->memory != NULL) == mem &&
 			       (next->refd == refd))
 				size += next->size;
 
@@ -1286,7 +1566,8 @@ nvkm_vmm_put_locked(struct nvkm_vmm *vmm, struct nvkm_vma *vma)
 				 * the page tree.
 				 */
 				nvkm_vmm_ptes_unmap_put(vmm, &page[refd], addr,
-							size, vma->sparse);
+							size, vma->sparse,
+							!mem);
 			} else
 			if (refd != NVKM_VMA_PAGE_NONE) {
 				/* Drop allocation-time PTE references. */
@@ -1301,7 +1582,7 @@ nvkm_vmm_put_locked(struct nvkm_vmm *vmm, struct nvkm_vma *vma)
 	 */
 	next = vma;
 	do {
-		if (next->memory)
+		if (next->mapped)
 			nvkm_vmm_unmap_region(vmm, next);
 	} while ((next = node(vma, next)) && next->part);
 
@@ -1522,7 +1803,7 @@ nvkm_vmm_join(struct nvkm_vmm *vmm, struct nvkm_memory *inst)
 }
 
 static bool
-nvkm_vmm_boot_ptes(struct nvkm_vmm_iter *it, u32 ptei, u32 ptes)
+nvkm_vmm_boot_ptes(struct nvkm_vmm_iter *it, bool pfn, u32 ptei, u32 ptes)
 {
 	const struct nvkm_vmm_desc *desc = it->desc;
 	const int type = desc->type == SPT;
@@ -1544,7 +1825,7 @@ nvkm_vmm_boot(struct nvkm_vmm *vmm)
 	if (ret)
 		return ret;
 
-	nvkm_vmm_iter(vmm, page, vmm->start, limit, "bootstrap", false,
+	nvkm_vmm_iter(vmm, page, vmm->start, limit, "bootstrap", false, false,
 		      nvkm_vmm_boot_ptes, NULL, NULL, NULL);
 	vmm->bootstrapped = true;
 	return 0;
@@ -1584,7 +1865,8 @@ nvkm_vmm_new(struct nvkm_device *device, u64 addr, u64 size, void *argv,
 	struct nvkm_mmu *mmu = device->mmu;
 	struct nvkm_vmm *vmm = NULL;
 	int ret;
-	ret = mmu->func->vmm.ctor(mmu, addr, size, argv, argc, key, name, &vmm);
+	ret = mmu->func->vmm.ctor(mmu, false, addr, size, argv, argc,
+				  key, name, &vmm);
 	if (ret)
 		nvkm_vmm_unref(&vmm);
 	*pvmm = vmm;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h
index 42ad326521a3..5e55ecbd8005 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h
@@ -67,6 +67,10 @@ struct nvkm_vmm_desc_func {
 	nvkm_vmm_pte_func mem;
 	nvkm_vmm_pte_func dma;
 	nvkm_vmm_pte_func sgl;
+
+	nvkm_vmm_pte_func pfn;
+	bool (*pfn_clear)(struct nvkm_vmm *, struct nvkm_mmu_pt *, u32 ptei, u32 ptes);
+	nvkm_vmm_pxe_func pfn_unmap;
 };
 
 extern const struct nvkm_vmm_desc_func gf100_vmm_pgd;
@@ -141,6 +145,11 @@ struct nvkm_vmm_func {
 		     struct nvkm_vmm_map *);
 	void (*flush)(struct nvkm_vmm *, int depth);
 
+	int (*mthd)(struct nvkm_vmm *, struct nvkm_client *,
+		    u32 mthd, void *argv, u32 argc);
+
+	void (*invalidate_pdb)(struct nvkm_vmm *, u64 addr);
+
 	u64 page_block;
 	const struct nvkm_vmm_page page[];
 };
@@ -151,11 +160,12 @@ struct nvkm_vmm_join {
 };
 
 int nvkm_vmm_new_(const struct nvkm_vmm_func *, struct nvkm_mmu *,
-		  u32 pd_header, u64 addr, u64 size, struct lock_class_key *,
-		  const char *name, struct nvkm_vmm **);
+		  u32 pd_header, bool managed, u64 addr, u64 size,
+		  struct lock_class_key *, const char *name,
+		  struct nvkm_vmm **);
 int nvkm_vmm_ctor(const struct nvkm_vmm_func *, struct nvkm_mmu *,
-		  u32 pd_header, u64 addr, u64 size, struct lock_class_key *,
-		  const char *name, struct nvkm_vmm *);
+		  u32 pd_header, bool managed, u64 addr, u64 size,
+		  struct lock_class_key *, const char *name, struct nvkm_vmm *);
 struct nvkm_vma *nvkm_vmm_node_search(struct nvkm_vmm *, u64 addr);
 struct nvkm_vma *nvkm_vmm_node_split(struct nvkm_vmm *, struct nvkm_vma *,
 				     u64 addr, u64 size);
@@ -163,13 +173,25 @@ int nvkm_vmm_get_locked(struct nvkm_vmm *, bool getref, bool mapref,
 			bool sparse, u8 page, u8 align, u64 size,
 			struct nvkm_vma **pvma);
 void nvkm_vmm_put_locked(struct nvkm_vmm *, struct nvkm_vma *);
-void nvkm_vmm_unmap_locked(struct nvkm_vmm *, struct nvkm_vma *);
-void nvkm_vmm_unmap_region(struct nvkm_vmm *vmm, struct nvkm_vma *vma);
+void nvkm_vmm_unmap_locked(struct nvkm_vmm *, struct nvkm_vma *, bool pfn);
+void nvkm_vmm_unmap_region(struct nvkm_vmm *, struct nvkm_vma *);
+
+#define NVKM_VMM_PFN_ADDR                                 0xfffffffffffff000ULL
+#define NVKM_VMM_PFN_ADDR_SHIFT                                              12
+#define NVKM_VMM_PFN_APER                                 0x00000000000000f0ULL
+#define NVKM_VMM_PFN_HOST                                 0x0000000000000000ULL
+#define NVKM_VMM_PFN_VRAM                                 0x0000000000000010ULL
+#define NVKM_VMM_PFN_W                                    0x0000000000000002ULL
+#define NVKM_VMM_PFN_V                                    0x0000000000000001ULL
+#define NVKM_VMM_PFN_NONE                                 0x0000000000000000ULL
+
+int nvkm_vmm_pfn_map(struct nvkm_vmm *, u8 page, u64 addr, u64 size, u64 *pfn);
+int nvkm_vmm_pfn_unmap(struct nvkm_vmm *, u64 addr, u64 size);
 
 struct nvkm_vma *nvkm_vma_tail(struct nvkm_vma *, u64 tail);
 
 int nv04_vmm_new_(const struct nvkm_vmm_func *, struct nvkm_mmu *, u32,
-		  u64, u64, void *, u32, struct lock_class_key *,
+		  bool, u64, u64, void *, u32, struct lock_class_key *,
 		  const char *, struct nvkm_vmm **);
 int nv04_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map *);
 
@@ -179,70 +201,76 @@ int nv50_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map *);
 void nv50_vmm_flush(struct nvkm_vmm *, int);
 
 int gf100_vmm_new_(const struct nvkm_vmm_func *, const struct nvkm_vmm_func *,
-		   struct nvkm_mmu *, u64, u64, void *, u32,
+		   struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		   struct lock_class_key *, const char *, struct nvkm_vmm **);
 int gf100_vmm_join_(struct nvkm_vmm *, struct nvkm_memory *, u64 base);
 int gf100_vmm_join(struct nvkm_vmm *, struct nvkm_memory *);
 void gf100_vmm_part(struct nvkm_vmm *, struct nvkm_memory *);
 int gf100_vmm_aper(enum nvkm_memory_target);
 int gf100_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map *);
-void gf100_vmm_flush_(struct nvkm_vmm *, int);
 void gf100_vmm_flush(struct nvkm_vmm *, int);
+void gf100_vmm_invalidate(struct nvkm_vmm *, u32 type);
+void gf100_vmm_invalidate_pdb(struct nvkm_vmm *, u64 addr);
 
 int gk20a_vmm_aper(enum nvkm_memory_target);
 
 int gm200_vmm_new_(const struct nvkm_vmm_func *, const struct nvkm_vmm_func *,
-		   struct nvkm_mmu *, u64, u64, void *, u32,
+		   struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		   struct lock_class_key *, const char *, struct nvkm_vmm **);
 int gm200_vmm_join_(struct nvkm_vmm *, struct nvkm_memory *, u64 base);
 int gm200_vmm_join(struct nvkm_vmm *, struct nvkm_memory *);
 
+int gp100_vmm_new_(const struct nvkm_vmm_func *,
+		   struct nvkm_mmu *, bool, u64, u64, void *, u32,
+		   struct lock_class_key *, const char *, struct nvkm_vmm **);
 int gp100_vmm_join(struct nvkm_vmm *, struct nvkm_memory *);
 int gp100_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map *);
 void gp100_vmm_flush(struct nvkm_vmm *, int);
+int gp100_vmm_mthd(struct nvkm_vmm *, struct nvkm_client *, u32, void *, u32);
+void gp100_vmm_invalidate_pdb(struct nvkm_vmm *, u64 addr);
 
 int gv100_vmm_join(struct nvkm_vmm *, struct nvkm_memory *);
 
-int nv04_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int nv04_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		 struct lock_class_key *, const char *, struct nvkm_vmm **);
-int nv41_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int nv41_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		 struct lock_class_key *, const char *, struct nvkm_vmm **);
-int nv44_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int nv44_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		 struct lock_class_key *, const char *, struct nvkm_vmm **);
-int nv50_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int nv50_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		 struct lock_class_key *, const char *, struct nvkm_vmm **);
-int mcp77_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int mcp77_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *, struct nvkm_vmm **);
-int g84_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int g84_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		struct lock_class_key *, const char *, struct nvkm_vmm **);
-int gf100_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int gf100_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *, struct nvkm_vmm **);
-int gk104_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int gk104_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *, struct nvkm_vmm **);
-int gk20a_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int gk20a_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *, struct nvkm_vmm **);
-int gm200_vmm_new_fixed(struct nvkm_mmu *, u64, u64, void *, u32,
+int gm200_vmm_new_fixed(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 			struct lock_class_key *, const char *,
 			struct nvkm_vmm **);
-int gm200_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int gm200_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *,
 		  struct nvkm_vmm **);
-int gm20b_vmm_new_fixed(struct nvkm_mmu *, u64, u64, void *, u32,
+int gm20b_vmm_new_fixed(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 			struct lock_class_key *, const char *,
 			struct nvkm_vmm **);
-int gm20b_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int gm20b_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *,
 		  struct nvkm_vmm **);
-int gp100_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int gp100_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *,
 		  struct nvkm_vmm **);
-int gp10b_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int gp10b_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *,
 		  struct nvkm_vmm **);
-int gv100_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int gv100_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *,
 		  struct nvkm_vmm **);
-int tu104_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+int tu102_vmm_new(struct nvkm_mmu *, bool, u64, u64, void *, u32,
 		  struct lock_class_key *, const char *,
 		  struct nvkm_vmm **);
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c
index faf5a7e9265e..ab6424faf84c 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgf100.c
@@ -178,15 +178,19 @@ gf100_vmm_desc_16_16[] = {
 };
 
 void
-gf100_vmm_flush_(struct nvkm_vmm *vmm, int depth)
+gf100_vmm_invalidate_pdb(struct nvkm_vmm *vmm, u64 addr)
+{
+	struct nvkm_device *device = vmm->mmu->subdev.device;
+	nvkm_wr32(device, 0x100cb8, addr);
+}
+
+void
+gf100_vmm_invalidate(struct nvkm_vmm *vmm, u32 type)
 {
 	struct nvkm_subdev *subdev = &vmm->mmu->subdev;
 	struct nvkm_device *device = subdev->device;
-	u32 type = depth << 24;
-
-	type = 0x00000001; /* PAGE_ALL */
-	if (atomic_read(&vmm->engref[NVKM_SUBDEV_BAR]))
-		type |= 0x00000004; /* HUB_ONLY */
+	struct nvkm_mmu_pt *pd = vmm->pd->pt[0];
+	u64 addr = 0;
 
 	mutex_lock(&subdev->mutex);
 	/* Looks like maybe a "free flush slots" counter, the
@@ -197,7 +201,20 @@ gf100_vmm_flush_(struct nvkm_vmm *vmm, int depth)
 			break;
 	);
 
-	nvkm_wr32(device, 0x100cb8, vmm->pd->pt[0]->addr >> 8);
+	if (!(type & 0x00000002) /* ALL_PDB. */) {
+		switch (nvkm_memory_target(pd->memory)) {
+		case NVKM_MEM_TARGET_VRAM: addr |= 0x00000000; break;
+		case NVKM_MEM_TARGET_HOST: addr |= 0x00000002; break;
+		case NVKM_MEM_TARGET_NCOH: addr |= 0x00000003; break;
+		default:
+			WARN_ON(1);
+			break;
+		}
+		addr |= (vmm->pd->pt[0]->addr >> 12) << 4;
+
+		vmm->func->invalidate_pdb(vmm, addr);
+	}
+
 	nvkm_wr32(device, 0x100cbc, 0x80000000 | type);
 
 	/* Wait for flush to be queued? */
@@ -211,7 +228,10 @@ gf100_vmm_flush_(struct nvkm_vmm *vmm, int depth)
 void
 gf100_vmm_flush(struct nvkm_vmm *vmm, int depth)
 {
-	gf100_vmm_flush_(vmm, 0);
+	u32 type = 0x00000001; /* PAGE_ALL */
+	if (atomic_read(&vmm->engref[NVKM_SUBDEV_BAR]))
+		type |= 0x00000004; /* HUB_ONLY */
+	gf100_vmm_invalidate(vmm, type);
 }
 
 int
@@ -354,6 +374,7 @@ gf100_vmm_17 = {
 	.aper = gf100_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 17, &gf100_vmm_desc_17_17[0], NVKM_VMM_PAGE_xVxC },
 		{ 12, &gf100_vmm_desc_17_12[0], NVKM_VMM_PAGE_xVHx },
@@ -368,6 +389,7 @@ gf100_vmm_16 = {
 	.aper = gf100_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 16, &gf100_vmm_desc_16_16[0], NVKM_VMM_PAGE_xVxC },
 		{ 12, &gf100_vmm_desc_16_12[0], NVKM_VMM_PAGE_xVHx },
@@ -378,14 +400,14 @@ gf100_vmm_16 = {
 int
 gf100_vmm_new_(const struct nvkm_vmm_func *func_16,
 	       const struct nvkm_vmm_func *func_17,
-	       struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	       struct lock_class_key *key, const char *name,
-	       struct nvkm_vmm **pvmm)
+	       struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	       void *argv, u32 argc, struct lock_class_key *key,
+	       const char *name, struct nvkm_vmm **pvmm)
 {
 	switch (mmu->subdev.device->fb->page) {
-	case 16: return nv04_vmm_new_(func_16, mmu, 0, addr, size,
+	case 16: return nv04_vmm_new_(func_16, mmu, 0, managed, addr, size,
 				      argv, argc, key, name, pvmm);
-	case 17: return nv04_vmm_new_(func_17, mmu, 0, addr, size,
+	case 17: return nv04_vmm_new_(func_17, mmu, 0, managed, addr, size,
 				      argv, argc, key, name, pvmm);
 	default:
 		WARN_ON(1);
@@ -394,10 +416,10 @@ gf100_vmm_new_(const struct nvkm_vmm_func *func_16,
 }
 
 int
-gf100_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+gf100_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return gf100_vmm_new_(&gf100_vmm_16, &gf100_vmm_17, mmu, addr,
+	return gf100_vmm_new_(&gf100_vmm_16, &gf100_vmm_17, mmu, managed, addr,
 			      size, argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.c
index 0ebb7bccfcd2..0b59c01fd146 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk104.c
@@ -71,6 +71,7 @@ gk104_vmm_17 = {
 	.aper = gf100_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 17, &gk104_vmm_desc_17_17[0], NVKM_VMM_PAGE_xVxC },
 		{ 12, &gk104_vmm_desc_17_12[0], NVKM_VMM_PAGE_xVHx },
@@ -85,6 +86,7 @@ gk104_vmm_16 = {
 	.aper = gf100_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 16, &gk104_vmm_desc_16_16[0], NVKM_VMM_PAGE_xVxC },
 		{ 12, &gk104_vmm_desc_16_12[0], NVKM_VMM_PAGE_xVHx },
@@ -93,10 +95,10 @@ gk104_vmm_16 = {
 };
 
 int
-gk104_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+gk104_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return gf100_vmm_new_(&gk104_vmm_16, &gk104_vmm_17, mmu, addr,
+	return gf100_vmm_new_(&gk104_vmm_16, &gk104_vmm_17, mmu, managed, addr,
 			      size, argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c
index 8086994a0446..5a9582dce970 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgk20a.c
@@ -40,6 +40,7 @@ gk20a_vmm_17 = {
 	.aper = gf100_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 17, &gk104_vmm_desc_17_17[0], NVKM_VMM_PAGE_xxHC },
 		{ 12, &gk104_vmm_desc_17_12[0], NVKM_VMM_PAGE_xxHx },
@@ -54,6 +55,7 @@ gk20a_vmm_16 = {
 	.aper = gf100_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 16, &gk104_vmm_desc_16_16[0], NVKM_VMM_PAGE_xxHC },
 		{ 12, &gk104_vmm_desc_16_12[0], NVKM_VMM_PAGE_xxHx },
@@ -62,10 +64,10 @@ gk20a_vmm_16 = {
 };
 
 int
-gk20a_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+gk20a_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return gf100_vmm_new_(&gk20a_vmm_16, &gk20a_vmm_17, mmu, addr,
+	return gf100_vmm_new_(&gk20a_vmm_16, &gk20a_vmm_17, mmu, managed, addr,
 			      size, argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm200.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm200.c
index a1676a4644fe..2e61af02d4d8 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm200.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm200.c
@@ -113,6 +113,7 @@ gm200_vmm_17 = {
 	.aper = gf100_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 27, &gm200_vmm_desc_17_17[1], NVKM_VMM_PAGE_Sxxx },
 		{ 17, &gm200_vmm_desc_17_17[0], NVKM_VMM_PAGE_SVxC },
@@ -128,6 +129,7 @@ gm200_vmm_16 = {
 	.aper = gf100_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 27, &gm200_vmm_desc_16_16[1], NVKM_VMM_PAGE_Sxxx },
 		{ 16, &gm200_vmm_desc_16_16[0], NVKM_VMM_PAGE_SVxC },
@@ -139,9 +141,9 @@ gm200_vmm_16 = {
 int
 gm200_vmm_new_(const struct nvkm_vmm_func *func_16,
 	       const struct nvkm_vmm_func *func_17,
-	       struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	       struct lock_class_key *key, const char *name,
-	       struct nvkm_vmm **pvmm)
+	       struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	       void *argv, u32 argc, struct lock_class_key *key,
+	       const char *name, struct nvkm_vmm **pvmm)
 {
 	const struct nvkm_vmm_func *func;
 	union {
@@ -163,23 +165,23 @@ gm200_vmm_new_(const struct nvkm_vmm_func *func_16,
 	} else
 		return ret;
 
-	return nvkm_vmm_new_(func, mmu, 0, addr, size, key, name, pvmm);
+	return nvkm_vmm_new_(func, mmu, 0, managed, addr, size, key, name, pvmm);
 }
 
 int
-gm200_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+gm200_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return gm200_vmm_new_(&gm200_vmm_16, &gm200_vmm_17, mmu, addr,
+	return gm200_vmm_new_(&gm200_vmm_16, &gm200_vmm_17, mmu, managed, addr,
 			      size, argv, argc, key, name, pvmm);
 }
 
 int
-gm200_vmm_new_fixed(struct nvkm_mmu *mmu, u64 addr, u64 size,
+gm200_vmm_new_fixed(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
 		    void *argv, u32 argc, struct lock_class_key *key,
 		    const char *name, struct nvkm_vmm **pvmm)
 {
-	return gf100_vmm_new_(&gm200_vmm_16, &gm200_vmm_17, mmu, addr,
+	return gf100_vmm_new_(&gm200_vmm_16, &gm200_vmm_17, mmu, managed, addr,
 			      size, argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c
index 64d4b6cff8dd..96b759695dd8 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgm20b.c
@@ -28,6 +28,7 @@ gm20b_vmm_17 = {
 	.aper = gk20a_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 27, &gm200_vmm_desc_17_17[1], NVKM_VMM_PAGE_Sxxx },
 		{ 17, &gm200_vmm_desc_17_17[0], NVKM_VMM_PAGE_SxHC },
@@ -43,6 +44,7 @@ gm20b_vmm_16 = {
 	.aper = gk20a_vmm_aper,
 	.valid = gf100_vmm_valid,
 	.flush = gf100_vmm_flush,
+	.invalidate_pdb = gf100_vmm_invalidate_pdb,
 	.page = {
 		{ 27, &gm200_vmm_desc_16_16[1], NVKM_VMM_PAGE_Sxxx },
 		{ 16, &gm200_vmm_desc_16_16[0], NVKM_VMM_PAGE_SxHC },
@@ -52,19 +54,19 @@ gm20b_vmm_16 = {
 };
 
 int
-gm20b_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+gm20b_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return gm200_vmm_new_(&gm20b_vmm_16, &gm20b_vmm_17, mmu, addr,
+	return gm200_vmm_new_(&gm20b_vmm_16, &gm20b_vmm_17, mmu, managed, addr,
 			      size, argv, argc, key, name, pvmm);
 }
 
 int
-gm20b_vmm_new_fixed(struct nvkm_mmu *mmu, u64 addr, u64 size,
+gm20b_vmm_new_fixed(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
 		    void *argv, u32 argc, struct lock_class_key *key,
 		    const char *name, struct nvkm_vmm **pvmm)
 {
-	return gf100_vmm_new_(&gm20b_vmm_16, &gm20b_vmm_17, mmu, addr,
+	return gf100_vmm_new_(&gm20b_vmm_16, &gm20b_vmm_17, mmu, managed, addr,
 			      size, argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c
index 059fafe0e771..b4f519768d5e 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c
@@ -21,12 +21,90 @@
  */
 #include "vmm.h"
 
+#include <core/client.h>
 #include <subdev/fb.h>
 #include <subdev/ltc.h>
+#include <subdev/timer.h>
+#include <engine/gr.h>
 
 #include <nvif/ifc00d.h>
 #include <nvif/unpack.h>
 
+static void
+gp100_vmm_pfn_unmap(struct nvkm_vmm *vmm,
+		    struct nvkm_mmu_pt *pt, u32 ptei, u32 ptes)
+{
+	struct device *dev = vmm->mmu->subdev.device->dev;
+	dma_addr_t addr;
+
+	nvkm_kmap(pt->memory);
+	while (ptes--) {
+		u32 datalo = nvkm_ro32(pt->memory, pt->base + ptei * 8 + 0);
+		u32 datahi = nvkm_ro32(pt->memory, pt->base + ptei * 8 + 4);
+		u64 data   = (u64)datahi << 32 | datalo;
+		if ((data & (3ULL << 1)) != 0) {
+			addr = (data >> 8) << 12;
+			dma_unmap_page(dev, addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+		}
+		ptei++;
+	}
+	nvkm_done(pt->memory);
+}
+
+static bool
+gp100_vmm_pfn_clear(struct nvkm_vmm *vmm,
+		    struct nvkm_mmu_pt *pt, u32 ptei, u32 ptes)
+{
+	bool dma = false;
+	nvkm_kmap(pt->memory);
+	while (ptes--) {
+		u32 datalo = nvkm_ro32(pt->memory, pt->base + ptei * 8 + 0);
+		u32 datahi = nvkm_ro32(pt->memory, pt->base + ptei * 8 + 4);
+		u64 data   = (u64)datahi << 32 | datalo;
+		if ((data & BIT_ULL(0)) && (data & (3ULL << 1)) != 0) {
+			VMM_WO064(pt, vmm, ptei * 8, data & ~BIT_ULL(0));
+			dma = true;
+		}
+		ptei++;
+	}
+	nvkm_done(pt->memory);
+	return dma;
+}
+
+static void
+gp100_vmm_pgt_pfn(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt,
+		  u32 ptei, u32 ptes, struct nvkm_vmm_map *map)
+{
+	struct device *dev = vmm->mmu->subdev.device->dev;
+	dma_addr_t addr;
+
+	nvkm_kmap(pt->memory);
+	while (ptes--) {
+		u64 data = 0;
+		if (!(*map->pfn & NVKM_VMM_PFN_W))
+			data |= BIT_ULL(6); /* RO. */
+
+		if (!(*map->pfn & NVKM_VMM_PFN_VRAM)) {
+			addr = *map->pfn >> NVKM_VMM_PFN_ADDR_SHIFT;
+			addr = dma_map_page(dev, pfn_to_page(addr), 0,
+					    PAGE_SIZE, DMA_BIDIRECTIONAL);
+			if (!WARN_ON(dma_mapping_error(dev, addr))) {
+				data |= addr >> 4;
+				data |= 2ULL << 1; /* SYSTEM_COHERENT_MEMORY. */
+				data |= BIT_ULL(3); /* VOL. */
+				data |= BIT_ULL(0); /* VALID. */
+			}
+		} else {
+			data |= (*map->pfn & NVKM_VMM_PFN_ADDR) >> 4;
+			data |= BIT_ULL(0); /* VALID. */
+		}
+
+		VMM_WO064(pt, vmm, ptei++ * 8, data);
+		map->pfn++;
+	}
+	nvkm_done(pt->memory);
+}
+
 static inline void
 gp100_vmm_pgt_pte(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt,
 		  u32 ptei, u32 ptes, struct nvkm_vmm_map *map, u64 addr)
@@ -89,6 +167,9 @@ gp100_vmm_desc_spt = {
 	.mem = gp100_vmm_pgt_mem,
 	.dma = gp100_vmm_pgt_dma,
 	.sgl = gp100_vmm_pgt_sgl,
+	.pfn = gp100_vmm_pgt_pfn,
+	.pfn_clear = gp100_vmm_pfn_clear,
+	.pfn_unmap = gp100_vmm_pfn_unmap,
 };
 
 static void
@@ -306,16 +387,100 @@ gp100_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc,
 	return 0;
 }
 
+static int
+gp100_vmm_fault_cancel(struct nvkm_vmm *vmm, void *argv, u32 argc)
+{
+	struct nvkm_device *device = vmm->mmu->subdev.device;
+	union {
+		struct gp100_vmm_fault_cancel_v0 v0;
+	} *args = argv;
+	int ret = -ENOSYS;
+	u32 inst, aper;
+
+	if ((ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, false)))
+		return ret;
+
+	/* Translate MaxwellFaultBufferA instance pointer to the same
+	 * format as the NV_GR_FECS_CURRENT_CTX register.
+	 */
+	aper = (args->v0.inst >> 8) & 3;
+	args->v0.inst >>= 12;
+	args->v0.inst |= aper << 28;
+	args->v0.inst |= 0x80000000;
+
+	if (!WARN_ON(nvkm_gr_ctxsw_pause(device))) {
+		if ((inst = nvkm_gr_ctxsw_inst(device)) == args->v0.inst) {
+			gf100_vmm_invalidate(vmm, 0x0000001b
+					     /* CANCEL_TARGETED. */ |
+					     (args->v0.hub    << 20) |
+					     (args->v0.gpc    << 15) |
+					     (args->v0.client << 9));
+		}
+		WARN_ON(nvkm_gr_ctxsw_resume(device));
+	}
+
+	return 0;
+}
+
+static int
+gp100_vmm_fault_replay(struct nvkm_vmm *vmm, void *argv, u32 argc)
+{
+	union {
+		struct gp100_vmm_fault_replay_vn vn;
+	} *args = argv;
+	int ret = -ENOSYS;
+
+	if (!(ret = nvif_unvers(ret, &argv, &argc, args->vn))) {
+		gf100_vmm_invalidate(vmm, 0x0000000b); /* REPLAY_GLOBAL. */
+	}
+
+	return ret;
+}
+
+int
+gp100_vmm_mthd(struct nvkm_vmm *vmm,
+	       struct nvkm_client *client, u32 mthd, void *argv, u32 argc)
+{
+	if (client->super) {
+		switch (mthd) {
+		case GP100_VMM_VN_FAULT_REPLAY:
+			return gp100_vmm_fault_replay(vmm, argv, argc);
+		case GP100_VMM_VN_FAULT_CANCEL:
+			return gp100_vmm_fault_cancel(vmm, argv, argc);
+		default:
+			break;
+		}
+	}
+	return -EINVAL;
+}
+
+void
+gp100_vmm_invalidate_pdb(struct nvkm_vmm *vmm, u64 addr)
+{
+	struct nvkm_device *device = vmm->mmu->subdev.device;
+	nvkm_wr32(device, 0x100cb8, lower_32_bits(addr));
+	nvkm_wr32(device, 0x100cec, upper_32_bits(addr));
+}
+
 void
 gp100_vmm_flush(struct nvkm_vmm *vmm, int depth)
 {
-	gf100_vmm_flush_(vmm, 5 /* CACHE_LEVEL_UP_TO_PDE3 */ - depth);
+	u32 type = (5 /* CACHE_LEVEL_UP_TO_PDE3 */ - depth) << 24;
+	type = 0; /*XXX: need to confirm stuff works with depth enabled... */
+	if (atomic_read(&vmm->engref[NVKM_SUBDEV_BAR]))
+		type |= 0x00000004; /* HUB_ONLY */
+	type |= 0x00000001; /* PAGE_ALL */
+	gf100_vmm_invalidate(vmm, type);
 }
 
 int
 gp100_vmm_join(struct nvkm_vmm *vmm, struct nvkm_memory *inst)
 {
-	const u64 base = BIT_ULL(10) /* VER2 */ | BIT_ULL(11); /* 64KiB */
+	u64 base = BIT_ULL(10) /* VER2 */ | BIT_ULL(11) /* 64KiB */;
+	if (vmm->replay) {
+		base |= BIT_ULL(4); /* FAULT_REPLAY_TEX */
+		base |= BIT_ULL(5); /* FAULT_REPLAY_GCC */
+	}
 	return gf100_vmm_join_(vmm, inst, base);
 }
 
@@ -326,6 +491,8 @@ gp100_vmm = {
 	.aper = gf100_vmm_aper,
 	.valid = gp100_vmm_valid,
 	.flush = gp100_vmm_flush,
+	.mthd = gp100_vmm_mthd,
+	.invalidate_pdb = gp100_vmm_invalidate_pdb,
 	.page = {
 		{ 47, &gp100_vmm_desc_16[4], NVKM_VMM_PAGE_Sxxx },
 		{ 38, &gp100_vmm_desc_16[3], NVKM_VMM_PAGE_Sxxx },
@@ -338,10 +505,39 @@ gp100_vmm = {
 };
 
 int
-gp100_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+gp100_vmm_new_(const struct nvkm_vmm_func *func,
+	       struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	       void *argv, u32 argc, struct lock_class_key *key,
+	       const char *name, struct nvkm_vmm **pvmm)
+{
+	union {
+		struct gp100_vmm_vn vn;
+		struct gp100_vmm_v0 v0;
+	} *args = argv;
+	int ret = -ENOSYS;
+	bool replay;
+
+	if (!(ret = nvif_unpack(ret, &argv, &argc, args->v0, 0, 0, false))) {
+		replay = args->v0.fault_replay != 0;
+	} else
+	if (!(ret = nvif_unvers(ret, &argv, &argc, args->vn))) {
+		replay = false;
+	} else
+		return ret;
+
+	ret = nvkm_vmm_new_(func, mmu, 0, managed, addr, size, key, name, pvmm);
+	if (ret)
+		return ret;
+
+	(*pvmm)->replay = replay;
+	return 0;
+}
+
+int
+gp100_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return nv04_vmm_new_(&gp100_vmm, mmu, 0, addr, size,
-			     argv, argc, key, name, pvmm);
+	return gp100_vmm_new_(&gp100_vmm, mmu, managed, addr, size,
+			      argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c
index 3dcc6bddb32f..e081239afe58 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp10b.c
@@ -28,6 +28,8 @@ gp10b_vmm = {
 	.aper = gk20a_vmm_aper,
 	.valid = gp100_vmm_valid,
 	.flush = gp100_vmm_flush,
+	.mthd = gp100_vmm_mthd,
+	.invalidate_pdb = gp100_vmm_invalidate_pdb,
 	.page = {
 		{ 47, &gp100_vmm_desc_16[4], NVKM_VMM_PAGE_Sxxx },
 		{ 38, &gp100_vmm_desc_16[3], NVKM_VMM_PAGE_Sxxx },
@@ -40,10 +42,10 @@ gp10b_vmm = {
 };
 
 int
-gp10b_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+gp10b_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return nv04_vmm_new_(&gp10b_vmm, mmu, 0, addr, size,
-			     argv, argc, key, name, pvmm);
+	return gp100_vmm_new_(&gp10b_vmm, mmu, managed, addr, size,
+			      argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgv100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgv100.c
index 2fa40c16e6d2..f0e21f63253a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgv100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgv100.c
@@ -66,6 +66,8 @@ gv100_vmm = {
 	.aper = gf100_vmm_aper,
 	.valid = gp100_vmm_valid,
 	.flush = gp100_vmm_flush,
+	.mthd = gp100_vmm_mthd,
+	.invalidate_pdb = gp100_vmm_invalidate_pdb,
 	.page = {
 		{ 47, &gp100_vmm_desc_16[4], NVKM_VMM_PAGE_Sxxx },
 		{ 38, &gp100_vmm_desc_16[3], NVKM_VMM_PAGE_Sxxx },
@@ -78,10 +80,10 @@ gv100_vmm = {
 };
 
 int
-gv100_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+gv100_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return nv04_vmm_new_(&gv100_vmm, mmu, 0, addr, size,
-			     argv, argc, key, name, pvmm);
+	return gp100_vmm_new_(&gv100_vmm, mmu, managed, addr, size,
+			      argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c
index e63d984cbfd4..bdddd99f5877 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c
@@ -36,10 +36,10 @@ mcp77_vmm = {
 };
 
 int
-mcp77_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+mcp77_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return nv04_vmm_new_(&mcp77_vmm, mmu, 0, addr, size,
+	return nv04_vmm_new_(&mcp77_vmm, mmu, 0, managed, addr, size,
 			     argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv04.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv04.c
index 0cab1ffc9f64..4c6b3b7d221f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv04.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv04.c
@@ -100,16 +100,17 @@ nv04_vmm = {
 
 int
 nv04_vmm_new_(const struct nvkm_vmm_func *func, struct nvkm_mmu *mmu,
-	      u32 pd_header, u64 addr, u64 size, void *argv, u32 argc,
-	      struct lock_class_key *key, const char *name,
-	      struct nvkm_vmm **pvmm)
+	      u32 pd_header, bool managed, u64 addr, u64 size,
+	      void *argv, u32 argc, struct lock_class_key *key,
+	      const char *name, struct nvkm_vmm **pvmm)
 {
 	union {
 		struct nv04_vmm_vn vn;
 	} *args = argv;
 	int ret;
 
-	ret = nvkm_vmm_new_(func, mmu, pd_header, addr, size, key, name, pvmm);
+	ret = nvkm_vmm_new_(func, mmu, pd_header, managed, addr, size,
+			    key, name, pvmm);
 	if (ret)
 		return ret;
 
@@ -117,15 +118,15 @@ nv04_vmm_new_(const struct nvkm_vmm_func *func, struct nvkm_mmu *mmu,
 }
 
 int
-nv04_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	     struct lock_class_key *key, const char *name,
+nv04_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	     void *argv, u32 argc, struct lock_class_key *key, const char *name,
 	     struct nvkm_vmm **pvmm)
 {
 	struct nvkm_memory *mem;
 	struct nvkm_vmm *vmm;
 	int ret;
 
-	ret = nv04_vmm_new_(&nv04_vmm, mmu, 8, addr, size,
+	ret = nv04_vmm_new_(&nv04_vmm, mmu, 8, managed, addr, size,
 			    argv, argc, key, name, &vmm);
 	*pvmm = vmm;
 	if (ret)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv41.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv41.c
index b595f130e573..1d3369683a21 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv41.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv41.c
@@ -104,10 +104,10 @@ nv41_vmm = {
 };
 
 int
-nv41_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	     struct lock_class_key *key, const char *name,
+nv41_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	     void *argv, u32 argc, struct lock_class_key *key, const char *name,
 	     struct nvkm_vmm **pvmm)
 {
-	return nv04_vmm_new_(&nv41_vmm, mmu, 0, addr, size,
+	return nv04_vmm_new_(&nv41_vmm, mmu, 0, managed, addr, size,
 			     argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv44.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv44.c
index b834e4352334..a82936ba9890 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv44.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv44.c
@@ -205,15 +205,15 @@ nv44_vmm = {
 };
 
 int
-nv44_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	     struct lock_class_key *key, const char *name,
+nv44_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	     void *argv, u32 argc, struct lock_class_key *key, const char *name,
 	     struct nvkm_vmm **pvmm)
 {
 	struct nvkm_subdev *subdev = &mmu->subdev;
 	struct nvkm_vmm *vmm;
 	int ret;
 
-	ret = nv04_vmm_new_(&nv44_vmm, mmu, 0, addr, size,
+	ret = nv04_vmm_new_(&nv44_vmm, mmu, 0, managed, addr, size,
 			    argv, argc, key, name, &vmm);
 	*pvmm = vmm;
 	if (ret)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c
index 64f75d906202..c98afe3134ee 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c
@@ -376,10 +376,10 @@ nv50_vmm = {
 };
 
 int
-nv50_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
-	     struct lock_class_key *key, const char *name,
+nv50_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
+	     void *argv, u32 argc, struct lock_class_key *key, const char *name,
 	     struct nvkm_vmm **pvmm)
 {
-	return nv04_vmm_new_(&nv50_vmm, mmu, 0, addr, size,
+	return nv04_vmm_new_(&nv50_vmm, mmu, 0, managed, addr, size,
 			     argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmtu104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmtu102.c
index adaadd92110f..be91cffc3b52 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmtu104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmtu102.c
@@ -24,7 +24,7 @@
 #include <subdev/timer.h>
 
 static void
-tu104_vmm_flush(struct nvkm_vmm *vmm, int depth)
+tu102_vmm_flush(struct nvkm_vmm *vmm, int depth)
 {
 	struct nvkm_subdev *subdev = &vmm->mmu->subdev;
 	struct nvkm_device *device = subdev->device;
@@ -50,12 +50,13 @@ tu104_vmm_flush(struct nvkm_vmm *vmm, int depth)
 }
 
 static const struct nvkm_vmm_func
-tu104_vmm = {
+tu102_vmm = {
 	.join = gv100_vmm_join,
 	.part = gf100_vmm_part,
 	.aper = gf100_vmm_aper,
 	.valid = gp100_vmm_valid,
-	.flush = tu104_vmm_flush,
+	.flush = tu102_vmm_flush,
+	.mthd = gp100_vmm_mthd,
 	.page = {
 		{ 47, &gp100_vmm_desc_16[4], NVKM_VMM_PAGE_Sxxx },
 		{ 38, &gp100_vmm_desc_16[3], NVKM_VMM_PAGE_Sxxx },
@@ -68,10 +69,10 @@ tu104_vmm = {
 };
 
 int
-tu104_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size,
+tu102_vmm_new(struct nvkm_mmu *mmu, bool managed, u64 addr, u64 size,
 	      void *argv, u32 argc, struct lock_class_key *key,
 	      const char *name, struct nvkm_vmm **pvmm)
 {
-	return nv04_vmm_new_(&tu104_vmm, mmu, 0, addr, size,
-			     argv, argc, key, name, pvmm);
+	return gp100_vmm_new_(&tu102_vmm, mmu, managed, addr, size,
+			      argv, argc, key, name, pvmm);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/memx.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/memx.c
index 11b28b086a06..7b052879af72 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/memx.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/memx.c
@@ -88,10 +88,10 @@ nvkm_memx_fini(struct nvkm_memx **pmemx, bool exec)
 	if (exec) {
 		nvkm_pmu_send(pmu, reply, PROC_MEMX, MEMX_MSG_EXEC,
 			      memx->base, finish);
+		nvkm_debug(subdev, "Exec took %uns, PMU_IN %08x\n",
+			   reply[0], reply[1]);
 	}
 
-	nvkm_debug(subdev, "Exec took %uns, PMU_IN %08x\n",
-		   reply[0], reply[1]);
 	kfree(memx);
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
index 5c14d6ac855d..1df09ed6fe6d 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
@@ -853,7 +853,7 @@ acr_r352_shutdown(struct acr_r352 *acr, struct nvkm_secboot *sb)
 		 * and the expected behavior on RM as well
 		 */
 		if (ret && ret != 0x1d) {
-			nvkm_error(subdev, "HS unload failed, ret 0x%08x", ret);
+			nvkm_error(subdev, "HS unload failed, ret 0x%08x\n", ret);
 			return -EINVAL;
 		}
 		nvkm_debug(subdev, "HS unload blob completed\n");
@@ -922,7 +922,7 @@ acr_r352_bootstrap(struct acr_r352 *acr, struct nvkm_secboot *sb)
 	if (ret < 0) {
 		return ret;
 	} else if (ret > 0) {
-		nvkm_error(subdev, "HS load failed, ret 0x%08x", ret);
+		nvkm_error(subdev, "HS load failed, ret 0x%08x\n", ret);
 		return -EINVAL;
 	}
 	nvkm_debug(subdev, "HS load blob completed\n");
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/top/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/top/base.c
index 67ada1d9a28c..cce6e4e90ebf 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/top/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/top/base.c
@@ -41,6 +41,22 @@ nvkm_top_device_new(struct nvkm_top *top)
 }
 
 u32
+nvkm_top_addr(struct nvkm_device *device, enum nvkm_devidx index)
+{
+	struct nvkm_top *top = device->top;
+	struct nvkm_top_device *info;
+
+	if (top) {
+		list_for_each_entry(info, &top->device, head) {
+			if (info->index == index)
+				return info->addr;
+		}
+	}
+
+	return 0;
+}
+
+u32
 nvkm_top_reset(struct nvkm_device *device, enum nvkm_devidx index)
 {
 	struct nvkm_top *top = device->top;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/top/gk104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/top/gk104.c
index 39081eadfd84..e01746ce9fc4 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/top/gk104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/top/gk104.c
@@ -73,6 +73,7 @@ gk104_top_oneinit(struct nvkm_top *top)
 #define A_(A) if (inst == 0) info->index = NVKM_ENGINE_##A
 #define B_(A) if (inst + NVKM_ENGINE_##A##0 < NVKM_ENGINE_##A##_LAST + 1)      \
 		info->index = NVKM_ENGINE_##A##0 + inst
+#define C_(A) if (inst == 0) info->index = NVKM_SUBDEV_##A
 		switch (type) {
 		case 0x00000000: A_(GR    ); break;
 		case 0x00000001: A_(CE0   ); break;
@@ -88,6 +89,7 @@ gk104_top_oneinit(struct nvkm_top *top)
 		case 0x0000000f: A_(NVENC1); break;
 		case 0x00000010: B_(NVDEC ); break;
 		case 0x00000013: B_(CE    ); break;
+		case 0x00000014: C_(GSP   ); break;
 			break;
 		default:
 			break;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/Kbuild
index bcd179ba11d0..146adcdd316a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/Kbuild
@@ -2,6 +2,7 @@ nvkm-y += nvkm/subdev/volt/base.o
 nvkm-y += nvkm/subdev/volt/gpio.o
 nvkm-y += nvkm/subdev/volt/nv40.o
 nvkm-y += nvkm/subdev/volt/gf100.o
+nvkm-y += nvkm/subdev/volt/gf117.o
 nvkm-y += nvkm/subdev/volt/gk104.o
 nvkm-y += nvkm/subdev/volt/gk20a.o
 nvkm-y += nvkm/subdev/volt/gm20b.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gf117.c b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gf117.c
new file mode 100644
index 000000000000..547a58f0aeac
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/volt/gf117.c
@@ -0,0 +1,60 @@
+/*
+ * Copyright 2019 Ilia Mirkin
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ilia Mirkin
+ */
+#include "priv.h"
+
+#include <subdev/fuse.h>
+
+static int
+gf117_volt_speedo_read(struct nvkm_volt *volt)
+{
+	struct nvkm_device *device = volt->subdev.device;
+	struct nvkm_fuse *fuse = device->fuse;
+
+	if (!fuse)
+		return -EINVAL;
+
+	return nvkm_fuse_read(fuse, 0x3a8);
+}
+
+static const struct nvkm_volt_func
+gf117_volt = {
+	.oneinit = gf100_volt_oneinit,
+	.vid_get = nvkm_voltgpio_get,
+	.vid_set = nvkm_voltgpio_set,
+	.speedo_read = gf117_volt_speedo_read,
+};
+
+int
+gf117_volt_new(struct nvkm_device *device, int index, struct nvkm_volt **pvolt)
+{
+	struct nvkm_volt *volt;
+	int ret;
+
+	ret = nvkm_volt_new_(&gf117_volt, device, index, &volt);
+	*pvolt = volt;
+	if (ret)
+		return ret;
+
+	return nvkm_voltgpio_init(volt);
+}
diff --git a/drivers/gpu/drm/omapdrm/omap_connector.c b/drivers/gpu/drm/omapdrm/omap_connector.c
index b81302c4bf9e..9da94d10782a 100644
--- a/drivers/gpu/drm/omapdrm/omap_connector.c
+++ b/drivers/gpu/drm/omapdrm/omap_connector.c
@@ -17,7 +17,7 @@
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "omap_drv.h"
 
@@ -305,14 +305,9 @@ static int omap_connector_mode_valid(struct drm_connector *connector,
 	drm_mode_destroy(dev, new_mode);
 
 done:
-	DBG("connector: mode %s: "
-			"%d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
+	DBG("connector: mode %s: " DRM_MODE_FMT,
 			(ret == MODE_OK) ? "valid" : "invalid",
-			mode->base.id, mode->name, mode->vrefresh, mode->clock,
-			mode->hdisplay, mode->hsync_start,
-			mode->hsync_end, mode->htotal,
-			mode->vdisplay, mode->vsync_start,
-			mode->vsync_end, mode->vtotal, mode->type, mode->flags);
+			DRM_MODE_ARG(mode));
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c
index caffc547ef97..d99e24dcc0bf 100644
--- a/drivers/gpu/drm/omapdrm/omap_crtc.c
+++ b/drivers/gpu/drm/omapdrm/omap_crtc.c
@@ -18,7 +18,6 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_mode.h>
 #include <drm/drm_plane_helper.h>
 #include <linux/math64.h>
@@ -427,12 +426,8 @@ static void omap_crtc_mode_set_nofb(struct drm_crtc *crtc)
 	struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
 	struct drm_display_mode *mode = &crtc->state->adjusted_mode;
 
-	DBG("%s: set mode: %d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x",
-	    omap_crtc->name, mode->base.id, mode->name,
-	    mode->vrefresh, mode->clock,
-	    mode->hdisplay, mode->hsync_start, mode->hsync_end, mode->htotal,
-	    mode->vdisplay, mode->vsync_start, mode->vsync_end, mode->vtotal,
-	    mode->type, mode->flags);
+	DBG("%s: set mode: " DRM_MODE_FMT,
+	    omap_crtc->name, DRM_MODE_ARG(mode));
 
 	drm_display_mode_to_videomode(mode, &omap_crtc->vm);
 }
diff --git a/drivers/gpu/drm/omapdrm/omap_drv.c b/drivers/gpu/drm/omapdrm/omap_drv.c
index 5e67d58cbc28..f8292278f57d 100644
--- a/drivers/gpu/drm/omapdrm/omap_drv.c
+++ b/drivers/gpu/drm/omapdrm/omap_drv.c
@@ -21,8 +21,8 @@
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "omap_dmm_tiler.h"
 #include "omap_drv.h"
diff --git a/drivers/gpu/drm/omapdrm/omap_drv.h b/drivers/gpu/drm/omapdrm/omap_drv.h
index bd7f2c227a25..0c57d2814c51 100644
--- a/drivers/gpu/drm/omapdrm/omap_drv.h
+++ b/drivers/gpu/drm/omapdrm/omap_drv.h
@@ -23,7 +23,6 @@
 #include <linux/workqueue.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_gem.h>
 #include <drm/omap_drm.h>
 
diff --git a/drivers/gpu/drm/omapdrm/omap_encoder.c b/drivers/gpu/drm/omapdrm/omap_encoder.c
index 933ebc9f9faa..0d85b3a35767 100644
--- a/drivers/gpu/drm/omapdrm/omap_encoder.c
+++ b/drivers/gpu/drm/omapdrm/omap_encoder.c
@@ -18,7 +18,7 @@
 #include <linux/list.h>
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_modeset_helper_vtables.h>
 #include <drm/drm_edid.h>
 
 #include "omap_drv.h"
@@ -76,8 +76,8 @@ static void omap_encoder_hdmi_mode_set(struct drm_encoder *encoder,
 		struct hdmi_avi_infoframe avi;
 		int r;
 
-		r = drm_hdmi_avi_infoframe_from_display_mode(&avi, adjusted_mode,
-							     false);
+		r = drm_hdmi_avi_infoframe_from_display_mode(&avi, connector,
+							     adjusted_mode);
 		if (r == 0)
 			dssdev->ops->hdmi.set_infoframe(dssdev, &avi);
 	}
diff --git a/drivers/gpu/drm/omapdrm/omap_fb.c b/drivers/gpu/drm/omapdrm/omap_fb.c
index 4d264fd554d8..4f8eb9d08f99 100644
--- a/drivers/gpu/drm/omapdrm/omap_fb.c
+++ b/drivers/gpu/drm/omapdrm/omap_fb.c
@@ -18,7 +18,7 @@
 #include <linux/seq_file.h>
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_modeset_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 
 #include "omap_dmm_tiler.h"
diff --git a/drivers/gpu/drm/omapdrm/omap_fbdev.c b/drivers/gpu/drm/omapdrm/omap_fbdev.c
index aee99194499f..851c59f07eb1 100644
--- a/drivers/gpu/drm/omapdrm/omap_fbdev.c
+++ b/drivers/gpu/drm/omapdrm/omap_fbdev.c
@@ -16,6 +16,7 @@
  */
 
 #include <drm/drm_crtc.h>
+#include <drm/drm_util.h>
 #include <drm/drm_fb_helper.h>
 
 #include "omap_drv.h"
diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
index 3f3537719beb..3e070153ef21 100644
--- a/drivers/gpu/drm/panel/Kconfig
+++ b/drivers/gpu/drm/panel/Kconfig
@@ -77,6 +77,17 @@ config DRM_PANEL_JDI_LT070ME05000
 	  The panel has a 1200(RGB)×1920 (WUXGA) resolution and uses
 	  24 bit per pixel.
 
+config DRM_PANEL_KINGDISPLAY_KD097D04
+	tristate "Kingdisplay kd097d04 panel"
+	depends on OF
+	depends on DRM_MIPI_DSI
+	depends on BACKLIGHT_CLASS_DEVICE
+	help
+	  Say Y here if you want to enable support for Kingdisplay kd097d04
+	  TFT-LCD modules. The panel has a 1536x2048 resolution and uses
+	  24 bit RGB per pixel. It provides a MIPI DSI interface to
+	  the host and has a built-in LED backlight.
+
 config DRM_PANEL_SAMSUNG_LD9040
 	tristate "Samsung LD9040 RGB/SPI panel"
 	depends on OF && SPI
@@ -196,6 +207,16 @@ config DRM_PANEL_SHARP_LS043T1LE01
 	  Say Y here if you want to enable support for Sharp LS043T1LE01 qHD
 	  (540x960) DSI panel as found on the Qualcomm APQ8074 Dragonboard
 
+config DRM_PANEL_SITRONIX_ST7701
+	tristate "Sitronix ST7701 panel driver"
+	depends on OF
+	depends on DRM_MIPI_DSI
+	depends on BACKLIGHT_CLASS_DEVICE
+	help
+	  Say Y here if you want to enable support for the Sitronix
+	  ST7701 controller for 480X864 LCD panels with MIPI/RGB/SPI
+	  system interfaces.
+
 config DRM_PANEL_SITRONIX_ST7789V
 	tristate "Sitronix ST7789V panel"
 	depends on OF && SPI
@@ -204,6 +225,15 @@ config DRM_PANEL_SITRONIX_ST7789V
 	  Say Y here if you want to enable support for the Sitronix
 	  ST7789V controller for 240x320 LCD panels
 
+config DRM_PANEL_TPO_TPG110
+	tristate "TPO TPG 800x400 panel"
+	depends on OF && SPI && GPIOLIB
+	depends on BACKLIGHT_CLASS_DEVICE
+	help
+	  Say Y here if you want to enable support for TPO TPG110
+	  400CH LTPS TFT LCD Single Chip Digital Driver for up to
+	  800x400 LCD panels.
+
 config DRM_PANEL_TRULY_NT35597_WQXGA
 	tristate "Truly WQXGA"
 	depends on OF
diff --git a/drivers/gpu/drm/panel/Makefile b/drivers/gpu/drm/panel/Makefile
index 4396658a7996..e7ab71968bbf 100644
--- a/drivers/gpu/drm/panel/Makefile
+++ b/drivers/gpu/drm/panel/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_DRM_PANEL_ILITEK_IL9322) += panel-ilitek-ili9322.o
 obj-$(CONFIG_DRM_PANEL_ILITEK_ILI9881C) += panel-ilitek-ili9881c.o
 obj-$(CONFIG_DRM_PANEL_INNOLUX_P079ZCA) += panel-innolux-p079zca.o
 obj-$(CONFIG_DRM_PANEL_JDI_LT070ME05000) += panel-jdi-lt070me05000.o
+obj-$(CONFIG_DRM_PANEL_KINGDISPLAY_KD097D04) += panel-kingdisplay-kd097d04.o
 obj-$(CONFIG_DRM_PANEL_LG_LG4573) += panel-lg-lg4573.o
 obj-$(CONFIG_DRM_PANEL_OLIMEX_LCD_OLINUXINO) += panel-olimex-lcd-olinuxino.o
 obj-$(CONFIG_DRM_PANEL_ORISETECH_OTM8009A) += panel-orisetech-otm8009a.o
@@ -20,5 +21,7 @@ obj-$(CONFIG_DRM_PANEL_SAMSUNG_S6E8AA0) += panel-samsung-s6e8aa0.o
 obj-$(CONFIG_DRM_PANEL_SEIKO_43WVF1G) += panel-seiko-43wvf1g.o
 obj-$(CONFIG_DRM_PANEL_SHARP_LQ101R1SX01) += panel-sharp-lq101r1sx01.o
 obj-$(CONFIG_DRM_PANEL_SHARP_LS043T1LE01) += panel-sharp-ls043t1le01.o
+obj-$(CONFIG_DRM_PANEL_SITRONIX_ST7701) += panel-sitronix-st7701.o
 obj-$(CONFIG_DRM_PANEL_SITRONIX_ST7789V) += panel-sitronix-st7789v.o
+obj-$(CONFIG_DRM_PANEL_TPO_TPG110) += panel-tpo-tpg110.o
 obj-$(CONFIG_DRM_PANEL_TRULY_NT35597_WQXGA) += panel-truly-nt35597.o
diff --git a/drivers/gpu/drm/panel/panel-innolux-p079zca.c b/drivers/gpu/drm/panel/panel-innolux-p079zca.c
index ca4ae45dd307..8e5724b63f1f 100644
--- a/drivers/gpu/drm/panel/panel-innolux-p079zca.c
+++ b/drivers/gpu/drm/panel/panel-innolux-p079zca.c
@@ -70,18 +70,12 @@ static inline struct innolux_panel *to_innolux_panel(struct drm_panel *panel)
 static int innolux_panel_disable(struct drm_panel *panel)
 {
 	struct innolux_panel *innolux = to_innolux_panel(panel);
-	int err;
 
 	if (!innolux->enabled)
 		return 0;
 
 	backlight_disable(innolux->backlight);
 
-	err = mipi_dsi_dcs_set_display_off(innolux->link);
-	if (err < 0)
-		DRM_DEV_ERROR(panel->dev, "failed to set display off: %d\n",
-			      err);
-
 	innolux->enabled = false;
 
 	return 0;
@@ -95,6 +89,11 @@ static int innolux_panel_unprepare(struct drm_panel *panel)
 	if (!innolux->prepared)
 		return 0;
 
+	err = mipi_dsi_dcs_set_display_off(innolux->link);
+	if (err < 0)
+		DRM_DEV_ERROR(panel->dev, "failed to set display off: %d\n",
+			      err);
+
 	err = mipi_dsi_dcs_enter_sleep_mode(innolux->link);
 	if (err < 0) {
 		DRM_DEV_ERROR(panel->dev, "failed to enter sleep mode: %d\n",
diff --git a/drivers/gpu/drm/panel/panel-kingdisplay-kd097d04.c b/drivers/gpu/drm/panel/panel-kingdisplay-kd097d04.c
new file mode 100644
index 000000000000..2a25a914d09e
--- /dev/null
+++ b/drivers/gpu/drm/panel/panel-kingdisplay-kd097d04.c
@@ -0,0 +1,473 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2017, Fuzhou Rockchip Electronics Co., Ltd
+ */
+
+#include <linux/backlight.h>
+#include <linux/gpio/consumer.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/regulator/consumer.h>
+
+#include <drm/drmP.h>
+#include <drm/drm_crtc.h>
+#include <drm/drm_mipi_dsi.h>
+#include <drm/drm_panel.h>
+
+#include <video/mipi_display.h>
+
+struct kingdisplay_panel {
+	struct drm_panel base;
+	struct mipi_dsi_device *link;
+
+	struct backlight_device *backlight;
+	struct regulator *supply;
+	struct gpio_desc *enable_gpio;
+
+	bool prepared;
+	bool enabled;
+};
+
+struct kingdisplay_panel_cmd {
+	char cmd;
+	char data;
+};
+
+/*
+ * According to the discussion on
+ * https://review.coreboot.org/#/c/coreboot/+/22472/
+ * the panel init array is not part of the panels datasheet but instead
+ * just came in this form from the panel vendor.
+ */
+static const struct kingdisplay_panel_cmd init_code[] = {
+	/* voltage setting */
+	{ 0xB0, 0x00 },
+	{ 0xB2, 0x02 },
+	{ 0xB3, 0x11 },
+	{ 0xB4, 0x00 },
+	{ 0xB6, 0x80 },
+	/* VCOM disable */
+	{ 0xB7, 0x02 },
+	{ 0xB8, 0x80 },
+	{ 0xBA, 0x43 },
+	/* VCOM setting */
+	{ 0xBB, 0x53 },
+	/* VSP setting */
+	{ 0xBC, 0x0A },
+	/* VSN setting */
+	{ 0xBD, 0x4A },
+	/* VGH setting */
+	{ 0xBE, 0x2F },
+	/* VGL setting */
+	{ 0xBF, 0x1A },
+	{ 0xF0, 0x39 },
+	{ 0xF1, 0x22 },
+	/* Gamma setting */
+	{ 0xB0, 0x02 },
+	{ 0xC0, 0x00 },
+	{ 0xC1, 0x01 },
+	{ 0xC2, 0x0B },
+	{ 0xC3, 0x15 },
+	{ 0xC4, 0x22 },
+	{ 0xC5, 0x11 },
+	{ 0xC6, 0x15 },
+	{ 0xC7, 0x19 },
+	{ 0xC8, 0x1A },
+	{ 0xC9, 0x16 },
+	{ 0xCA, 0x18 },
+	{ 0xCB, 0x13 },
+	{ 0xCC, 0x18 },
+	{ 0xCD, 0x13 },
+	{ 0xCE, 0x1C },
+	{ 0xCF, 0x19 },
+	{ 0xD0, 0x21 },
+	{ 0xD1, 0x2C },
+	{ 0xD2, 0x2F },
+	{ 0xD3, 0x30 },
+	{ 0xD4, 0x19 },
+	{ 0xD5, 0x1F },
+	{ 0xD6, 0x00 },
+	{ 0xD7, 0x01 },
+	{ 0xD8, 0x0B },
+	{ 0xD9, 0x15 },
+	{ 0xDA, 0x22 },
+	{ 0xDB, 0x11 },
+	{ 0xDC, 0x15 },
+	{ 0xDD, 0x19 },
+	{ 0xDE, 0x1A },
+	{ 0xDF, 0x16 },
+	{ 0xE0, 0x18 },
+	{ 0xE1, 0x13 },
+	{ 0xE2, 0x18 },
+	{ 0xE3, 0x13 },
+	{ 0xE4, 0x1C },
+	{ 0xE5, 0x19 },
+	{ 0xE6, 0x21 },
+	{ 0xE7, 0x2C },
+	{ 0xE8, 0x2F },
+	{ 0xE9, 0x30 },
+	{ 0xEA, 0x19 },
+	{ 0xEB, 0x1F },
+	/* GOA MUX setting */
+	{ 0xB0, 0x01 },
+	{ 0xC0, 0x10 },
+	{ 0xC1, 0x0F },
+	{ 0xC2, 0x0E },
+	{ 0xC3, 0x0D },
+	{ 0xC4, 0x0C },
+	{ 0xC5, 0x0B },
+	{ 0xC6, 0x0A },
+	{ 0xC7, 0x09 },
+	{ 0xC8, 0x08 },
+	{ 0xC9, 0x07 },
+	{ 0xCA, 0x06 },
+	{ 0xCB, 0x05 },
+	{ 0xCC, 0x00 },
+	{ 0xCD, 0x01 },
+	{ 0xCE, 0x02 },
+	{ 0xCF, 0x03 },
+	{ 0xD0, 0x04 },
+	{ 0xD6, 0x10 },
+	{ 0xD7, 0x0F },
+	{ 0xD8, 0x0E },
+	{ 0xD9, 0x0D },
+	{ 0xDA, 0x0C },
+	{ 0xDB, 0x0B },
+	{ 0xDC, 0x0A },
+	{ 0xDD, 0x09 },
+	{ 0xDE, 0x08 },
+	{ 0xDF, 0x07 },
+	{ 0xE0, 0x06 },
+	{ 0xE1, 0x05 },
+	{ 0xE2, 0x00 },
+	{ 0xE3, 0x01 },
+	{ 0xE4, 0x02 },
+	{ 0xE5, 0x03 },
+	{ 0xE6, 0x04 },
+	{ 0xE7, 0x00 },
+	{ 0xEC, 0xC0 },
+	/* GOA timing setting */
+	{ 0xB0, 0x03 },
+	{ 0xC0, 0x01 },
+	{ 0xC2, 0x6F },
+	{ 0xC3, 0x6F },
+	{ 0xC5, 0x36 },
+	{ 0xC8, 0x08 },
+	{ 0xC9, 0x04 },
+	{ 0xCA, 0x41 },
+	{ 0xCC, 0x43 },
+	{ 0xCF, 0x60 },
+	{ 0xD2, 0x04 },
+	{ 0xD3, 0x04 },
+	{ 0xD4, 0x03 },
+	{ 0xD5, 0x02 },
+	{ 0xD6, 0x01 },
+	{ 0xD7, 0x00 },
+	{ 0xDB, 0x01 },
+	{ 0xDE, 0x36 },
+	{ 0xE6, 0x6F },
+	{ 0xE7, 0x6F },
+	/* GOE setting */
+	{ 0xB0, 0x06 },
+	{ 0xB8, 0xA5 },
+	{ 0xC0, 0xA5 },
+	{ 0xD5, 0x3F },
+};
+
+static inline
+struct kingdisplay_panel *to_kingdisplay_panel(struct drm_panel *panel)
+{
+	return container_of(panel, struct kingdisplay_panel, base);
+}
+
+static int kingdisplay_panel_disable(struct drm_panel *panel)
+{
+	struct kingdisplay_panel *kingdisplay = to_kingdisplay_panel(panel);
+	int err;
+
+	if (!kingdisplay->enabled)
+		return 0;
+
+	backlight_disable(kingdisplay->backlight);
+
+	err = mipi_dsi_dcs_set_display_off(kingdisplay->link);
+	if (err < 0)
+		DRM_DEV_ERROR(panel->dev, "failed to set display off: %d\n",
+			      err);
+
+	kingdisplay->enabled = false;
+
+	return 0;
+}
+
+static int kingdisplay_panel_unprepare(struct drm_panel *panel)
+{
+	struct kingdisplay_panel *kingdisplay = to_kingdisplay_panel(panel);
+	int err;
+
+	if (!kingdisplay->prepared)
+		return 0;
+
+	err = mipi_dsi_dcs_enter_sleep_mode(kingdisplay->link);
+	if (err < 0) {
+		DRM_DEV_ERROR(panel->dev, "failed to enter sleep mode: %d\n",
+			      err);
+		return err;
+	}
+
+	/* T15: 120ms */
+	msleep(120);
+
+	gpiod_set_value_cansleep(kingdisplay->enable_gpio, 0);
+
+	err = regulator_disable(kingdisplay->supply);
+	if (err < 0)
+		return err;
+
+	kingdisplay->prepared = false;
+
+	return 0;
+}
+
+static int kingdisplay_panel_prepare(struct drm_panel *panel)
+{
+	struct kingdisplay_panel *kingdisplay = to_kingdisplay_panel(panel);
+	int err, regulator_err;
+	unsigned int i;
+
+	if (kingdisplay->prepared)
+		return 0;
+
+	gpiod_set_value_cansleep(kingdisplay->enable_gpio, 0);
+
+	err = regulator_enable(kingdisplay->supply);
+	if (err < 0)
+		return err;
+
+	/* T2: 15ms */
+	usleep_range(15000, 16000);
+
+	gpiod_set_value_cansleep(kingdisplay->enable_gpio, 1);
+
+	/* T4: 15ms */
+	usleep_range(15000, 16000);
+
+	for (i = 0; i < ARRAY_SIZE(init_code); i++) {
+		err = mipi_dsi_generic_write(kingdisplay->link, &init_code[i],
+					sizeof(struct kingdisplay_panel_cmd));
+		if (err < 0) {
+			DRM_DEV_ERROR(panel->dev, "failed write init cmds: %d\n",
+				      err);
+			goto poweroff;
+		}
+	}
+
+	err = mipi_dsi_dcs_exit_sleep_mode(kingdisplay->link);
+	if (err < 0) {
+		DRM_DEV_ERROR(panel->dev, "failed to exit sleep mode: %d\n",
+			      err);
+		goto poweroff;
+	}
+
+	/* T6: 120ms */
+	msleep(120);
+
+	err = mipi_dsi_dcs_set_display_on(kingdisplay->link);
+	if (err < 0) {
+		DRM_DEV_ERROR(panel->dev, "failed to set display on: %d\n",
+			      err);
+		goto poweroff;
+	}
+
+	/* T7: 10ms */
+	usleep_range(10000, 11000);
+
+	kingdisplay->prepared = true;
+
+	return 0;
+
+poweroff:
+	gpiod_set_value_cansleep(kingdisplay->enable_gpio, 0);
+
+	regulator_err = regulator_disable(kingdisplay->supply);
+	if (regulator_err)
+		DRM_DEV_ERROR(panel->dev, "failed to disable regulator: %d\n",
+			      regulator_err);
+
+	return err;
+}
+
+static int kingdisplay_panel_enable(struct drm_panel *panel)
+{
+	struct kingdisplay_panel *kingdisplay = to_kingdisplay_panel(panel);
+	int ret;
+
+	if (kingdisplay->enabled)
+		return 0;
+
+	ret = backlight_enable(kingdisplay->backlight);
+	if (ret) {
+		DRM_DEV_ERROR(panel->drm->dev,
+			      "Failed to enable backlight %d\n", ret);
+		return ret;
+	}
+
+	kingdisplay->enabled = true;
+
+	return 0;
+}
+
+static const struct drm_display_mode default_mode = {
+	.clock = 229000,
+	.hdisplay = 1536,
+	.hsync_start = 1536 + 100,
+	.hsync_end = 1536 + 100 + 24,
+	.htotal = 1536 + 100 + 24 + 100,
+	.vdisplay = 2048,
+	.vsync_start = 2048 + 95,
+	.vsync_end = 2048 + 95 + 2,
+	.vtotal = 2048 + 95 + 2 + 23,
+	.vrefresh = 60,
+};
+
+static int kingdisplay_panel_get_modes(struct drm_panel *panel)
+{
+	struct drm_display_mode *mode;
+
+	mode = drm_mode_duplicate(panel->drm, &default_mode);
+	if (!mode) {
+		DRM_DEV_ERROR(panel->drm->dev, "failed to add mode %ux%ux@%u\n",
+			      default_mode.hdisplay, default_mode.vdisplay,
+			      default_mode.vrefresh);
+		return -ENOMEM;
+	}
+
+	drm_mode_set_name(mode);
+
+	drm_mode_probed_add(panel->connector, mode);
+
+	panel->connector->display_info.width_mm = 147;
+	panel->connector->display_info.height_mm = 196;
+	panel->connector->display_info.bpc = 8;
+
+	return 1;
+}
+
+static const struct drm_panel_funcs kingdisplay_panel_funcs = {
+	.disable = kingdisplay_panel_disable,
+	.unprepare = kingdisplay_panel_unprepare,
+	.prepare = kingdisplay_panel_prepare,
+	.enable = kingdisplay_panel_enable,
+	.get_modes = kingdisplay_panel_get_modes,
+};
+
+static const struct of_device_id kingdisplay_of_match[] = {
+	{ .compatible = "kingdisplay,kd097d04", },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, kingdisplay_of_match);
+
+static int kingdisplay_panel_add(struct kingdisplay_panel *kingdisplay)
+{
+	struct device *dev = &kingdisplay->link->dev;
+	int err;
+
+	kingdisplay->supply = devm_regulator_get(dev, "power");
+	if (IS_ERR(kingdisplay->supply))
+		return PTR_ERR(kingdisplay->supply);
+
+	kingdisplay->enable_gpio = devm_gpiod_get_optional(dev, "enable",
+							   GPIOD_OUT_HIGH);
+	if (IS_ERR(kingdisplay->enable_gpio)) {
+		err = PTR_ERR(kingdisplay->enable_gpio);
+		dev_dbg(dev, "failed to get enable gpio: %d\n", err);
+		kingdisplay->enable_gpio = NULL;
+	}
+
+	kingdisplay->backlight = devm_of_find_backlight(dev);
+	if (IS_ERR(kingdisplay->backlight))
+		return PTR_ERR(kingdisplay->backlight);
+
+	drm_panel_init(&kingdisplay->base);
+	kingdisplay->base.funcs = &kingdisplay_panel_funcs;
+	kingdisplay->base.dev = &kingdisplay->link->dev;
+
+	return drm_panel_add(&kingdisplay->base);
+}
+
+static void kingdisplay_panel_del(struct kingdisplay_panel *kingdisplay)
+{
+	drm_panel_remove(&kingdisplay->base);
+}
+
+static int kingdisplay_panel_probe(struct mipi_dsi_device *dsi)
+{
+	struct kingdisplay_panel *kingdisplay;
+	int err;
+
+	dsi->lanes = 4;
+	dsi->format = MIPI_DSI_FMT_RGB888;
+	dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
+			  MIPI_DSI_MODE_LPM;
+
+	kingdisplay = devm_kzalloc(&dsi->dev, sizeof(*kingdisplay), GFP_KERNEL);
+	if (!kingdisplay)
+		return -ENOMEM;
+
+	mipi_dsi_set_drvdata(dsi, kingdisplay);
+	kingdisplay->link = dsi;
+
+	err = kingdisplay_panel_add(kingdisplay);
+	if (err < 0)
+		return err;
+
+	return mipi_dsi_attach(dsi);
+}
+
+static int kingdisplay_panel_remove(struct mipi_dsi_device *dsi)
+{
+	struct kingdisplay_panel *kingdisplay = mipi_dsi_get_drvdata(dsi);
+	int err;
+
+	err = kingdisplay_panel_unprepare(&kingdisplay->base);
+	if (err < 0)
+		DRM_DEV_ERROR(&dsi->dev, "failed to unprepare panel: %d\n",
+			      err);
+
+	err = kingdisplay_panel_disable(&kingdisplay->base);
+	if (err < 0)
+		DRM_DEV_ERROR(&dsi->dev, "failed to disable panel: %d\n", err);
+
+	err = mipi_dsi_detach(dsi);
+	if (err < 0)
+		DRM_DEV_ERROR(&dsi->dev, "failed to detach from DSI host: %d\n",
+			      err);
+
+	kingdisplay_panel_del(kingdisplay);
+
+	return 0;
+}
+
+static void kingdisplay_panel_shutdown(struct mipi_dsi_device *dsi)
+{
+	struct kingdisplay_panel *kingdisplay = mipi_dsi_get_drvdata(dsi);
+
+	kingdisplay_panel_unprepare(&kingdisplay->base);
+	kingdisplay_panel_disable(&kingdisplay->base);
+}
+
+static struct mipi_dsi_driver kingdisplay_panel_driver = {
+	.driver = {
+		.name = "panel-kingdisplay-kd097d04",
+		.of_match_table = kingdisplay_of_match,
+	},
+	.probe = kingdisplay_panel_probe,
+	.remove = kingdisplay_panel_remove,
+	.shutdown = kingdisplay_panel_shutdown,
+};
+module_mipi_dsi_driver(kingdisplay_panel_driver);
+
+MODULE_AUTHOR("Chris Zhong <zyw@rock-chips.com>");
+MODULE_AUTHOR("Nickey Yang <nickey.yang@rock-chips.com>");
+MODULE_DESCRIPTION("kingdisplay KD097D04 panel driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c
index 9c69e739a524..9e8218f6a3f2 100644
--- a/drivers/gpu/drm/panel/panel-simple.c
+++ b/drivers/gpu/drm/panel/panel-simple.c
@@ -1597,6 +1597,30 @@ static const struct panel_desc kyo_tcg121xglp = {
 	.bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_SPWG,
 };
 
+static const struct drm_display_mode lemaker_bl035_rgb_002_mode = {
+	.clock = 7000,
+	.hdisplay = 320,
+	.hsync_start = 320 + 20,
+	.hsync_end = 320 + 20 + 30,
+	.htotal = 320 + 20 + 30 + 38,
+	.vdisplay = 240,
+	.vsync_start = 240 + 4,
+	.vsync_end = 240 + 4 + 3,
+	.vtotal = 240 + 4 + 3 + 15,
+	.vrefresh = 60,
+};
+
+static const struct panel_desc lemaker_bl035_rgb_002 = {
+	.modes = &lemaker_bl035_rgb_002_mode,
+	.num_modes = 1,
+	.size = {
+		.width = 70,
+		.height = 52,
+	},
+	.bus_format = MEDIA_BUS_FMT_RGB888_1X24,
+	.bus_flags = DRM_BUS_FLAG_DE_LOW,
+};
+
 static const struct drm_display_mode lg_lb070wv8_mode = {
 	.clock = 33246,
 	.hdisplay = 800,
@@ -2008,6 +2032,30 @@ static const struct panel_desc ortustech_com43h4m85ulc = {
 	.bus_flags = DRM_BUS_FLAG_DE_HIGH | DRM_BUS_FLAG_PIXDATA_POSEDGE,
 };
 
+static const struct drm_display_mode pda_91_00156_a0_mode = {
+	.clock = 33300,
+	.hdisplay = 800,
+	.hsync_start = 800 + 1,
+	.hsync_end = 800 + 1 + 64,
+	.htotal = 800 + 1 + 64 + 64,
+	.vdisplay = 480,
+	.vsync_start = 480 + 1,
+	.vsync_end = 480 + 1 + 23,
+	.vtotal = 480 + 1 + 23 + 22,
+	.vrefresh = 60,
+};
+
+static const struct panel_desc pda_91_00156_a0  = {
+	.modes = &pda_91_00156_a0_mode,
+	.num_modes = 1,
+	.size = {
+		.width = 152,
+		.height = 91,
+	},
+	.bus_format = MEDIA_BUS_FMT_RGB888_1X24,
+};
+
+
 static const struct drm_display_mode qd43003c0_40_mode = {
 	.clock = 9000,
 	.hdisplay = 480,
@@ -2638,6 +2686,9 @@ static const struct of_device_id platform_of_match[] = {
 		.compatible = "kyo,tcg121xglp",
 		.data = &kyo_tcg121xglp,
 	}, {
+		.compatible = "lemaker,bl035-rgb-002",
+		.data = &lemaker_bl035_rgb_002,
+	}, {
 		.compatible = "lg,lb070wv8",
 		.data = &lg_lb070wv8,
 	}, {
@@ -2686,6 +2737,9 @@ static const struct of_device_id platform_of_match[] = {
 		.compatible = "ortustech,com43h4m85ulc",
 		.data = &ortustech_com43h4m85ulc,
 	}, {
+		.compatible = "pda,91-00156-a0",
+		.data = &pda_91_00156_a0,
+	}, {
 		.compatible = "qiaodian,qd43003c0-40",
 		.data = &qd43003c0_40,
 	}, {
diff --git a/drivers/gpu/drm/panel/panel-sitronix-st7701.c b/drivers/gpu/drm/panel/panel-sitronix-st7701.c
new file mode 100644
index 000000000000..63f9a1c7fb1b
--- /dev/null
+++ b/drivers/gpu/drm/panel/panel-sitronix-st7701.c
@@ -0,0 +1,426 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2019, Amarula Solutions.
+ * Author: Jagan Teki <jagan@amarulasolutions.com>
+ */
+
+#include <drm/drm_mipi_dsi.h>
+#include <drm/drm_modes.h>
+#include <drm/drm_panel.h>
+#include <drm/drm_print.h>
+
+#include <linux/backlight.h>
+#include <linux/gpio/consumer.h>
+#include <linux/delay.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/regulator/consumer.h>
+
+#include <video/mipi_display.h>
+
+/* Command2 BKx selection command */
+#define DSI_CMD2BKX_SEL			0xFF
+
+/* Command2, BK0 commands */
+#define DSI_CMD2_BK0_PVGAMCTRL		0xB0 /* Positive Voltage Gamma Control */
+#define DSI_CMD2_BK0_NVGAMCTRL		0xB1 /* Negative Voltage Gamma Control */
+#define DSI_CMD2_BK0_LNESET		0xC0 /* Display Line setting */
+#define DSI_CMD2_BK0_PORCTRL		0xC1 /* Porch control */
+#define DSI_CMD2_BK0_INVSEL		0xC2 /* Inversion selection, Frame Rate Control */
+
+/* Command2, BK1 commands */
+#define DSI_CMD2_BK1_VRHS		0xB0 /* Vop amplitude setting */
+#define DSI_CMD2_BK1_VCOM		0xB1 /* VCOM amplitude setting */
+#define DSI_CMD2_BK1_VGHSS		0xB2 /* VGH Voltage setting */
+#define DSI_CMD2_BK1_TESTCMD		0xB3 /* TEST Command Setting */
+#define DSI_CMD2_BK1_VGLS		0xB5 /* VGL Voltage setting */
+#define DSI_CMD2_BK1_PWCTLR1		0xB7 /* Power Control 1 */
+#define DSI_CMD2_BK1_PWCTLR2		0xB8 /* Power Control 2 */
+#define DSI_CMD2_BK1_SPD1		0xC1 /* Source pre_drive timing set1 */
+#define DSI_CMD2_BK1_SPD2		0xC2 /* Source EQ2 Setting */
+#define DSI_CMD2_BK1_MIPISET1		0xD0 /* MIPI Setting 1 */
+
+/**
+ * Command2 with BK function selection.
+ *
+ * BIT[4, 0]: [CN2, BKXSEL]
+ * 10 = CMD2BK0, Command2 BK0
+ * 11 = CMD2BK1, Command2 BK1
+ * 00 = Command2 disable
+ */
+#define DSI_CMD2BK1_SEL			0x11
+#define DSI_CMD2BK0_SEL			0x10
+#define DSI_CMD2BKX_SEL_NONE		0x00
+
+/* Command2, BK0 bytes */
+#define DSI_LINESET_LINE		0x69
+#define DSI_LINESET_LDE_EN		BIT(7)
+#define DSI_LINESET_LINEDELTA		GENMASK(1, 0)
+#define DSI_CMD2_BK0_LNESET_B1		DSI_LINESET_LINEDELTA
+#define DSI_CMD2_BK0_LNESET_B0		(DSI_LINESET_LDE_EN | DSI_LINESET_LINE)
+#define DSI_INVSEL_DEFAULT		GENMASK(5, 4)
+#define DSI_INVSEL_NLINV		GENMASK(2, 0)
+#define DSI_INVSEL_RTNI			GENMASK(2, 1)
+#define DSI_CMD2_BK0_INVSEL_B1		DSI_INVSEL_RTNI
+#define DSI_CMD2_BK0_INVSEL_B0		(DSI_INVSEL_DEFAULT | DSI_INVSEL_NLINV)
+#define DSI_CMD2_BK0_PORCTRL_B0(m)	((m)->vtotal - (m)->vsync_end)
+#define DSI_CMD2_BK0_PORCTRL_B1(m)	((m)->vsync_start - (m)->vdisplay)
+
+/* Command2, BK1 bytes */
+#define DSI_CMD2_BK1_VRHA_SET		0x45
+#define DSI_CMD2_BK1_VCOM_SET		0x13
+#define DSI_CMD2_BK1_VGHSS_SET		GENMASK(2, 0)
+#define DSI_CMD2_BK1_TESTCMD_VAL	BIT(7)
+#define DSI_VGLS_DEFAULT		BIT(6)
+#define DSI_VGLS_SEL			GENMASK(2, 0)
+#define DSI_CMD2_BK1_VGLS_SET		(DSI_VGLS_DEFAULT | DSI_VGLS_SEL)
+#define DSI_PWCTLR1_AP			BIT(7) /* Gamma OP bias, max */
+#define DSI_PWCTLR1_APIS		BIT(2) /* Source OP input bias, min */
+#define DSI_PWCTLR1_APOS		BIT(0) /* Source OP output bias, min */
+#define DSI_CMD2_BK1_PWCTLR1_SET	(DSI_PWCTLR1_AP | DSI_PWCTLR1_APIS | \
+					DSI_PWCTLR1_APOS)
+#define DSI_PWCTLR2_AVDD		BIT(5) /* AVDD 6.6v */
+#define DSI_PWCTLR2_AVCL		0x0    /* AVCL -4.4v */
+#define DSI_CMD2_BK1_PWCTLR2_SET	(DSI_PWCTLR2_AVDD | DSI_PWCTLR2_AVCL)
+#define DSI_SPD1_T2D			BIT(3)
+#define DSI_CMD2_BK1_SPD1_SET		(GENMASK(6, 4) | DSI_SPD1_T2D)
+#define DSI_CMD2_BK1_SPD2_SET		DSI_CMD2_BK1_SPD1_SET
+#define DSI_MIPISET1_EOT_EN		BIT(3)
+#define DSI_CMD2_BK1_MIPISET1_SET	(BIT(7) | DSI_MIPISET1_EOT_EN)
+
+struct st7701_panel_desc {
+	const struct drm_display_mode *mode;
+	unsigned int lanes;
+	unsigned long flags;
+	enum mipi_dsi_pixel_format format;
+	const char *const *supply_names;
+	unsigned int num_supplies;
+	unsigned int panel_sleep_delay;
+};
+
+struct st7701 {
+	struct drm_panel panel;
+	struct mipi_dsi_device *dsi;
+	const struct st7701_panel_desc *desc;
+
+	struct backlight_device *backlight;
+	struct regulator_bulk_data *supplies;
+	struct gpio_desc *reset;
+	unsigned int sleep_delay;
+};
+
+static inline struct st7701 *panel_to_st7701(struct drm_panel *panel)
+{
+	return container_of(panel, struct st7701, panel);
+}
+
+static inline int st7701_dsi_write(struct st7701 *st7701, const void *seq,
+				   size_t len)
+{
+	return mipi_dsi_dcs_write_buffer(st7701->dsi, seq, len);
+}
+
+#define ST7701_DSI(st7701, seq...)				\
+	{							\
+		const u8 d[] = { seq };				\
+		st7701_dsi_write(st7701, d, ARRAY_SIZE(d));	\
+	}
+
+static void st7701_init_sequence(struct st7701 *st7701)
+{
+	const struct drm_display_mode *mode = st7701->desc->mode;
+
+	ST7701_DSI(st7701, MIPI_DCS_SOFT_RESET, 0x00);
+
+	/* We need to wait 5ms before sending new commands */
+	msleep(5);
+
+	ST7701_DSI(st7701, MIPI_DCS_EXIT_SLEEP_MODE, 0x00);
+
+	msleep(st7701->sleep_delay);
+
+	/* Command2, BK0 */
+	ST7701_DSI(st7701, DSI_CMD2BKX_SEL,
+		   0x77, 0x01, 0x00, 0x00, DSI_CMD2BK0_SEL);
+	ST7701_DSI(st7701, DSI_CMD2_BK0_PVGAMCTRL, 0x00, 0x0E, 0x15, 0x0F,
+		   0x11, 0x08, 0x08, 0x08, 0x08, 0x23, 0x04, 0x13, 0x12,
+		   0x2B, 0x34, 0x1F);
+	ST7701_DSI(st7701, DSI_CMD2_BK0_NVGAMCTRL, 0x00, 0x0E, 0x95, 0x0F,
+		   0x13, 0x07, 0x09, 0x08, 0x08, 0x22, 0x04, 0x10, 0x0E,
+		   0x2C, 0x34, 0x1F);
+	ST7701_DSI(st7701, DSI_CMD2_BK0_LNESET,
+		   DSI_CMD2_BK0_LNESET_B0, DSI_CMD2_BK0_LNESET_B1);
+	ST7701_DSI(st7701, DSI_CMD2_BK0_PORCTRL,
+		   DSI_CMD2_BK0_PORCTRL_B0(mode),
+		   DSI_CMD2_BK0_PORCTRL_B1(mode));
+	ST7701_DSI(st7701, DSI_CMD2_BK0_INVSEL,
+		   DSI_CMD2_BK0_INVSEL_B0, DSI_CMD2_BK0_INVSEL_B1);
+
+	/* Command2, BK1 */
+	ST7701_DSI(st7701, DSI_CMD2BKX_SEL,
+			0x77, 0x01, 0x00, 0x00, DSI_CMD2BK1_SEL);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_VRHS, DSI_CMD2_BK1_VRHA_SET);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_VCOM, DSI_CMD2_BK1_VCOM_SET);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_VGHSS, DSI_CMD2_BK1_VGHSS_SET);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_TESTCMD, DSI_CMD2_BK1_TESTCMD_VAL);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_VGLS, DSI_CMD2_BK1_VGLS_SET);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_PWCTLR1, DSI_CMD2_BK1_PWCTLR1_SET);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_PWCTLR2, DSI_CMD2_BK1_PWCTLR2_SET);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_SPD1, DSI_CMD2_BK1_SPD1_SET);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_SPD2, DSI_CMD2_BK1_SPD2_SET);
+	ST7701_DSI(st7701, DSI_CMD2_BK1_MIPISET1, DSI_CMD2_BK1_MIPISET1_SET);
+
+	/**
+	 * ST7701_SPEC_V1.2 is unable to provide enough information above this
+	 * specific command sequence, so grab the same from vendor BSP driver.
+	 */
+	ST7701_DSI(st7701, 0xE0, 0x00, 0x00, 0x02);
+	ST7701_DSI(st7701, 0xE1, 0x0B, 0x00, 0x0D, 0x00, 0x0C, 0x00, 0x0E,
+		   0x00, 0x00, 0x44, 0x44);
+	ST7701_DSI(st7701, 0xE2, 0x33, 0x33, 0x44, 0x44, 0x64, 0x00, 0x66,
+		   0x00, 0x65, 0x00, 0x67, 0x00, 0x00);
+	ST7701_DSI(st7701, 0xE3, 0x00, 0x00, 0x33, 0x33);
+	ST7701_DSI(st7701, 0xE4, 0x44, 0x44);
+	ST7701_DSI(st7701, 0xE5, 0x0C, 0x78, 0x3C, 0xA0, 0x0E, 0x78, 0x3C,
+		   0xA0, 0x10, 0x78, 0x3C, 0xA0, 0x12, 0x78, 0x3C, 0xA0);
+	ST7701_DSI(st7701, 0xE6, 0x00, 0x00, 0x33, 0x33);
+	ST7701_DSI(st7701, 0xE7, 0x44, 0x44);
+	ST7701_DSI(st7701, 0xE8, 0x0D, 0x78, 0x3C, 0xA0, 0x0F, 0x78, 0x3C,
+		   0xA0, 0x11, 0x78, 0x3C, 0xA0, 0x13, 0x78, 0x3C, 0xA0);
+	ST7701_DSI(st7701, 0xEB, 0x02, 0x02, 0x39, 0x39, 0xEE, 0x44, 0x00);
+	ST7701_DSI(st7701, 0xEC, 0x00, 0x00);
+	ST7701_DSI(st7701, 0xED, 0xFF, 0xF1, 0x04, 0x56, 0x72, 0x3F, 0xFF,
+		   0xFF, 0xFF, 0xFF, 0xF3, 0x27, 0x65, 0x40, 0x1F, 0xFF);
+
+	/* disable Command2 */
+	ST7701_DSI(st7701, DSI_CMD2BKX_SEL,
+		   0x77, 0x01, 0x00, 0x00, DSI_CMD2BKX_SEL_NONE);
+}
+
+static int st7701_prepare(struct drm_panel *panel)
+{
+	struct st7701 *st7701 = panel_to_st7701(panel);
+	int ret;
+
+	gpiod_set_value(st7701->reset, 0);
+
+	ret = regulator_bulk_enable(st7701->desc->num_supplies,
+				    st7701->supplies);
+	if (ret < 0)
+		return ret;
+	msleep(20);
+
+	gpiod_set_value(st7701->reset, 1);
+	msleep(150);
+
+	st7701_init_sequence(st7701);
+
+	return 0;
+}
+
+static int st7701_enable(struct drm_panel *panel)
+{
+	struct st7701 *st7701 = panel_to_st7701(panel);
+
+	ST7701_DSI(st7701, MIPI_DCS_SET_DISPLAY_ON, 0x00);
+	backlight_enable(st7701->backlight);
+
+	return 0;
+}
+
+static int st7701_disable(struct drm_panel *panel)
+{
+	struct st7701 *st7701 = panel_to_st7701(panel);
+
+	backlight_disable(st7701->backlight);
+	ST7701_DSI(st7701, MIPI_DCS_SET_DISPLAY_OFF, 0x00);
+
+	return 0;
+}
+
+static int st7701_unprepare(struct drm_panel *panel)
+{
+	struct st7701 *st7701 = panel_to_st7701(panel);
+
+	ST7701_DSI(st7701, MIPI_DCS_ENTER_SLEEP_MODE, 0x00);
+
+	msleep(st7701->sleep_delay);
+
+	gpiod_set_value(st7701->reset, 0);
+
+	/**
+	 * During the Resetting period, the display will be blanked
+	 * (The display is entering blanking sequence, which maximum
+	 * time is 120 ms, when Reset Starts in Sleep Out –mode. The
+	 * display remains the blank state in Sleep In –mode.) and
+	 * then return to Default condition for Hardware Reset.
+	 *
+	 * So we need wait sleep_delay time to make sure reset completed.
+	 */
+	msleep(st7701->sleep_delay);
+
+	regulator_bulk_disable(st7701->desc->num_supplies, st7701->supplies);
+
+	return 0;
+}
+
+static int st7701_get_modes(struct drm_panel *panel)
+{
+	struct st7701 *st7701 = panel_to_st7701(panel);
+	const struct drm_display_mode *desc_mode = st7701->desc->mode;
+	struct drm_display_mode *mode;
+
+	mode = drm_mode_duplicate(panel->drm, desc_mode);
+	if (!mode) {
+		DRM_DEV_ERROR(&st7701->dsi->dev,
+			      "failed to add mode %ux%ux@%u\n",
+			      desc_mode->hdisplay, desc_mode->vdisplay,
+			      desc_mode->vrefresh);
+		return -ENOMEM;
+	}
+
+	drm_mode_set_name(mode);
+	drm_mode_probed_add(panel->connector, mode);
+
+	panel->connector->display_info.width_mm = desc_mode->width_mm;
+	panel->connector->display_info.height_mm = desc_mode->height_mm;
+
+	return 1;
+}
+
+static const struct drm_panel_funcs st7701_funcs = {
+	.disable	= st7701_disable,
+	.unprepare	= st7701_unprepare,
+	.prepare	= st7701_prepare,
+	.enable		= st7701_enable,
+	.get_modes	= st7701_get_modes,
+};
+
+static const struct drm_display_mode ts8550b_mode = {
+	.clock		= 27500,
+
+	.hdisplay	= 480,
+	.hsync_start	= 480 + 38,
+	.hsync_end	= 480 + 38 + 12,
+	.htotal		= 480 + 38 + 12 + 12,
+
+	.vdisplay	= 854,
+	.vsync_start	= 854 + 4,
+	.vsync_end	= 854 + 4 + 8,
+	.vtotal		= 854 + 4 + 8 + 18,
+
+	.width_mm	= 69,
+	.height_mm	= 139,
+
+	.type = DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED,
+};
+
+static const char * const ts8550b_supply_names[] = {
+	"VCC",
+	"IOVCC",
+};
+
+static const struct st7701_panel_desc ts8550b_desc = {
+	.mode = &ts8550b_mode,
+	.lanes = 2,
+	.flags = MIPI_DSI_MODE_VIDEO,
+	.format = MIPI_DSI_FMT_RGB888,
+	.supply_names = ts8550b_supply_names,
+	.num_supplies = ARRAY_SIZE(ts8550b_supply_names),
+	.panel_sleep_delay = 80, /* panel need extra 80ms for sleep out cmd */
+};
+
+static int st7701_dsi_probe(struct mipi_dsi_device *dsi)
+{
+	const struct st7701_panel_desc *desc;
+	struct st7701 *st7701;
+	int ret, i;
+
+	st7701 = devm_kzalloc(&dsi->dev, sizeof(*st7701), GFP_KERNEL);
+	if (!st7701)
+		return -ENOMEM;
+
+	desc = of_device_get_match_data(&dsi->dev);
+	dsi->mode_flags = desc->flags;
+	dsi->format = desc->format;
+	dsi->lanes = desc->lanes;
+
+	st7701->supplies = devm_kcalloc(&dsi->dev, desc->num_supplies,
+					sizeof(*st7701->supplies),
+					GFP_KERNEL);
+	if (!st7701->supplies)
+		return -ENOMEM;
+
+	for (i = 0; i < desc->num_supplies; i++)
+		st7701->supplies[i].supply = desc->supply_names[i];
+
+	ret = devm_regulator_bulk_get(&dsi->dev, desc->num_supplies,
+				      st7701->supplies);
+	if (ret < 0)
+		return ret;
+
+	st7701->reset = devm_gpiod_get(&dsi->dev, "reset", GPIOD_OUT_LOW);
+	if (IS_ERR(st7701->reset)) {
+		DRM_DEV_ERROR(&dsi->dev, "Couldn't get our reset GPIO\n");
+		return PTR_ERR(st7701->reset);
+	}
+
+	st7701->backlight = devm_of_find_backlight(&dsi->dev);
+	if (IS_ERR(st7701->backlight))
+		return PTR_ERR(st7701->backlight);
+
+	drm_panel_init(&st7701->panel);
+
+	/**
+	 * Once sleep out has been issued, ST7701 IC required to wait 120ms
+	 * before initiating new commands.
+	 *
+	 * On top of that some panels might need an extra delay to wait, so
+	 * add panel specific delay for those cases. As now this panel specific
+	 * delay information is referenced from those panel BSP driver, example
+	 * ts8550b and there is no valid documentation for that.
+	 */
+	st7701->sleep_delay = 120 + desc->panel_sleep_delay;
+	st7701->panel.funcs = &st7701_funcs;
+	st7701->panel.dev = &dsi->dev;
+
+	ret = drm_panel_add(&st7701->panel);
+	if (ret < 0)
+		return ret;
+
+	mipi_dsi_set_drvdata(dsi, st7701);
+	st7701->dsi = dsi;
+	st7701->desc = desc;
+
+	return mipi_dsi_attach(dsi);
+}
+
+static int st7701_dsi_remove(struct mipi_dsi_device *dsi)
+{
+	struct st7701 *st7701 = mipi_dsi_get_drvdata(dsi);
+
+	mipi_dsi_detach(dsi);
+	drm_panel_remove(&st7701->panel);
+
+	return 0;
+}
+
+static const struct of_device_id st7701_of_match[] = {
+	{ .compatible = "techstar,ts8550b", .data = &ts8550b_desc },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, st7701_of_match);
+
+static struct mipi_dsi_driver st7701_dsi_driver = {
+	.probe		= st7701_dsi_probe,
+	.remove		= st7701_dsi_remove,
+	.driver = {
+		.name		= "st7701",
+		.of_match_table	= st7701_of_match,
+	},
+};
+module_mipi_dsi_driver(st7701_dsi_driver);
+
+MODULE_AUTHOR("Jagan Teki <jagan@amarulasolutions.com>");
+MODULE_DESCRIPTION("Sitronix ST7701 LCD Panel Driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/gpu/drm/panel/panel-tpo-tpg110.c b/drivers/gpu/drm/panel/panel-tpo-tpg110.c
new file mode 100644
index 000000000000..5a9f8f4d5d24
--- /dev/null
+++ b/drivers/gpu/drm/panel/panel-tpo-tpg110.c
@@ -0,0 +1,496 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Panel driver for the TPO TPG110 400CH LTPS TFT LCD Single Chip
+ * Digital Driver.
+ *
+ * This chip drives a TFT LCD, so it does not know what kind of
+ * display is actually connected to it, so the width and height of that
+ * display needs to be supplied from the machine configuration.
+ *
+ * Author:
+ * Linus Walleij <linus.walleij@linaro.org>
+ */
+#include <drm/drm_modes.h>
+#include <drm/drm_panel.h>
+#include <drm/drm_print.h>
+
+#include <linux/backlight.h>
+#include <linux/bitops.h>
+#include <linux/delay.h>
+#include <linux/gpio/consumer.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/spi/spi.h>
+
+#define TPG110_TEST			0x00
+#define TPG110_CHIPID			0x01
+#define TPG110_CTRL1			0x02
+#define TPG110_RES_MASK			GENMASK(2, 0)
+#define TPG110_RES_800X480		0x07
+#define TPG110_RES_640X480		0x06
+#define TPG110_RES_480X272		0x05
+#define TPG110_RES_480X640		0x04
+#define TPG110_RES_480X272_D		0x01 /* Dual scan: outputs 800x480 */
+#define TPG110_RES_400X240_D		0x00 /* Dual scan: outputs 800x480 */
+#define TPG110_CTRL2			0x03
+#define TPG110_CTRL2_PM			BIT(0)
+#define TPG110_CTRL2_RES_PM_CTRL	BIT(7)
+
+/**
+ * struct tpg110_panel_mode - lookup struct for the supported modes
+ */
+struct tpg110_panel_mode {
+	/**
+	 * @name: the name of this panel
+	 */
+	const char *name;
+	/**
+	 * @magic: the magic value from the detection register
+	 */
+	u32 magic;
+	/**
+	 * @mode: the DRM display mode for this panel
+	 */
+	struct drm_display_mode mode;
+	/**
+	 * @bus_flags: the DRM bus flags for this panel e.g. inverted clock
+	 */
+	u32 bus_flags;
+};
+
+/**
+ * struct tpg110 - state container for the TPG110 panel
+ */
+struct tpg110 {
+	/**
+	 * @dev: the container device
+	 */
+	struct device *dev;
+	/**
+	 * @spi: the corresponding SPI device
+	 */
+	struct spi_device *spi;
+	/**
+	 * @panel: the DRM panel instance for this device
+	 */
+	struct drm_panel panel;
+	/**
+	 * @backlight: backlight for this panel
+	 */
+	struct backlight_device *backlight;
+	/**
+	 * @panel_type: the panel mode as detected
+	 */
+	const struct tpg110_panel_mode *panel_mode;
+	/**
+	 * @width: the width of this panel in mm
+	 */
+	u32 width;
+	/**
+	 * @height: the height of this panel in mm
+	 */
+	u32 height;
+	/**
+	 * @grestb: reset GPIO line
+	 */
+	struct gpio_desc *grestb;
+};
+
+/*
+ * TPG110 modes, these are the simple modes, the dualscan modes that
+ * take 400x240 or 480x272 in and display as 800x480 are not listed.
+ */
+static const struct tpg110_panel_mode tpg110_modes[] = {
+	{
+		.name = "800x480 RGB",
+		.magic = TPG110_RES_800X480,
+		.mode = {
+			.clock = 33200,
+			.hdisplay = 800,
+			.hsync_start = 800 + 40,
+			.hsync_end = 800 + 40 + 1,
+			.htotal = 800 + 40 + 1 + 216,
+			.vdisplay = 480,
+			.vsync_start = 480 + 10,
+			.vsync_end = 480 + 10 + 1,
+			.vtotal = 480 + 10 + 1 + 35,
+			.vrefresh = 60,
+		},
+		.bus_flags = DRM_BUS_FLAG_PIXDATA_POSEDGE,
+	},
+	{
+		.name = "640x480 RGB",
+		.magic = TPG110_RES_640X480,
+		.mode = {
+			.clock = 25200,
+			.hdisplay = 640,
+			.hsync_start = 640 + 24,
+			.hsync_end = 640 + 24 + 1,
+			.htotal = 640 + 24 + 1 + 136,
+			.vdisplay = 480,
+			.vsync_start = 480 + 18,
+			.vsync_end = 480 + 18 + 1,
+			.vtotal = 480 + 18 + 1 + 27,
+			.vrefresh = 60,
+		},
+		.bus_flags = DRM_BUS_FLAG_PIXDATA_POSEDGE,
+	},
+	{
+		.name = "480x272 RGB",
+		.magic = TPG110_RES_480X272,
+		.mode = {
+			.clock = 9000,
+			.hdisplay = 480,
+			.hsync_start = 480 + 2,
+			.hsync_end = 480 + 2 + 1,
+			.htotal = 480 + 2 + 1 + 43,
+			.vdisplay = 272,
+			.vsync_start = 272 + 2,
+			.vsync_end = 272 + 2 + 1,
+			.vtotal = 272 + 2 + 1 + 12,
+			.vrefresh = 60,
+		},
+		.bus_flags = DRM_BUS_FLAG_PIXDATA_POSEDGE,
+	},
+	{
+		.name = "480x640 RGB",
+		.magic = TPG110_RES_480X640,
+		.mode = {
+			.clock = 20500,
+			.hdisplay = 480,
+			.hsync_start = 480 + 2,
+			.hsync_end = 480 + 2 + 1,
+			.htotal = 480 + 2 + 1 + 43,
+			.vdisplay = 640,
+			.vsync_start = 640 + 4,
+			.vsync_end = 640 + 4 + 1,
+			.vtotal = 640 + 4 + 1 + 8,
+			.vrefresh = 60,
+		},
+		.bus_flags = DRM_BUS_FLAG_PIXDATA_POSEDGE,
+	},
+	{
+		.name = "400x240 RGB",
+		.magic = TPG110_RES_400X240_D,
+		.mode = {
+			.clock = 8300,
+			.hdisplay = 400,
+			.hsync_start = 400 + 20,
+			.hsync_end = 400 + 20 + 1,
+			.htotal = 400 + 20 + 1 + 108,
+			.vdisplay = 240,
+			.vsync_start = 240 + 2,
+			.vsync_end = 240 + 2 + 1,
+			.vtotal = 240 + 2 + 1 + 20,
+			.vrefresh = 60,
+		},
+		.bus_flags = DRM_BUS_FLAG_PIXDATA_POSEDGE,
+	},
+};
+
+static inline struct tpg110 *
+to_tpg110(struct drm_panel *panel)
+{
+	return container_of(panel, struct tpg110, panel);
+}
+
+static u8 tpg110_readwrite_reg(struct tpg110 *tpg, bool write,
+			       u8 address, u8 outval)
+{
+	struct spi_message m;
+	struct spi_transfer t[2];
+	u8 buf[2];
+	int ret;
+
+	spi_message_init(&m);
+	memset(t, 0, sizeof(t));
+
+	if (write) {
+		/*
+		 * Clear address bit 0, 1 when writing, just to be sure
+		 * The actual bit indicating a write here is bit 1, bit
+		 * 0 is just surplus to pad it up to 8 bits.
+		 */
+		buf[0] = address << 2;
+		buf[0] &= ~0x03;
+		buf[1] = outval;
+
+		t[0].bits_per_word = 8;
+		t[0].tx_buf = &buf[0];
+		t[0].len = 1;
+
+		t[1].tx_buf = &buf[1];
+		t[1].len = 1;
+		t[1].bits_per_word = 8;
+	} else {
+		/* Set address bit 0 to 1 to read */
+		buf[0] = address << 1;
+		buf[0] |= 0x01;
+
+		/*
+		 * The last bit/clock is Hi-Z turnaround cycle, so we need
+		 * to send only 7 bits here. The 8th bit is the high impedance
+		 * turn-around cycle.
+		 */
+		t[0].bits_per_word = 7;
+		t[0].tx_buf = &buf[0];
+		t[0].len = 1;
+
+		t[1].rx_buf = &buf[1];
+		t[1].len = 1;
+		t[1].bits_per_word = 8;
+	}
+
+	spi_message_add_tail(&t[0], &m);
+	spi_message_add_tail(&t[1], &m);
+	ret = spi_sync(tpg->spi, &m);
+	if (ret) {
+		DRM_DEV_ERROR(tpg->dev, "SPI message error %d\n", ret);
+		return ret;
+	}
+	if (write)
+		return 0;
+	/* Read */
+	return buf[1];
+}
+
+static u8 tpg110_read_reg(struct tpg110 *tpg, u8 address)
+{
+	return tpg110_readwrite_reg(tpg, false, address, 0);
+}
+
+static void tpg110_write_reg(struct tpg110 *tpg, u8 address, u8 outval)
+{
+	tpg110_readwrite_reg(tpg, true, address, outval);
+}
+
+static int tpg110_startup(struct tpg110 *tpg)
+{
+	u8 val;
+	int i;
+
+	/* De-assert the reset signal */
+	gpiod_set_value_cansleep(tpg->grestb, 0);
+	usleep_range(1000, 2000);
+	DRM_DEV_DEBUG(tpg->dev, "de-asserted GRESTB\n");
+
+	/* Test display communication */
+	tpg110_write_reg(tpg, TPG110_TEST, 0x55);
+	val = tpg110_read_reg(tpg, TPG110_TEST);
+	if (val != 0x55) {
+		DRM_DEV_ERROR(tpg->dev, "failed communication test\n");
+		return -ENODEV;
+	}
+
+	val = tpg110_read_reg(tpg, TPG110_CHIPID);
+	DRM_DEV_INFO(tpg->dev, "TPG110 chip ID: %d version: %d\n",
+		 val >> 4, val & 0x0f);
+
+	/* Show display resolution */
+	val = tpg110_read_reg(tpg, TPG110_CTRL1);
+	val &= TPG110_RES_MASK;
+	switch (val) {
+	case TPG110_RES_400X240_D:
+		DRM_DEV_INFO(tpg->dev,
+			 "IN 400x240 RGB -> OUT 800x480 RGB (dual scan)\n");
+		break;
+	case TPG110_RES_480X272_D:
+		DRM_DEV_INFO(tpg->dev,
+			 "IN 480x272 RGB -> OUT 800x480 RGB (dual scan)\n");
+		break;
+	case TPG110_RES_480X640:
+		DRM_DEV_INFO(tpg->dev, "480x640 RGB\n");
+		break;
+	case TPG110_RES_480X272:
+		DRM_DEV_INFO(tpg->dev, "480x272 RGB\n");
+		break;
+	case TPG110_RES_640X480:
+		DRM_DEV_INFO(tpg->dev, "640x480 RGB\n");
+		break;
+	case TPG110_RES_800X480:
+		DRM_DEV_INFO(tpg->dev, "800x480 RGB\n");
+		break;
+	default:
+		DRM_DEV_ERROR(tpg->dev, "ILLEGAL RESOLUTION 0x%02x\n", val);
+		break;
+	}
+
+	/* From the producer side, this is the same resolution */
+	if (val == TPG110_RES_480X272_D)
+		val = TPG110_RES_480X272;
+
+	for (i = 0; i < ARRAY_SIZE(tpg110_modes); i++) {
+		const struct tpg110_panel_mode *pm;
+
+		pm = &tpg110_modes[i];
+		if (pm->magic == val) {
+			tpg->panel_mode = pm;
+			break;
+		}
+	}
+	if (i == ARRAY_SIZE(tpg110_modes)) {
+		DRM_DEV_ERROR(tpg->dev, "unsupported mode (%02x) detected\n",
+			val);
+		return -ENODEV;
+	}
+
+	val = tpg110_read_reg(tpg, TPG110_CTRL2);
+	DRM_DEV_INFO(tpg->dev, "resolution and standby is controlled by %s\n",
+		 (val & TPG110_CTRL2_RES_PM_CTRL) ? "software" : "hardware");
+	/* Take control over resolution and standby */
+	val |= TPG110_CTRL2_RES_PM_CTRL;
+	tpg110_write_reg(tpg, TPG110_CTRL2, val);
+
+	return 0;
+}
+
+static int tpg110_disable(struct drm_panel *panel)
+{
+	struct tpg110 *tpg = to_tpg110(panel);
+	u8 val;
+
+	/* Put chip into standby */
+	val = tpg110_read_reg(tpg, TPG110_CTRL2_PM);
+	val &= ~TPG110_CTRL2_PM;
+	tpg110_write_reg(tpg, TPG110_CTRL2_PM, val);
+
+	backlight_disable(tpg->backlight);
+
+	return 0;
+}
+
+static int tpg110_enable(struct drm_panel *panel)
+{
+	struct tpg110 *tpg = to_tpg110(panel);
+	u8 val;
+
+	backlight_enable(tpg->backlight);
+
+	/* Take chip out of standby */
+	val = tpg110_read_reg(tpg, TPG110_CTRL2_PM);
+	val |= TPG110_CTRL2_PM;
+	tpg110_write_reg(tpg, TPG110_CTRL2_PM, val);
+
+	return 0;
+}
+
+/**
+ * tpg110_get_modes() - return the appropriate mode
+ * @panel: the panel to get the mode for
+ *
+ * This currently does not present a forest of modes, instead it
+ * presents the mode that is configured for the system under use,
+ * and which is detected by reading the registers of the display.
+ */
+static int tpg110_get_modes(struct drm_panel *panel)
+{
+	struct drm_connector *connector = panel->connector;
+	struct tpg110 *tpg = to_tpg110(panel);
+	struct drm_display_mode *mode;
+
+	strncpy(connector->display_info.name, tpg->panel_mode->name,
+		DRM_DISPLAY_INFO_LEN);
+	connector->display_info.width_mm = tpg->width;
+	connector->display_info.height_mm = tpg->height;
+	connector->display_info.bus_flags = tpg->panel_mode->bus_flags;
+
+	mode = drm_mode_duplicate(panel->drm, &tpg->panel_mode->mode);
+	drm_mode_set_name(mode);
+	mode->type = DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED;
+
+	mode->width_mm = tpg->width;
+	mode->height_mm = tpg->height;
+
+	drm_mode_probed_add(connector, mode);
+
+	return 1;
+}
+
+static const struct drm_panel_funcs tpg110_drm_funcs = {
+	.disable = tpg110_disable,
+	.enable = tpg110_enable,
+	.get_modes = tpg110_get_modes,
+};
+
+static int tpg110_probe(struct spi_device *spi)
+{
+	struct device *dev = &spi->dev;
+	struct device_node *np = dev->of_node;
+	struct tpg110 *tpg;
+	int ret;
+
+	tpg = devm_kzalloc(dev, sizeof(*tpg), GFP_KERNEL);
+	if (!tpg)
+		return -ENOMEM;
+	tpg->dev = dev;
+
+	/* We get the physical display dimensions from the DT */
+	ret = of_property_read_u32(np, "width-mm", &tpg->width);
+	if (ret)
+		DRM_DEV_ERROR(dev, "no panel width specified\n");
+	ret = of_property_read_u32(np, "height-mm", &tpg->height);
+	if (ret)
+		DRM_DEV_ERROR(dev, "no panel height specified\n");
+
+	/* Look for some optional backlight */
+	tpg->backlight = devm_of_find_backlight(dev);
+	if (IS_ERR(tpg->backlight))
+		return PTR_ERR(tpg->backlight);
+
+	/* This asserts the GRESTB signal, putting the display into reset */
+	tpg->grestb = devm_gpiod_get(dev, "grestb", GPIOD_OUT_HIGH);
+	if (IS_ERR(tpg->grestb)) {
+		DRM_DEV_ERROR(dev, "no GRESTB GPIO\n");
+		return -ENODEV;
+	}
+
+	spi->bits_per_word = 8;
+	spi->mode |= SPI_3WIRE_HIZ;
+	ret = spi_setup(spi);
+	if (ret < 0) {
+		DRM_DEV_ERROR(dev, "spi setup failed.\n");
+		return ret;
+	}
+	tpg->spi = spi;
+
+	ret = tpg110_startup(tpg);
+	if (ret)
+		return ret;
+
+	drm_panel_init(&tpg->panel);
+	tpg->panel.dev = dev;
+	tpg->panel.funcs = &tpg110_drm_funcs;
+	spi_set_drvdata(spi, tpg);
+
+	return drm_panel_add(&tpg->panel);
+}
+
+static int tpg110_remove(struct spi_device *spi)
+{
+	struct tpg110 *tpg = spi_get_drvdata(spi);
+
+	drm_panel_remove(&tpg->panel);
+	return 0;
+}
+
+static const struct of_device_id tpg110_match[] = {
+	{ .compatible = "tpo,tpg110", },
+	{},
+};
+MODULE_DEVICE_TABLE(of, tpg110_match);
+
+static struct spi_driver tpg110_driver = {
+	.probe		= tpg110_probe,
+	.remove		= tpg110_remove,
+	.driver		= {
+		.name	= "tpo-tpg110-panel",
+		.of_match_table = tpg110_match,
+	},
+};
+module_spi_driver(tpg110_driver);
+
+MODULE_AUTHOR("Linus Walleij <linus.walleij@linaro.org>");
+MODULE_DESCRIPTION("TPO TPG110 panel driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/gpu/drm/pl111/pl111_drv.c b/drivers/gpu/drm/pl111/pl111_drv.c
index 33e0483d62ae..a8958c201a88 100644
--- a/drivers/gpu/drm/pl111/pl111_drv.c
+++ b/drivers/gpu/drm/pl111/pl111_drv.c
@@ -64,14 +64,14 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_bridge.h>
+#include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
-#include <drm/drm_fb_helper.h>
-#include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_of.h>
-#include <drm/drm_bridge.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include "pl111_drm.h"
 #include "pl111_versatile.h"
diff --git a/drivers/gpu/drm/qxl/Makefile b/drivers/gpu/drm/qxl/Makefile
index 33a7d0c434b7..fc59d42b31af 100644
--- a/drivers/gpu/drm/qxl/Makefile
+++ b/drivers/gpu/drm/qxl/Makefile
@@ -2,6 +2,6 @@
 # Makefile for the drm device driver.  This driver provides support for the
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
-qxl-y := qxl_drv.o qxl_kms.o qxl_display.o qxl_ttm.o qxl_fb.o qxl_object.o qxl_gem.o qxl_cmd.o qxl_image.o qxl_draw.o qxl_debugfs.o qxl_irq.o qxl_dumb.o qxl_ioctl.o qxl_release.o qxl_prime.o
+qxl-y := qxl_drv.o qxl_kms.o qxl_display.o qxl_ttm.o qxl_object.o qxl_gem.o qxl_cmd.o qxl_image.o qxl_draw.o qxl_debugfs.o qxl_irq.o qxl_dumb.o qxl_ioctl.o qxl_release.o qxl_prime.o
 
 obj-$(CONFIG_DRM_QXL)+= qxl.o
diff --git a/drivers/gpu/drm/qxl/qxl_cmd.c b/drivers/gpu/drm/qxl/qxl_cmd.c
index dffc5093ff16..0a2e51af1230 100644
--- a/drivers/gpu/drm/qxl/qxl_cmd.c
+++ b/drivers/gpu/drm/qxl/qxl_cmd.c
@@ -25,6 +25,8 @@
 
 /* QXL cmd/ring handling */
 
+#include <drm/drm_util.h>
+
 #include "qxl_drv.h"
 #include "qxl_object.h"
 
@@ -372,25 +374,25 @@ void qxl_io_flush_surfaces(struct qxl_device *qdev)
 void qxl_io_destroy_primary(struct qxl_device *qdev)
 {
 	wait_for_io_cmd(qdev, 0, QXL_IO_DESTROY_PRIMARY_ASYNC);
-	qdev->primary_created = false;
+	qdev->primary_bo->is_primary = false;
+	drm_gem_object_put_unlocked(&qdev->primary_bo->gem_base);
+	qdev->primary_bo = NULL;
 }
 
-void qxl_io_create_primary(struct qxl_device *qdev,
-			   unsigned int offset, struct qxl_bo *bo)
+void qxl_io_create_primary(struct qxl_device *qdev, struct qxl_bo *bo)
 {
 	struct qxl_surface_create *create;
 
+	if (WARN_ON(qdev->primary_bo))
+		return;
+
 	DRM_DEBUG_DRIVER("qdev %p, ram_header %p\n", qdev, qdev->ram_header);
 	create = &qdev->ram_header->create_surface;
 	create->format = bo->surf.format;
 	create->width = bo->surf.width;
 	create->height = bo->surf.height;
 	create->stride = bo->surf.stride;
-	if (bo->shadow) {
-		create->mem = qxl_bo_physical_address(qdev, bo->shadow, offset);
-	} else {
-		create->mem = qxl_bo_physical_address(qdev, bo, offset);
-	}
+	create->mem = qxl_bo_physical_address(qdev, bo, 0);
 
 	DRM_DEBUG_DRIVER("mem = %llx, from %p\n", create->mem, bo->kptr);
 
@@ -398,7 +400,9 @@ void qxl_io_create_primary(struct qxl_device *qdev,
 	create->type = QXL_SURF_TYPE_PRIMARY;
 
 	wait_for_io_cmd(qdev, 0, QXL_IO_CREATE_PRIMARY_ASYNC);
-	qdev->primary_created = true;
+	qdev->primary_bo = bo;
+	qdev->primary_bo->is_primary = true;
+	drm_gem_object_get(&qdev->primary_bo->gem_base);
 }
 
 void qxl_io_memslot_add(struct qxl_device *qdev, uint8_t id)
@@ -458,8 +462,7 @@ void qxl_surface_id_dealloc(struct qxl_device *qdev,
 }
 
 int qxl_hw_surface_alloc(struct qxl_device *qdev,
-			 struct qxl_bo *surf,
-			 struct ttm_mem_reg *new_mem)
+			 struct qxl_bo *surf)
 {
 	struct qxl_surface_cmd *cmd;
 	struct qxl_release *release;
@@ -485,16 +488,7 @@ int qxl_hw_surface_alloc(struct qxl_device *qdev,
 	cmd->u.surface_create.width = surf->surf.width;
 	cmd->u.surface_create.height = surf->surf.height;
 	cmd->u.surface_create.stride = surf->surf.stride;
-	if (new_mem) {
-		int slot_id = surf->type == QXL_GEM_DOMAIN_VRAM ? qdev->main_mem_slot : qdev->surfaces_mem_slot;
-		struct qxl_memslot *slot = &(qdev->mem_slots[slot_id]);
-
-		/* TODO - need to hold one of the locks to read tbo.offset */
-		cmd->u.surface_create.data = slot->high_bits;
-
-		cmd->u.surface_create.data |= (new_mem->start << PAGE_SHIFT) + surf->tbo.bdev->man[new_mem->mem_type].gpu_offset;
-	} else
-		cmd->u.surface_create.data = qxl_bo_physical_address(qdev, surf, 0);
+	cmd->u.surface_create.data = qxl_bo_physical_address(qdev, surf, 0);
 	cmd->surface_id = surf->surface_id;
 	qxl_release_unmap(qdev, release, &cmd->release_info);
 
diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c
index ce0b9c40fc21..08c725544a2f 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -24,11 +24,11 @@
  */
 
 #include <linux/crc32.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_plane_helper.h>
-#include <drm/drm_atomic_helper.h>
 #include <drm/drm_atomic.h>
+#include <drm/drm_atomic_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "qxl_drv.h"
 #include "qxl_object.h"
@@ -48,8 +48,8 @@ static int qxl_alloc_client_monitors_config(struct qxl_device *qdev,
 	}
 	if (!qdev->client_monitors_config) {
 		qdev->client_monitors_config = kzalloc(
-				sizeof(struct qxl_monitors_config) +
-				sizeof(struct qxl_head) * count, GFP_KERNEL);
+				struct_size(qdev->client_monitors_config,
+				heads, count), GFP_KERNEL);
 		if (!qdev->client_monitors_config)
 			return -ENOMEM;
 	}
@@ -80,10 +80,10 @@ static int qxl_display_copy_rom_client_monitors_config(struct qxl_device *qdev)
 		DRM_DEBUG_KMS("no client monitors configured\n");
 		return status;
 	}
-	if (num_monitors > qdev->monitors_config->max_allowed) {
+	if (num_monitors > qxl_num_crtc) {
 		DRM_DEBUG_KMS("client monitors list will be truncated: %d < %d\n",
-			      qdev->monitors_config->max_allowed, num_monitors);
-		num_monitors = qdev->monitors_config->max_allowed;
+			      qxl_num_crtc, num_monitors);
+		num_monitors = qxl_num_crtc;
 	} else {
 		num_monitors = qdev->rom->client_monitors_config.count;
 	}
@@ -96,8 +96,7 @@ static int qxl_display_copy_rom_client_monitors_config(struct qxl_device *qdev)
 		return status;
 	}
 	/* we copy max from the client but it isn't used */
-	qdev->client_monitors_config->max_allowed =
-				qdev->monitors_config->max_allowed;
+	qdev->client_monitors_config->max_allowed = qxl_num_crtc;
 	for (i = 0 ; i < qdev->client_monitors_config->count ; ++i) {
 		struct qxl_urect *c_rect =
 			&qdev->rom->client_monitors_config.heads[i];
@@ -191,20 +190,63 @@ void qxl_display_read_client_monitors_config(struct qxl_device *qdev)
 	}
 }
 
-static int qxl_add_monitors_config_modes(struct drm_connector *connector,
-                                         unsigned *pwidth,
-                                         unsigned *pheight)
+static int qxl_check_mode(struct qxl_device *qdev,
+			  unsigned int width,
+			  unsigned int height)
+{
+	unsigned int stride;
+	unsigned int size;
+
+	if (check_mul_overflow(width, 4u, &stride))
+		return -EINVAL;
+	if (check_mul_overflow(stride, height, &size))
+		return -EINVAL;
+	if (size > qdev->vram_size)
+		return -ENOMEM;
+	return 0;
+}
+
+static int qxl_check_framebuffer(struct qxl_device *qdev,
+				 struct qxl_bo *bo)
+{
+	return qxl_check_mode(qdev, bo->surf.width, bo->surf.height);
+}
+
+static int qxl_add_mode(struct drm_connector *connector,
+			unsigned int width,
+			unsigned int height,
+			bool preferred)
+{
+	struct drm_device *dev = connector->dev;
+	struct qxl_device *qdev = dev->dev_private;
+	struct drm_display_mode *mode = NULL;
+	int rc;
+
+	rc = qxl_check_mode(qdev, width, height);
+	if (rc != 0)
+		return 0;
+
+	mode = drm_cvt_mode(dev, width, height, 60, false, false, false);
+	if (preferred)
+		mode->type |= DRM_MODE_TYPE_PREFERRED;
+	mode->hdisplay = width;
+	mode->vdisplay = height;
+	drm_mode_set_name(mode);
+	drm_mode_probed_add(connector, mode);
+	return 1;
+}
+
+static int qxl_add_monitors_config_modes(struct drm_connector *connector)
 {
 	struct drm_device *dev = connector->dev;
 	struct qxl_device *qdev = dev->dev_private;
 	struct qxl_output *output = drm_connector_to_qxl_output(connector);
 	int h = output->index;
-	struct drm_display_mode *mode = NULL;
 	struct qxl_head *head;
 
 	if (!qdev->monitors_config)
 		return 0;
-	if (h >= qdev->monitors_config->max_allowed)
+	if (h >= qxl_num_crtc)
 		return 0;
 	if (!qdev->client_monitors_config)
 		return 0;
@@ -214,60 +256,28 @@ static int qxl_add_monitors_config_modes(struct drm_connector *connector,
 	head = &qdev->client_monitors_config->heads[h];
 	DRM_DEBUG_KMS("head %d is %dx%d\n", h, head->width, head->height);
 
-	mode = drm_cvt_mode(dev, head->width, head->height, 60, false, false,
-			    false);
-	mode->type |= DRM_MODE_TYPE_PREFERRED;
-	mode->hdisplay = head->width;
-	mode->vdisplay = head->height;
-	drm_mode_set_name(mode);
-	*pwidth = head->width;
-	*pheight = head->height;
-	drm_mode_probed_add(connector, mode);
-	/* remember the last custom size for mode validation */
-	qdev->monitors_config_width = mode->hdisplay;
-	qdev->monitors_config_height = mode->vdisplay;
-	return 1;
+	return qxl_add_mode(connector, head->width, head->height, true);
 }
 
 static struct mode_size {
 	int w;
 	int h;
-} common_modes[] = {
-	{ 640,  480},
+} extra_modes[] = {
 	{ 720,  480},
-	{ 800,  600},
-	{ 848,  480},
-	{1024,  768},
 	{1152,  768},
-	{1280,  720},
-	{1280,  800},
 	{1280,  854},
-	{1280,  960},
-	{1280, 1024},
-	{1440,  900},
-	{1400, 1050},
-	{1680, 1050},
-	{1600, 1200},
-	{1920, 1080},
-	{1920, 1200}
 };
 
-static int qxl_add_common_modes(struct drm_connector *connector,
-                                unsigned int pwidth,
-                                unsigned int pheight)
+static int qxl_add_extra_modes(struct drm_connector *connector)
 {
-	struct drm_device *dev = connector->dev;
-	struct drm_display_mode *mode = NULL;
-	int i;
+	int i, ret = 0;
 
-	for (i = 0; i < ARRAY_SIZE(common_modes); i++) {
-		mode = drm_cvt_mode(dev, common_modes[i].w, common_modes[i].h,
-				    60, false, false, false);
-		if (common_modes[i].w == pwidth && common_modes[i].h == pheight)
-			mode->type |= DRM_MODE_TYPE_PREFERRED;
-		drm_mode_probed_add(connector, mode);
-	}
-	return i - 1;
+	for (i = 0; i < ARRAY_SIZE(extra_modes); i++)
+		ret += qxl_add_mode(connector,
+				    extra_modes[i].w,
+				    extra_modes[i].h,
+				    false);
+	return ret;
 }
 
 static void qxl_send_monitors_config(struct qxl_device *qdev)
@@ -302,13 +312,12 @@ static void qxl_crtc_update_monitors_config(struct drm_crtc *crtc,
 	struct qxl_head head;
 	int oldcount, i = qcrtc->index;
 
-	if (!qdev->primary_created) {
+	if (!qdev->primary_bo) {
 		DRM_DEBUG_KMS("no primary surface, skip (%s)\n", reason);
 		return;
 	}
 
-	if (!qdev->monitors_config ||
-	    qdev->monitors_config->max_allowed <= i)
+	if (!qdev->monitors_config || qxl_num_crtc <= i)
 		return;
 
 	head.id = i;
@@ -323,6 +332,8 @@ static void qxl_crtc_update_monitors_config(struct drm_crtc *crtc,
 		head.y = crtc->y;
 		if (qdev->monitors_config->count < i + 1)
 			qdev->monitors_config->count = i + 1;
+		if (qdev->primary_bo == qdev->dumb_shadow_bo)
+			head.x += qdev->dumb_heads[i].x;
 	} else if (i > 0) {
 		head.width = 0;
 		head.height = 0;
@@ -348,9 +359,10 @@ static void qxl_crtc_update_monitors_config(struct drm_crtc *crtc,
 	if (oldcount != qdev->monitors_config->count)
 		DRM_DEBUG_KMS("active heads %d -> %d (%d total)\n",
 			      oldcount, qdev->monitors_config->count,
-			      qdev->monitors_config->max_allowed);
+			      qxl_num_crtc);
 
 	qdev->monitors_config->heads[i] = head;
+	qdev->monitors_config->max_allowed = qxl_num_crtc;
 	qxl_send_monitors_config(qdev);
 }
 
@@ -401,13 +413,15 @@ static int qxl_framebuffer_surface_dirty(struct drm_framebuffer *fb,
 	struct qxl_device *qdev = fb->dev->dev_private;
 	struct drm_clip_rect norect;
 	struct qxl_bo *qobj;
+	bool is_primary;
 	int inc = 1;
 
 	drm_modeset_lock_all(fb->dev);
 
 	qobj = gem_to_qxl_bo(fb->obj[0]);
 	/* if we aren't primary surface ignore this */
-	if (!qobj->is_primary) {
+	is_primary = qobj->shadow ? qobj->shadow->is_primary : qobj->is_primary;
+	if (!is_primary) {
 		drm_modeset_unlock_all(fb->dev);
 		return 0;
 	}
@@ -424,7 +438,7 @@ static int qxl_framebuffer_surface_dirty(struct drm_framebuffer *fb,
 	}
 
 	qxl_draw_dirty_fb(qdev, fb, qobj, flags, color,
-			  clips, num_clips, inc);
+			  clips, num_clips, inc, 0);
 
 	drm_modeset_unlock_all(fb->dev);
 
@@ -466,12 +480,7 @@ static int qxl_primary_atomic_check(struct drm_plane *plane,
 
 	bo = gem_to_qxl_bo(state->fb->obj[0]);
 
-	if (bo->surf.stride * bo->surf.height > qdev->vram_size) {
-		DRM_ERROR("Mode doesn't fit in vram size (vgamem)");
-		return -EINVAL;
-	}
-
-	return 0;
+	return qxl_check_framebuffer(qdev, bo);
 }
 
 static int qxl_primary_apply_cursor(struct drm_plane *plane)
@@ -526,15 +535,14 @@ static void qxl_primary_atomic_update(struct drm_plane *plane,
 {
 	struct qxl_device *qdev = plane->dev->dev_private;
 	struct qxl_bo *bo = gem_to_qxl_bo(plane->state->fb->obj[0]);
-	struct qxl_bo *bo_old;
+	struct qxl_bo *bo_old, *primary;
 	struct drm_clip_rect norect = {
 	    .x1 = 0,
 	    .y1 = 0,
 	    .x2 = plane->state->fb->width,
 	    .y2 = plane->state->fb->height
 	};
-	int ret;
-	bool same_shadow = false;
+	uint32_t dumb_shadow_offset = 0;
 
 	if (old_state->fb) {
 		bo_old = gem_to_qxl_bo(old_state->fb->obj[0]);
@@ -542,32 +550,21 @@ static void qxl_primary_atomic_update(struct drm_plane *plane,
 		bo_old = NULL;
 	}
 
-	if (bo == bo_old)
-		return;
+	primary = bo->shadow ? bo->shadow : bo;
 
-	if (bo_old && bo_old->shadow && bo->shadow &&
-	    bo_old->shadow == bo->shadow) {
-		same_shadow = true;
-	}
-
-	if (bo_old && bo_old->is_primary) {
-		if (!same_shadow)
+	if (!primary->is_primary) {
+		if (qdev->primary_bo)
 			qxl_io_destroy_primary(qdev);
-		bo_old->is_primary = false;
-
-		ret = qxl_primary_apply_cursor(plane);
-		if (ret)
-			DRM_ERROR(
-			"could not set cursor after creating primary");
+		qxl_io_create_primary(qdev, primary);
+		qxl_primary_apply_cursor(plane);
 	}
 
-	if (!bo->is_primary) {
-		if (!same_shadow)
-			qxl_io_create_primary(qdev, 0, bo);
-		bo->is_primary = true;
-	}
+	if (bo->is_dumb)
+		dumb_shadow_offset =
+			qdev->dumb_heads[plane->state->crtc->index].x;
 
-	qxl_draw_dirty_fb(qdev, plane->state->fb, bo, 0, 0, &norect, 1, 1);
+	qxl_draw_dirty_fb(qdev, plane->state->fb, bo, 0, 0, &norect, 1, 1,
+			  dumb_shadow_offset);
 }
 
 static void qxl_primary_atomic_disable(struct drm_plane *plane,
@@ -723,12 +720,68 @@ static void qxl_cursor_atomic_disable(struct drm_plane *plane,
 	qxl_release_fence_buffer_objects(release);
 }
 
+static void qxl_update_dumb_head(struct qxl_device *qdev,
+				 int index, struct qxl_bo *bo)
+{
+	uint32_t width, height;
+
+	if (index >= qdev->monitors_config->max_allowed)
+		return;
+
+	if (bo && bo->is_dumb) {
+		width = bo->surf.width;
+		height = bo->surf.height;
+	} else {
+		width = 0;
+		height = 0;
+	}
+
+	if (qdev->dumb_heads[index].width == width &&
+	    qdev->dumb_heads[index].height == height)
+		return;
+
+	DRM_DEBUG("#%d: %dx%d -> %dx%d\n", index,
+		  qdev->dumb_heads[index].width,
+		  qdev->dumb_heads[index].height,
+		  width, height);
+	qdev->dumb_heads[index].width = width;
+	qdev->dumb_heads[index].height = height;
+}
+
+static void qxl_calc_dumb_shadow(struct qxl_device *qdev,
+				 struct qxl_surface *surf)
+{
+	struct qxl_head *head;
+	int i;
+
+	memset(surf, 0, sizeof(*surf));
+	for (i = 0; i < qdev->monitors_config->max_allowed; i++) {
+		head = qdev->dumb_heads + i;
+		head->x = surf->width;
+		surf->width += head->width;
+		if (surf->height < head->height)
+			surf->height = head->height;
+	}
+	if (surf->width < 64)
+		surf->width = 64;
+	if (surf->height < 64)
+		surf->height = 64;
+	surf->format = SPICE_SURFACE_FMT_32_xRGB;
+	surf->stride = surf->width * 4;
+
+	if (!qdev->dumb_shadow_bo ||
+	    qdev->dumb_shadow_bo->surf.width != surf->width ||
+	    qdev->dumb_shadow_bo->surf.height != surf->height)
+		DRM_DEBUG("%dx%d\n", surf->width, surf->height);
+}
+
 static int qxl_plane_prepare_fb(struct drm_plane *plane,
 				struct drm_plane_state *new_state)
 {
 	struct qxl_device *qdev = plane->dev->dev_private;
 	struct drm_gem_object *obj;
-	struct qxl_bo *user_bo, *old_bo = NULL;
+	struct qxl_bo *user_bo;
+	struct qxl_surface surf;
 	int ret;
 
 	if (!new_state->fb)
@@ -738,28 +791,30 @@ static int qxl_plane_prepare_fb(struct drm_plane *plane,
 	user_bo = gem_to_qxl_bo(obj);
 
 	if (plane->type == DRM_PLANE_TYPE_PRIMARY &&
-	    user_bo->is_dumb && !user_bo->shadow) {
-		if (plane->state->fb) {
-			obj = plane->state->fb->obj[0];
-			old_bo = gem_to_qxl_bo(obj);
+	    user_bo->is_dumb) {
+		qxl_update_dumb_head(qdev, new_state->crtc->index,
+				     user_bo);
+		qxl_calc_dumb_shadow(qdev, &surf);
+		if (!qdev->dumb_shadow_bo ||
+		    qdev->dumb_shadow_bo->surf.width  != surf.width ||
+		    qdev->dumb_shadow_bo->surf.height != surf.height) {
+			if (qdev->dumb_shadow_bo) {
+				drm_gem_object_put_unlocked
+					(&qdev->dumb_shadow_bo->gem_base);
+				qdev->dumb_shadow_bo = NULL;
+			}
+			qxl_bo_create(qdev, surf.height * surf.stride,
+				      true, true, QXL_GEM_DOMAIN_SURFACE, &surf,
+				      &qdev->dumb_shadow_bo);
 		}
-		if (old_bo && old_bo->shadow &&
-		    user_bo->gem_base.size == old_bo->gem_base.size &&
-		    plane->state->crtc     == new_state->crtc &&
-		    plane->state->crtc_w   == new_state->crtc_w &&
-		    plane->state->crtc_h   == new_state->crtc_h &&
-		    plane->state->src_x    == new_state->src_x &&
-		    plane->state->src_y    == new_state->src_y &&
-		    plane->state->src_w    == new_state->src_w &&
-		    plane->state->src_h    == new_state->src_h &&
-		    plane->state->rotation == new_state->rotation &&
-		    plane->state->zpos     == new_state->zpos) {
-			drm_gem_object_get(&old_bo->shadow->gem_base);
-			user_bo->shadow = old_bo->shadow;
-		} else {
-			qxl_bo_create(qdev, user_bo->gem_base.size,
-				      true, true, QXL_GEM_DOMAIN_VRAM, NULL,
-				      &user_bo->shadow);
+		if (user_bo->shadow != qdev->dumb_shadow_bo) {
+			if (user_bo->shadow) {
+				drm_gem_object_put_unlocked
+					(&user_bo->shadow->gem_base);
+				user_bo->shadow = NULL;
+			}
+			drm_gem_object_get(&qdev->dumb_shadow_bo->gem_base);
+			user_bo->shadow = qdev->dumb_shadow_bo;
 		}
 	}
 
@@ -788,7 +843,7 @@ static void qxl_plane_cleanup_fb(struct drm_plane *plane,
 	user_bo = gem_to_qxl_bo(obj);
 	qxl_bo_unpin(user_bo);
 
-	if (user_bo->shadow && !user_bo->is_primary) {
+	if (old_state->fb != plane->state->fb && user_bo->shadow) {
 		drm_gem_object_put_unlocked(&user_bo->shadow->gem_base);
 		user_bo->shadow = NULL;
 	}
@@ -925,14 +980,26 @@ free_mem:
 
 static int qxl_conn_get_modes(struct drm_connector *connector)
 {
+	struct drm_device *dev = connector->dev;
+	struct qxl_device *qdev = dev->dev_private;
+	struct qxl_output *output = drm_connector_to_qxl_output(connector);
 	unsigned int pwidth = 1024;
 	unsigned int pheight = 768;
 	int ret = 0;
 
-	ret = qxl_add_monitors_config_modes(connector, &pwidth, &pheight);
-	if (ret < 0)
-		return ret;
-	ret += qxl_add_common_modes(connector, pwidth, pheight);
+	if (qdev->client_monitors_config) {
+		struct qxl_head *head;
+		head = &qdev->client_monitors_config->heads[output->index];
+		if (head->width)
+			pwidth = head->width;
+		if (head->height)
+			pheight = head->height;
+	}
+
+	ret += drm_add_modes_noedid(connector, 8192, 8192);
+	ret += qxl_add_extra_modes(connector);
+	ret += qxl_add_monitors_config_modes(connector);
+	drm_set_preferred_mode(connector, pwidth, pheight);
 	return ret;
 }
 
@@ -941,20 +1008,11 @@ static enum drm_mode_status qxl_conn_mode_valid(struct drm_connector *connector,
 {
 	struct drm_device *ddev = connector->dev;
 	struct qxl_device *qdev = ddev->dev_private;
-	int i;
-
-	/* TODO: is this called for user defined modes? (xrandr --add-mode)
-	 * TODO: check that the mode fits in the framebuffer */
 
-	if (qdev->monitors_config_width == mode->hdisplay &&
-	    qdev->monitors_config_height == mode->vdisplay)
-		return MODE_OK;
+	if (qxl_check_mode(qdev, mode->hdisplay, mode->vdisplay) != 0)
+		return MODE_BAD;
 
-	for (i = 0; i < ARRAY_SIZE(common_modes); i++) {
-		if (common_modes[i].w == mode->hdisplay && common_modes[i].h == mode->vdisplay)
-			return MODE_OK;
-	}
-	return MODE_BAD;
+	return MODE_OK;
 }
 
 static struct drm_encoder *qxl_best_encoder(struct drm_connector *connector)
@@ -1010,7 +1068,6 @@ static void qxl_conn_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs qxl_connector_funcs = {
-	.dpms = drm_helper_connector_dpms,
 	.detect = qxl_conn_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = qxl_conn_destroy,
@@ -1097,9 +1154,8 @@ int qxl_create_monitors_object(struct qxl_device *qdev)
 {
 	int ret;
 	struct drm_gem_object *gobj;
-	int max_allowed = qxl_num_crtc;
 	int monitors_config_size = sizeof(struct qxl_monitors_config) +
-		max_allowed * sizeof(struct qxl_head);
+		qxl_num_crtc * sizeof(struct qxl_head);
 
 	ret = qxl_gem_object_create(qdev, monitors_config_size, 0,
 				    QXL_GEM_DOMAIN_VRAM,
@@ -1121,7 +1177,12 @@ int qxl_create_monitors_object(struct qxl_device *qdev)
 		qxl_bo_physical_address(qdev, qdev->monitors_config_bo, 0);
 
 	memset(qdev->monitors_config, 0, monitors_config_size);
-	qdev->monitors_config->max_allowed = max_allowed;
+	qdev->dumb_heads = kcalloc(qxl_num_crtc, sizeof(qdev->dumb_heads[0]),
+				   GFP_KERNEL);
+	if (!qdev->dumb_heads) {
+		qxl_destroy_monitors_object(qdev);
+		return -ENOMEM;
+	}
 	return 0;
 }
 
@@ -1173,18 +1234,11 @@ int qxl_modeset_init(struct qxl_device *qdev)
 	qxl_display_read_client_monitors_config(qdev);
 
 	drm_mode_config_reset(&qdev->ddev);
-
-	/* primary surface must be created by this point, to allow
-	 * issuing command queue commands and having them read by
-	 * spice server. */
-	qxl_fbdev_init(qdev);
 	return 0;
 }
 
 void qxl_modeset_fini(struct qxl_device *qdev)
 {
-	qxl_fbdev_fini(qdev);
-
 	qxl_destroy_monitors_object(qdev);
 	drm_mode_config_cleanup(&qdev->ddev);
 }
diff --git a/drivers/gpu/drm/qxl/qxl_draw.c b/drivers/gpu/drm/qxl/qxl_draw.c
index c408bb83c7a9..97c3f1a95a32 100644
--- a/drivers/gpu/drm/qxl/qxl_draw.c
+++ b/drivers/gpu/drm/qxl/qxl_draw.c
@@ -109,152 +109,6 @@ make_drawable(struct qxl_device *qdev, int surface, uint8_t type,
 	return 0;
 }
 
-static int alloc_palette_object(struct qxl_device *qdev,
-				struct qxl_release *release,
-				struct qxl_bo **palette_bo)
-{
-	return qxl_alloc_bo_reserved(qdev, release,
-				     sizeof(struct qxl_palette) + sizeof(uint32_t) * 2,
-				     palette_bo);
-}
-
-static int qxl_palette_create_1bit(struct qxl_bo *palette_bo,
-				   struct qxl_release *release,
-				   const struct qxl_fb_image *qxl_fb_image)
-{
-	const struct fb_image *fb_image = &qxl_fb_image->fb_image;
-	uint32_t visual = qxl_fb_image->visual;
-	const uint32_t *pseudo_palette = qxl_fb_image->pseudo_palette;
-	struct qxl_palette *pal;
-	int ret;
-	uint32_t fgcolor, bgcolor;
-	static uint64_t unique; /* we make no attempt to actually set this
-				 * correctly globaly, since that would require
-				 * tracking all of our palettes. */
-	ret = qxl_bo_kmap(palette_bo, (void **)&pal);
-	if (ret)
-		return ret;
-	pal->num_ents = 2;
-	pal->unique = unique++;
-	if (visual == FB_VISUAL_TRUECOLOR || visual == FB_VISUAL_DIRECTCOLOR) {
-		/* NB: this is the only used branch currently. */
-		fgcolor = pseudo_palette[fb_image->fg_color];
-		bgcolor = pseudo_palette[fb_image->bg_color];
-	} else {
-		fgcolor = fb_image->fg_color;
-		bgcolor = fb_image->bg_color;
-	}
-	pal->ents[0] = bgcolor;
-	pal->ents[1] = fgcolor;
-	qxl_bo_kunmap(palette_bo);
-	return 0;
-}
-
-void qxl_draw_opaque_fb(const struct qxl_fb_image *qxl_fb_image,
-			int stride /* filled in if 0 */)
-{
-	struct qxl_device *qdev = qxl_fb_image->qdev;
-	struct qxl_drawable *drawable;
-	struct qxl_rect rect;
-	const struct fb_image *fb_image = &qxl_fb_image->fb_image;
-	int x = fb_image->dx;
-	int y = fb_image->dy;
-	int width = fb_image->width;
-	int height = fb_image->height;
-	const char *src = fb_image->data;
-	int depth = fb_image->depth;
-	struct qxl_release *release;
-	struct qxl_image *image;
-	int ret;
-	struct qxl_drm_image *dimage;
-	struct qxl_bo *palette_bo = NULL;
-
-	if (stride == 0)
-		stride = depth * width / 8;
-
-	ret = alloc_drawable(qdev, &release);
-	if (ret)
-		return;
-
-	ret = qxl_image_alloc_objects(qdev, release,
-				      &dimage,
-				      height, stride);
-	if (ret)
-		goto out_free_drawable;
-
-	if (depth == 1) {
-		ret = alloc_palette_object(qdev, release, &palette_bo);
-		if (ret)
-			goto out_free_image;
-	}
-
-	/* do a reservation run over all the objects we just allocated */
-	ret = qxl_release_reserve_list(release, true);
-	if (ret)
-		goto out_free_palette;
-
-	rect.left = x;
-	rect.right = x + width;
-	rect.top = y;
-	rect.bottom = y + height;
-
-	ret = make_drawable(qdev, 0, QXL_DRAW_COPY, &rect, release);
-	if (ret) {
-		qxl_release_backoff_reserve_list(release);
-		goto out_free_palette;
-	}
-
-	ret = qxl_image_init(qdev, release, dimage,
-			     (const uint8_t *)src, 0, 0,
-			     width, height, depth, stride);
-	if (ret) {
-		qxl_release_backoff_reserve_list(release);
-		qxl_release_free(qdev, release);
-		return;
-	}
-
-	if (depth == 1) {
-		void *ptr;
-
-		ret = qxl_palette_create_1bit(palette_bo, release, qxl_fb_image);
-
-		ptr = qxl_bo_kmap_atomic_page(qdev, dimage->bo, 0);
-		image = ptr;
-		image->u.bitmap.palette =
-			qxl_bo_physical_address(qdev, palette_bo, 0);
-		qxl_bo_kunmap_atomic_page(qdev, dimage->bo, ptr);
-	}
-
-	drawable = (struct qxl_drawable *)qxl_release_map(qdev, release);
-
-	drawable->u.copy.src_area.top = 0;
-	drawable->u.copy.src_area.bottom = height;
-	drawable->u.copy.src_area.left = 0;
-	drawable->u.copy.src_area.right = width;
-
-	drawable->u.copy.rop_descriptor = SPICE_ROPD_OP_PUT;
-	drawable->u.copy.scale_mode = 0;
-	drawable->u.copy.mask.flags = 0;
-	drawable->u.copy.mask.pos.x = 0;
-	drawable->u.copy.mask.pos.y = 0;
-	drawable->u.copy.mask.bitmap = 0;
-
-	drawable->u.copy.src_bitmap =
-		qxl_bo_physical_address(qdev, dimage->bo, 0);
-	qxl_release_unmap(qdev, release, &drawable->release_info);
-
-	qxl_push_command_ring_release(qdev, release, QXL_CMD_DRAW, false);
-	qxl_release_fence_buffer_objects(release);
-
-out_free_palette:
-	qxl_bo_unref(&palette_bo);
-out_free_image:
-	qxl_image_free_objects(qdev, dimage);
-out_free_drawable:
-	if (ret)
-		free_drawable(qdev, release);
-}
-
 /* push a draw command using the given clipping rectangles as
  * the sources from the shadow framebuffer.
  *
@@ -267,7 +121,8 @@ void qxl_draw_dirty_fb(struct qxl_device *qdev,
 		       struct qxl_bo *bo,
 		       unsigned int flags, unsigned int color,
 		       struct drm_clip_rect *clips,
-		       unsigned int num_clips, int inc)
+		       unsigned int num_clips, int inc,
+		       uint32_t dumb_shadow_offset)
 {
 	/*
 	 * TODO: if flags & DRM_MODE_FB_DIRTY_ANNOTATE_FILL then we should
@@ -295,6 +150,9 @@ void qxl_draw_dirty_fb(struct qxl_device *qdev,
 	if (ret)
 		return;
 
+	clips->x1 += dumb_shadow_offset;
+	clips->x2 += dumb_shadow_offset;
+
 	left = clips->x1;
 	right = clips->x2;
 	top = clips->y1;
@@ -342,7 +200,8 @@ void qxl_draw_dirty_fb(struct qxl_device *qdev,
 		goto out_release_backoff;
 
 	ret = qxl_image_init(qdev, release, dimage, surface_base,
-			     left, top, width, height, depth, stride);
+			     left - dumb_shadow_offset,
+			     top, width, height, depth, stride);
 	qxl_bo_kunmap(bo);
 	if (ret)
 		goto out_release_backoff;
@@ -397,89 +256,3 @@ out_free_drawable:
 		free_drawable(qdev, release);
 
 }
-
-void qxl_draw_copyarea(struct qxl_device *qdev,
-		       u32 width, u32 height,
-		       u32 sx, u32 sy,
-		       u32 dx, u32 dy)
-{
-	struct qxl_drawable *drawable;
-	struct qxl_rect rect;
-	struct qxl_release *release;
-	int ret;
-
-	ret = alloc_drawable(qdev, &release);
-	if (ret)
-		return;
-
-	/* do a reservation run over all the objects we just allocated */
-	ret = qxl_release_reserve_list(release, true);
-	if (ret)
-		goto out_free_release;
-
-	rect.left = dx;
-	rect.top = dy;
-	rect.right = dx + width;
-	rect.bottom = dy + height;
-	ret = make_drawable(qdev, 0, QXL_COPY_BITS, &rect, release);
-	if (ret) {
-		qxl_release_backoff_reserve_list(release);
-		goto out_free_release;
-	}
-
-	drawable = (struct qxl_drawable *)qxl_release_map(qdev, release);
-	drawable->u.copy_bits.src_pos.x = sx;
-	drawable->u.copy_bits.src_pos.y = sy;
-	qxl_release_unmap(qdev, release, &drawable->release_info);
-
-	qxl_push_command_ring_release(qdev, release, QXL_CMD_DRAW, false);
-	qxl_release_fence_buffer_objects(release);
-
-out_free_release:
-	if (ret)
-		free_drawable(qdev, release);
-}
-
-void qxl_draw_fill(struct qxl_draw_fill *qxl_draw_fill_rec)
-{
-	struct qxl_device *qdev = qxl_draw_fill_rec->qdev;
-	struct qxl_rect rect = qxl_draw_fill_rec->rect;
-	uint32_t color = qxl_draw_fill_rec->color;
-	uint16_t rop = qxl_draw_fill_rec->rop;
-	struct qxl_drawable *drawable;
-	struct qxl_release *release;
-	int ret;
-
-	ret = alloc_drawable(qdev, &release);
-	if (ret)
-		return;
-
-	/* do a reservation run over all the objects we just allocated */
-	ret = qxl_release_reserve_list(release, true);
-	if (ret)
-		goto out_free_release;
-
-	ret = make_drawable(qdev, 0, QXL_DRAW_FILL, &rect, release);
-	if (ret) {
-		qxl_release_backoff_reserve_list(release);
-		goto out_free_release;
-	}
-
-	drawable = (struct qxl_drawable *)qxl_release_map(qdev, release);
-	drawable->u.fill.brush.type = SPICE_BRUSH_TYPE_SOLID;
-	drawable->u.fill.brush.u.color = color;
-	drawable->u.fill.rop_descriptor = rop;
-	drawable->u.fill.mask.flags = 0;
-	drawable->u.fill.mask.pos.x = 0;
-	drawable->u.fill.mask.pos.y = 0;
-	drawable->u.fill.mask.bitmap = 0;
-
-	qxl_release_unmap(qdev, release, &drawable->release_info);
-
-	qxl_push_command_ring_release(qdev, release, QXL_CMD_DRAW, false);
-	qxl_release_fence_buffer_objects(release);
-
-out_free_release:
-	if (ret)
-		free_drawable(qdev, release);
-}
diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
index ccb090f3ab30..bb81e310eb6d 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.c
+++ b/drivers/gpu/drm/qxl/qxl_drv.c
@@ -33,7 +33,8 @@
 
 #include <drm/drmP.h>
 #include <drm/drm.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_modeset_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "qxl_drv.h"
 #include "qxl_object.h"
 
@@ -93,6 +94,8 @@ qxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (ret)
 		goto modeset_cleanup;
 
+	drm_fb_helper_remove_conflicting_pci_framebuffers(pdev, 0, "qxl");
+	drm_fbdev_generic_setup(&qdev->ddev, 32);
 	return 0;
 
 modeset_cleanup:
@@ -242,7 +245,6 @@ static struct pci_driver qxl_pci_driver = {
 
 static struct drm_driver qxl_driver = {
 	.driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME |
-			   DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED |
 			   DRIVER_ATOMIC,
 
 	.dumb_create = qxl_mode_dumb_create,
diff --git a/drivers/gpu/drm/qxl/qxl_drv.h b/drivers/gpu/drm/qxl/qxl_drv.h
index 13a0254b59a1..4a0331b3ff7d 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.h
+++ b/drivers/gpu/drm/qxl/qxl_drv.h
@@ -84,6 +84,7 @@ struct qxl_bo {
 	struct ttm_bo_kmap_obj		kmap;
 	unsigned int pin_count;
 	void				*kptr;
+	unsigned int                    map_count;
 	int                             type;
 
 	/* Constant after initialization */
@@ -130,10 +131,13 @@ struct qxl_mman {
 };
 
 struct qxl_memslot {
+	int             index;
+	const char      *name;
 	uint8_t		generation;
 	uint64_t	start_phys_addr;
-	uint64_t	end_phys_addr;
+	uint64_t	size;
 	uint64_t	high_bits;
+	uint64_t        gpu_offset;
 };
 
 enum {
@@ -216,8 +220,6 @@ struct qxl_device {
 	struct qxl_mman		mman;
 	struct qxl_gem		gem;
 
-	struct drm_fb_helper	fb_helper;
-
 	void *ram_physical;
 
 	struct qxl_ring *release_ring;
@@ -226,16 +228,12 @@ struct qxl_device {
 
 	struct qxl_ram_header *ram_header;
 
-	unsigned int primary_created:1;
-
-	struct qxl_memslot	*mem_slots;
-	uint8_t		n_mem_slots;
+	struct qxl_bo *primary_bo;
+	struct qxl_bo *dumb_shadow_bo;
+	struct qxl_head *dumb_heads;
 
-	uint8_t		main_mem_slot;
-	uint8_t		surfaces_mem_slot;
-	uint8_t		slot_id_bits;
-	uint8_t		slot_gen_bits;
-	uint64_t	va_slot_mask;
+	struct qxl_memslot main_slot;
+	struct qxl_memslot surfaces_slot;
 
 	spinlock_t	release_lock;
 	struct idr	release_idr;
@@ -308,30 +306,20 @@ void qxl_ring_free(struct qxl_ring *ring);
 void qxl_ring_init_hdr(struct qxl_ring *ring);
 int qxl_check_idle(struct qxl_ring *ring);
 
-static inline void *
-qxl_fb_virtual_address(struct qxl_device *qdev, unsigned long physical)
-{
-	DRM_DEBUG_DRIVER("not implemented (%lu)\n", physical);
-	return 0;
-}
-
 static inline uint64_t
 qxl_bo_physical_address(struct qxl_device *qdev, struct qxl_bo *bo,
 			unsigned long offset)
 {
-	int slot_id = bo->type == QXL_GEM_DOMAIN_VRAM ? qdev->main_mem_slot : qdev->surfaces_mem_slot;
-	struct qxl_memslot *slot = &(qdev->mem_slots[slot_id]);
+	struct qxl_memslot *slot =
+		(bo->tbo.mem.mem_type == TTM_PL_VRAM)
+		? &qdev->main_slot : &qdev->surfaces_slot;
+
+	WARN_ON_ONCE((bo->tbo.offset & slot->gpu_offset) != slot->gpu_offset);
 
 	/* TODO - need to hold one of the locks to read tbo.offset */
-	return slot->high_bits | (bo->tbo.offset + offset);
+	return slot->high_bits | (bo->tbo.offset - slot->gpu_offset + offset);
 }
 
-/* qxl_fb.c */
-#define QXLFB_CONN_LIMIT 1
-
-int qxl_fbdev_init(struct qxl_device *qdev);
-void qxl_fbdev_fini(struct qxl_device *qdev);
-
 /* qxl_display.c */
 void qxl_display_read_client_monitors_config(struct qxl_device *qdev);
 int qxl_create_monitors_object(struct qxl_device *qdev);
@@ -392,7 +380,6 @@ void qxl_update_screen(struct qxl_device *qxl);
 /* qxl io operations (qxl_cmd.c) */
 
 void qxl_io_create_primary(struct qxl_device *qdev,
-			   unsigned int offset,
 			   struct qxl_bo *bo);
 void qxl_io_destroy_primary(struct qxl_device *qdev);
 void qxl_io_memslot_add(struct qxl_device *qdev, uint8_t id);
@@ -437,22 +424,13 @@ int qxl_alloc_bo_reserved(struct qxl_device *qdev,
 			  struct qxl_bo **_bo);
 /* qxl drawing commands */
 
-void qxl_draw_opaque_fb(const struct qxl_fb_image *qxl_fb_image,
-			int stride /* filled in if 0 */);
-
 void qxl_draw_dirty_fb(struct qxl_device *qdev,
 		       struct drm_framebuffer *fb,
 		       struct qxl_bo *bo,
 		       unsigned int flags, unsigned int color,
 		       struct drm_clip_rect *clips,
-		       unsigned int num_clips, int inc);
-
-void qxl_draw_fill(struct qxl_draw_fill *qxl_draw_fill_rec);
-
-void qxl_draw_copyarea(struct qxl_device *qdev,
-		       u32 width, u32 height,
-		       u32 sx, u32 sy,
-		       u32 dx, u32 dy);
+		       unsigned int num_clips, int inc,
+		       uint32_t dumb_shadow_offset);
 
 void qxl_release_free(struct qxl_device *qdev,
 		      struct qxl_release *release);
@@ -485,9 +463,6 @@ int qxl_gem_prime_mmap(struct drm_gem_object *obj,
 int qxl_irq_init(struct qxl_device *qdev);
 irqreturn_t qxl_irq_handler(int irq, void *arg);
 
-/* qxl_fb.c */
-bool qxl_fbdev_qobj_is_fb(struct qxl_device *qdev, struct qxl_bo *qobj);
-
 int qxl_debugfs_add_files(struct qxl_device *qdev,
 			  struct drm_info_list *files,
 			  unsigned int nfiles);
@@ -497,8 +472,7 @@ int qxl_surface_id_alloc(struct qxl_device *qdev,
 void qxl_surface_id_dealloc(struct qxl_device *qdev,
 			    uint32_t surface_id);
 int qxl_hw_surface_alloc(struct qxl_device *qdev,
-			 struct qxl_bo *surf,
-			 struct ttm_mem_reg *mem);
+			 struct qxl_bo *surf);
 int qxl_hw_surface_dealloc(struct qxl_device *qdev,
 			   struct qxl_bo *surf);
 
diff --git a/drivers/gpu/drm/qxl/qxl_dumb.c b/drivers/gpu/drm/qxl/qxl_dumb.c
index e3765739c396..272d19b677d8 100644
--- a/drivers/gpu/drm/qxl/qxl_dumb.c
+++ b/drivers/gpu/drm/qxl/qxl_dumb.c
@@ -59,7 +59,7 @@ int qxl_mode_dumb_create(struct drm_file *file_priv,
 	surf.stride = pitch;
 	surf.format = format;
 	r = qxl_gem_object_create_with_handle(qdev, file_priv,
-					      QXL_GEM_DOMAIN_VRAM,
+					      QXL_GEM_DOMAIN_SURFACE,
 					      args->size, &surf, &qobj,
 					      &handle);
 	if (r)
diff --git a/drivers/gpu/drm/qxl/qxl_fb.c b/drivers/gpu/drm/qxl/qxl_fb.c
deleted file mode 100644
index a819d24225d2..000000000000
--- a/drivers/gpu/drm/qxl/qxl_fb.c
+++ /dev/null
@@ -1,300 +0,0 @@
-/*
- * Copyright © 2013 Red Hat
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- *
- * Authors:
- *     David Airlie
- */
-#include <linux/module.h>
-
-#include <drm/drmP.h>
-#include <drm/drm.h>
-#include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_fb_helper.h>
-#include <drm/drm_gem_framebuffer_helper.h>
-
-#include "qxl_drv.h"
-
-#include "qxl_object.h"
-
-static void qxl_fb_image_init(struct qxl_fb_image *qxl_fb_image,
-			      struct qxl_device *qdev, struct fb_info *info,
-			      const struct fb_image *image)
-{
-	qxl_fb_image->qdev = qdev;
-	if (info) {
-		qxl_fb_image->visual = info->fix.visual;
-		if (qxl_fb_image->visual == FB_VISUAL_TRUECOLOR ||
-		    qxl_fb_image->visual == FB_VISUAL_DIRECTCOLOR)
-			memcpy(&qxl_fb_image->pseudo_palette,
-			       info->pseudo_palette,
-			       sizeof(qxl_fb_image->pseudo_palette));
-	} else {
-		 /* fallback */
-		if (image->depth == 1)
-			qxl_fb_image->visual = FB_VISUAL_MONO10;
-		else
-			qxl_fb_image->visual = FB_VISUAL_DIRECTCOLOR;
-	}
-	if (image) {
-		memcpy(&qxl_fb_image->fb_image, image,
-		       sizeof(qxl_fb_image->fb_image));
-	}
-}
-
-static struct fb_ops qxlfb_ops = {
-	.owner = THIS_MODULE,
-	DRM_FB_HELPER_DEFAULT_OPS,
-	.fb_fillrect = drm_fb_helper_sys_fillrect,
-	.fb_copyarea = drm_fb_helper_sys_copyarea,
-	.fb_imageblit = drm_fb_helper_sys_imageblit,
-};
-
-static void qxlfb_destroy_pinned_object(struct drm_gem_object *gobj)
-{
-	struct qxl_bo *qbo = gem_to_qxl_bo(gobj);
-
-	qxl_bo_kunmap(qbo);
-	qxl_bo_unpin(qbo);
-
-	drm_gem_object_put_unlocked(gobj);
-}
-
-static int qxlfb_create_pinned_object(struct qxl_device *qdev,
-				      const struct drm_mode_fb_cmd2 *mode_cmd,
-				      struct drm_gem_object **gobj_p)
-{
-	struct drm_gem_object *gobj = NULL;
-	struct qxl_bo *qbo = NULL;
-	int ret;
-	int aligned_size, size;
-	int height = mode_cmd->height;
-
-	size = mode_cmd->pitches[0] * height;
-	aligned_size = ALIGN(size, PAGE_SIZE);
-	/* TODO: unallocate and reallocate surface0 for real. Hack to just
-	 * have a large enough surface0 for 1024x768 Xorg 32bpp mode */
-	ret = qxl_gem_object_create(qdev, aligned_size, 0,
-				    QXL_GEM_DOMAIN_SURFACE,
-				    false, /* is discardable */
-				    false, /* is kernel (false means device) */
-				    NULL,
-				    &gobj);
-	if (ret) {
-		pr_err("failed to allocate framebuffer (%d)\n",
-		       aligned_size);
-		return -ENOMEM;
-	}
-	qbo = gem_to_qxl_bo(gobj);
-
-	qbo->surf.width = mode_cmd->width;
-	qbo->surf.height = mode_cmd->height;
-	qbo->surf.stride = mode_cmd->pitches[0];
-	qbo->surf.format = SPICE_SURFACE_FMT_32_xRGB;
-
-	ret = qxl_bo_pin(qbo);
-	if (ret) {
-		goto out_unref;
-	}
-	ret = qxl_bo_kmap(qbo, NULL);
-
-	if (ret)
-		goto out_unref;
-
-	*gobj_p = gobj;
-	return 0;
-out_unref:
-	qxlfb_destroy_pinned_object(gobj);
-	*gobj_p = NULL;
-	return ret;
-}
-
-/*
- * FIXME
- * It should not be necessary to have a special dirty() callback for fbdev.
- */
-static int qxlfb_framebuffer_dirty(struct drm_framebuffer *fb,
-				   struct drm_file *file_priv,
-				   unsigned int flags, unsigned int color,
-				   struct drm_clip_rect *clips,
-				   unsigned int num_clips)
-{
-	struct qxl_device *qdev = fb->dev->dev_private;
-	struct fb_info *info = qdev->fb_helper.fbdev;
-	struct qxl_fb_image qxl_fb_image;
-	struct fb_image *image = &qxl_fb_image.fb_image;
-
-	/* TODO: hard coding 32 bpp */
-	int stride = fb->pitches[0];
-
-	/*
-	 * we are using a shadow draw buffer, at qdev->surface0_shadow
-	 */
-	image->dx = clips->x1;
-	image->dy = clips->y1;
-	image->width = clips->x2 - clips->x1;
-	image->height = clips->y2 - clips->y1;
-	image->fg_color = 0xffffffff; /* unused, just to avoid uninitialized
-					 warnings */
-	image->bg_color = 0;
-	image->depth = 32;	     /* TODO: take from somewhere? */
-	image->cmap.start = 0;
-	image->cmap.len = 0;
-	image->cmap.red = NULL;
-	image->cmap.green = NULL;
-	image->cmap.blue = NULL;
-	image->cmap.transp = NULL;
-	image->data = info->screen_base + (clips->x1 * 4) + (stride * clips->y1);
-
-	qxl_fb_image_init(&qxl_fb_image, qdev, info, NULL);
-	qxl_draw_opaque_fb(&qxl_fb_image, stride);
-
-	return 0;
-}
-
-static const struct drm_framebuffer_funcs qxlfb_fb_funcs = {
-	.destroy = drm_gem_fb_destroy,
-	.create_handle = drm_gem_fb_create_handle,
-	.dirty = qxlfb_framebuffer_dirty,
-};
-
-static int qxlfb_create(struct drm_fb_helper *helper,
-			struct drm_fb_helper_surface_size *sizes)
-{
-	struct qxl_device *qdev =
-		container_of(helper, struct qxl_device, fb_helper);
-	struct fb_info *info;
-	struct drm_framebuffer *fb = NULL;
-	struct drm_mode_fb_cmd2 mode_cmd;
-	struct drm_gem_object *gobj = NULL;
-	struct qxl_bo *qbo = NULL;
-	int ret;
-	int bpp = sizes->surface_bpp;
-	int depth = sizes->surface_depth;
-	void *shadow;
-
-	mode_cmd.width = sizes->surface_width;
-	mode_cmd.height = sizes->surface_height;
-
-	mode_cmd.pitches[0] = ALIGN(mode_cmd.width * ((bpp + 1) / 8), 64);
-	mode_cmd.pixel_format = drm_mode_legacy_fb_format(bpp, depth);
-
-	ret = qxlfb_create_pinned_object(qdev, &mode_cmd, &gobj);
-	if (ret < 0)
-		return ret;
-
-	qbo = gem_to_qxl_bo(gobj);
-	DRM_DEBUG_DRIVER("%dx%d %d\n", mode_cmd.width,
-			 mode_cmd.height, mode_cmd.pitches[0]);
-
-	shadow = vmalloc(array_size(mode_cmd.pitches[0], mode_cmd.height));
-	/* TODO: what's the usual response to memory allocation errors? */
-	BUG_ON(!shadow);
-	DRM_DEBUG_DRIVER("surface0 at gpu offset %lld, mmap_offset %lld (virt %p, shadow %p)\n",
-			 qxl_bo_gpu_offset(qbo), qxl_bo_mmap_offset(qbo),
-			 qbo->kptr, shadow);
-
-	info = drm_fb_helper_alloc_fbi(helper);
-	if (IS_ERR(info)) {
-		ret = PTR_ERR(info);
-		goto out_unref;
-	}
-
-	info->par = helper;
-
-	fb = drm_gem_fbdev_fb_create(&qdev->ddev, sizes, 64, gobj,
-				     &qxlfb_fb_funcs);
-	if (IS_ERR(fb)) {
-		DRM_ERROR("Failed to create framebuffer: %ld\n", PTR_ERR(fb));
-		ret = PTR_ERR(fb);
-		goto out_unref;
-	}
-
-	/* setup helper with fb data */
-	qdev->fb_helper.fb = fb;
-
-	strcpy(info->fix.id, "qxldrmfb");
-
-	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
-
-	info->fbops = &qxlfb_ops;
-
-	/*
-	 * TODO: using gobj->size in various places in this function. Not sure
-	 * what the difference between the different sizes is.
-	 */
-	info->fix.smem_start = qdev->vram_base; /* TODO - correct? */
-	info->fix.smem_len = gobj->size;
-	info->screen_base = shadow;
-	info->screen_size = gobj->size;
-
-	drm_fb_helper_fill_var(info, &qdev->fb_helper, sizes->fb_width,
-			       sizes->fb_height);
-
-	/* setup aperture base/size for vesafb takeover */
-	info->apertures->ranges[0].base = qdev->ddev.mode_config.fb_base;
-	info->apertures->ranges[0].size = qdev->vram_size;
-
-	info->fix.mmio_start = 0;
-	info->fix.mmio_len = 0;
-
-	if (info->screen_base == NULL) {
-		ret = -ENOSPC;
-		goto out_unref;
-	}
-
-	/* XXX error handling. */
-	drm_fb_helper_defio_init(helper);
-
-	DRM_INFO("fb mappable at 0x%lX, size %lu\n",  info->fix.smem_start, (unsigned long)info->screen_size);
-	DRM_INFO("fb: depth %d, pitch %d, width %d, height %d\n",
-		 fb->format->depth, fb->pitches[0], fb->width, fb->height);
-	return 0;
-
-out_unref:
-	if (qbo) {
-		qxl_bo_kunmap(qbo);
-		qxl_bo_unpin(qbo);
-	}
-	drm_gem_object_put_unlocked(gobj);
-	return ret;
-}
-
-static const struct drm_fb_helper_funcs qxl_fb_helper_funcs = {
-	.fb_probe = qxlfb_create,
-};
-
-int qxl_fbdev_init(struct qxl_device *qdev)
-{
-	return drm_fb_helper_fbdev_setup(&qdev->ddev, &qdev->fb_helper,
-					 &qxl_fb_helper_funcs, 32,
-					 QXLFB_CONN_LIMIT);
-}
-
-void qxl_fbdev_fini(struct qxl_device *qdev)
-{
-	struct fb_info *fbi = qdev->fb_helper.fbdev;
-	void *shadow = fbi ? fbi->screen_buffer : NULL;
-
-	drm_fb_helper_fbdev_teardown(&qdev->ddev);
-	vfree(shadow);
-}
diff --git a/drivers/gpu/drm/qxl/qxl_kms.c b/drivers/gpu/drm/qxl/qxl_kms.c
index 15238a413f9d..bee61fa2c9bc 100644
--- a/drivers/gpu/drm/qxl/qxl_kms.c
+++ b/drivers/gpu/drm/qxl/qxl_kms.c
@@ -26,7 +26,7 @@
 #include "qxl_drv.h"
 #include "qxl_object.h"
 
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/io-mapping.h>
 
 int qxl_log_level;
@@ -53,40 +53,47 @@ static bool qxl_check_device(struct qxl_device *qdev)
 	return true;
 }
 
-static void setup_hw_slot(struct qxl_device *qdev, int slot_index,
-			  struct qxl_memslot *slot)
+static void setup_hw_slot(struct qxl_device *qdev, struct qxl_memslot *slot)
 {
 	qdev->ram_header->mem_slot.mem_start = slot->start_phys_addr;
-	qdev->ram_header->mem_slot.mem_end = slot->end_phys_addr;
-	qxl_io_memslot_add(qdev, slot_index);
+	qdev->ram_header->mem_slot.mem_end = slot->start_phys_addr + slot->size;
+	qxl_io_memslot_add(qdev, qdev->rom->slots_start + slot->index);
 }
 
-static uint8_t setup_slot(struct qxl_device *qdev, uint8_t slot_index_offset,
-	unsigned long start_phys_addr, unsigned long end_phys_addr)
+static void setup_slot(struct qxl_device *qdev,
+		       struct qxl_memslot *slot,
+		       unsigned int slot_index,
+		       const char *slot_name,
+		       unsigned long start_phys_addr,
+		       unsigned long size)
 {
 	uint64_t high_bits;
-	struct qxl_memslot *slot;
-	uint8_t slot_index;
 
-	slot_index = qdev->rom->slots_start + slot_index_offset;
-	slot = &qdev->mem_slots[slot_index];
+	slot->index = slot_index;
+	slot->name = slot_name;
 	slot->start_phys_addr = start_phys_addr;
-	slot->end_phys_addr = end_phys_addr;
+	slot->size = size;
 
-	setup_hw_slot(qdev, slot_index, slot);
+	setup_hw_slot(qdev, slot);
 
 	slot->generation = qdev->rom->slot_generation;
-	high_bits = slot_index << qdev->slot_gen_bits;
+	high_bits = (qdev->rom->slots_start + slot->index)
+		<< qdev->rom->slot_gen_bits;
 	high_bits |= slot->generation;
-	high_bits <<= (64 - (qdev->slot_gen_bits + qdev->slot_id_bits));
+	high_bits <<= (64 - (qdev->rom->slot_gen_bits + qdev->rom->slot_id_bits));
 	slot->high_bits = high_bits;
-	return slot_index;
+
+	DRM_INFO("slot %d (%s): base 0x%08lx, size 0x%08lx, gpu_offset 0x%lx\n",
+		 slot->index, slot->name,
+		 (unsigned long)slot->start_phys_addr,
+		 (unsigned long)slot->size,
+		 (unsigned long)slot->gpu_offset);
 }
 
 void qxl_reinit_memslots(struct qxl_device *qdev)
 {
-	setup_hw_slot(qdev, qdev->main_mem_slot, &qdev->mem_slots[qdev->main_mem_slot]);
-	setup_hw_slot(qdev, qdev->surfaces_mem_slot, &qdev->mem_slots[qdev->surfaces_mem_slot]);
+	setup_hw_slot(qdev, &qdev->main_slot);
+	setup_hw_slot(qdev, &qdev->surfaces_slot);
 }
 
 static void qxl_gc_work(struct work_struct *work)
@@ -229,23 +236,6 @@ int qxl_device_init(struct qxl_device *qdev,
 		r = -ENOMEM;
 		goto cursor_ring_free;
 	}
-	/* TODO - slot initialization should happen on reset. where is our
-	 * reset handler? */
-	qdev->n_mem_slots = qdev->rom->slots_end;
-	qdev->slot_gen_bits = qdev->rom->slot_gen_bits;
-	qdev->slot_id_bits = qdev->rom->slot_id_bits;
-	qdev->va_slot_mask =
-		(~(uint64_t)0) >> (qdev->slot_id_bits + qdev->slot_gen_bits);
-
-	qdev->mem_slots =
-		kmalloc_array(qdev->n_mem_slots, sizeof(struct qxl_memslot),
-			      GFP_KERNEL);
-
-	if (!qdev->mem_slots) {
-		DRM_ERROR("Unable to alloc mem slots\n");
-		r = -ENOMEM;
-		goto release_ring_free;
-	}
 
 	idr_init(&qdev->release_idr);
 	spin_lock_init(&qdev->release_idr_lock);
@@ -264,33 +254,24 @@ int qxl_device_init(struct qxl_device *qdev,
 	r = qxl_irq_init(qdev);
 	if (r) {
 		DRM_ERROR("Unable to init qxl irq\n");
-		goto mem_slots_free;
+		goto release_ring_free;
 	}
 
 	/*
 	 * Note that virtual is surface0. We rely on the single ioremap done
 	 * before.
 	 */
-	qdev->main_mem_slot = setup_slot(qdev, 0,
-		(unsigned long)qdev->vram_base,
-		(unsigned long)qdev->vram_base + qdev->rom->ram_header_offset);
-	qdev->surfaces_mem_slot = setup_slot(qdev, 1,
-		(unsigned long)qdev->surfaceram_base,
-		(unsigned long)qdev->surfaceram_base + qdev->surfaceram_size);
-	DRM_INFO("main mem slot %d [%lx,%x]\n",
-		 qdev->main_mem_slot,
-		 (unsigned long)qdev->vram_base, qdev->rom->ram_header_offset);
-	DRM_INFO("surface mem slot %d [%lx,%lx]\n",
-		 qdev->surfaces_mem_slot,
-		 (unsigned long)qdev->surfaceram_base,
-		 (unsigned long)qdev->surfaceram_size);
+	setup_slot(qdev, &qdev->main_slot, 0, "main",
+		   (unsigned long)qdev->vram_base,
+		   (unsigned long)qdev->rom->ram_header_offset);
+	setup_slot(qdev, &qdev->surfaces_slot, 1, "surfaces",
+		   (unsigned long)qdev->surfaceram_base,
+		   (unsigned long)qdev->surfaceram_size);
 
 	INIT_WORK(&qdev->gc_work, qxl_gc_work);
 
 	return 0;
 
-mem_slots_free:
-	kfree(qdev->mem_slots);
 release_ring_free:
 	qxl_ring_free(qdev->release_ring);
 cursor_ring_free:
diff --git a/drivers/gpu/drm/qxl/qxl_object.c b/drivers/gpu/drm/qxl/qxl_object.c
index 91f3bbc73ecc..4928fa602944 100644
--- a/drivers/gpu/drm/qxl/qxl_object.c
+++ b/drivers/gpu/drm/qxl/qxl_object.c
@@ -36,6 +36,7 @@ static void qxl_ttm_bo_destroy(struct ttm_buffer_object *tbo)
 	qdev = (struct qxl_device *)bo->gem_base.dev->dev_private;
 
 	qxl_surface_evict(qdev, bo, false);
+	WARN_ON_ONCE(bo->map_count > 0);
 	mutex_lock(&qdev->gem.mutex);
 	list_del_init(&bo->list);
 	mutex_unlock(&qdev->gem.mutex);
@@ -60,8 +61,10 @@ void qxl_ttm_placement_from_domain(struct qxl_bo *qbo, u32 domain, bool pinned)
 	qbo->placement.busy_placement = qbo->placements;
 	if (domain == QXL_GEM_DOMAIN_VRAM)
 		qbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_VRAM | pflag;
-	if (domain == QXL_GEM_DOMAIN_SURFACE)
+	if (domain == QXL_GEM_DOMAIN_SURFACE) {
 		qbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_PRIV | pflag;
+		qbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_VRAM | pflag;
+	}
 	if (domain == QXL_GEM_DOMAIN_CPU)
 		qbo->placements[c++].flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM | pflag;
 	if (!c)
@@ -129,6 +132,7 @@ int qxl_bo_kmap(struct qxl_bo *bo, void **ptr)
 	if (bo->kptr) {
 		if (ptr)
 			*ptr = bo->kptr;
+		bo->map_count++;
 		return 0;
 	}
 	r = ttm_bo_kmap(&bo->tbo, 0, bo->tbo.num_pages, &bo->kmap);
@@ -137,6 +141,7 @@ int qxl_bo_kmap(struct qxl_bo *bo, void **ptr)
 	bo->kptr = ttm_kmap_obj_virtual(&bo->kmap, &is_iomem);
 	if (ptr)
 		*ptr = bo->kptr;
+	bo->map_count = 1;
 	return 0;
 }
 
@@ -178,6 +183,9 @@ void qxl_bo_kunmap(struct qxl_bo *bo)
 {
 	if (bo->kptr == NULL)
 		return;
+	bo->map_count--;
+	if (bo->map_count > 0)
+		return;
 	bo->kptr = NULL;
 	ttm_bo_kunmap(&bo->kmap);
 }
@@ -332,7 +340,7 @@ int qxl_bo_check_id(struct qxl_device *qdev, struct qxl_bo *bo)
 		if (ret)
 			return ret;
 
-		ret = qxl_hw_surface_alloc(qdev, bo, NULL);
+		ret = qxl_hw_surface_alloc(qdev, bo);
 		if (ret)
 			return ret;
 	}
diff --git a/drivers/gpu/drm/qxl/qxl_prime.c b/drivers/gpu/drm/qxl/qxl_prime.c
index df65d3c1a7b8..8b448eca1cd9 100644
--- a/drivers/gpu/drm/qxl/qxl_prime.c
+++ b/drivers/gpu/drm/qxl/qxl_prime.c
@@ -23,30 +23,43 @@
  */
 
 #include "qxl_drv.h"
+#include "qxl_object.h"
 
 /* Empty Implementations as there should not be any other driver for a virtual
  * device that might share buffers with qxl */
 
 int qxl_gem_prime_pin(struct drm_gem_object *obj)
 {
-	WARN_ONCE(1, "not implemented");
-	return -ENOSYS;
+	struct qxl_bo *bo = gem_to_qxl_bo(obj);
+
+	return qxl_bo_pin(bo);
 }
 
 void qxl_gem_prime_unpin(struct drm_gem_object *obj)
 {
-	WARN_ONCE(1, "not implemented");
+	struct qxl_bo *bo = gem_to_qxl_bo(obj);
+
+	qxl_bo_unpin(bo);
 }
 
 void *qxl_gem_prime_vmap(struct drm_gem_object *obj)
 {
-	WARN_ONCE(1, "not implemented");
-	return ERR_PTR(-ENOSYS);
+	struct qxl_bo *bo = gem_to_qxl_bo(obj);
+	void *ptr;
+	int ret;
+
+	ret = qxl_bo_kmap(bo, &ptr);
+	if (ret < 0)
+		return ERR_PTR(ret);
+
+	return ptr;
 }
 
 void qxl_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr)
 {
-	WARN_ONCE(1, "not implemented");
+	struct qxl_bo *bo = gem_to_qxl_bo(obj);
+
+	qxl_bo_kunmap(bo);
 }
 
 int qxl_gem_prime_mmap(struct drm_gem_object *obj,
diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c
index 886f61e94f24..92f5db5b296f 100644
--- a/drivers/gpu/drm/qxl/qxl_ttm.c
+++ b/drivers/gpu/drm/qxl/qxl_ttm.c
@@ -100,6 +100,11 @@ static int qxl_invalidate_caches(struct ttm_bo_device *bdev, uint32_t flags)
 static int qxl_init_mem_type(struct ttm_bo_device *bdev, uint32_t type,
 			     struct ttm_mem_type_manager *man)
 {
+	struct qxl_device *qdev = qxl_get_qdev(bdev);
+	unsigned int gpu_offset_shift =
+		64 - (qdev->rom->slot_gen_bits + qdev->rom->slot_id_bits + 8);
+	struct qxl_memslot *slot;
+
 	switch (type) {
 	case TTM_PL_SYSTEM:
 		/* System memory */
@@ -110,8 +115,11 @@ static int qxl_init_mem_type(struct ttm_bo_device *bdev, uint32_t type,
 	case TTM_PL_VRAM:
 	case TTM_PL_PRIV:
 		/* "On-card" video ram */
+		slot = (type == TTM_PL_VRAM) ?
+			&qdev->main_slot : &qdev->surfaces_slot;
+		slot->gpu_offset = (uint64_t)type << gpu_offset_shift;
 		man->func = &ttm_bo_manager_func;
-		man->gpu_offset = 0;
+		man->gpu_offset = slot->gpu_offset;
 		man->flags = TTM_MEMTYPE_FLAG_FIXED |
 			     TTM_MEMTYPE_FLAG_MAPPABLE;
 		man->available_caching = TTM_PL_MASK_CACHING;
@@ -196,7 +204,7 @@ static void qxl_ttm_io_mem_free(struct ttm_bo_device *bdev,
  * TTM backend functions.
  */
 struct qxl_ttm_tt {
-	struct ttm_dma_tt		ttm;
+	struct ttm_tt		        ttm;
 	struct qxl_device		*qdev;
 	u64				offset;
 };
@@ -225,7 +233,7 @@ static void qxl_ttm_backend_destroy(struct ttm_tt *ttm)
 {
 	struct qxl_ttm_tt *gtt = (void *)ttm;
 
-	ttm_dma_tt_fini(&gtt->ttm);
+	ttm_tt_fini(&gtt->ttm);
 	kfree(gtt);
 }
 
@@ -245,13 +253,13 @@ static struct ttm_tt *qxl_ttm_tt_create(struct ttm_buffer_object *bo,
 	gtt = kzalloc(sizeof(struct qxl_ttm_tt), GFP_KERNEL);
 	if (gtt == NULL)
 		return NULL;
-	gtt->ttm.ttm.func = &qxl_backend_func;
+	gtt->ttm.func = &qxl_backend_func;
 	gtt->qdev = qdev;
-	if (ttm_dma_tt_init(&gtt->ttm, bo, page_flags)) {
+	if (ttm_tt_init(&gtt->ttm, bo, page_flags)) {
 		kfree(gtt);
 		return NULL;
 	}
-	return &gtt->ttm.ttm;
+	return &gtt->ttm;
 }
 
 static void qxl_move_null(struct ttm_buffer_object *bo,
diff --git a/drivers/gpu/drm/r128/r128_cce.c b/drivers/gpu/drm/r128/r128_cce.c
index c9890afe69d6..b91af1bf531b 100644
--- a/drivers/gpu/drm/r128/r128_cce.c
+++ b/drivers/gpu/drm/r128/r128_cce.c
@@ -560,11 +560,12 @@ static int r128_do_init_cce(struct drm_device *dev, drm_r128_init_t *init)
 		dev_priv->gart_info.addr = NULL;
 		dev_priv->gart_info.bus_addr = 0;
 		dev_priv->gart_info.gart_reg_if = DRM_ATI_GART_PCI;
-		if (!drm_ati_pcigart_init(dev, &dev_priv->gart_info)) {
+		rc = drm_ati_pcigart_init(dev, &dev_priv->gart_info);
+		if (rc) {
 			DRM_ERROR("failed to init PCI GART!\n");
 			dev->dev_private = (void *)dev_priv;
 			r128_do_cleanup_cce(dev);
-			return -ENOMEM;
+			return rc;
 		}
 		R128_WRITE(R128_PCI_GART_PAGE, dev_priv->gart_info.bus_addr);
 #if IS_ENABLED(CONFIG_AGP)
diff --git a/drivers/gpu/drm/r128/r128_drv.c b/drivers/gpu/drm/r128/r128_drv.c
index 0d2b7e42b3a7..4b1a505ab353 100644
--- a/drivers/gpu/drm/r128/r128_drv.c
+++ b/drivers/gpu/drm/r128/r128_drv.c
@@ -57,7 +57,7 @@ static const struct file_operations r128_driver_fops = {
 static struct drm_driver driver = {
 	.driver_features =
 	    DRIVER_USE_AGP | DRIVER_PCI_DMA | DRIVER_SG | DRIVER_LEGACY |
-	    DRIVER_HAVE_DMA | DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED,
+	    DRIVER_HAVE_DMA | DRIVER_HAVE_IRQ,
 	.dev_priv_size = sizeof(drm_r128_buf_priv_t),
 	.load = r128_driver_load,
 	.preclose = r128_driver_preclose,
diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c
index e55cbeee7a53..ac98ad561870 100644
--- a/drivers/gpu/drm/radeon/atom.c
+++ b/drivers/gpu/drm/radeon/atom.c
@@ -27,6 +27,8 @@
 #include <linux/slab.h>
 #include <asm/unaligned.h>
 
+#include <drm/drm_util.h>
+
 #define ATOM_DEBUG
 
 #include "atom.h"
diff --git a/drivers/gpu/drm/radeon/ci_dpm.c b/drivers/gpu/drm/radeon/ci_dpm.c
index a97294ac96d5..a12439266bb0 100644
--- a/drivers/gpu/drm/radeon/ci_dpm.c
+++ b/drivers/gpu/drm/radeon/ci_dpm.c
@@ -4869,10 +4869,12 @@ static void ci_request_link_speed_change_before_state_change(struct radeon_devic
 			pi->force_pcie_gen = RADEON_PCIE_GEN2;
 			if (current_link_speed == RADEON_PCIE_GEN2)
 				break;
+			/* fall through */
 		case RADEON_PCIE_GEN2:
 			if (radeon_acpi_pcie_performance_request(rdev, PCIE_PERF_REQ_PECI_GEN2, false) == 0)
 				break;
 #endif
+			/* fall through */
 		default:
 			pi->force_pcie_gen = ci_get_current_pcie_speed(rdev);
 			break;
diff --git a/drivers/gpu/drm/radeon/evergreen_cs.c b/drivers/gpu/drm/radeon/evergreen_cs.c
index f471537c852f..1e14c6921454 100644
--- a/drivers/gpu/drm/radeon/evergreen_cs.c
+++ b/drivers/gpu/drm/radeon/evergreen_cs.c
@@ -1299,6 +1299,7 @@ static int evergreen_cs_handle_reg(struct radeon_cs_parser *p, u32 reg, u32 idx)
 			return -EINVAL;
 		}
 		ib[idx] += (u32)((reloc->gpu_offset >> 8) & 0xffffffff);
+		break;
 	case CB_TARGET_MASK:
 		track->cb_target_mask = radeon_get_ib_value(p, idx);
 		track->cb_dirty = true;
diff --git a/drivers/gpu/drm/radeon/radeon_acpi.c b/drivers/gpu/drm/radeon/radeon_acpi.c
index 8d3251a10cd4..224cc21bbe38 100644
--- a/drivers/gpu/drm/radeon/radeon_acpi.c
+++ b/drivers/gpu/drm/radeon/radeon_acpi.c
@@ -29,6 +29,7 @@
 #include <acpi/video.h>
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "radeon.h"
 #include "radeon_acpi.h"
 #include "atom.h"
diff --git a/drivers/gpu/drm/radeon/radeon_audio.c b/drivers/gpu/drm/radeon/radeon_audio.c
index 770e31f5fd1b..96f71114237a 100644
--- a/drivers/gpu/drm/radeon/radeon_audio.c
+++ b/drivers/gpu/drm/radeon/radeon_audio.c
@@ -516,21 +516,17 @@ static int radeon_audio_set_avi_packet(struct drm_encoder *encoder,
 	if (!connector)
 		return -EINVAL;
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, connector, mode);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %d\n", err);
 		return err;
 	}
 
 	if (radeon_encoder->output_csc != RADEON_OUTPUT_CSC_BYPASS) {
-		if (drm_rgb_quant_range_selectable(radeon_connector_edid(connector))) {
-			if (radeon_encoder->output_csc == RADEON_OUTPUT_CSC_TVRGB)
-				frame.quantization_range = HDMI_QUANTIZATION_RANGE_LIMITED;
-			else
-				frame.quantization_range = HDMI_QUANTIZATION_RANGE_FULL;
-		} else {
-			frame.quantization_range = HDMI_QUANTIZATION_RANGE_DEFAULT;
-		}
+		drm_hdmi_avi_infoframe_quant_range(&frame, connector, mode,
+						   radeon_encoder->output_csc == RADEON_OUTPUT_CSC_TVRGB ?
+						   HDMI_QUANTIZATION_RANGE_LIMITED :
+						   HDMI_QUANTIZATION_RANGE_FULL);
 	}
 
 	err = hdmi_avi_infoframe_pack(&frame, buffer, sizeof(buffer));
diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c b/drivers/gpu/drm/radeon/radeon_connectors.c
index 414642e5b7a3..de1745adcccc 100644
--- a/drivers/gpu/drm/radeon/radeon_connectors.c
+++ b/drivers/gpu/drm/radeon/radeon_connectors.c
@@ -28,6 +28,7 @@
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_dp_mst_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/radeon_drm.h>
 #include "radeon.h"
 #include "radeon_audio.h"
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index 59c8a6647ff2..53f29a115104 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -29,6 +29,7 @@
 #include <linux/slab.h>
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_cache.h>
 #include <drm/radeon_drm.h>
 #include <linux/pm_runtime.h>
diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index 9d3ac8b981da..aa898c699101 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -35,6 +35,7 @@
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_edid.h>
 
 #include <linux/gcd.h>
@@ -1646,7 +1647,7 @@ void radeon_modeset_fini(struct radeon_device *rdev)
 	if (rdev->mode_info.mode_config_initialized) {
 		drm_kms_helper_poll_fini(rdev->ddev);
 		radeon_hpd_fini(rdev);
-		drm_crtc_force_disable_all(rdev->ddev);
+		drm_helper_force_disable_all(rdev->ddev);
 		radeon_fbdev_fini(rdev);
 		radeon_afmt_fini(rdev);
 		drm_mode_config_cleanup(rdev->ddev);
diff --git a/drivers/gpu/drm/radeon/radeon_dp_mst.c b/drivers/gpu/drm/radeon/radeon_dp_mst.c
index 84b3ad2172a3..8d85540bbb43 100644
--- a/drivers/gpu/drm/radeon/radeon_dp_mst.c
+++ b/drivers/gpu/drm/radeon/radeon_dp_mst.c
@@ -3,6 +3,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_dp_mst_helper.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "radeon.h"
 #include "atom.h"
@@ -320,19 +321,10 @@ static void radeon_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
 	DRM_DEBUG_KMS("\n");
 }
 
-static void radeon_dp_mst_hotplug(struct drm_dp_mst_topology_mgr *mgr)
-{
-	struct radeon_connector *master = container_of(mgr, struct radeon_connector, mst_mgr);
-	struct drm_device *dev = master->base.dev;
-
-	drm_kms_helper_hotplug_event(dev);
-}
-
 static const struct drm_dp_mst_topology_cbs mst_cbs = {
 	.add_connector = radeon_dp_add_mst_connector,
 	.register_connector = radeon_dp_register_mst_connector,
 	.destroy_connector = radeon_dp_destroy_mst_connector,
-	.hotplug = radeon_dp_mst_hotplug,
 };
 
 static struct
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index 99c63eeb2866..2e96c886392b 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -43,6 +43,7 @@
 #include <drm/drm_fb_helper.h>
 
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 /*
  * KMS wrapper.
@@ -533,9 +534,7 @@ radeon_get_crtc_scanout_position(struct drm_device *dev, unsigned int pipe,
 
 static struct drm_driver kms_driver = {
 	.driver_features =
-	    DRIVER_USE_AGP |
-	    DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM |
-	    DRIVER_PRIME | DRIVER_RENDER,
+	    DRIVER_USE_AGP | DRIVER_GEM | DRIVER_PRIME | DRIVER_RENDER,
 	.load = radeon_driver_load_kms,
 	.open = radeon_driver_open_kms,
 	.postclose = radeon_driver_postclose_kms,
diff --git a/drivers/gpu/drm/radeon/radeon_irq_kms.c b/drivers/gpu/drm/radeon/radeon_irq_kms.c
index afaf10db47cc..1d5e3ba7383e 100644
--- a/drivers/gpu/drm/radeon/radeon_irq_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_irq_kms.c
@@ -27,6 +27,7 @@
  */
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/radeon_drm.h>
 #include "radeon_reg.h"
 #include "radeon.h"
diff --git a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
index 222a1fa41d7c..7e3257e8fd56 100644
--- a/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_legacy_encoders.c
@@ -24,6 +24,7 @@
  *          Alex Deucher
  */
 #include <drm/drmP.h>
+#include <drm/drm_util.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/radeon_drm.h>
 #include "radeon.h"
diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
index 0a785ef0ab66..c9f6cb77e857 100644
--- a/drivers/gpu/drm/radeon/si_dpm.c
+++ b/drivers/gpu/drm/radeon/si_dpm.c
@@ -5762,10 +5762,12 @@ static void si_request_link_speed_change_before_state_change(struct radeon_devic
 			si_pi->force_pcie_gen = RADEON_PCIE_GEN2;
 			if (current_link_speed == RADEON_PCIE_GEN2)
 				break;
+			/* fall through */
 		case RADEON_PCIE_GEN2:
 			if (radeon_acpi_pcie_performance_request(rdev, PCIE_PERF_REQ_PECI_GEN2, false) == 0)
 				break;
 #endif
+			/* fall through */
 		default:
 			si_pi->force_pcie_gen = si_get_current_pcie_speed(rdev);
 			break;
diff --git a/drivers/gpu/drm/rcar-du/Kconfig b/drivers/gpu/drm/rcar-du/Kconfig
index 225141656e19..7c36e2777a15 100644
--- a/drivers/gpu/drm/rcar-du/Kconfig
+++ b/drivers/gpu/drm/rcar-du/Kconfig
@@ -4,6 +4,7 @@ config DRM_RCAR_DU
 	depends on DRM && OF
 	depends on ARM || ARM64
 	depends on ARCH_RENESAS || COMPILE_TEST
+	imply DRM_RCAR_LVDS
 	select DRM_KMS_HELPER
 	select DRM_KMS_CMA_HELPER
 	select DRM_GEM_CMA_HELPER
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
index 90dacab67be5..4cdea14d552f 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -9,23 +9,26 @@
 
 #include <linux/clk.h>
 #include <linux/mutex.h>
+#include <linux/platform_device.h>
 #include <linux/sys_soc.h>
 
-#include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_device.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_vblank.h>
 
 #include "rcar_du_crtc.h"
 #include "rcar_du_drv.h"
+#include "rcar_du_encoder.h"
 #include "rcar_du_kms.h"
 #include "rcar_du_plane.h"
 #include "rcar_du_regs.h"
 #include "rcar_du_vsp.h"
+#include "rcar_lvds.h"
 
 static u32 rcar_du_crtc_read(struct rcar_du_crtc *rcrtc, u32 reg)
 {
@@ -316,26 +319,6 @@ static void rcar_du_crtc_set_display_timing(struct rcar_du_crtc *rcrtc)
 	rcar_du_crtc_write(rcrtc, DEWR,  mode->hdisplay);
 }
 
-void rcar_du_crtc_route_output(struct drm_crtc *crtc,
-			       enum rcar_du_output output)
-{
-	struct rcar_du_crtc *rcrtc = to_rcar_crtc(crtc);
-	struct rcar_du_device *rcdu = rcrtc->group->dev;
-
-	/*
-	 * Store the route from the CRTC output to the DU output. The DU will be
-	 * configured when starting the CRTC.
-	 */
-	rcrtc->outputs |= BIT(output);
-
-	/*
-	 * Store RGB routing to DPAD0, the hardware will be configured when
-	 * starting the CRTC.
-	 */
-	if (output == RCAR_DU_OUTPUT_DPAD0)
-		rcdu->dpad0_source = rcrtc->index;
-}
-
 static unsigned int plane_zpos(struct rcar_du_plane *plane)
 {
 	return plane->plane.state->normalized_zpos;
@@ -655,12 +638,49 @@ static void rcar_du_crtc_stop(struct rcar_du_crtc *rcrtc)
  * CRTC Functions
  */
 
+static int rcar_du_crtc_atomic_check(struct drm_crtc *crtc,
+				     struct drm_crtc_state *state)
+{
+	struct rcar_du_crtc_state *rstate = to_rcar_crtc_state(state);
+	struct drm_encoder *encoder;
+
+	/* Store the routes from the CRTC output to the DU outputs. */
+	rstate->outputs = 0;
+
+	drm_for_each_encoder_mask(encoder, crtc->dev, state->encoder_mask) {
+		struct rcar_du_encoder *renc = to_rcar_encoder(encoder);
+
+		rstate->outputs |= BIT(renc->output);
+	}
+
+	return 0;
+}
+
 static void rcar_du_crtc_atomic_enable(struct drm_crtc *crtc,
 				       struct drm_crtc_state *old_state)
 {
 	struct rcar_du_crtc *rcrtc = to_rcar_crtc(crtc);
+	struct rcar_du_crtc_state *rstate = to_rcar_crtc_state(crtc->state);
+	struct rcar_du_device *rcdu = rcrtc->group->dev;
 
 	rcar_du_crtc_get(rcrtc);
+
+	/*
+	 * On D3/E3 the dot clock is provided by the LVDS encoder attached to
+	 * the DU channel. We need to enable its clock output explicitly if
+	 * the LVDS output is disabled.
+	 */
+	if (rcdu->info->lvds_clk_mask & BIT(rcrtc->index) &&
+	    rstate->outputs == BIT(RCAR_DU_OUTPUT_DPAD0)) {
+		struct rcar_du_encoder *encoder =
+			rcdu->encoders[RCAR_DU_OUTPUT_LVDS0 + rcrtc->index];
+		const struct drm_display_mode *mode =
+			&crtc->state->adjusted_mode;
+
+		rcar_lvds_clk_enable(encoder->base.bridge,
+				     mode->clock * 1000);
+	}
+
 	rcar_du_crtc_start(rcrtc);
 }
 
@@ -668,18 +688,30 @@ static void rcar_du_crtc_atomic_disable(struct drm_crtc *crtc,
 					struct drm_crtc_state *old_state)
 {
 	struct rcar_du_crtc *rcrtc = to_rcar_crtc(crtc);
+	struct rcar_du_crtc_state *rstate = to_rcar_crtc_state(old_state);
+	struct rcar_du_device *rcdu = rcrtc->group->dev;
 
 	rcar_du_crtc_stop(rcrtc);
 	rcar_du_crtc_put(rcrtc);
 
+	if (rcdu->info->lvds_clk_mask & BIT(rcrtc->index) &&
+	    rstate->outputs == BIT(RCAR_DU_OUTPUT_DPAD0)) {
+		struct rcar_du_encoder *encoder =
+			rcdu->encoders[RCAR_DU_OUTPUT_LVDS0 + rcrtc->index];
+
+		/*
+		 * Disable the LVDS clock output, see
+		 * rcar_du_crtc_atomic_enable().
+		 */
+		rcar_lvds_clk_disable(encoder->base.bridge);
+	}
+
 	spin_lock_irq(&crtc->dev->event_lock);
 	if (crtc->state->event) {
 		drm_crtc_send_vblank_event(crtc, crtc->state->event);
 		crtc->state->event = NULL;
 	}
 	spin_unlock_irq(&crtc->dev->event_lock);
-
-	rcrtc->outputs = 0;
 }
 
 static void rcar_du_crtc_atomic_begin(struct drm_crtc *crtc,
@@ -755,6 +787,7 @@ enum drm_mode_status rcar_du_crtc_mode_valid(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs crtc_helper_funcs = {
+	.atomic_check = rcar_du_crtc_atomic_check,
 	.atomic_begin = rcar_du_crtc_atomic_begin,
 	.atomic_flush = rcar_du_crtc_atomic_flush,
 	.atomic_enable = rcar_du_crtc_atomic_enable,
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
index 59ac6e7d22c9..bcb35b0b7612 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
@@ -14,7 +14,6 @@
 #include <linux/spinlock.h>
 #include <linux/wait.h>
 
-#include <drm/drmP.h>
 #include <drm/drm_crtc.h>
 
 #include <media/vsp1.h>
@@ -37,7 +36,6 @@ struct rcar_du_vsp;
  * @vblank_lock: protects vblank_wait and vblank_count
  * @vblank_wait: wait queue used to signal vertical blanking
  * @vblank_count: number of vertical blanking interrupts to wait for
- * @outputs: bitmask of the outputs (enum rcar_du_output) driven by this CRTC
  * @group: CRTC group this CRTC belongs to
  * @vsp: VSP feeding video to this CRTC
  * @vsp_pipe: index of the VSP pipeline feeding video to this CRTC
@@ -61,8 +59,6 @@ struct rcar_du_crtc {
 	wait_queue_head_t vblank_wait;
 	unsigned int vblank_count;
 
-	unsigned int outputs;
-
 	struct rcar_du_group *group;
 	struct rcar_du_vsp *vsp;
 	unsigned int vsp_pipe;
@@ -77,11 +73,13 @@ struct rcar_du_crtc {
  * struct rcar_du_crtc_state - Driver-specific CRTC state
  * @state: base DRM CRTC state
  * @crc: CRC computation configuration
+ * @outputs: bitmask of the outputs (enum rcar_du_output) driven by this CRTC
  */
 struct rcar_du_crtc_state {
 	struct drm_crtc_state state;
 
 	struct vsp1_du_crc_config crc;
+	unsigned int outputs;
 };
 
 #define to_rcar_crtc_state(s) container_of(s, struct rcar_du_crtc_state, state)
@@ -102,8 +100,6 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, unsigned int swindex,
 void rcar_du_crtc_suspend(struct rcar_du_crtc *rcrtc);
 void rcar_du_crtc_resume(struct rcar_du_crtc *rcrtc);
 
-void rcar_du_crtc_route_output(struct drm_crtc *crtc,
-			       enum rcar_du_output output);
 void rcar_du_crtc_finish_page_flip(struct rcar_du_crtc *rcrtc);
 
 void rcar_du_crtc_dsysr_clr_set(struct rcar_du_crtc *rcrtc, u32 clr, u32 set);
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
index f50a3b1864bb..75ab17af13a9 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
@@ -17,12 +17,12 @@
 #include <linux/slab.h>
 #include <linux/wait.h>
 
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "rcar_du_drv.h"
 #include "rcar_du_kms.h"
@@ -36,7 +36,6 @@
 static const struct rcar_du_device_info rzg1_du_r8a7743_info = {
 	.gen = 2,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
 	.channels_mask = BIT(1) | BIT(0),
@@ -59,7 +58,6 @@ static const struct rcar_du_device_info rzg1_du_r8a7743_info = {
 static const struct rcar_du_device_info rzg1_du_r8a7745_info = {
 	.gen = 2,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
 	.channels_mask = BIT(1) | BIT(0),
@@ -81,7 +79,6 @@ static const struct rcar_du_device_info rzg1_du_r8a7745_info = {
 static const struct rcar_du_device_info rzg1_du_r8a77470_info = {
 	.gen = 2,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
 	.channels_mask = BIT(1) | BIT(0),
@@ -105,8 +102,34 @@ static const struct rcar_du_device_info rzg1_du_r8a77470_info = {
 	},
 };
 
+static const struct rcar_du_device_info rcar_du_r8a774c0_info = {
+	.gen = 3,
+	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
+		  | RCAR_DU_FEATURE_VSP1_SOURCE,
+	.channels_mask = BIT(1) | BIT(0),
+	.routes = {
+		/*
+		 * R8A774C0 has one RGB output and two LVDS outputs
+		 */
+		[RCAR_DU_OUTPUT_DPAD0] = {
+			.possible_crtcs = BIT(0) | BIT(1),
+			.port = 0,
+		},
+		[RCAR_DU_OUTPUT_LVDS0] = {
+			.possible_crtcs = BIT(0),
+			.port = 1,
+		},
+		[RCAR_DU_OUTPUT_LVDS1] = {
+			.possible_crtcs = BIT(1),
+			.port = 2,
+		},
+	},
+	.num_lvds = 2,
+	.lvds_clk_mask =  BIT(1) | BIT(0),
+};
+
 static const struct rcar_du_device_info rcar_du_r8a7779_info = {
-	.gen = 2,
+	.gen = 1,
 	.features = RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
 	.channels_mask = BIT(1) | BIT(0),
@@ -129,7 +152,6 @@ static const struct rcar_du_device_info rcar_du_r8a7779_info = {
 static const struct rcar_du_device_info rcar_du_r8a7790_info = {
 	.gen = 2,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
 	.quirks = RCAR_DU_QUIRK_ALIGN_128B,
@@ -159,7 +181,6 @@ static const struct rcar_du_device_info rcar_du_r8a7790_info = {
 static const struct rcar_du_device_info rcar_du_r8a7791_info = {
 	.gen = 2,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
 	.channels_mask = BIT(1) | BIT(0),
@@ -183,7 +204,6 @@ static const struct rcar_du_device_info rcar_du_r8a7791_info = {
 static const struct rcar_du_device_info rcar_du_r8a7792_info = {
 	.gen = 2,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
 	.channels_mask = BIT(1) | BIT(0),
@@ -203,7 +223,6 @@ static const struct rcar_du_device_info rcar_du_r8a7792_info = {
 static const struct rcar_du_device_info rcar_du_r8a7794_info = {
 	.gen = 2,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
 	.channels_mask = BIT(1) | BIT(0),
@@ -226,7 +245,6 @@ static const struct rcar_du_device_info rcar_du_r8a7794_info = {
 static const struct rcar_du_device_info rcar_du_r8a7795_info = {
 	.gen = 3,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_VSP1_SOURCE
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
@@ -260,7 +278,6 @@ static const struct rcar_du_device_info rcar_du_r8a7795_info = {
 static const struct rcar_du_device_info rcar_du_r8a7796_info = {
 	.gen = 3,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_VSP1_SOURCE
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
@@ -290,7 +307,6 @@ static const struct rcar_du_device_info rcar_du_r8a7796_info = {
 static const struct rcar_du_device_info rcar_du_r8a77965_info = {
 	.gen = 3,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_VSP1_SOURCE
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
@@ -320,7 +336,6 @@ static const struct rcar_du_device_info rcar_du_r8a77965_info = {
 static const struct rcar_du_device_info rcar_du_r8a77970_info = {
 	.gen = 3,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_VSP1_SOURCE
 		  | RCAR_DU_FEATURE_INTERLACED
 		  | RCAR_DU_FEATURE_TVM_SYNC,
@@ -342,7 +357,6 @@ static const struct rcar_du_device_info rcar_du_r8a77970_info = {
 static const struct rcar_du_device_info rcar_du_r8a7799x_info = {
 	.gen = 3,
 	.features = RCAR_DU_FEATURE_CRTC_IRQ_CLOCK
-		  | RCAR_DU_FEATURE_EXT_CTRL_REGS
 		  | RCAR_DU_FEATURE_VSP1_SOURCE,
 	.channels_mask = BIT(1) | BIT(0),
 	.routes = {
@@ -372,6 +386,7 @@ static const struct of_device_id rcar_du_of_table[] = {
 	{ .compatible = "renesas,du-r8a7744", .data = &rzg1_du_r8a7743_info },
 	{ .compatible = "renesas,du-r8a7745", .data = &rzg1_du_r8a7745_info },
 	{ .compatible = "renesas,du-r8a77470", .data = &rzg1_du_r8a77470_info },
+	{ .compatible = "renesas,du-r8a774c0", .data = &rcar_du_r8a774c0_info },
 	{ .compatible = "renesas,du-r8a7779", .data = &rcar_du_r8a7779_info },
 	{ .compatible = "renesas,du-r8a7790", .data = &rcar_du_r8a7790_info },
 	{ .compatible = "renesas,du-r8a7791", .data = &rcar_du_r8a7791_info },
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.h b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
index a68da79b424e..1327cd0df90a 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
@@ -20,13 +20,14 @@
 struct clk;
 struct device;
 struct drm_device;
+struct drm_property;
 struct rcar_du_device;
+struct rcar_du_encoder;
 
 #define RCAR_DU_FEATURE_CRTC_IRQ_CLOCK	BIT(0)	/* Per-CRTC IRQ and clock */
-#define RCAR_DU_FEATURE_EXT_CTRL_REGS	BIT(1)	/* Has extended control registers */
-#define RCAR_DU_FEATURE_VSP1_SOURCE	BIT(2)	/* Has inputs from VSP1 */
-#define RCAR_DU_FEATURE_INTERLACED	BIT(3)	/* HW supports interlaced */
-#define RCAR_DU_FEATURE_TVM_SYNC	BIT(4)	/* Has TV switch/sync modes */
+#define RCAR_DU_FEATURE_VSP1_SOURCE	BIT(1)	/* Has inputs from VSP1 */
+#define RCAR_DU_FEATURE_INTERLACED	BIT(2)	/* HW supports interlaced */
+#define RCAR_DU_FEATURE_TVM_SYNC	BIT(3)	/* Has TV switch/sync modes */
 
 #define RCAR_DU_QUIRK_ALIGN_128B	BIT(0)	/* Align pitches to 128 bytes */
 
@@ -81,6 +82,8 @@ struct rcar_du_device {
 	struct rcar_du_crtc crtcs[RCAR_DU_MAX_CRTCS];
 	unsigned int num_crtcs;
 
+	struct rcar_du_encoder *encoders[RCAR_DU_OUTPUT_MAX];
+
 	struct rcar_du_group groups[RCAR_DU_MAX_GROUPS];
 	struct rcar_du_vsp vsps[RCAR_DU_MAX_VSPS];
 
@@ -89,6 +92,7 @@ struct rcar_du_device {
 	} props;
 
 	unsigned int dpad0_source;
+	unsigned int dpad1_source;
 	unsigned int vspd1_sink;
 };
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_encoder.c b/drivers/gpu/drm/rcar-du/rcar_du_encoder.c
index 1877764bd6d9..8ee4e762f4e5 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_encoder.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_encoder.c
@@ -9,9 +9,8 @@
 
 #include <linux/export.h>
 
-#include <drm/drmP.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_modeset_helper_vtables.h>
 #include <drm/drm_panel.h>
 
 #include "rcar_du_drv.h"
@@ -22,17 +21,7 @@
  * Encoder
  */
 
-static void rcar_du_encoder_mode_set(struct drm_encoder *encoder,
-				     struct drm_crtc_state *crtc_state,
-				     struct drm_connector_state *conn_state)
-{
-	struct rcar_du_encoder *renc = to_rcar_encoder(encoder);
-
-	rcar_du_crtc_route_output(crtc_state->crtc, renc->output);
-}
-
 static const struct drm_encoder_helper_funcs encoder_helper_funcs = {
-	.atomic_mode_set = rcar_du_encoder_mode_set,
 };
 
 static const struct drm_encoder_funcs encoder_funcs = {
@@ -41,8 +30,7 @@ static const struct drm_encoder_funcs encoder_funcs = {
 
 int rcar_du_encoder_init(struct rcar_du_device *rcdu,
 			 enum rcar_du_output output,
-			 struct device_node *enc_node,
-			 struct device_node *con_node)
+			 struct device_node *enc_node)
 {
 	struct rcar_du_encoder *renc;
 	struct drm_encoder *encoder;
@@ -53,6 +41,7 @@ int rcar_du_encoder_init(struct rcar_du_device *rcdu,
 	if (renc == NULL)
 		return -ENOMEM;
 
+	rcdu->encoders[output] = renc;
 	renc->output = output;
 	encoder = rcar_encoder_to_drm_encoder(renc);
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_encoder.h b/drivers/gpu/drm/rcar-du/rcar_du_encoder.h
index ce3cbc85695e..df9be4524301 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_encoder.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_encoder.h
@@ -10,10 +10,8 @@
 #ifndef __RCAR_DU_ENCODER_H__
 #define __RCAR_DU_ENCODER_H__
 
-#include <drm/drm_crtc.h>
 #include <drm/drm_encoder.h>
 
-struct drm_panel;
 struct rcar_du_device;
 
 struct rcar_du_encoder {
@@ -28,7 +26,6 @@ struct rcar_du_encoder {
 
 int rcar_du_encoder_init(struct rcar_du_device *rcdu,
 			 enum rcar_du_output output,
-			 struct device_node *enc_node,
-			 struct device_node *con_node);
+			 struct device_node *enc_node);
 
 #endif /* __RCAR_DU_ENCODER_H__ */
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_group.c b/drivers/gpu/drm/rcar-du/rcar_du_group.c
index cebf313c6e1f..9eee47969e77 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_group.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_group.c
@@ -147,7 +147,7 @@ static void rcar_du_group_setup(struct rcar_du_group *rgrp)
 
 	rcar_du_group_setup_pins(rgrp);
 
-	if (rcar_du_has(rgrp->dev, RCAR_DU_FEATURE_EXT_CTRL_REGS)) {
+	if (rcdu->info->gen >= 2) {
 		rcar_du_group_setup_defr8(rgrp);
 		rcar_du_group_setup_didsr(rgrp);
 	}
@@ -262,7 +262,7 @@ int rcar_du_set_dpad0_vsp1_routing(struct rcar_du_device *rcdu)
 	unsigned int index;
 	int ret;
 
-	if (!rcar_du_has(rcdu, RCAR_DU_FEATURE_EXT_CTRL_REGS))
+	if (rcdu->info->gen < 2)
 		return 0;
 
 	/*
@@ -287,9 +287,50 @@ int rcar_du_set_dpad0_vsp1_routing(struct rcar_du_device *rcdu)
 	return 0;
 }
 
+static void rcar_du_group_set_dpad_levels(struct rcar_du_group *rgrp)
+{
+	static const u32 doflr_values[2] = {
+		DOFLR_HSYCFL0 | DOFLR_VSYCFL0 | DOFLR_ODDFL0 |
+		DOFLR_DISPFL0 | DOFLR_CDEFL0  | DOFLR_RGBFL0,
+		DOFLR_HSYCFL1 | DOFLR_VSYCFL1 | DOFLR_ODDFL1 |
+		DOFLR_DISPFL1 | DOFLR_CDEFL1  | DOFLR_RGBFL1,
+	};
+	static const u32 dpad_mask = BIT(RCAR_DU_OUTPUT_DPAD1)
+				   | BIT(RCAR_DU_OUTPUT_DPAD0);
+	struct rcar_du_device *rcdu = rgrp->dev;
+	u32 doflr = DOFLR_CODE;
+	unsigned int i;
+
+	if (rcdu->info->gen < 2)
+		return;
+
+	/*
+	 * The DPAD outputs can't be controlled directly. However, the parallel
+	 * output of the DU channels routed to DPAD can be set to fixed levels
+	 * through the DOFLR group register. Use this to turn the DPAD on or off
+	 * by driving fixed low-level signals at the output of any DU channel
+	 * not routed to a DPAD output. This doesn't affect the DU output
+	 * signals going to other outputs, such as the internal LVDS and HDMI
+	 * encoders.
+	 */
+
+	for (i = 0; i < rgrp->num_crtcs; ++i) {
+		struct rcar_du_crtc_state *rstate;
+		struct rcar_du_crtc *rcrtc;
+
+		rcrtc = &rcdu->crtcs[rgrp->index * 2 + i];
+		rstate = to_rcar_crtc_state(rcrtc->crtc.state);
+
+		if (!(rstate->outputs & dpad_mask))
+			doflr |= doflr_values[i];
+	}
+
+	rcar_du_group_write(rgrp, DOFLR, doflr);
+}
+
 int rcar_du_group_set_routing(struct rcar_du_group *rgrp)
 {
-	struct rcar_du_crtc *crtc0 = &rgrp->dev->crtcs[rgrp->index * 2];
+	struct rcar_du_device *rcdu = rgrp->dev;
 	u32 dorcr = rcar_du_group_read(rgrp, DORCR);
 
 	dorcr &= ~(DORCR_PG2T | DORCR_DK2S | DORCR_PG2D_MASK);
@@ -299,12 +340,14 @@ int rcar_du_group_set_routing(struct rcar_du_group *rgrp)
 	 * CRTC 1 in all other cases to avoid cloning CRTC 0 to DPAD0 and DPAD1
 	 * by default.
 	 */
-	if (crtc0->outputs & BIT(RCAR_DU_OUTPUT_DPAD1))
+	if (rcdu->dpad1_source == rgrp->index * 2)
 		dorcr |= DORCR_PG2D_DS1;
 	else
 		dorcr |= DORCR_PG2T | DORCR_DK2S | DORCR_PG2D_DS2;
 
 	rcar_du_group_write(rgrp, DORCR, dorcr);
 
+	rcar_du_group_set_dpad_levels(rgrp);
+
 	return rcar_du_set_dpad0_vsp1_routing(rgrp->dev);
 }
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_kms.c b/drivers/gpu/drm/rcar-du/rcar_du_kms.c
index 9c7007d45408..3b7d50a8fb9b 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_kms.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_kms.c
@@ -7,14 +7,15 @@
  * Contact: Laurent Pinchart (laurent.pinchart@ideasonboard.com)
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_device.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
+#include <drm/drm_vblank.h>
 
 #include <linux/of_graph.h>
 #include <linux/wait.h>
@@ -278,6 +279,28 @@ static int rcar_du_atomic_check(struct drm_device *dev,
 static void rcar_du_atomic_commit_tail(struct drm_atomic_state *old_state)
 {
 	struct drm_device *dev = old_state->dev;
+	struct rcar_du_device *rcdu = dev->dev_private;
+	struct drm_crtc_state *crtc_state;
+	struct drm_crtc *crtc;
+	unsigned int i;
+
+	/*
+	 * Store RGB routing to DPAD0 and DPAD1, the hardware will be configured
+	 * when starting the CRTCs.
+	 */
+	rcdu->dpad1_source = -1;
+
+	for_each_new_crtc_in_state(old_state, crtc, crtc_state, i) {
+		struct rcar_du_crtc_state *rcrtc_state =
+			to_rcar_crtc_state(crtc_state);
+		struct rcar_du_crtc *rcrtc = to_rcar_crtc(crtc);
+
+		if (rcrtc_state->outputs & BIT(RCAR_DU_OUTPUT_DPAD0))
+			rcdu->dpad0_source = rcrtc->index;
+
+		if (rcrtc_state->outputs & BIT(RCAR_DU_OUTPUT_DPAD1))
+			rcdu->dpad1_source = rcrtc->index;
+	}
 
 	/* Apply the atomic update. */
 	drm_atomic_helper_commit_modeset_disables(dev, old_state);
@@ -309,17 +332,10 @@ static int rcar_du_encoders_init_one(struct rcar_du_device *rcdu,
 				     enum rcar_du_output output,
 				     struct of_endpoint *ep)
 {
-	struct device_node *connector = NULL;
-	struct device_node *encoder = NULL;
-	struct device_node *ep_node = NULL;
-	struct device_node *entity_ep_node;
 	struct device_node *entity;
 	int ret;
 
-	/*
-	 * Locate the connected entity and infer its type from the number of
-	 * endpoints.
-	 */
+	/* Locate the connected entity and initialize the encoder. */
 	entity = of_graph_get_remote_port_parent(ep->local_node);
 	if (!entity) {
 		dev_dbg(rcdu->dev, "unconnected endpoint %pOF, skipping\n",
@@ -331,52 +347,17 @@ static int rcar_du_encoders_init_one(struct rcar_du_device *rcdu,
 		dev_dbg(rcdu->dev,
 			"connected entity %pOF is disabled, skipping\n",
 			entity);
+		of_node_put(entity);
 		return -ENODEV;
 	}
 
-	entity_ep_node = of_graph_get_remote_endpoint(ep->local_node);
-
-	for_each_endpoint_of_node(entity, ep_node) {
-		if (ep_node == entity_ep_node)
-			continue;
-
-		/*
-		 * We've found one endpoint other than the input, this must
-		 * be an encoder. Locate the connector.
-		 */
-		encoder = entity;
-		connector = of_graph_get_remote_port_parent(ep_node);
-		of_node_put(ep_node);
-
-		if (!connector) {
-			dev_warn(rcdu->dev,
-				 "no connector for encoder %pOF, skipping\n",
-				 encoder);
-			of_node_put(entity_ep_node);
-			of_node_put(encoder);
-			return -ENODEV;
-		}
-
-		break;
-	}
-
-	of_node_put(entity_ep_node);
-
-	if (!encoder) {
-		dev_warn(rcdu->dev,
-			 "no encoder found for endpoint %pOF, skipping\n",
-			 ep->local_node);
-		return -ENODEV;
-	}
-
-	ret = rcar_du_encoder_init(rcdu, output, encoder, connector);
+	ret = rcar_du_encoder_init(rcdu, output, entity);
 	if (ret && ret != -EPROBE_DEFER)
 		dev_warn(rcdu->dev,
 			 "failed to initialize encoder %pOF on output %u (%d), skipping\n",
-			 encoder, output, ret);
+			 entity, output, ret);
 
-	of_node_put(encoder);
-	of_node_put(connector);
+	of_node_put(entity);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7790.dts b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7790.dts
index 579753e04f3b..8bee4e787a0a 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7790.dts
+++ b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7790.dts
@@ -7,70 +7,63 @@
 
 /dts-v1/;
 /plugin/;
-/ {
-	fragment@0 {
-		target-path = "/";
-		__overlay__ {
-			#address-cells = <2>;
-			#size-cells = <2>;
 
-			lvds@feb90000 {
-				compatible = "renesas,r8a7790-lvds";
-				reg = <0 0xfeb90000 0 0x1c>;
+&{/} {
+	#address-cells = <2>;
+	#size-cells = <2>;
 
-				ports {
-					#address-cells = <1>;
-					#size-cells = <0>;
+	lvds@feb90000 {
+		compatible = "renesas,r8a7790-lvds";
+		reg = <0 0xfeb90000 0 0x1c>;
 
-					port@0 {
-						reg = <0>;
-						lvds0_input: endpoint {
-						};
-					};
-					port@1 {
-						reg = <1>;
-						lvds0_out: endpoint {
-						};
-					};
+		ports {
+			#address-cells = <1>;
+			#size-cells = <0>;
+
+			port@0 {
+				reg = <0>;
+				lvds0_input: endpoint {
 				};
 			};
-
-			lvds@feb94000 {
-				compatible = "renesas,r8a7790-lvds";
-				reg = <0 0xfeb94000 0 0x1c>;
-
-				ports {
-					#address-cells = <1>;
-					#size-cells = <0>;
-
-					port@0 {
-						reg = <0>;
-						lvds1_input: endpoint {
-						};
-					};
-					port@1 {
-						reg = <1>;
-						lvds1_out: endpoint {
-						};
-					};
+			port@1 {
+				reg = <1>;
+				lvds0_out: endpoint {
 				};
 			};
 		};
 	};
 
-	fragment@1 {
-		target-path = "/display@feb00000/ports";
-		__overlay__ {
-			port@1 {
-				endpoint {
-					remote-endpoint = <&lvds0_input>;
+	lvds@feb94000 {
+		compatible = "renesas,r8a7790-lvds";
+		reg = <0 0xfeb94000 0 0x1c>;
+
+		ports {
+			#address-cells = <1>;
+			#size-cells = <0>;
+
+			port@0 {
+				reg = <0>;
+				lvds1_input: endpoint {
 				};
 			};
-			port@2 {
-				endpoint {
-					remote-endpoint = <&lvds1_input>;
+			port@1 {
+				reg = <1>;
+				lvds1_out: endpoint {
 				};
 			};
 		};
 	};
 };
+
+&{/display@feb00000/ports} {
+	port@1 {
+		endpoint {
+			remote-endpoint = <&lvds0_input>;
+		};
+	};
+	port@2 {
+		endpoint {
+			remote-endpoint = <&lvds1_input>;
+		};
+	};
+};
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7791.dts b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7791.dts
index cb9da1f3942b..92c0509971ec 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7791.dts
+++ b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7791.dts
@@ -7,44 +7,37 @@
 
 /dts-v1/;
 /plugin/;
-/ {
-	fragment@0 {
-		target-path = "/";
-		__overlay__ {
-			#address-cells = <2>;
-			#size-cells = <2>;
 
-			lvds@feb90000 {
-				compatible = "renesas,r8a7791-lvds";
-				reg = <0 0xfeb90000 0 0x1c>;
+&{/} {
+	#address-cells = <2>;
+	#size-cells = <2>;
 
-				ports {
-					#address-cells = <1>;
-					#size-cells = <0>;
+	lvds@feb90000 {
+		compatible = "renesas,r8a7791-lvds";
+		reg = <0 0xfeb90000 0 0x1c>;
 
-					port@0 {
-						reg = <0>;
-						lvds0_input: endpoint {
-						};
-					};
-					port@1 {
-						reg = <1>;
-						lvds0_out: endpoint {
-						};
-					};
+		ports {
+			#address-cells = <1>;
+			#size-cells = <0>;
+
+			port@0 {
+				reg = <0>;
+				lvds0_input: endpoint {
 				};
 			};
-		};
-	};
-
-	fragment@1 {
-		target-path = "/display@feb00000/ports";
-		__overlay__ {
 			port@1 {
-				endpoint {
-					remote-endpoint = <&lvds0_input>;
+				reg = <1>;
+				lvds0_out: endpoint {
 				};
 			};
 		};
 	};
 };
+
+&{/display@feb00000/ports} {
+	port@1 {
+		endpoint {
+			remote-endpoint = <&lvds0_input>;
+		};
+	};
+};
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7793.dts b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7793.dts
index e7b8804dc3c1..c8b93f21de0f 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7793.dts
+++ b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7793.dts
@@ -7,44 +7,37 @@
 
 /dts-v1/;
 /plugin/;
-/ {
-	fragment@0 {
-		target-path = "/";
-		__overlay__ {
-			#address-cells = <2>;
-			#size-cells = <2>;
 
-			lvds@feb90000 {
-				compatible = "renesas,r8a7793-lvds";
-				reg = <0 0xfeb90000 0 0x1c>;
+&{/} {
+	#address-cells = <2>;
+	#size-cells = <2>;
 
-				ports {
-					#address-cells = <1>;
-					#size-cells = <0>;
+	lvds@feb90000 {
+		compatible = "renesas,r8a7793-lvds";
+		reg = <0 0xfeb90000 0 0x1c>;
 
-					port@0 {
-						reg = <0>;
-						lvds0_input: endpoint {
-						};
-					};
-					port@1 {
-						reg = <1>;
-						lvds0_out: endpoint {
-						};
-					};
+		ports {
+			#address-cells = <1>;
+			#size-cells = <0>;
+
+			port@0 {
+				reg = <0>;
+				lvds0_input: endpoint {
 				};
 			};
-		};
-	};
-
-	fragment@1 {
-		target-path = "/display@feb00000/ports";
-		__overlay__ {
 			port@1 {
-				endpoint {
-					remote-endpoint = <&lvds0_input>;
+				reg = <1>;
+				lvds0_out: endpoint {
 				};
 			};
 		};
 	};
 };
+
+&{/display@feb00000/ports} {
+	port@1 {
+		endpoint {
+			remote-endpoint = <&lvds0_input>;
+		};
+	};
+};
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7795.dts b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7795.dts
index a1327443e6fa..16c2d03cb016 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7795.dts
+++ b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7795.dts
@@ -7,44 +7,37 @@
 
 /dts-v1/;
 /plugin/;
-/ {
-	fragment@0 {
-		target-path = "/soc";
-		__overlay__ {
-			#address-cells = <2>;
-			#size-cells = <2>;
 
-			lvds@feb90000 {
-				compatible = "renesas,r8a7795-lvds";
-				reg = <0 0xfeb90000 0 0x14>;
+&{/soc} {
+	#address-cells = <2>;
+	#size-cells = <2>;
 
-				ports {
-					#address-cells = <1>;
-					#size-cells = <0>;
+	lvds@feb90000 {
+		compatible = "renesas,r8a7795-lvds";
+		reg = <0 0xfeb90000 0 0x14>;
 
-					port@0 {
-						reg = <0>;
-						lvds0_input: endpoint {
-						};
-					};
-					port@1 {
-						reg = <1>;
-						lvds0_out: endpoint {
-						};
-					};
+		ports {
+			#address-cells = <1>;
+			#size-cells = <0>;
+
+			port@0 {
+				reg = <0>;
+				lvds0_input: endpoint {
+				};
+			};
+			port@1 {
+				reg = <1>;
+				lvds0_out: endpoint {
 				};
 			};
 		};
 	};
+};
 
-	fragment@1 {
-		target-path = "/soc/display@feb00000/ports";
-		__overlay__ {
-			port@3 {
-				endpoint {
-					remote-endpoint = <&lvds0_input>;
-				};
-			};
+&{/soc/display@feb00000/ports} {
+	port@3 {
+		endpoint {
+			remote-endpoint = <&lvds0_input>;
 		};
 	};
 };
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7796.dts b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7796.dts
index b23d6466c415..680e923ac036 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7796.dts
+++ b/drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7796.dts
@@ -7,44 +7,37 @@
 
 /dts-v1/;
 /plugin/;
-/ {
-	fragment@0 {
-		target-path = "/soc";
-		__overlay__ {
-			#address-cells = <2>;
-			#size-cells = <2>;
 
-			lvds@feb90000 {
-				compatible = "renesas,r8a7796-lvds";
-				reg = <0 0xfeb90000 0 0x14>;
+&{/soc} {
+	#address-cells = <2>;
+	#size-cells = <2>;
 
-				ports {
-					#address-cells = <1>;
-					#size-cells = <0>;
+	lvds@feb90000 {
+		compatible = "renesas,r8a7796-lvds";
+		reg = <0 0xfeb90000 0 0x14>;
 
-					port@0 {
-						reg = <0>;
-						lvds0_input: endpoint {
-						};
-					};
-					port@1 {
-						reg = <1>;
-						lvds0_out: endpoint {
-						};
-					};
+		ports {
+			#address-cells = <1>;
+			#size-cells = <0>;
+
+			port@0 {
+				reg = <0>;
+				lvds0_input: endpoint {
+				};
+			};
+			port@1 {
+				reg = <1>;
+				lvds0_out: endpoint {
 				};
 			};
 		};
 	};
+};
 
-	fragment@1 {
-		target-path = "/soc/display@feb00000/ports";
-		__overlay__ {
-			port@3 {
-				endpoint {
-					remote-endpoint = <&lvds0_input>;
-				};
-			};
+&{/soc/display@feb00000/ports} {
+	port@3 {
+		endpoint {
+			remote-endpoint = <&lvds0_input>;
 		};
 	};
 };
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_plane.c b/drivers/gpu/drm/rcar-du/rcar_du_plane.c
index 39d5ae3fdf72..c6430027169f 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_plane.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_plane.c
@@ -7,12 +7,12 @@
  * Contact: Laurent Pinchart (laurent.pinchart@ideasonboard.com)
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_device.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fourcc.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_plane_helper.h>
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_plane.h b/drivers/gpu/drm/rcar-du/rcar_du_plane.h
index 2f223a4c1d33..81bbf207ad0e 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_plane.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_plane.h
@@ -10,8 +10,7 @@
 #ifndef __RCAR_DU_PLANE_H__
 #define __RCAR_DU_PLANE_H__
 
-#include <drm/drmP.h>
-#include <drm/drm_crtc.h>
+#include <drm/drm_plane.h>
 
 struct rcar_du_format_info;
 struct rcar_du_group;
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
index 4576119e7777..0878accbd134 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
@@ -7,14 +7,13 @@
  * Contact: Laurent Pinchart (laurent.pinchart@ideasonboard.com)
  */
 
-#include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_vblank.h>
 
 #include <linux/bitops.h>
 #include <linux/dma-mapping.h>
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_vsp.h b/drivers/gpu/drm/rcar-du/rcar_du_vsp.h
index e8c14dc5cb93..db232037f24a 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_vsp.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_vsp.h
@@ -10,8 +10,7 @@
 #ifndef __RCAR_DU_VSP_H__
 #define __RCAR_DU_VSP_H__
 
-#include <drm/drmP.h>
-#include <drm/drm_crtc.h>
+#include <drm/drm_plane.h>
 
 struct rcar_du_format_info;
 struct rcar_du_vsp;
diff --git a/drivers/gpu/drm/rcar-du/rcar_dw_hdmi.c b/drivers/gpu/drm/rcar-du/rcar_dw_hdmi.c
index 75490a3e0a2a..452461dc96f2 100644
--- a/drivers/gpu/drm/rcar-du/rcar_dw_hdmi.c
+++ b/drivers/gpu/drm/rcar-du/rcar_dw_hdmi.c
@@ -7,10 +7,12 @@
  * Contact: Laurent Pinchart (laurent.pinchart@ideasonboard.com)
  */
 
+#include <linux/mod_devicetable.h>
 #include <linux/module.h>
 #include <linux/platform_device.h>
 
 #include <drm/bridge/dw_hdmi.h>
+#include <drm/drm_modes.h>
 
 #define RCAR_HDMI_PHY_OPMODE_PLLCFG	0x06	/* Mode of operation and PLL dividers */
 #define RCAR_HDMI_PHY_PLLCURRGMPCTRL	0x10	/* PLL current and Gmp (conductance) */
@@ -35,6 +37,20 @@ static const struct rcar_hdmi_phy_params rcar_hdmi_phy_params[] = {
 	{ ~0UL,      0x0000, 0x0000, 0x0000 },
 };
 
+static enum drm_mode_status
+rcar_hdmi_mode_valid(struct drm_connector *connector,
+		     const struct drm_display_mode *mode)
+{
+	/*
+	 * The maximum supported clock frequency is 297 MHz, as shown in the PHY
+	 * parameters table.
+	 */
+	if (mode->clock > 297000)
+		return MODE_CLOCK_HIGH;
+
+	return MODE_OK;
+}
+
 static int rcar_hdmi_phy_configure(struct dw_hdmi *hdmi,
 				   const struct dw_hdmi_plat_data *pdata,
 				   unsigned long mpixelclock)
@@ -59,6 +75,7 @@ static int rcar_hdmi_phy_configure(struct dw_hdmi *hdmi,
 }
 
 static const struct dw_hdmi_plat_data rcar_dw_hdmi_plat_data = {
+	.mode_valid = rcar_hdmi_mode_valid,
 	.configure_phy	= rcar_hdmi_phy_configure,
 };
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_lvds.c b/drivers/gpu/drm/rcar-du/rcar_lvds.c
index 534a128a869d..7ef97b2a6eda 100644
--- a/drivers/gpu/drm/rcar-du/rcar_lvds.c
+++ b/drivers/gpu/drm/rcar-du/rcar_lvds.c
@@ -10,6 +10,7 @@
 #include <linux/clk.h>
 #include <linux/delay.h>
 #include <linux/io.h>
+#include <linux/module.h>
 #include <linux/of.h>
 #include <linux/of_device.h>
 #include <linux/of_graph.h>
@@ -19,9 +20,10 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_bridge.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
+#include "rcar_lvds.h"
 #include "rcar_lvds_regs.h"
 
 struct rcar_lvds;
@@ -182,8 +184,9 @@ struct pll_info {
 
 static void rcar_lvds_d3_e3_pll_calc(struct rcar_lvds *lvds, struct clk *clk,
 				     unsigned long target, struct pll_info *pll,
-				     u32 clksel)
+				     u32 clksel, bool dot_clock_only)
 {
+	unsigned int div7 = dot_clock_only ? 1 : 7;
 	unsigned long output;
 	unsigned long fin;
 	unsigned int m_min;
@@ -217,9 +220,9 @@ static void rcar_lvds_d3_e3_pll_calc(struct rcar_lvds *lvds, struct clk *clk,
 	 *                     `------------> | |
 	 *                                    |/
 	 *
-	 * The /7 divider is optional when the LVDS PLL is used to generate a
-	 * dot clock for the DU RGB output, without using the LVDS encoder. We
-	 * don't support this configuration yet.
+	 * The /7 divider is optional, it is enabled when the LVDS PLL is used
+	 * to drive the LVDS encoder, and disabled when  used to generate a dot
+	 * clock for the DU RGB output, without using the LVDS encoder.
 	 *
 	 * The PLL allowed input frequency range is 12 MHz to 192 MHz.
 	 */
@@ -279,7 +282,7 @@ static void rcar_lvds_d3_e3_pll_calc(struct rcar_lvds *lvds, struct clk *clk,
 				 * the PLL, followed by a an optional fixed /7
 				 * divider.
 				 */
-				fout = fvco / (1 << e) / 7;
+				fout = fvco / (1 << e) / div7;
 				div = DIV_ROUND_CLOSEST(fout, target);
 				diff = abs(fout / div - target);
 
@@ -300,7 +303,7 @@ static void rcar_lvds_d3_e3_pll_calc(struct rcar_lvds *lvds, struct clk *clk,
 
 done:
 	output = fin * pll->pll_n / pll->pll_m / (1 << pll->pll_e)
-	       / 7 / pll->div;
+	       / div7 / pll->div;
 	error = (long)(output - target) * 10000 / (long)target;
 
 	dev_dbg(lvds->dev,
@@ -310,17 +313,18 @@ done:
 		pll->pll_m, pll->pll_n, pll->pll_e, pll->div);
 }
 
-static void rcar_lvds_pll_setup_d3_e3(struct rcar_lvds *lvds, unsigned int freq)
+static void __rcar_lvds_pll_setup_d3_e3(struct rcar_lvds *lvds,
+					unsigned int freq, bool dot_clock_only)
 {
 	struct pll_info pll = { .diff = (unsigned long)-1 };
 	u32 lvdpllcr;
 
 	rcar_lvds_d3_e3_pll_calc(lvds, lvds->clocks.dotclkin[0], freq, &pll,
-				 LVDPLLCR_CKSEL_DU_DOTCLKIN(0));
+				 LVDPLLCR_CKSEL_DU_DOTCLKIN(0), dot_clock_only);
 	rcar_lvds_d3_e3_pll_calc(lvds, lvds->clocks.dotclkin[1], freq, &pll,
-				 LVDPLLCR_CKSEL_DU_DOTCLKIN(1));
+				 LVDPLLCR_CKSEL_DU_DOTCLKIN(1), dot_clock_only);
 	rcar_lvds_d3_e3_pll_calc(lvds, lvds->clocks.extal, freq, &pll,
-				 LVDPLLCR_CKSEL_EXTAL);
+				 LVDPLLCR_CKSEL_EXTAL, dot_clock_only);
 
 	lvdpllcr = LVDPLLCR_PLLON | pll.clksel | LVDPLLCR_CLKOUT
 		 | LVDPLLCR_PLLN(pll.pll_n - 1) | LVDPLLCR_PLLM(pll.pll_m - 1);
@@ -329,6 +333,9 @@ static void rcar_lvds_pll_setup_d3_e3(struct rcar_lvds *lvds, unsigned int freq)
 		lvdpllcr |= LVDPLLCR_STP_CLKOUTE | LVDPLLCR_OUTCLKSEL
 			 |  LVDPLLCR_PLLE(pll.pll_e - 1);
 
+	if (dot_clock_only)
+		lvdpllcr |= LVDPLLCR_OCKSEL;
+
 	rcar_lvds_write(lvds, LVDPLLCR, lvdpllcr);
 
 	if (pll.div > 1)
@@ -342,6 +349,57 @@ static void rcar_lvds_pll_setup_d3_e3(struct rcar_lvds *lvds, unsigned int freq)
 		rcar_lvds_write(lvds, LVDDIV, 0);
 }
 
+static void rcar_lvds_pll_setup_d3_e3(struct rcar_lvds *lvds, unsigned int freq)
+{
+	__rcar_lvds_pll_setup_d3_e3(lvds, freq, false);
+}
+
+/* -----------------------------------------------------------------------------
+ * Clock - D3/E3 only
+ */
+
+int rcar_lvds_clk_enable(struct drm_bridge *bridge, unsigned long freq)
+{
+	struct rcar_lvds *lvds = bridge_to_rcar_lvds(bridge);
+	int ret;
+
+	if (WARN_ON(!(lvds->info->quirks & RCAR_LVDS_QUIRK_EXT_PLL)))
+		return -ENODEV;
+
+	dev_dbg(lvds->dev, "enabling LVDS PLL, freq=%luHz\n", freq);
+
+	WARN_ON(lvds->enabled);
+
+	ret = clk_prepare_enable(lvds->clocks.mod);
+	if (ret < 0)
+		return ret;
+
+	__rcar_lvds_pll_setup_d3_e3(lvds, freq, true);
+
+	lvds->enabled = true;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(rcar_lvds_clk_enable);
+
+void rcar_lvds_clk_disable(struct drm_bridge *bridge)
+{
+	struct rcar_lvds *lvds = bridge_to_rcar_lvds(bridge);
+
+	if (WARN_ON(!(lvds->info->quirks & RCAR_LVDS_QUIRK_EXT_PLL)))
+		return;
+
+	dev_dbg(lvds->dev, "disabling LVDS PLL\n");
+
+	WARN_ON(!lvds->enabled);
+
+	rcar_lvds_write(lvds, LVDPLLCR, 0);
+
+	clk_disable_unprepare(lvds->clocks.mod);
+
+	lvds->enabled = false;
+}
+EXPORT_SYMBOL_GPL(rcar_lvds_clk_disable);
+
 /* -----------------------------------------------------------------------------
  * Bridge
  */
@@ -520,8 +578,8 @@ static void rcar_lvds_get_lvds_mode(struct rcar_lvds *lvds)
 }
 
 static void rcar_lvds_mode_set(struct drm_bridge *bridge,
-			       struct drm_display_mode *mode,
-			       struct drm_display_mode *adjusted_mode)
+			       const struct drm_display_mode *mode,
+			       const struct drm_display_mode *adjusted_mode)
 {
 	struct rcar_lvds *lvds = bridge_to_rcar_lvds(bridge);
 
@@ -544,7 +602,10 @@ static int rcar_lvds_attach(struct drm_bridge *bridge)
 		return drm_bridge_attach(bridge->encoder, lvds->next_bridge,
 					 bridge);
 
-	/* Otherwise we have a panel, create a connector. */
+	/* Otherwise if we have a panel, create a connector. */
+	if (!lvds->panel)
+		return 0;
+
 	ret = drm_connector_init(bridge->dev, connector, &rcar_lvds_conn_funcs,
 				 DRM_MODE_CONNECTOR_LVDS);
 	if (ret < 0)
@@ -592,7 +653,8 @@ static int rcar_lvds_parse_dt(struct rcar_lvds *lvds)
 	local_output = of_graph_get_endpoint_by_regs(lvds->dev->of_node, 1, 0);
 	if (!local_output) {
 		dev_dbg(lvds->dev, "unconnected port@1\n");
-		return -ENODEV;
+		ret = -ENODEV;
+		goto done;
 	}
 
 	/*
@@ -642,6 +704,15 @@ done:
 	of_node_put(remote_input);
 	of_node_put(remote);
 
+	/*
+	 * On D3/E3 the LVDS encoder provides a clock to the DU, which can be
+	 * used for the DPAD output even when the LVDS output is not connected.
+	 * Don't fail probe in that case as the DU will need the bridge to
+	 * control the clock.
+	 */
+	if (lvds->info->quirks & RCAR_LVDS_QUIRK_EXT_PLL)
+		return ret == -ENODEV ? 0 : ret;
+
 	return ret;
 }
 
@@ -785,6 +856,8 @@ static const struct rcar_lvds_device_info rcar_lvds_r8a77995_info = {
 
 static const struct of_device_id rcar_lvds_of_table[] = {
 	{ .compatible = "renesas,r8a7743-lvds", .data = &rcar_lvds_gen2_info },
+	{ .compatible = "renesas,r8a7744-lvds", .data = &rcar_lvds_gen2_info },
+	{ .compatible = "renesas,r8a774c0-lvds", .data = &rcar_lvds_r8a77990_info },
 	{ .compatible = "renesas,r8a7790-lvds", .data = &rcar_lvds_r8a7790_info },
 	{ .compatible = "renesas,r8a7791-lvds", .data = &rcar_lvds_gen2_info },
 	{ .compatible = "renesas,r8a7793-lvds", .data = &rcar_lvds_gen2_info },
diff --git a/drivers/gpu/drm/rcar-du/rcar_lvds.h b/drivers/gpu/drm/rcar-du/rcar_lvds.h
new file mode 100644
index 000000000000..a709cae1bc32
--- /dev/null
+++ b/drivers/gpu/drm/rcar-du/rcar_lvds.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * rcar_lvds.h  --  R-Car LVDS Encoder
+ *
+ * Copyright (C) 2013-2018 Renesas Electronics Corporation
+ *
+ * Contact: Laurent Pinchart (laurent.pinchart@ideasonboard.com)
+ */
+
+#ifndef __RCAR_LVDS_H__
+#define __RCAR_LVDS_H__
+
+struct drm_bridge;
+
+#if IS_ENABLED(CONFIG_DRM_RCAR_LVDS)
+int rcar_lvds_clk_enable(struct drm_bridge *bridge, unsigned long freq);
+void rcar_lvds_clk_disable(struct drm_bridge *bridge);
+#else
+static inline int rcar_lvds_clk_enable(struct drm_bridge *bridge,
+				       unsigned long freq)
+{
+	return -ENOSYS;
+}
+static inline void rcar_lvds_clk_disable(struct drm_bridge *bridge) { }
+#endif /* CONFIG_DRM_RCAR_LVDS */
+
+#endif /* __RCAR_LVDS_H__ */
diff --git a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
index 080f05352195..bc4423624209 100644
--- a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
+++ b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
@@ -21,10 +21,10 @@
 #include <linux/clk.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include <video/of_videomode.h>
 #include <video/videomode.h>
diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c b/drivers/gpu/drm/rockchip/cdn-dp-core.c
index 8ad0d773dc33..f7b9d45aa1d6 100644
--- a/drivers/gpu/drm/rockchip/cdn-dp-core.c
+++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c
@@ -14,10 +14,10 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/clk.h>
 #include <linux/component.h>
diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.h b/drivers/gpu/drm/rockchip/cdn-dp-core.h
index f57e296401b8..48fef95cb3c6 100644
--- a/drivers/gpu/drm/rockchip/cdn-dp-core.h
+++ b/drivers/gpu/drm/rockchip/cdn-dp-core.h
@@ -16,9 +16,9 @@
 #define _CDN_DP_CORE_H
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include "rockchip_drm_drv.h"
 
 #define MAX_PHY		2
diff --git a/drivers/gpu/drm/rockchip/cdn-dp-reg.c b/drivers/gpu/drm/rockchip/cdn-dp-reg.c
index 5a485489a1e2..6c8b14fb1d2f 100644
--- a/drivers/gpu/drm/rockchip/cdn-dp-reg.c
+++ b/drivers/gpu/drm/rockchip/cdn-dp-reg.c
@@ -113,7 +113,7 @@ static int cdp_dp_mailbox_write(struct cdn_dp_device *dp, u8 val)
 
 static int cdn_dp_mailbox_validate_receive(struct cdn_dp_device *dp,
 					   u8 module_id, u8 opcode,
-					   u8 req_size)
+					   u16 req_size)
 {
 	u32 mbox_size, i;
 	u8 header[4];
diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
index 7ee359bcee62..ef8486e5e2cd 100644
--- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
+++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
@@ -467,7 +467,7 @@ static int dw_mipi_dsi_phy_init(void *priv_data)
 }
 
 static int
-dw_mipi_dsi_get_lane_mbps(void *priv_data, struct drm_display_mode *mode,
+dw_mipi_dsi_get_lane_mbps(void *priv_data, const struct drm_display_mode *mode,
 			  unsigned long mode_flags, u32 lanes, u32 format,
 			  unsigned int *lane_mbps)
 {
diff --git a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
index 89c63cfde5c8..4cdc9f86c2e5 100644
--- a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
+++ b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
@@ -16,8 +16,8 @@
 
 #include <drm/drm_of.h>
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/bridge/dw_hdmi.h>
 
 #include "rockchip_drm_drv.h"
diff --git a/drivers/gpu/drm/rockchip/inno_hdmi.c b/drivers/gpu/drm/rockchip/inno_hdmi.c
index 1c02b3e61299..ce1545862b6c 100644
--- a/drivers/gpu/drm/rockchip/inno_hdmi.c
+++ b/drivers/gpu/drm/rockchip/inno_hdmi.c
@@ -26,8 +26,8 @@
 #include <drm/drm_of.h>
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 
 #include "rockchip_drm_drv.h"
 #include "rockchip_drm_vop.h"
@@ -295,7 +295,9 @@ static int inno_hdmi_config_video_avi(struct inno_hdmi *hdmi,
 	union hdmi_infoframe frame;
 	int rc;
 
-	rc = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
+	rc = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
+						      &hdmi->connector,
+						      mode);
 
 	if (hdmi->hdmi_data.enc_out_format == HDMI_COLORSPACE_YUV444)
 		frame.avi.colorspace = HDMI_COLORSPACE_YUV444;
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
index be6c2573039a..d7fa17f12769 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
@@ -15,10 +15,10 @@
  */
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/dma-mapping.h>
 #include <linux/dma-iommu.h>
 #include <linux/pm_runtime.h>
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
index ea18cb2a76c0..97438bbbe389 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
@@ -17,8 +17,8 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_fb_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "rockchip_drm_drv.h"
 #include "rockchip_drm_fb.h"
@@ -128,42 +128,6 @@ err_gem_object_unreference:
 }
 
 static void
-rockchip_drm_psr_inhibit_get_state(struct drm_atomic_state *state)
-{
-	struct drm_crtc *crtc;
-	struct drm_crtc_state *crtc_state;
-	struct drm_encoder *encoder;
-	u32 encoder_mask = 0;
-	int i;
-
-	for_each_old_crtc_in_state(state, crtc, crtc_state, i) {
-		encoder_mask |= crtc_state->encoder_mask;
-		encoder_mask |= crtc->state->encoder_mask;
-	}
-
-	drm_for_each_encoder_mask(encoder, state->dev, encoder_mask)
-		rockchip_drm_psr_inhibit_get(encoder);
-}
-
-static void
-rockchip_drm_psr_inhibit_put_state(struct drm_atomic_state *state)
-{
-	struct drm_crtc *crtc;
-	struct drm_crtc_state *crtc_state;
-	struct drm_encoder *encoder;
-	u32 encoder_mask = 0;
-	int i;
-
-	for_each_old_crtc_in_state(state, crtc, crtc_state, i) {
-		encoder_mask |= crtc_state->encoder_mask;
-		encoder_mask |= crtc->state->encoder_mask;
-	}
-
-	drm_for_each_encoder_mask(encoder, state->dev, encoder_mask)
-		rockchip_drm_psr_inhibit_put(encoder);
-}
-
-static void
 rockchip_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state)
 {
 	struct drm_device *dev = old_state->dev;
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c b/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c
index e6650553f5d6..8ce68bd508be 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c
@@ -15,7 +15,7 @@
 #include <drm/drm.h>
 #include <drm/drmP.h>
 #include <drm/drm_fb_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "rockchip_drm_drv.h"
 #include "rockchip_drm_gem.h"
@@ -91,7 +91,6 @@ static int rockchip_drm_fbdev_create(struct drm_fb_helper *helper,
 	}
 
 	fbi->par = helper;
-	fbi->flags = FBINFO_FLAG_DEFAULT;
 	fbi->fbops = &rockchip_drm_fbdev_ops;
 
 	fb = helper->fb;
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c
index 01ff3c858875..a0c8bd235b67 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c
@@ -13,7 +13,8 @@
  */
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_atomic.h>
+#include <drm/drm_probe_helper.h>
 
 #include "rockchip_drm_drv.h"
 #include "rockchip_drm_psr.h"
@@ -109,6 +110,42 @@ int rockchip_drm_psr_inhibit_put(struct drm_encoder *encoder)
 }
 EXPORT_SYMBOL(rockchip_drm_psr_inhibit_put);
 
+void rockchip_drm_psr_inhibit_get_state(struct drm_atomic_state *state)
+{
+	struct drm_crtc *crtc;
+	struct drm_crtc_state *crtc_state;
+	struct drm_encoder *encoder;
+	u32 encoder_mask = 0;
+	int i;
+
+	for_each_old_crtc_in_state(state, crtc, crtc_state, i) {
+		encoder_mask |= crtc_state->encoder_mask;
+		encoder_mask |= crtc->state->encoder_mask;
+	}
+
+	drm_for_each_encoder_mask(encoder, state->dev, encoder_mask)
+		rockchip_drm_psr_inhibit_get(encoder);
+}
+EXPORT_SYMBOL(rockchip_drm_psr_inhibit_get_state);
+
+void rockchip_drm_psr_inhibit_put_state(struct drm_atomic_state *state)
+{
+	struct drm_crtc *crtc;
+	struct drm_crtc_state *crtc_state;
+	struct drm_encoder *encoder;
+	u32 encoder_mask = 0;
+	int i;
+
+	for_each_old_crtc_in_state(state, crtc, crtc_state, i) {
+		encoder_mask |= crtc_state->encoder_mask;
+		encoder_mask |= crtc->state->encoder_mask;
+	}
+
+	drm_for_each_encoder_mask(encoder, state->dev, encoder_mask)
+		rockchip_drm_psr_inhibit_put(encoder);
+}
+EXPORT_SYMBOL(rockchip_drm_psr_inhibit_put_state);
+
 /**
  * rockchip_drm_psr_inhibit_get - acquire PSR inhibit on given encoder
  * @encoder: encoder to obtain the PSR encoder
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_psr.h b/drivers/gpu/drm/rockchip/rockchip_drm_psr.h
index 860c62494496..25350ba3237b 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_psr.h
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_psr.h
@@ -20,6 +20,9 @@ void rockchip_drm_psr_flush_all(struct drm_device *dev);
 int rockchip_drm_psr_inhibit_put(struct drm_encoder *encoder);
 int rockchip_drm_psr_inhibit_get(struct drm_encoder *encoder);
 
+void rockchip_drm_psr_inhibit_get_state(struct drm_atomic_state *state);
+void rockchip_drm_psr_inhibit_put_state(struct drm_atomic_state *state);
+
 int rockchip_drm_psr_register(struct drm_encoder *encoder,
 			int (*psr_set)(struct drm_encoder *, bool enable));
 void rockchip_drm_psr_unregister(struct drm_encoder *encoder);
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index fb70fb486fbf..c7d4c6073ea5 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -15,10 +15,12 @@
 #include <drm/drm.h>
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
+#include <drm/drm_atomic_uapi.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_flip_work.h>
+#include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #ifdef CONFIG_DRM_ANALOGIX_DP
 #include <drm/bridge/analogix_dp.h>
 #endif
@@ -44,14 +46,26 @@
 #include "rockchip_drm_vop.h"
 #include "rockchip_rgb.h"
 
-#define VOP_WIN_SET(x, win, name, v) \
+#define VOP_WIN_SET(vop, win, name, v) \
 		vop_reg_set(vop, &win->phy->name, win->base, ~0, v, #name)
-#define VOP_SCL_SET(x, win, name, v) \
+#define VOP_SCL_SET(vop, win, name, v) \
 		vop_reg_set(vop, &win->phy->scl->name, win->base, ~0, v, #name)
-#define VOP_SCL_SET_EXT(x, win, name, v) \
+#define VOP_SCL_SET_EXT(vop, win, name, v) \
 		vop_reg_set(vop, &win->phy->scl->ext->name, \
 			    win->base, ~0, v, #name)
 
+#define VOP_WIN_YUV2YUV_SET(vop, win_yuv2yuv, name, v) \
+	do { \
+		if (win_yuv2yuv && win_yuv2yuv->name.mask) \
+			vop_reg_set(vop, &win_yuv2yuv->name, 0, ~0, v, #name); \
+	} while (0)
+
+#define VOP_WIN_YUV2YUV_COEFFICIENT_SET(vop, win_yuv2yuv, name, v) \
+	do { \
+		if (win_yuv2yuv && win_yuv2yuv->phy->name.mask) \
+			vop_reg_set(vop, &win_yuv2yuv->phy->name, win_yuv2yuv->base, ~0, v, #name); \
+	} while (0)
+
 #define VOP_INTR_SET_MASK(vop, name, mask, v) \
 		vop_reg_set(vop, &vop->data->intr->name, 0, mask, v, #name)
 
@@ -72,8 +86,11 @@
 #define VOP_INTR_GET_TYPE(vop, name, type) \
 		vop_get_intr_type(vop, &vop->data->intr->name, type)
 
-#define VOP_WIN_GET(x, win, name) \
-		vop_read_reg(x, win->offset, win->phy->name)
+#define VOP_WIN_GET(vop, win, name) \
+		vop_read_reg(vop, win->offset, win->phy->name)
+
+#define VOP_WIN_HAS_REG(win, name) \
+	(!!(win->phy->name.mask))
 
 #define VOP_WIN_GET_YRGBADDR(vop, win) \
 		vop_readl(vop, win->base + win->phy->yrgb_mst.offset)
@@ -84,6 +101,18 @@
 #define to_vop(x) container_of(x, struct vop, crtc)
 #define to_vop_win(x) container_of(x, struct vop_win, base)
 
+/*
+ * The coefficients of the following matrix are all fixed points.
+ * The format is S2.10 for the 3x3 part of the matrix, and S9.12 for the offsets.
+ * They are all represented in two's complement.
+ */
+static const uint32_t bt601_yuv2rgb[] = {
+	0x4A8, 0x0,    0x662,
+	0x4A8, 0x1E6F, 0x1CBF,
+	0x4A8, 0x812,  0x0,
+	0x321168, 0x0877CF, 0x2EB127
+};
+
 enum vop_pending {
 	VOP_PENDING_FB_UNREF,
 };
@@ -91,6 +120,7 @@ enum vop_pending {
 struct vop_win {
 	struct drm_plane base;
 	const struct vop_win_data *data;
+	const struct vop_win_yuv2yuv_data *yuv2yuv_data;
 	struct vop *vop;
 };
 
@@ -685,6 +715,11 @@ static int vop_plane_atomic_check(struct drm_plane *plane,
 		return -EINVAL;
 	}
 
+	if (fb->format->is_yuv && state->rotation & DRM_MODE_REFLECT_Y) {
+		DRM_ERROR("Invalid Source: Yuv format does not support this rotation\n");
+		return -EINVAL;
+	}
+
 	return 0;
 }
 
@@ -712,6 +747,7 @@ static void vop_plane_atomic_update(struct drm_plane *plane,
 	struct drm_crtc *crtc = state->crtc;
 	struct vop_win *vop_win = to_vop_win(plane);
 	const struct vop_win_data *win = vop_win->data;
+	const struct vop_win_yuv2yuv_data *win_yuv2yuv = vop_win->yuv2yuv_data;
 	struct vop *vop = to_vop(state->crtc);
 	struct drm_framebuffer *fb = state->fb;
 	unsigned int actual_w, actual_h;
@@ -727,6 +763,8 @@ static void vop_plane_atomic_update(struct drm_plane *plane,
 	bool rb_swap;
 	int win_index = VOP_WIN_TO_INDEX(vop_win);
 	int format;
+	int is_yuv = fb->format->is_yuv;
+	int i;
 
 	/*
 	 * can't update plane when vop is disabled.
@@ -760,6 +798,13 @@ static void vop_plane_atomic_update(struct drm_plane *plane,
 	offset += (src->y1 >> 16) * fb->pitches[0];
 	dma_addr = rk_obj->dma_addr + offset + fb->offsets[0];
 
+	/*
+	 * For y-mirroring we need to move address
+	 * to the beginning of the last line.
+	 */
+	if (state->rotation & DRM_MODE_REFLECT_Y)
+		dma_addr += (actual_h - 1) * fb->pitches[0];
+
 	format = vop_convert_format(fb->format->format);
 
 	spin_lock(&vop->reg_lock);
@@ -767,7 +812,13 @@ static void vop_plane_atomic_update(struct drm_plane *plane,
 	VOP_WIN_SET(vop, win, format, format);
 	VOP_WIN_SET(vop, win, yrgb_vir, DIV_ROUND_UP(fb->pitches[0], 4));
 	VOP_WIN_SET(vop, win, yrgb_mst, dma_addr);
-	if (fb->format->is_yuv) {
+	VOP_WIN_YUV2YUV_SET(vop, win_yuv2yuv, y2r_en, is_yuv);
+	VOP_WIN_SET(vop, win, y_mir_en,
+		    (state->rotation & DRM_MODE_REFLECT_Y) ? 1 : 0);
+	VOP_WIN_SET(vop, win, x_mir_en,
+		    (state->rotation & DRM_MODE_REFLECT_X) ? 1 : 0);
+
+	if (is_yuv) {
 		int hsub = drm_format_horz_chroma_subsampling(fb->format->format);
 		int vsub = drm_format_vert_chroma_subsampling(fb->format->format);
 		int bpp = fb->format->cpp[1];
@@ -781,6 +832,13 @@ static void vop_plane_atomic_update(struct drm_plane *plane,
 		dma_addr = rk_uv_obj->dma_addr + offset + fb->offsets[1];
 		VOP_WIN_SET(vop, win, uv_vir, DIV_ROUND_UP(fb->pitches[1], 4));
 		VOP_WIN_SET(vop, win, uv_mst, dma_addr);
+
+		for (i = 0; i < NUM_YUV2YUV_COEFFICIENTS; i++) {
+			VOP_WIN_YUV2YUV_COEFFICIENT_SET(vop,
+							win_yuv2yuv,
+							y2r_coefficients[i],
+							bt601_yuv2rgb[i]);
+		}
 	}
 
 	if (win->phy->scl)
@@ -819,10 +877,84 @@ static void vop_plane_atomic_update(struct drm_plane *plane,
 	spin_unlock(&vop->reg_lock);
 }
 
+static int vop_plane_atomic_async_check(struct drm_plane *plane,
+					struct drm_plane_state *state)
+{
+	struct vop_win *vop_win = to_vop_win(plane);
+	const struct vop_win_data *win = vop_win->data;
+	int min_scale = win->phy->scl ? FRAC_16_16(1, 8) :
+					DRM_PLANE_HELPER_NO_SCALING;
+	int max_scale = win->phy->scl ? FRAC_16_16(8, 1) :
+					DRM_PLANE_HELPER_NO_SCALING;
+	struct drm_crtc_state *crtc_state;
+
+	if (plane != state->crtc->cursor)
+		return -EINVAL;
+
+	if (!plane->state)
+		return -EINVAL;
+
+	if (!plane->state->fb)
+		return -EINVAL;
+
+	if (state->state)
+		crtc_state = drm_atomic_get_existing_crtc_state(state->state,
+								state->crtc);
+	else /* Special case for asynchronous cursor updates. */
+		crtc_state = plane->crtc->state;
+
+	return drm_atomic_helper_check_plane_state(plane->state, crtc_state,
+						   min_scale, max_scale,
+						   true, true);
+}
+
+static void vop_plane_atomic_async_update(struct drm_plane *plane,
+					  struct drm_plane_state *new_state)
+{
+	struct vop *vop = to_vop(plane->state->crtc);
+	struct drm_plane_state *plane_state;
+
+	plane_state = plane->funcs->atomic_duplicate_state(plane);
+	plane_state->crtc_x = new_state->crtc_x;
+	plane_state->crtc_y = new_state->crtc_y;
+	plane_state->crtc_h = new_state->crtc_h;
+	plane_state->crtc_w = new_state->crtc_w;
+	plane_state->src_x = new_state->src_x;
+	plane_state->src_y = new_state->src_y;
+	plane_state->src_h = new_state->src_h;
+	plane_state->src_w = new_state->src_w;
+
+	if (plane_state->fb != new_state->fb)
+		drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
+
+	swap(plane_state, plane->state);
+
+	if (plane->state->fb && plane->state->fb != new_state->fb) {
+		drm_framebuffer_get(plane->state->fb);
+		WARN_ON(drm_crtc_vblank_get(plane->state->crtc) != 0);
+		drm_flip_work_queue(&vop->fb_unref_work, plane->state->fb);
+		set_bit(VOP_PENDING_FB_UNREF, &vop->pending);
+	}
+
+	if (vop->is_enabled) {
+		rockchip_drm_psr_inhibit_get_state(new_state->state);
+		vop_plane_atomic_update(plane, plane->state);
+		spin_lock(&vop->reg_lock);
+		vop_cfg_done(vop);
+		spin_unlock(&vop->reg_lock);
+		rockchip_drm_psr_inhibit_put_state(new_state->state);
+	}
+
+	plane->funcs->atomic_destroy_state(plane, plane_state);
+}
+
 static const struct drm_plane_helper_funcs plane_helper_funcs = {
 	.atomic_check = vop_plane_atomic_check,
 	.atomic_update = vop_plane_atomic_update,
 	.atomic_disable = vop_plane_atomic_disable,
+	.atomic_async_check = vop_plane_atomic_async_check,
+	.atomic_async_update = vop_plane_atomic_async_update,
+	.prepare_fb = drm_gem_fb_prepare_fb,
 };
 
 static const struct drm_plane_funcs vop_plane_funcs = {
@@ -1272,6 +1404,18 @@ out:
 	return ret;
 }
 
+static void vop_plane_add_properties(struct drm_plane *plane,
+				     const struct vop_win_data *win_data)
+{
+	unsigned int flags = 0;
+
+	flags |= VOP_WIN_HAS_REG(win_data, x_mir_en) ? DRM_MODE_REFLECT_X : 0;
+	flags |= VOP_WIN_HAS_REG(win_data, y_mir_en) ? DRM_MODE_REFLECT_Y : 0;
+	if (flags)
+		drm_plane_create_rotation_property(plane, DRM_MODE_ROTATE_0,
+						   DRM_MODE_ROTATE_0 | flags);
+}
+
 static int vop_create_crtc(struct vop *vop)
 {
 	const struct vop_data *vop_data = vop->data;
@@ -1309,6 +1453,7 @@ static int vop_create_crtc(struct vop *vop)
 
 		plane = &vop_win->base;
 		drm_plane_helper_add(plane, &plane_helper_funcs);
+		vop_plane_add_properties(plane, win_data);
 		if (plane->type == DRM_PLANE_TYPE_PRIMARY)
 			primary = plane;
 		else if (plane->type == DRM_PLANE_TYPE_CURSOR)
@@ -1346,6 +1491,7 @@ static int vop_create_crtc(struct vop *vop)
 			goto err_cleanup_crtc;
 		}
 		drm_plane_helper_add(&vop_win->base, &plane_helper_funcs);
+		vop_plane_add_properties(&vop_win->base, win_data);
 	}
 
 	port = of_get_child_by_name(dev->of_node, "port");
@@ -1529,6 +1675,9 @@ static void vop_win_init(struct vop *vop)
 
 		vop_win->data = win_data;
 		vop_win->vop = vop;
+
+		if (vop_data->win_yuv2yuv)
+			vop_win->yuv2yuv_data = &vop_data->win_yuv2yuv[i];
 	}
 }
 
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.h b/drivers/gpu/drm/rockchip/rockchip_drm_vop.h
index 0fe40e1983d9..04ed401d2325 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.h
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.h
@@ -23,6 +23,8 @@
 #define VOP_MAJOR(version)		((version) >> 8)
 #define VOP_MINOR(version)		((version) & 0xff)
 
+#define NUM_YUV2YUV_COEFFICIENTS 12
+
 enum vop_data_format {
 	VOP_FMT_ARGB8888 = 0,
 	VOP_FMT_RGB888,
@@ -124,6 +126,10 @@ struct vop_scl_regs {
 	struct vop_reg scale_cbcr_y;
 };
 
+struct vop_yuv2yuv_phy {
+	struct vop_reg y2r_coefficients[NUM_YUV2YUV_COEFFICIENTS];
+};
+
 struct vop_win_phy {
 	const struct vop_scl_regs *scl;
 	const uint32_t *data_formats;
@@ -140,12 +146,20 @@ struct vop_win_phy {
 	struct vop_reg uv_mst;
 	struct vop_reg yrgb_vir;
 	struct vop_reg uv_vir;
+	struct vop_reg y_mir_en;
+	struct vop_reg x_mir_en;
 
 	struct vop_reg dst_alpha_ctl;
 	struct vop_reg src_alpha_ctl;
 	struct vop_reg channel;
 };
 
+struct vop_win_yuv2yuv_data {
+	uint32_t base;
+	const struct vop_yuv2yuv_phy *phy;
+	struct vop_reg y2r_en;
+};
+
 struct vop_win_data {
 	uint32_t base;
 	const struct vop_win_phy *phy;
@@ -159,6 +173,7 @@ struct vop_data {
 	const struct vop_misc *misc;
 	const struct vop_modeset *modeset;
 	const struct vop_output *output;
+	const struct vop_win_yuv2yuv_data *win_yuv2yuv;
 	const struct vop_win_data *win;
 	unsigned int win_size;
 
diff --git a/drivers/gpu/drm/rockchip/rockchip_lvds.c b/drivers/gpu/drm/rockchip/rockchip_lvds.c
index 456bd9f13bae..e52dd5a8529e 100644
--- a/drivers/gpu/drm/rockchip/rockchip_lvds.c
+++ b/drivers/gpu/drm/rockchip/rockchip_lvds.c
@@ -16,10 +16,10 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_panel.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/component.h>
 #include <linux/clk.h>
diff --git a/drivers/gpu/drm/rockchip/rockchip_rgb.c b/drivers/gpu/drm/rockchip/rockchip_rgb.c
index c0351abf83a3..ce4d82d293e4 100644
--- a/drivers/gpu/drm/rockchip/rockchip_rgb.c
+++ b/drivers/gpu/drm/rockchip/rockchip_rgb.c
@@ -7,10 +7,10 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_panel.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/component.h>
 #include <linux/of_graph.h>
diff --git a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c
index 08fc40af52c8..bd76328c0fdb 100644
--- a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c
+++ b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c
@@ -299,6 +299,114 @@ static const struct vop_data px30_vop_lit = {
 	.win_size = ARRAY_SIZE(px30_vop_lit_win_data),
 };
 
+static const struct vop_scl_regs rk3066_win_scl = {
+	.scale_yrgb_x = VOP_REG(RK3066_WIN0_SCL_FACTOR_YRGB, 0xffff, 0x0),
+	.scale_yrgb_y = VOP_REG(RK3066_WIN0_SCL_FACTOR_YRGB, 0xffff, 16),
+	.scale_cbcr_x = VOP_REG(RK3066_WIN0_SCL_FACTOR_CBR, 0xffff, 0x0),
+	.scale_cbcr_y = VOP_REG(RK3066_WIN0_SCL_FACTOR_CBR, 0xffff, 16),
+};
+
+static const struct vop_win_phy rk3066_win0_data = {
+	.scl = &rk3066_win_scl,
+	.data_formats = formats_win_full,
+	.nformats = ARRAY_SIZE(formats_win_full),
+	.enable = VOP_REG(RK3066_SYS_CTRL1, 0x1, 0),
+	.format = VOP_REG(RK3066_SYS_CTRL0, 0x7, 4),
+	.rb_swap = VOP_REG(RK3066_SYS_CTRL0, 0x1, 19),
+	.act_info = VOP_REG(RK3066_WIN0_ACT_INFO, 0x1fff1fff, 0),
+	.dsp_info = VOP_REG(RK3066_WIN0_DSP_INFO, 0x0fff0fff, 0),
+	.dsp_st = VOP_REG(RK3066_WIN0_DSP_ST, 0x1fff1fff, 0),
+	.yrgb_mst = VOP_REG(RK3066_WIN0_YRGB_MST0, 0xffffffff, 0),
+	.uv_mst = VOP_REG(RK3066_WIN0_CBR_MST0, 0xffffffff, 0),
+	.yrgb_vir = VOP_REG(RK3066_WIN0_VIR, 0xffff, 0),
+	.uv_vir = VOP_REG(RK3066_WIN0_VIR, 0x1fff, 16),
+};
+
+static const struct vop_win_phy rk3066_win1_data = {
+	.scl = &rk3066_win_scl,
+	.data_formats = formats_win_full,
+	.nformats = ARRAY_SIZE(formats_win_full),
+	.enable = VOP_REG(RK3066_SYS_CTRL1, 0x1, 1),
+	.format = VOP_REG(RK3066_SYS_CTRL0, 0x7, 7),
+	.rb_swap = VOP_REG(RK3066_SYS_CTRL0, 0x1, 23),
+	.act_info = VOP_REG(RK3066_WIN1_ACT_INFO, 0x1fff1fff, 0),
+	.dsp_info = VOP_REG(RK3066_WIN1_DSP_INFO, 0x0fff0fff, 0),
+	.dsp_st = VOP_REG(RK3066_WIN1_DSP_ST, 0x1fff1fff, 0),
+	.yrgb_mst = VOP_REG(RK3066_WIN1_YRGB_MST, 0xffffffff, 0),
+	.uv_mst = VOP_REG(RK3066_WIN1_CBR_MST, 0xffffffff, 0),
+	.yrgb_vir = VOP_REG(RK3066_WIN1_VIR, 0xffff, 0),
+	.uv_vir = VOP_REG(RK3066_WIN1_VIR, 0x1fff, 16),
+};
+
+static const struct vop_win_phy rk3066_win2_data = {
+	.data_formats = formats_win_lite,
+	.nformats = ARRAY_SIZE(formats_win_lite),
+	.enable = VOP_REG(RK3066_SYS_CTRL1, 0x1, 2),
+	.format = VOP_REG(RK3066_SYS_CTRL0, 0x7, 10),
+	.rb_swap = VOP_REG(RK3066_SYS_CTRL0, 0x1, 27),
+	.dsp_info = VOP_REG(RK3066_WIN2_DSP_INFO, 0x0fff0fff, 0),
+	.dsp_st = VOP_REG(RK3066_WIN2_DSP_ST, 0x1fff1fff, 0),
+	.yrgb_mst = VOP_REG(RK3066_WIN2_MST, 0xffffffff, 0),
+	.yrgb_vir = VOP_REG(RK3066_WIN2_VIR, 0xffff, 0),
+};
+
+static const struct vop_modeset rk3066_modeset = {
+	.htotal_pw = VOP_REG(RK3066_DSP_HTOTAL_HS_END, 0x1fff1fff, 0),
+	.hact_st_end = VOP_REG(RK3066_DSP_HACT_ST_END, 0x1fff1fff, 0),
+	.vtotal_pw = VOP_REG(RK3066_DSP_VTOTAL_VS_END, 0x1fff1fff, 0),
+	.vact_st_end = VOP_REG(RK3066_DSP_VACT_ST_END, 0x1fff1fff, 0),
+};
+
+static const struct vop_output rk3066_output = {
+	.pin_pol = VOP_REG(RK3066_DSP_CTRL0, 0x7, 4),
+};
+
+static const struct vop_common rk3066_common = {
+	.standby = VOP_REG(RK3066_SYS_CTRL0, 0x1, 1),
+	.out_mode = VOP_REG(RK3066_DSP_CTRL0, 0xf, 0),
+	.cfg_done = VOP_REG(RK3066_REG_CFG_DONE, 0x1, 0),
+	.dsp_blank = VOP_REG(RK3066_DSP_CTRL1, 0x1, 24),
+};
+
+static const struct vop_win_data rk3066_vop_win_data[] = {
+	{ .base = 0x00, .phy = &rk3066_win0_data,
+	  .type = DRM_PLANE_TYPE_PRIMARY },
+	{ .base = 0x00, .phy = &rk3066_win1_data,
+	  .type = DRM_PLANE_TYPE_OVERLAY },
+	{ .base = 0x00, .phy = &rk3066_win2_data,
+	  .type = DRM_PLANE_TYPE_CURSOR },
+};
+
+static const int rk3066_vop_intrs[] = {
+	/*
+	 * hs_start interrupt fires at frame-start, so serves
+	 * the same purpose as dsp_hold in the driver.
+	 */
+	DSP_HOLD_VALID_INTR,
+	FS_INTR,
+	LINE_FLAG_INTR,
+	BUS_ERROR_INTR,
+};
+
+static const struct vop_intr rk3066_intr = {
+	.intrs = rk3066_vop_intrs,
+	.nintrs = ARRAY_SIZE(rk3066_vop_intrs),
+	.line_flag_num[0] = VOP_REG(RK3066_INT_STATUS, 0xfff, 12),
+	.status = VOP_REG(RK3066_INT_STATUS, 0xf, 0),
+	.enable = VOP_REG(RK3066_INT_STATUS, 0xf, 4),
+	.clear = VOP_REG(RK3066_INT_STATUS, 0xf, 8),
+};
+
+static const struct vop_data rk3066_vop = {
+	.version = VOP_VERSION(2, 1),
+	.intr = &rk3066_intr,
+	.common = &rk3066_common,
+	.modeset = &rk3066_modeset,
+	.output = &rk3066_output,
+	.win = rk3066_vop_win_data,
+	.win_size = ARRAY_SIZE(rk3066_vop_win_data),
+};
+
 static const struct vop_scl_regs rk3188_win_scl = {
 	.scale_yrgb_x = VOP_REG(RK3188_WIN0_SCL_FACTOR_YRGB, 0xffff, 0x0),
 	.scale_yrgb_y = VOP_REG(RK3188_WIN0_SCL_FACTOR_YRGB, 0xffff, 16),
@@ -550,6 +658,27 @@ static const struct vop_intr rk3368_vop_intr = {
 	.clear = VOP_REG_MASK_SYNC(RK3368_INTR_CLEAR, 0x3fff, 0),
 };
 
+static const struct vop_win_phy rk3368_win01_data = {
+	.scl = &rk3288_win_full_scl,
+	.data_formats = formats_win_full,
+	.nformats = ARRAY_SIZE(formats_win_full),
+	.enable = VOP_REG(RK3368_WIN0_CTRL0, 0x1, 0),
+	.format = VOP_REG(RK3368_WIN0_CTRL0, 0x7, 1),
+	.rb_swap = VOP_REG(RK3368_WIN0_CTRL0, 0x1, 12),
+	.x_mir_en = VOP_REG(RK3368_WIN0_CTRL0, 0x1, 21),
+	.y_mir_en = VOP_REG(RK3368_WIN0_CTRL0, 0x1, 22),
+	.act_info = VOP_REG(RK3368_WIN0_ACT_INFO, 0x1fff1fff, 0),
+	.dsp_info = VOP_REG(RK3368_WIN0_DSP_INFO, 0x0fff0fff, 0),
+	.dsp_st = VOP_REG(RK3368_WIN0_DSP_ST, 0x1fff1fff, 0),
+	.yrgb_mst = VOP_REG(RK3368_WIN0_YRGB_MST, 0xffffffff, 0),
+	.uv_mst = VOP_REG(RK3368_WIN0_CBR_MST, 0xffffffff, 0),
+	.yrgb_vir = VOP_REG(RK3368_WIN0_VIR, 0x3fff, 0),
+	.uv_vir = VOP_REG(RK3368_WIN0_VIR, 0x3fff, 16),
+	.src_alpha_ctl = VOP_REG(RK3368_WIN0_SRC_ALPHA_CTRL, 0xff, 0),
+	.dst_alpha_ctl = VOP_REG(RK3368_WIN0_DST_ALPHA_CTRL, 0xff, 0),
+	.channel = VOP_REG(RK3368_WIN0_CTRL2, 0xff, 0),
+};
+
 static const struct vop_win_phy rk3368_win23_data = {
 	.data_formats = formats_win_lite,
 	.nformats = ARRAY_SIZE(formats_win_lite),
@@ -557,6 +686,7 @@ static const struct vop_win_phy rk3368_win23_data = {
 	.enable = VOP_REG(RK3368_WIN2_CTRL0, 0x1, 4),
 	.format = VOP_REG(RK3368_WIN2_CTRL0, 0x3, 5),
 	.rb_swap = VOP_REG(RK3368_WIN2_CTRL0, 0x1, 20),
+	.y_mir_en = VOP_REG(RK3368_WIN2_CTRL1, 0x1, 15),
 	.dsp_info = VOP_REG(RK3368_WIN2_DSP_INFO0, 0x0fff0fff, 0),
 	.dsp_st = VOP_REG(RK3368_WIN2_DSP_ST0, 0x1fff1fff, 0),
 	.yrgb_mst = VOP_REG(RK3368_WIN2_MST0, 0xffffffff, 0),
@@ -566,9 +696,9 @@ static const struct vop_win_phy rk3368_win23_data = {
 };
 
 static const struct vop_win_data rk3368_vop_win_data[] = {
-	{ .base = 0x00, .phy = &rk3288_win01_data,
+	{ .base = 0x00, .phy = &rk3368_win01_data,
 	  .type = DRM_PLANE_TYPE_PRIMARY },
-	{ .base = 0x40, .phy = &rk3288_win01_data,
+	{ .base = 0x40, .phy = &rk3368_win01_data,
 	  .type = DRM_PLANE_TYPE_OVERLAY },
 	{ .base = 0x00, .phy = &rk3368_win23_data,
 	  .type = DRM_PLANE_TYPE_OVERLAY },
@@ -637,6 +767,34 @@ static const struct vop_output rk3399_output = {
 	.mipi_dual_channel_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 3),
 };
 
+static const struct vop_yuv2yuv_phy rk3399_yuv2yuv_win01_data = {
+	.y2r_coefficients = {
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 0, 0xffff, 0),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 0, 0xffff, 16),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 4, 0xffff, 0),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 4, 0xffff, 16),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 8, 0xffff, 0),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 8, 0xffff, 16),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 12, 0xffff, 0),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 12, 0xffff, 16),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 16, 0xffff, 0),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 20, 0xffffffff, 0),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 24, 0xffffffff, 0),
+		VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 28, 0xffffffff, 0),
+	},
+};
+
+static const struct vop_yuv2yuv_phy rk3399_yuv2yuv_win23_data = { };
+
+static const struct vop_win_yuv2yuv_data rk3399_vop_big_win_yuv2yuv_data[] = {
+	{ .base = 0x00, .phy = &rk3399_yuv2yuv_win01_data,
+	  .y2r_en = VOP_REG(RK3399_YUV2YUV_WIN, 0x1, 1) },
+	{ .base = 0x60, .phy = &rk3399_yuv2yuv_win01_data,
+	  .y2r_en = VOP_REG(RK3399_YUV2YUV_WIN, 0x1, 9) },
+	{ .base = 0xC0, .phy = &rk3399_yuv2yuv_win23_data },
+	{ .base = 0x120, .phy = &rk3399_yuv2yuv_win23_data },
+};
+
 static const struct vop_data rk3399_vop_big = {
 	.version = VOP_VERSION(3, 5),
 	.feature = VOP_FEATURE_OUTPUT_RGB10,
@@ -647,15 +805,22 @@ static const struct vop_data rk3399_vop_big = {
 	.misc = &rk3368_misc,
 	.win = rk3368_vop_win_data,
 	.win_size = ARRAY_SIZE(rk3368_vop_win_data),
+	.win_yuv2yuv = rk3399_vop_big_win_yuv2yuv_data,
 };
 
 static const struct vop_win_data rk3399_vop_lit_win_data[] = {
-	{ .base = 0x00, .phy = &rk3288_win01_data,
+	{ .base = 0x00, .phy = &rk3368_win01_data,
 	  .type = DRM_PLANE_TYPE_PRIMARY },
 	{ .base = 0x00, .phy = &rk3368_win23_data,
 	  .type = DRM_PLANE_TYPE_CURSOR},
 };
 
+static const struct vop_win_yuv2yuv_data rk3399_vop_lit_win_yuv2yuv_data[] = {
+	{ .base = 0x00, .phy = &rk3399_yuv2yuv_win01_data,
+	  .y2r_en = VOP_REG(RK3399_YUV2YUV_WIN, 0x1, 1)},
+	{ .base = 0x60, .phy = &rk3399_yuv2yuv_win23_data },
+};
+
 static const struct vop_data rk3399_vop_lit = {
 	.version = VOP_VERSION(3, 6),
 	.intr = &rk3366_vop_intr,
@@ -665,6 +830,7 @@ static const struct vop_data rk3399_vop_lit = {
 	.misc = &rk3368_misc,
 	.win = rk3399_vop_lit_win_data,
 	.win_size = ARRAY_SIZE(rk3399_vop_lit_win_data),
+	.win_yuv2yuv = rk3399_vop_lit_win_yuv2yuv_data,
 };
 
 static const struct vop_win_data rk3228_vop_win_data[] = {
@@ -730,11 +896,11 @@ static const struct vop_intr rk3328_vop_intr = {
 };
 
 static const struct vop_win_data rk3328_vop_win_data[] = {
-	{ .base = 0xd0, .phy = &rk3288_win01_data,
+	{ .base = 0xd0, .phy = &rk3368_win01_data,
 	  .type = DRM_PLANE_TYPE_PRIMARY },
-	{ .base = 0x1d0, .phy = &rk3288_win01_data,
+	{ .base = 0x1d0, .phy = &rk3368_win01_data,
 	  .type = DRM_PLANE_TYPE_OVERLAY },
-	{ .base = 0x2d0, .phy = &rk3288_win01_data,
+	{ .base = 0x2d0, .phy = &rk3368_win01_data,
 	  .type = DRM_PLANE_TYPE_CURSOR },
 };
 
@@ -759,6 +925,8 @@ static const struct of_device_id vop_driver_dt_match[] = {
 	  .data = &px30_vop_big },
 	{ .compatible = "rockchip,px30-vop-lit",
 	  .data = &px30_vop_lit },
+	{ .compatible = "rockchip,rk3066-vop",
+	  .data = &rk3066_vop },
 	{ .compatible = "rockchip,rk3188-vop",
 	  .data = &rk3188_vop },
 	{ .compatible = "rockchip,rk3288-vop",
diff --git a/drivers/gpu/drm/rockchip/rockchip_vop_reg.h b/drivers/gpu/drm/rockchip/rockchip_vop_reg.h
index 7348c68352ed..d837d4a7df4a 100644
--- a/drivers/gpu/drm/rockchip/rockchip_vop_reg.h
+++ b/drivers/gpu/drm/rockchip/rockchip_vop_reg.h
@@ -983,4 +983,57 @@
 #define RK3188_REG_CFG_DONE		0x90
 /* rk3188 register definition end */
 
+/* rk3066 register definition */
+#define RK3066_SYS_CTRL0		0x00
+#define RK3066_SYS_CTRL1		0x04
+#define RK3066_DSP_CTRL0		0x08
+#define RK3066_DSP_CTRL1		0x0c
+#define RK3066_INT_STATUS		0x10
+#define RK3066_MCU_CTRL			0x14
+#define RK3066_BLEND_CTRL		0x18
+#define RK3066_WIN0_COLOR_KEY_CTRL	0x1c
+#define RK3066_WIN1_COLOR_KEY_CTRL	0x20
+#define RK3066_WIN2_COLOR_KEY_CTRL	0x24
+#define RK3066_WIN0_YRGB_MST0		0x28
+#define RK3066_WIN0_CBR_MST0		0x2c
+#define RK3066_WIN0_YRGB_MST1		0x30
+#define RK3066_WIN0_CBR_MST1		0x34
+#define RK3066_WIN0_VIR			0x38
+#define RK3066_WIN0_ACT_INFO		0x3c
+#define RK3066_WIN0_DSP_INFO		0x40
+#define RK3066_WIN0_DSP_ST		0x44
+#define RK3066_WIN0_SCL_FACTOR_YRGB	0x48
+#define RK3066_WIN0_SCL_FACTOR_CBR	0x4c
+#define RK3066_WIN0_SCL_OFFSET		0x50
+#define RK3066_WIN1_YRGB_MST		0x54
+#define RK3066_WIN1_CBR_MST		0x58
+#define RK3066_WIN1_VIR			0x5c
+#define RK3066_WIN1_ACT_INFO		0x60
+#define RK3066_WIN1_DSP_INFO		0x64
+#define RK3066_WIN1_DSP_ST		0x68
+#define RK3066_WIN1_SCL_FACTOR_YRGB	0x6c
+#define RK3066_WIN1_SCL_FACTOR_CBR	0x70
+#define RK3066_WIN1_SCL_OFFSET		0x74
+#define RK3066_WIN2_MST			0x78
+#define RK3066_WIN2_VIR			0x7c
+#define RK3066_WIN2_DSP_INFO		0x80
+#define RK3066_WIN2_DSP_ST		0x84
+#define RK3066_HWC_MST			0x88
+#define RK3066_HWC_DSP_ST		0x8c
+#define RK3066_HWC_COLOR_LUT0		0x90
+#define RK3066_HWC_COLOR_LUT1		0x94
+#define RK3066_HWC_COLOR_LUT2		0x98
+#define RK3066_DSP_HTOTAL_HS_END	0x9c
+#define RK3066_DSP_HACT_ST_END		0xa0
+#define RK3066_DSP_VTOTAL_VS_END	0xa4
+#define RK3066_DSP_VACT_ST_END		0xa8
+#define RK3066_DSP_VS_ST_END_F1		0xac
+#define RK3066_DSP_VACT_ST_END_F1	0xb0
+#define RK3066_REG_CFG_DONE		0xc0
+#define RK3066_MCU_BYPASS_WPORT		0x100
+#define RK3066_MCU_BYPASS_RPORT		0x200
+#define RK3066_WIN2_LUT_ADDR		0x400
+#define RK3066_DSP_LUT_ADDR		0x800
+/* rk3066 register definition end */
+
 #endif /* _ROCKCHIP_VOP_REG_H */
diff --git a/drivers/gpu/drm/savage/savage_state.c b/drivers/gpu/drm/savage/savage_state.c
index 7559a820bd43..ebb8b7d32b33 100644
--- a/drivers/gpu/drm/savage/savage_state.c
+++ b/drivers/gpu/drm/savage/savage_state.c
@@ -299,6 +299,7 @@ static int savage_dispatch_dma_prim(drm_savage_private_t * dev_priv,
 	case SAVAGE_PRIM_TRILIST_201:
 		reorder = 1;
 		prim = SAVAGE_PRIM_TRILIST;
+		/* fall through */
 	case SAVAGE_PRIM_TRILIST:
 		if (n % 3 != 0) {
 			DRM_ERROR("wrong number of vertices %u in TRILIST\n",
@@ -436,6 +437,7 @@ static int savage_dispatch_vb_prim(drm_savage_private_t * dev_priv,
 	case SAVAGE_PRIM_TRILIST_201:
 		reorder = 1;
 		prim = SAVAGE_PRIM_TRILIST;
+		/* fall through */
 	case SAVAGE_PRIM_TRILIST:
 		if (n % 3 != 0) {
 			DRM_ERROR("wrong number of vertices %u in TRILIST\n",
@@ -557,6 +559,7 @@ static int savage_dispatch_dma_idx(drm_savage_private_t * dev_priv,
 	case SAVAGE_PRIM_TRILIST_201:
 		reorder = 1;
 		prim = SAVAGE_PRIM_TRILIST;
+		/* fall through */
 	case SAVAGE_PRIM_TRILIST:
 		if (n % 3 != 0) {
 			DRM_ERROR("wrong number of indices %u in TRILIST\n", n);
@@ -695,6 +698,7 @@ static int savage_dispatch_vb_idx(drm_savage_private_t * dev_priv,
 	case SAVAGE_PRIM_TRILIST_201:
 		reorder = 1;
 		prim = SAVAGE_PRIM_TRILIST;
+		/* fall through */
 	case SAVAGE_PRIM_TRILIST:
 		if (n % 3 != 0) {
 			DRM_ERROR("wrong number of indices %u in TRILIST\n", n);
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index e2942c9a11a7..35ddbec1375a 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -52,12 +52,12 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 {
 	int i;
 
-	if (!(entity && rq_list && num_rq_list > 0 && rq_list[0]))
+	if (!(entity && rq_list && (num_rq_list == 0 || rq_list[0])))
 		return -EINVAL;
 
 	memset(entity, 0, sizeof(struct drm_sched_entity));
 	INIT_LIST_HEAD(&entity->list);
-	entity->rq = rq_list[0];
+	entity->rq = NULL;
 	entity->guilty = guilty;
 	entity->num_rq_list = num_rq_list;
 	entity->rq_list = kcalloc(num_rq_list, sizeof(struct drm_sched_rq *),
@@ -67,6 +67,10 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
 
 	for (i = 0; i < num_rq_list; ++i)
 		entity->rq_list[i] = rq_list[i];
+
+	if (num_rq_list)
+		entity->rq = rq_list[0];
+
 	entity->last_scheduled = NULL;
 
 	spin_lock_init(&entity->rq_lock);
@@ -165,6 +169,9 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
 	struct task_struct *last_user;
 	long ret = timeout;
 
+	if (!entity->rq)
+		return 0;
+
 	sched = entity->rq->sched;
 	/**
 	 * The client will not queue more IBs during this fini, consume existing
@@ -264,20 +271,24 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
  */
 void drm_sched_entity_fini(struct drm_sched_entity *entity)
 {
-	struct drm_gpu_scheduler *sched;
+	struct drm_gpu_scheduler *sched = NULL;
 
-	sched = entity->rq->sched;
-	drm_sched_rq_remove_entity(entity->rq, entity);
+	if (entity->rq) {
+		sched = entity->rq->sched;
+		drm_sched_rq_remove_entity(entity->rq, entity);
+	}
 
 	/* Consumption of existing IBs wasn't completed. Forcefully
 	 * remove them here.
 	 */
 	if (spsc_queue_peek(&entity->job_queue)) {
-		/* Park the kernel for a moment to make sure it isn't processing
-		 * our enity.
-		 */
-		kthread_park(sched->thread);
-		kthread_unpark(sched->thread);
+		if (sched) {
+			/* Park the kernel for a moment to make sure it isn't processing
+			 * our enity.
+			 */
+			kthread_park(sched->thread);
+			kthread_unpark(sched->thread);
+		}
 		if (entity->dependency) {
 			dma_fence_remove_callback(entity->dependency,
 						  &entity->cb);
@@ -362,9 +373,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
 	for (i = 0; i < entity->num_rq_list; ++i)
 		drm_sched_entity_set_rq_priority(&entity->rq_list[i], priority);
 
-	drm_sched_rq_remove_entity(entity->rq, entity);
-	drm_sched_entity_set_rq_priority(&entity->rq, priority);
-	drm_sched_rq_add_entity(entity->rq, entity);
+	if (entity->rq) {
+		drm_sched_rq_remove_entity(entity->rq, entity);
+		drm_sched_entity_set_rq_priority(&entity->rq, priority);
+		drm_sched_rq_add_entity(entity->rq, entity);
+	}
 
 	spin_unlock(&entity->rq_lock);
 }
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index dbb69063b3d5..19fc601c9eeb 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -60,8 +60,6 @@
 
 static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
 
-static void drm_sched_expel_job_unlocked(struct drm_sched_job *s_job);
-
 /**
  * drm_sched_rq_init - initialize a given run queue struct
  *
@@ -286,8 +284,6 @@ static void drm_sched_job_finish(struct work_struct *work)
 	cancel_delayed_work_sync(&sched->work_tdr);
 
 	spin_lock_irqsave(&sched->job_list_lock, flags);
-	/* remove job from ring_mirror_list */
-	list_del_init(&s_job->node);
 	/* queue TDR for next job */
 	drm_sched_start_timeout(sched);
 	spin_unlock_irqrestore(&sched->job_list_lock, flags);
@@ -295,22 +291,11 @@ static void drm_sched_job_finish(struct work_struct *work)
 	sched->ops->free_job(s_job);
 }
 
-static void drm_sched_job_finish_cb(struct dma_fence *f,
-				    struct dma_fence_cb *cb)
-{
-	struct drm_sched_job *job = container_of(cb, struct drm_sched_job,
-						 finish_cb);
-	schedule_work(&job->finish_work);
-}
-
 static void drm_sched_job_begin(struct drm_sched_job *s_job)
 {
 	struct drm_gpu_scheduler *sched = s_job->sched;
 	unsigned long flags;
 
-	dma_fence_add_callback(&s_job->s_fence->finished, &s_job->finish_cb,
-			       drm_sched_job_finish_cb);
-
 	spin_lock_irqsave(&sched->job_list_lock, flags);
 	list_add_tail(&s_job->node, &sched->ring_mirror_list);
 	drm_sched_start_timeout(sched);
@@ -335,6 +320,51 @@ static void drm_sched_job_timedout(struct work_struct *work)
 	spin_unlock_irqrestore(&sched->job_list_lock, flags);
 }
 
+ /**
+  * drm_sched_increase_karma - Update sched_entity guilty flag
+  *
+  * @bad: The job guilty of time out
+  *
+  * Increment on every hang caused by the 'bad' job. If this exceeds the hang
+  * limit of the scheduler then the respective sched entity is marked guilty and
+  * jobs from it will not be scheduled further
+  */
+void drm_sched_increase_karma(struct drm_sched_job *bad)
+{
+	int i;
+	struct drm_sched_entity *tmp;
+	struct drm_sched_entity *entity;
+	struct drm_gpu_scheduler *sched = bad->sched;
+
+	/* don't increase @bad's karma if it's from KERNEL RQ,
+	 * because sometimes GPU hang would cause kernel jobs (like VM updating jobs)
+	 * corrupt but keep in mind that kernel jobs always considered good.
+	 */
+	if (bad->s_priority != DRM_SCHED_PRIORITY_KERNEL) {
+		atomic_inc(&bad->karma);
+		for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_KERNEL;
+		     i++) {
+			struct drm_sched_rq *rq = &sched->sched_rq[i];
+
+			spin_lock(&rq->lock);
+			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
+				if (bad->s_fence->scheduled.context ==
+				    entity->fence_context) {
+					if (atomic_read(&bad->karma) >
+					    bad->sched->hang_limit)
+						if (entity->guilty)
+							atomic_set(entity->guilty, 1);
+					break;
+				}
+			}
+			spin_unlock(&rq->lock);
+			if (&entity->list != &rq->entities)
+				break;
+		}
+	}
+}
+EXPORT_SYMBOL(drm_sched_increase_karma);
+
 /**
  * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
  *
@@ -342,50 +372,42 @@ static void drm_sched_job_timedout(struct work_struct *work)
  * @bad: bad scheduler job
  *
  */
-void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
+void drm_sched_stop(struct drm_gpu_scheduler *sched)
 {
 	struct drm_sched_job *s_job;
-	struct drm_sched_entity *entity, *tmp;
 	unsigned long flags;
-	int i;
+	struct dma_fence *last_fence =  NULL;
 
+	kthread_park(sched->thread);
+
+	/*
+	 * Verify all the signaled jobs in mirror list are removed from the ring
+	 * by waiting for the latest job to enter the list. This should insure that
+	 * also all the previous jobs that were in flight also already singaled
+	 * and removed from the list.
+	 */
 	spin_lock_irqsave(&sched->job_list_lock, flags);
 	list_for_each_entry_reverse(s_job, &sched->ring_mirror_list, node) {
 		if (s_job->s_fence->parent &&
 		    dma_fence_remove_callback(s_job->s_fence->parent,
-					      &s_job->s_fence->cb)) {
+					      &s_job->cb)) {
 			dma_fence_put(s_job->s_fence->parent);
 			s_job->s_fence->parent = NULL;
 			atomic_dec(&sched->hw_rq_count);
+		} else {
+			 last_fence = dma_fence_get(&s_job->s_fence->finished);
+			 break;
 		}
 	}
 	spin_unlock_irqrestore(&sched->job_list_lock, flags);
 
-	if (bad && bad->s_priority != DRM_SCHED_PRIORITY_KERNEL) {
-		atomic_inc(&bad->karma);
-		/* don't increase @bad's karma if it's from KERNEL RQ,
-		 * becuase sometimes GPU hang would cause kernel jobs (like VM updating jobs)
-		 * corrupt but keep in mind that kernel jobs always considered good.
-		 */
-		for (i = DRM_SCHED_PRIORITY_MIN; i < DRM_SCHED_PRIORITY_KERNEL; i++ ) {
-			struct drm_sched_rq *rq = &sched->sched_rq[i];
-
-			spin_lock(&rq->lock);
-			list_for_each_entry_safe(entity, tmp, &rq->entities, list) {
-				if (bad->s_fence->scheduled.context == entity->fence_context) {
-				    if (atomic_read(&bad->karma) > bad->sched->hang_limit)
-						if (entity->guilty)
-							atomic_set(entity->guilty, 1);
-					break;
-				}
-			}
-			spin_unlock(&rq->lock);
-			if (&entity->list != &rq->entities)
-				break;
-		}
+	if (last_fence) {
+		dma_fence_wait(last_fence, false);
+		dma_fence_put(last_fence);
 	}
 }
-EXPORT_SYMBOL(drm_sched_hw_job_reset);
+
+EXPORT_SYMBOL(drm_sched_stop);
 
 /**
  * drm_sched_job_recovery - recover jobs after a reset
@@ -393,18 +415,58 @@ EXPORT_SYMBOL(drm_sched_hw_job_reset);
  * @sched: scheduler instance
  *
  */
-void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
+void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery)
 {
 	struct drm_sched_job *s_job, *tmp;
-	bool found_guilty = false;
-	unsigned long flags;
 	int r;
 
-	spin_lock_irqsave(&sched->job_list_lock, flags);
+	if (!full_recovery)
+		goto unpark;
+
+	/*
+	 * Locking the list is not required here as the sched thread is parked
+	 * so no new jobs are being pushed in to HW and in drm_sched_stop we
+	 * flushed all the jobs who were still in mirror list but who already
+	 * signaled and removed them self from the list. Also concurrent
+	 * GPU recovers can't run in parallel.
+	 */
+	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
+		struct dma_fence *fence = s_job->s_fence->parent;
+
+		if (fence) {
+			r = dma_fence_add_callback(fence, &s_job->cb,
+						   drm_sched_process_job);
+			if (r == -ENOENT)
+				drm_sched_process_job(fence, &s_job->cb);
+			else if (r)
+				DRM_ERROR("fence add callback failed (%d)\n",
+					  r);
+		} else
+			drm_sched_process_job(NULL, &s_job->cb);
+	}
+
+	drm_sched_start_timeout(sched);
+
+unpark:
+	kthread_unpark(sched->thread);
+}
+EXPORT_SYMBOL(drm_sched_start);
+
+/**
+ * drm_sched_resubmit_jobs - helper to relunch job from mirror ring list
+ *
+ * @sched: scheduler instance
+ *
+ */
+void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched)
+{
+	struct drm_sched_job *s_job, *tmp;
+	uint64_t guilty_context;
+	bool found_guilty = false;
+
+	/*TODO DO we need spinlock here ? */
 	list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) {
 		struct drm_sched_fence *s_fence = s_job->s_fence;
-		struct dma_fence *fence;
-		uint64_t guilty_context;
 
 		if (!found_guilty && atomic_read(&s_job->karma) > sched->hang_limit) {
 			found_guilty = true;
@@ -414,31 +476,11 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
 		if (found_guilty && s_job->s_fence->scheduled.context == guilty_context)
 			dma_fence_set_error(&s_fence->finished, -ECANCELED);
 
-		spin_unlock_irqrestore(&sched->job_list_lock, flags);
-		fence = sched->ops->run_job(s_job);
+		s_job->s_fence->parent = sched->ops->run_job(s_job);
 		atomic_inc(&sched->hw_rq_count);
-
-		if (fence) {
-			s_fence->parent = dma_fence_get(fence);
-			r = dma_fence_add_callback(fence, &s_fence->cb,
-						   drm_sched_process_job);
-			if (r == -ENOENT)
-				drm_sched_process_job(fence, &s_fence->cb);
-			else if (r)
-				DRM_ERROR("fence add callback failed (%d)\n",
-					  r);
-			dma_fence_put(fence);
-		} else {
-			if (s_fence->finished.error < 0)
-				drm_sched_expel_job_unlocked(s_job);
-			drm_sched_process_job(NULL, &s_fence->cb);
-		}
-		spin_lock_irqsave(&sched->job_list_lock, flags);
 	}
-	drm_sched_start_timeout(sched);
-	spin_unlock_irqrestore(&sched->job_list_lock, flags);
 }
-EXPORT_SYMBOL(drm_sched_job_recovery);
+EXPORT_SYMBOL(drm_sched_resubmit_jobs);
 
 /**
  * drm_sched_job_init - init a scheduler job
@@ -552,18 +594,27 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
  */
 static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
 {
-	struct drm_sched_fence *s_fence =
-		container_of(cb, struct drm_sched_fence, cb);
+	struct drm_sched_job *s_job = container_of(cb, struct drm_sched_job, cb);
+	struct drm_sched_fence *s_fence = s_job->s_fence;
 	struct drm_gpu_scheduler *sched = s_fence->sched;
+	unsigned long flags;
+
+	cancel_delayed_work(&sched->work_tdr);
 
-	dma_fence_get(&s_fence->finished);
 	atomic_dec(&sched->hw_rq_count);
 	atomic_dec(&sched->num_jobs);
+
+	spin_lock_irqsave(&sched->job_list_lock, flags);
+	/* remove job from ring_mirror_list */
+	list_del_init(&s_job->node);
+	spin_unlock_irqrestore(&sched->job_list_lock, flags);
+
 	drm_sched_fence_finished(s_fence);
 
 	trace_drm_sched_process_job(s_fence);
-	dma_fence_put(&s_fence->finished);
 	wake_up_interruptible(&sched->wake_up_worker);
+
+	schedule_work(&s_job->finish_work);
 }
 
 /**
@@ -626,34 +677,22 @@ static int drm_sched_main(void *param)
 
 		if (fence) {
 			s_fence->parent = dma_fence_get(fence);
-			r = dma_fence_add_callback(fence, &s_fence->cb,
+			r = dma_fence_add_callback(fence, &sched_job->cb,
 						   drm_sched_process_job);
 			if (r == -ENOENT)
-				drm_sched_process_job(fence, &s_fence->cb);
+				drm_sched_process_job(fence, &sched_job->cb);
 			else if (r)
 				DRM_ERROR("fence add callback failed (%d)\n",
 					  r);
 			dma_fence_put(fence);
-		} else {
-			if (s_fence->finished.error < 0)
-				drm_sched_expel_job_unlocked(sched_job);
-			drm_sched_process_job(NULL, &s_fence->cb);
-		}
+		} else
+			drm_sched_process_job(NULL, &sched_job->cb);
 
 		wake_up(&sched->job_scheduled);
 	}
 	return 0;
 }
 
-static void drm_sched_expel_job_unlocked(struct drm_sched_job *s_job)
-{
-	struct drm_gpu_scheduler *sched = s_job->sched;
-
-	spin_lock(&sched->job_list_lock);
-	list_del_init(&s_job->node);
-	spin_unlock(&sched->job_list_lock);
-}
-
 /**
  * drm_sched_init - Init a gpu scheduler instance
  *
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_crtc.c b/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
index 499b5fdb869f..b6988a6d698e 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
+++ b/drivers/gpu/drm/shmobile/shmob_drm_crtc.c
@@ -16,6 +16,7 @@
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "shmob_drm_backlight.h"
 #include "shmob_drm_crtc.h"
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_drv.c b/drivers/gpu/drm/shmobile/shmob_drm_drv.c
index 8554102a6ead..cb821adfc321 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_drv.c
+++ b/drivers/gpu/drm/shmobile/shmob_drm_drv.c
@@ -18,6 +18,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "shmob_drm_drv.h"
 #include "shmob_drm_kms.h"
@@ -126,7 +127,7 @@ static irqreturn_t shmob_drm_irq(int irq, void *arg)
 DEFINE_DRM_GEM_CMA_FOPS(shmob_drm_fops);
 
 static struct drm_driver shmob_drm_driver = {
-	.driver_features	= DRIVER_HAVE_IRQ | DRIVER_GEM | DRIVER_MODESET
+	.driver_features	= DRIVER_GEM | DRIVER_MODESET
 				| DRIVER_PRIME,
 	.irq_handler		= shmob_drm_irq,
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
@@ -229,8 +230,8 @@ static int shmob_drm_probe(struct platform_device *pdev)
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	sdev->mmio = devm_ioremap_resource(&pdev->dev, res);
-	if (sdev->mmio == NULL)
-		return -ENOMEM;
+	if (IS_ERR(sdev->mmio))
+		return PTR_ERR(sdev->mmio);
 
 	ret = shmob_drm_setup_clocks(sdev, pdata->clk_source);
 	if (ret < 0)
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_kms.c b/drivers/gpu/drm/shmobile/shmob_drm_kms.c
index a17268444c6d..2e08bc203bf9 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_kms.c
+++ b/drivers/gpu/drm/shmobile/shmob_drm_kms.c
@@ -13,6 +13,7 @@
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "shmob_drm_crtc.h"
 #include "shmob_drm_drv.h"
diff --git a/drivers/gpu/drm/sti/sti_crtc.c b/drivers/gpu/drm/sti/sti_crtc.c
index ed76e52eb213..387f0bed6c1c 100644
--- a/drivers/gpu/drm/sti/sti_crtc.c
+++ b/drivers/gpu/drm/sti/sti_crtc.c
@@ -11,8 +11,8 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "sti_compositor.h"
 #include "sti_crtc.h"
@@ -53,18 +53,10 @@ sti_crtc_mode_set(struct drm_crtc *crtc, struct drm_display_mode *mode)
 	struct clk *compo_clk, *pix_clk;
 	int rate = mode->clock * 1000;
 
-	DRM_DEBUG_KMS("CRTC:%d (%s) mode:%d (%s)\n",
-		      crtc->base.id, sti_mixer_to_str(mixer),
-		      mode->base.id, mode->name);
-
-	DRM_DEBUG_KMS("%d %d %d %d %d %d %d %d %d %d 0x%x 0x%x\n",
-		      mode->vrefresh, mode->clock,
-		      mode->hdisplay,
-		      mode->hsync_start, mode->hsync_end,
-		      mode->htotal,
-		      mode->vdisplay,
-		      mode->vsync_start, mode->vsync_end,
-		      mode->vtotal, mode->type, mode->flags);
+	DRM_DEBUG_KMS("CRTC:%d (%s) mode: (%s)\n",
+		      crtc->base.id, sti_mixer_to_str(mixer), mode->name);
+
+	DRM_DEBUG_KMS(DRM_MODE_FMT "\n", DRM_MODE_ARG(mode));
 
 	if (mixer->id == STI_MIXER_MAIN) {
 		compo_clk = compo->clk_compo_main;
diff --git a/drivers/gpu/drm/sti/sti_drv.c b/drivers/gpu/drm/sti/sti_drv.c
index ac54e0f9caea..a525fd899f68 100644
--- a/drivers/gpu/drm/sti/sti_drv.c
+++ b/drivers/gpu/drm/sti/sti_drv.c
@@ -14,12 +14,12 @@
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 
 #include "sti_crtc.h"
 #include "sti_drv.h"
diff --git a/drivers/gpu/drm/sti/sti_dvo.c b/drivers/gpu/drm/sti/sti_dvo.c
index b08376b7611b..b31cc2672d36 100644
--- a/drivers/gpu/drm/sti/sti_dvo.c
+++ b/drivers/gpu/drm/sti/sti_dvo.c
@@ -13,8 +13,8 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include "sti_awg_utils.h"
 #include "sti_drv.h"
@@ -277,8 +277,8 @@ static void sti_dvo_pre_enable(struct drm_bridge *bridge)
 }
 
 static void sti_dvo_set_mode(struct drm_bridge *bridge,
-			     struct drm_display_mode *mode,
-			     struct drm_display_mode *adjusted_mode)
+			     const struct drm_display_mode *mode,
+			     const struct drm_display_mode *adjusted_mode)
 {
 	struct sti_dvo *dvo = bridge->driver_private;
 	struct sti_mixer *mixer = to_sti_mixer(dvo->encoder->crtc);
diff --git a/drivers/gpu/drm/sti/sti_hda.c b/drivers/gpu/drm/sti/sti_hda.c
index 19b9b5ed1297..ff9256673fc8 100644
--- a/drivers/gpu/drm/sti/sti_hda.c
+++ b/drivers/gpu/drm/sti/sti_hda.c
@@ -12,7 +12,7 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 /* HDformatter registers */
 #define HDA_ANA_CFG                     0x0000
@@ -508,8 +508,8 @@ static void sti_hda_pre_enable(struct drm_bridge *bridge)
 }
 
 static void sti_hda_set_mode(struct drm_bridge *bridge,
-		struct drm_display_mode *mode,
-		struct drm_display_mode *adjusted_mode)
+			     const struct drm_display_mode *mode,
+			     const struct drm_display_mode *adjusted_mode)
 {
 	struct sti_hda *hda = bridge->driver_private;
 	u32 mode_idx;
diff --git a/drivers/gpu/drm/sti/sti_hdmi.c b/drivers/gpu/drm/sti/sti_hdmi.c
index ccf718404a1c..6000df624980 100644
--- a/drivers/gpu/drm/sti/sti_hdmi.c
+++ b/drivers/gpu/drm/sti/sti_hdmi.c
@@ -15,8 +15,8 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 
 #include <sound/hdmi-codec.h>
 
@@ -434,7 +434,8 @@ static int hdmi_avi_infoframe_config(struct sti_hdmi *hdmi)
 
 	DRM_DEBUG_DRIVER("\n");
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&infoframe, mode, false);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&infoframe,
+						       hdmi->drm_connector, mode);
 	if (ret < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %d\n", ret);
 		return ret;
@@ -917,8 +918,8 @@ static void sti_hdmi_pre_enable(struct drm_bridge *bridge)
 }
 
 static void sti_hdmi_set_mode(struct drm_bridge *bridge,
-		struct drm_display_mode *mode,
-		struct drm_display_mode *adjusted_mode)
+			      const struct drm_display_mode *mode,
+			      const struct drm_display_mode *adjusted_mode)
 {
 	struct sti_hdmi *hdmi = bridge->driver_private;
 	int ret;
diff --git a/drivers/gpu/drm/sti/sti_tvout.c b/drivers/gpu/drm/sti/sti_tvout.c
index ea4a3b87fa55..c42f2fa7053c 100644
--- a/drivers/gpu/drm/sti/sti_tvout.c
+++ b/drivers/gpu/drm/sti/sti_tvout.c
@@ -15,7 +15,7 @@
 #include <linux/seq_file.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_atomic_helper.h>
 
 #include "sti_crtc.h"
 #include "sti_drv.h"
diff --git a/drivers/gpu/drm/stm/drv.c b/drivers/gpu/drm/stm/drv.c
index 8dec001b9d37..0a7f933ab007 100644
--- a/drivers/gpu/drm/stm/drv.c
+++ b/drivers/gpu/drm/stm/drv.c
@@ -9,15 +9,19 @@
  */
 
 #include <linux/component.h>
+#include <linux/dma-mapping.h>
+#include <linux/module.h>
 #include <linux/of_platform.h>
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_fb_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
+#include <drm/drm_vblank.h>
 
 #include "ltdc.h"
 
diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
index a514b593f37c..a672b59a2226 100644
--- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
+++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
@@ -215,7 +215,7 @@ static int dw_mipi_dsi_phy_init(void *priv_data)
 }
 
 static int
-dw_mipi_dsi_get_lane_mbps(void *priv_data, struct drm_display_mode *mode,
+dw_mipi_dsi_get_lane_mbps(void *priv_data, const struct drm_display_mode *mode,
 			  unsigned long mode_flags, u32 lanes, u32 format,
 			  unsigned int *lane_mbps)
 {
diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 61dd661aa0ac..b1741a9d5be2 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -10,18 +10,25 @@
 
 #include <linux/clk.h>
 #include <linux/component.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
 #include <linux/of_address.h>
 #include <linux/of_graph.h>
+#include <linux/platform_device.h>
 #include <linux/reset.h>
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_bridge.h>
+#include <drm/drm_device.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fourcc.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_of.h>
-#include <drm/drm_bridge.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
+#include <drm/drm_vblank.h>
 
 #include <video/videomode.h>
 
@@ -691,7 +698,7 @@ static int ltdc_plane_atomic_check(struct drm_plane *plane,
 				   struct drm_plane_state *state)
 {
 	struct drm_framebuffer *fb = state->fb;
-	u32 src_x, src_y, src_w, src_h;
+	u32 src_w, src_h;
 
 	DRM_DEBUG_DRIVER("\n");
 
@@ -699,8 +706,6 @@ static int ltdc_plane_atomic_check(struct drm_plane *plane,
 		return 0;
 
 	/* convert src_ from 16:16 format */
-	src_x = state->src_x >> 16;
-	src_y = state->src_y >> 16;
 	src_w = state->src_w >> 16;
 	src_h = state->src_h >> 16;
 
diff --git a/drivers/gpu/drm/sun4i/Kconfig b/drivers/gpu/drm/sun4i/Kconfig
index c2c042287c19..1dbbc3a1b763 100644
--- a/drivers/gpu/drm/sun4i/Kconfig
+++ b/drivers/gpu/drm/sun4i/Kconfig
@@ -45,10 +45,11 @@ config DRM_SUN6I_DSI
 	default MACH_SUN8I
 	select CRC_CCITT
 	select DRM_MIPI_DSI
+	select PHY_SUN6I_MIPI_DPHY
 	help
 	  Choose this option if you want have an Allwinner SoC with
 	  MIPI-DSI support. If M is selected the module will be called
-	  sun6i-dsi
+	  sun6i_mipi_dsi.
 
 config DRM_SUN8I_DW_HDMI
 	tristate "Support for Allwinner version of DesignWare HDMI"
diff --git a/drivers/gpu/drm/sun4i/Makefile b/drivers/gpu/drm/sun4i/Makefile
index 0eb38ac8e86e..0d04f2447b01 100644
--- a/drivers/gpu/drm/sun4i/Makefile
+++ b/drivers/gpu/drm/sun4i/Makefile
@@ -24,9 +24,6 @@ sun4i-tcon-y			+= sun4i_lvds.o
 sun4i-tcon-y			+= sun4i_tcon.o
 sun4i-tcon-y			+= sun4i_rgb.o
 
-sun6i-dsi-y			+= sun6i_mipi_dphy.o
-sun6i-dsi-y			+= sun6i_mipi_dsi.o
-
 obj-$(CONFIG_DRM_SUN4I)		+= sun4i-drm.o
 obj-$(CONFIG_DRM_SUN4I)		+= sun4i-tcon.o
 obj-$(CONFIG_DRM_SUN4I)		+= sun4i_tv.o
@@ -37,7 +34,7 @@ ifdef CONFIG_DRM_SUN4I_BACKEND
 obj-$(CONFIG_DRM_SUN4I)		+= sun4i-frontend.o
 endif
 obj-$(CONFIG_DRM_SUN4I_HDMI)	+= sun4i-drm-hdmi.o
-obj-$(CONFIG_DRM_SUN6I_DSI)	+= sun6i-dsi.o
+obj-$(CONFIG_DRM_SUN6I_DSI)	+= sun6i_mipi_dsi.o
 obj-$(CONFIG_DRM_SUN8I_DW_HDMI)	+= sun8i-drm-hdmi.o
 obj-$(CONFIG_DRM_SUN8I_MIXER)	+= sun8i-mixer.o
 obj-$(CONFIG_DRM_SUN8I_TCON_TOP) += sun8i_tcon_top.o
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
index a021bab11a4f..4c0d51f73237 100644
--- a/drivers/gpu/drm/sun4i/sun4i_backend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
@@ -14,10 +14,10 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/component.h>
 #include <linux/list.h>
@@ -45,28 +45,6 @@ static const u32 sunxi_rgb2yuv_coef[12] = {
 	0x000001c1, 0x00003e88, 0x00003fb8, 0x00000808
 };
 
-/*
- * These coefficients are taken from the A33 BSP from Allwinner.
- *
- * The first three values of each row are coded as 13-bit signed fixed-point
- * numbers, with 10 bits for the fractional part. The fourth value is a
- * constant coded as a 14-bit signed fixed-point number with 4 bits for the
- * fractional part.
- *
- * The values in table order give the following colorspace translation:
- * G = 1.164 * Y - 0.391 * U - 0.813 * V + 135
- * R = 1.164 * Y + 1.596 * V - 222
- * B = 1.164 * Y + 2.018 * U + 276
- *
- * This seems to be a conversion from Y[16:235] UV[16:240] to RGB[0:255],
- * following the BT601 spec.
- */
-static const u32 sunxi_bt601_yuv2rgb_coef[12] = {
-	0x000004a7, 0x00001e6f, 0x00001cbf, 0x00000877,
-	0x000004a7, 0x00000000, 0x00000662, 0x00003211,
-	0x000004a7, 0x00000812, 0x00000000, 0x00002eb1,
-};
-
 static void sun4i_backend_apply_color_correction(struct sunxi_engine *engine)
 {
 	int i;
@@ -163,7 +141,6 @@ static const uint32_t sun4i_backend_formats[] = {
 	DRM_FORMAT_ARGB1555,
 	DRM_FORMAT_ARGB4444,
 	DRM_FORMAT_ARGB8888,
-	DRM_FORMAT_BGRX8888,
 	DRM_FORMAT_RGB565,
 	DRM_FORMAT_RGB888,
 	DRM_FORMAT_RGBA4444,
@@ -245,7 +222,8 @@ static int sun4i_backend_update_yuv_format(struct sun4i_backend *backend,
 			   SUN4I_BACKEND_ATTCTL_REG0_LAY_YUVEN);
 
 	/* TODO: Add support for the multi-planar YUV formats */
-	if (format->num_planes == 1)
+	if (drm_format_info_is_yuv_packed(format) &&
+	    drm_format_info_is_yuv_sampling_422(format))
 		val |= SUN4I_BACKEND_IYUVCTL_FBFMT_PACKED_YUV422;
 	else
 		DRM_DEBUG_DRIVER("Unsupported YUV format (0x%x)\n", fmt);
@@ -1035,6 +1013,10 @@ static const struct of_device_id sun4i_backend_of_table[] = {
 		.data = &sun7i_backend_quirks,
 	},
 	{
+		.compatible = "allwinner,sun8i-a23-display-backend",
+		.data = &sun8i_a33_backend_quirks,
+	},
+	{
 		.compatible = "allwinner,sun8i-a33-display-backend",
 		.data = &sun8i_a33_backend_quirks,
 	},
diff --git a/drivers/gpu/drm/sun4i/sun4i_crtc.c b/drivers/gpu/drm/sun4i/sun4i_crtc.c
index 3eedf335a935..cdb881e34470 100644
--- a/drivers/gpu/drm/sun4i/sun4i_crtc.c
+++ b/drivers/gpu/drm/sun4i/sun4i_crtc.c
@@ -13,8 +13,8 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_modes.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/clk-provider.h>
 #include <linux/ioport.h>
diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c b/drivers/gpu/drm/sun4i/sun4i_drv.c
index 9e4c375ccc96..3ebd9f5e2719 100644
--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
@@ -16,11 +16,11 @@
 #include <linux/of_reserved_mem.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
-#include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 
 #include "sun4i_drv.h"
 #include "sun4i_frontend.h"
@@ -97,6 +97,7 @@ static int sun4i_drv_bind(struct device *dev)
 	}
 
 	drm_mode_config_init(drm);
+	drm->mode_config.allow_fb_modifiers = true;
 
 	ret = component_bind_all(drm->dev, drm);
 	if (ret) {
@@ -164,6 +165,7 @@ static bool sun4i_drv_node_is_frontend(struct device_node *node)
 		of_device_is_compatible(node, "allwinner,sun5i-a13-display-frontend") ||
 		of_device_is_compatible(node, "allwinner,sun6i-a31-display-frontend") ||
 		of_device_is_compatible(node, "allwinner,sun7i-a20-display-frontend") ||
+		of_device_is_compatible(node, "allwinner,sun8i-a23-display-frontend") ||
 		of_device_is_compatible(node, "allwinner,sun8i-a33-display-frontend") ||
 		of_device_is_compatible(node, "allwinner,sun9i-a80-display-frontend");
 }
@@ -403,6 +405,7 @@ static const struct of_device_id sun4i_drv_of_table[] = {
 	{ .compatible = "allwinner,sun6i-a31-display-engine" },
 	{ .compatible = "allwinner,sun6i-a31s-display-engine" },
 	{ .compatible = "allwinner,sun7i-a20-display-engine" },
+	{ .compatible = "allwinner,sun8i-a23-display-engine" },
 	{ .compatible = "allwinner,sun8i-a33-display-engine" },
 	{ .compatible = "allwinner,sun8i-a83t-display-engine" },
 	{ .compatible = "allwinner,sun8i-h3-display-engine" },
diff --git a/drivers/gpu/drm/sun4i/sun4i_frontend.c b/drivers/gpu/drm/sun4i/sun4i_frontend.c
index 1a7ebc45747e..346c8071bd38 100644
--- a/drivers/gpu/drm/sun4i/sun4i_frontend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_frontend.c
@@ -10,6 +10,7 @@
 #include <linux/clk.h>
 #include <linux/component.h>
 #include <linux/module.h>
+#include <linux/of_device.h>
 #include <linux/platform_device.h>
 #include <linux/pm_runtime.h>
 #include <linux/regmap.h>
@@ -48,10 +49,38 @@ static const u32 sun4i_frontend_horz_coef[64] = {
 	0x03ff0000, 0x0000fd41, 0x01ff0000, 0x0000fe42,
 };
 
+/*
+ * These coefficients are taken from the A33 BSP from Allwinner.
+ *
+ * The first three values of each row are coded as 13-bit signed fixed-point
+ * numbers, with 10 bits for the fractional part. The fourth value is a
+ * constant coded as a 14-bit signed fixed-point number with 4 bits for the
+ * fractional part.
+ *
+ * The values in table order give the following colorspace translation:
+ * G = 1.164 * Y - 0.391 * U - 0.813 * V + 135
+ * R = 1.164 * Y + 1.596 * V - 222
+ * B = 1.164 * Y + 2.018 * U + 276
+ *
+ * This seems to be a conversion from Y[16:235] UV[16:240] to RGB[0:255],
+ * following the BT601 spec.
+ */
+const u32 sunxi_bt601_yuv2rgb_coef[12] = {
+	0x000004a7, 0x00001e6f, 0x00001cbf, 0x00000877,
+	0x000004a7, 0x00000000, 0x00000662, 0x00003211,
+	0x000004a7, 0x00000812, 0x00000000, 0x00002eb1,
+};
+EXPORT_SYMBOL(sunxi_bt601_yuv2rgb_coef);
+
 static void sun4i_frontend_scaler_init(struct sun4i_frontend *frontend)
 {
 	int i;
 
+	if (frontend->data->has_coef_access_ctrl)
+		regmap_write_bits(frontend->regs, SUN4I_FRONTEND_FRM_CTRL_REG,
+				  SUN4I_FRONTEND_FRM_CTRL_COEF_ACCESS_CTRL,
+				  SUN4I_FRONTEND_FRM_CTRL_COEF_ACCESS_CTRL);
+
 	for (i = 0; i < 32; i++) {
 		regmap_write(frontend->regs, SUN4I_FRONTEND_CH0_HORZCOEF0_REG(i),
 			     sun4i_frontend_horz_coef[2 * i]);
@@ -67,9 +96,11 @@ static void sun4i_frontend_scaler_init(struct sun4i_frontend *frontend)
 			     sun4i_frontend_vert_coef[i]);
 	}
 
-	regmap_update_bits(frontend->regs, SUN4I_FRONTEND_FRM_CTRL_REG,
-			   SUN4I_FRONTEND_FRM_CTRL_COEF_ACCESS_CTRL,
-			   SUN4I_FRONTEND_FRM_CTRL_COEF_ACCESS_CTRL);
+	if (frontend->data->has_coef_rdy)
+		regmap_write_bits(frontend->regs,
+				  SUN4I_FRONTEND_FRM_CTRL_REG,
+				  SUN4I_FRONTEND_FRM_CTRL_COEF_RDY,
+				  SUN4I_FRONTEND_FRM_CTRL_COEF_RDY);
 }
 
 int sun4i_frontend_init(struct sun4i_frontend *frontend)
@@ -84,59 +115,228 @@ void sun4i_frontend_exit(struct sun4i_frontend *frontend)
 }
 EXPORT_SYMBOL(sun4i_frontend_exit);
 
+static bool sun4i_frontend_format_chroma_requires_swap(uint32_t fmt)
+{
+	switch (fmt) {
+	case DRM_FORMAT_YVU411:
+	case DRM_FORMAT_YVU420:
+	case DRM_FORMAT_YVU422:
+	case DRM_FORMAT_YVU444:
+		return true;
+
+	default:
+		return false;
+	}
+}
+
+static bool sun4i_frontend_format_supports_tiling(uint32_t fmt)
+{
+	switch (fmt) {
+	case DRM_FORMAT_NV12:
+	case DRM_FORMAT_NV16:
+	case DRM_FORMAT_NV21:
+	case DRM_FORMAT_NV61:
+	case DRM_FORMAT_YUV411:
+	case DRM_FORMAT_YUV420:
+	case DRM_FORMAT_YUV422:
+	case DRM_FORMAT_YVU420:
+	case DRM_FORMAT_YVU422:
+	case DRM_FORMAT_YVU411:
+		return true;
+
+	default:
+		return false;
+	}
+}
+
 void sun4i_frontend_update_buffer(struct sun4i_frontend *frontend,
 				  struct drm_plane *plane)
 {
 	struct drm_plane_state *state = plane->state;
 	struct drm_framebuffer *fb = state->fb;
+	unsigned int strides[3] = {};
+
 	dma_addr_t paddr;
+	bool swap;
+
+	if (fb->modifier == DRM_FORMAT_MOD_ALLWINNER_TILED) {
+		unsigned int width = state->src_w >> 16;
+		unsigned int offset;
+
+		strides[0] = SUN4I_FRONTEND_LINESTRD_TILED(fb->pitches[0]);
+
+		/*
+		 * The X1 offset is the offset to the bottom-right point in the
+		 * end tile, which is the final pixel (at offset width - 1)
+		 * within the end tile (with a 32-byte mask).
+		 */
+		offset = (width - 1) & (32 - 1);
+
+		regmap_write(frontend->regs, SUN4I_FRONTEND_TB_OFF0_REG,
+			     SUN4I_FRONTEND_TB_OFF_X1(offset));
+
+		if (fb->format->num_planes > 1) {
+			strides[1] =
+				SUN4I_FRONTEND_LINESTRD_TILED(fb->pitches[1]);
+
+			regmap_write(frontend->regs, SUN4I_FRONTEND_TB_OFF1_REG,
+				     SUN4I_FRONTEND_TB_OFF_X1(offset));
+		}
+
+		if (fb->format->num_planes > 2) {
+			strides[2] =
+				SUN4I_FRONTEND_LINESTRD_TILED(fb->pitches[2]);
+
+			regmap_write(frontend->regs, SUN4I_FRONTEND_TB_OFF2_REG,
+				     SUN4I_FRONTEND_TB_OFF_X1(offset));
+		}
+	} else {
+		strides[0] = fb->pitches[0];
+
+		if (fb->format->num_planes > 1)
+			strides[1] = fb->pitches[1];
+
+		if (fb->format->num_planes > 2)
+			strides[2] = fb->pitches[2];
+	}
 
 	/* Set the line width */
 	DRM_DEBUG_DRIVER("Frontend stride: %d bytes\n", fb->pitches[0]);
 	regmap_write(frontend->regs, SUN4I_FRONTEND_LINESTRD0_REG,
-		     fb->pitches[0]);
+		     strides[0]);
+
+	if (fb->format->num_planes > 1)
+		regmap_write(frontend->regs, SUN4I_FRONTEND_LINESTRD1_REG,
+			     strides[1]);
+
+	if (fb->format->num_planes > 2)
+		regmap_write(frontend->regs, SUN4I_FRONTEND_LINESTRD2_REG,
+			     strides[2]);
+
+	/* Some planar formats require chroma channel swapping by hand. */
+	swap = sun4i_frontend_format_chroma_requires_swap(fb->format->format);
 
 	/* Set the physical address of the buffer in memory */
 	paddr = drm_fb_cma_get_gem_addr(fb, state, 0);
 	paddr -= PHYS_OFFSET;
-	DRM_DEBUG_DRIVER("Setting buffer address to %pad\n", &paddr);
+	DRM_DEBUG_DRIVER("Setting buffer #0 address to %pad\n", &paddr);
 	regmap_write(frontend->regs, SUN4I_FRONTEND_BUF_ADDR0_REG, paddr);
+
+	if (fb->format->num_planes > 1) {
+		paddr = drm_fb_cma_get_gem_addr(fb, state, swap ? 2 : 1);
+		paddr -= PHYS_OFFSET;
+		DRM_DEBUG_DRIVER("Setting buffer #1 address to %pad\n", &paddr);
+		regmap_write(frontend->regs, SUN4I_FRONTEND_BUF_ADDR1_REG,
+			     paddr);
+	}
+
+	if (fb->format->num_planes > 2) {
+		paddr = drm_fb_cma_get_gem_addr(fb, state, swap ? 1 : 2);
+		paddr -= PHYS_OFFSET;
+		DRM_DEBUG_DRIVER("Setting buffer #2 address to %pad\n", &paddr);
+		regmap_write(frontend->regs, SUN4I_FRONTEND_BUF_ADDR2_REG,
+			     paddr);
+	}
 }
 EXPORT_SYMBOL(sun4i_frontend_update_buffer);
 
-static int sun4i_frontend_drm_format_to_input_fmt(uint32_t fmt, u32 *val)
+static int
+sun4i_frontend_drm_format_to_input_fmt(const struct drm_format_info *format,
+				       u32 *val)
 {
-	switch (fmt) {
-	case DRM_FORMAT_XRGB8888:
+	if (!format->is_yuv)
 		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_RGB;
-		return 0;
-
-	default:
+	else if (drm_format_info_is_yuv_sampling_411(format))
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_YUV411;
+	else if (drm_format_info_is_yuv_sampling_420(format))
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_YUV420;
+	else if (drm_format_info_is_yuv_sampling_422(format))
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_YUV422;
+	else if (drm_format_info_is_yuv_sampling_444(format))
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_YUV444;
+	else
 		return -EINVAL;
-	}
+
+	return 0;
 }
 
-static int sun4i_frontend_drm_format_to_input_mode(uint32_t fmt, u32 *val)
+static int
+sun4i_frontend_drm_format_to_input_mode(const struct drm_format_info *format,
+					uint64_t modifier, u32 *val)
 {
-	if (drm_format_num_planes(fmt) == 1)
+	bool tiled = (modifier == DRM_FORMAT_MOD_ALLWINNER_TILED);
+
+	switch (format->num_planes) {
+	case 1:
 		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_PACKED;
-	else
-		return -EINVAL;
+		return 0;
 
-	return 0;
+	case 2:
+		*val = tiled ? SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_MB32_SEMIPLANAR
+			     : SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_SEMIPLANAR;
+		return 0;
+
+	case 3:
+		*val = tiled ? SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_MB32_PLANAR
+			     : SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_PLANAR;
+		return 0;
+
+	default:
+		return -EINVAL;
+	}
 }
 
-static int sun4i_frontend_drm_format_to_input_sequence(uint32_t fmt, u32 *val)
+static int
+sun4i_frontend_drm_format_to_input_sequence(const struct drm_format_info *format,
+					    u32 *val)
 {
-	switch (fmt) {
+	/* Planar formats have an explicit input sequence. */
+	if (drm_format_info_is_yuv_planar(format)) {
+		*val = 0;
+		return 0;
+	}
+
+	switch (format->format) {
 	case DRM_FORMAT_BGRX8888:
 		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_BGRX;
 		return 0;
 
+	case DRM_FORMAT_NV12:
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_UV;
+		return 0;
+
+	case DRM_FORMAT_NV16:
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_UV;
+		return 0;
+
+	case DRM_FORMAT_NV21:
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_VU;
+		return 0;
+
+	case DRM_FORMAT_NV61:
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_VU;
+		return 0;
+
+	case DRM_FORMAT_UYVY:
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_UYVY;
+		return 0;
+
+	case DRM_FORMAT_VYUY:
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_VYUY;
+		return 0;
+
 	case DRM_FORMAT_XRGB8888:
 		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_XRGB;
 		return 0;
 
+	case DRM_FORMAT_YUYV:
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_YUYV;
+		return 0;
+
+	case DRM_FORMAT_YVYU:
+		*val = SUN4I_FRONTEND_INPUT_FMT_DATA_PS_YVYU;
+		return 0;
+
 	default:
 		return -EINVAL;
 	}
@@ -160,14 +360,32 @@ static int sun4i_frontend_drm_format_to_output_fmt(uint32_t fmt, u32 *val)
 
 static const uint32_t sun4i_frontend_formats[] = {
 	DRM_FORMAT_BGRX8888,
+	DRM_FORMAT_NV12,
+	DRM_FORMAT_NV16,
+	DRM_FORMAT_NV21,
+	DRM_FORMAT_NV61,
+	DRM_FORMAT_UYVY,
+	DRM_FORMAT_VYUY,
 	DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_YUV411,
+	DRM_FORMAT_YUV420,
+	DRM_FORMAT_YUV422,
+	DRM_FORMAT_YUV444,
+	DRM_FORMAT_YUYV,
+	DRM_FORMAT_YVU411,
+	DRM_FORMAT_YVU420,
+	DRM_FORMAT_YVU422,
+	DRM_FORMAT_YVU444,
+	DRM_FORMAT_YVYU,
 };
 
 bool sun4i_frontend_format_is_supported(uint32_t fmt, uint64_t modifier)
 {
 	unsigned int i;
 
-	if (modifier != DRM_FORMAT_MOD_LINEAR)
+	if (modifier == DRM_FORMAT_MOD_ALLWINNER_TILED)
+		return sun4i_frontend_format_supports_tiling(fmt);
+	else if (modifier != DRM_FORMAT_MOD_LINEAR)
 		return false;
 
 	for (i = 0; i < ARRAY_SIZE(sun4i_frontend_formats); i++)
@@ -183,9 +401,12 @@ int sun4i_frontend_update_formats(struct sun4i_frontend *frontend,
 {
 	struct drm_plane_state *state = plane->state;
 	struct drm_framebuffer *fb = state->fb;
-	uint32_t format = fb->format->format;
+	const struct drm_format_info *format = fb->format;
+	uint64_t modifier = fb->modifier;
 	u32 out_fmt_val;
 	u32 in_fmt_val, in_mod_val, in_ps_val;
+	unsigned int i;
+	u32 bypass;
 	int ret;
 
 	ret = sun4i_frontend_drm_format_to_input_fmt(format, &in_fmt_val);
@@ -194,7 +415,8 @@ int sun4i_frontend_update_formats(struct sun4i_frontend *frontend,
 		return ret;
 	}
 
-	ret = sun4i_frontend_drm_format_to_input_mode(format, &in_mod_val);
+	ret = sun4i_frontend_drm_format_to_input_mode(format, modifier,
+						      &in_mod_val);
 	if (ret) {
 		DRM_DEBUG_DRIVER("Invalid input mode\n");
 		return ret;
@@ -216,16 +438,39 @@ int sun4i_frontend_update_formats(struct sun4i_frontend *frontend,
 	 * I have no idea what this does exactly, but it seems to be
 	 * related to the scaler FIR filter phase parameters.
 	 */
-	regmap_write(frontend->regs, SUN4I_FRONTEND_CH0_HORZPHASE_REG, 0x400);
-	regmap_write(frontend->regs, SUN4I_FRONTEND_CH1_HORZPHASE_REG, 0x400);
-	regmap_write(frontend->regs, SUN4I_FRONTEND_CH0_VERTPHASE0_REG, 0x400);
-	regmap_write(frontend->regs, SUN4I_FRONTEND_CH1_VERTPHASE0_REG, 0x400);
-	regmap_write(frontend->regs, SUN4I_FRONTEND_CH0_VERTPHASE1_REG, 0x400);
-	regmap_write(frontend->regs, SUN4I_FRONTEND_CH1_VERTPHASE1_REG, 0x400);
+	regmap_write(frontend->regs, SUN4I_FRONTEND_CH0_HORZPHASE_REG,
+		     frontend->data->ch_phase[0].horzphase);
+	regmap_write(frontend->regs, SUN4I_FRONTEND_CH1_HORZPHASE_REG,
+		     frontend->data->ch_phase[1].horzphase);
+	regmap_write(frontend->regs, SUN4I_FRONTEND_CH0_VERTPHASE0_REG,
+		     frontend->data->ch_phase[0].vertphase[0]);
+	regmap_write(frontend->regs, SUN4I_FRONTEND_CH1_VERTPHASE0_REG,
+		     frontend->data->ch_phase[1].vertphase[0]);
+	regmap_write(frontend->regs, SUN4I_FRONTEND_CH0_VERTPHASE1_REG,
+		     frontend->data->ch_phase[0].vertphase[1]);
+	regmap_write(frontend->regs, SUN4I_FRONTEND_CH1_VERTPHASE1_REG,
+		     frontend->data->ch_phase[1].vertphase[1]);
+
+	/*
+	 * Checking the input format is sufficient since we currently only
+	 * support RGB output formats to the backend. If YUV output formats
+	 * ever get supported, an YUV input and output would require bypassing
+	 * the CSC engine too.
+	 */
+	if (format->is_yuv) {
+		/* Setup the CSC engine for YUV to RGB conversion. */
+		bypass = 0;
+
+		for (i = 0; i < ARRAY_SIZE(sunxi_bt601_yuv2rgb_coef); i++)
+			regmap_write(frontend->regs,
+				     SUN4I_FRONTEND_CSC_COEF_REG(i),
+				     sunxi_bt601_yuv2rgb_coef[i]);
+	} else {
+		bypass = SUN4I_FRONTEND_BYPASS_CSC_EN;
+	}
 
 	regmap_update_bits(frontend->regs, SUN4I_FRONTEND_BYPASS_REG,
-			   SUN4I_FRONTEND_BYPASS_CSC_EN,
-			   SUN4I_FRONTEND_BYPASS_CSC_EN);
+			   SUN4I_FRONTEND_BYPASS_CSC_EN, bypass);
 
 	regmap_write(frontend->regs, SUN4I_FRONTEND_INPUT_FMT_REG,
 		     in_mod_val | in_fmt_val | in_ps_val);
@@ -321,6 +566,10 @@ static int sun4i_frontend_bind(struct device *dev, struct device *master,
 	frontend->dev = dev;
 	frontend->node = dev->of_node;
 
+	frontend->data = of_device_get_match_data(dev);
+	if (!frontend->data)
+		return -ENODEV;
+
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(regs))
@@ -433,8 +682,51 @@ static const struct dev_pm_ops sun4i_frontend_pm_ops = {
 	.runtime_suspend	= sun4i_frontend_runtime_suspend,
 };
 
+static const struct sun4i_frontend_data sun4i_a10_frontend = {
+	.ch_phase		= {
+		{
+			.horzphase = 0,
+			.vertphase = { 0, 0 },
+		},
+		{
+			.horzphase = 0xfc000,
+			.vertphase = { 0xfc000, 0xfc000 },
+		},
+	},
+	.has_coef_rdy		= true,
+};
+
+static const struct sun4i_frontend_data sun8i_a33_frontend = {
+	.ch_phase		= {
+		{
+			.horzphase = 0x400,
+			.vertphase = { 0x400, 0x400 },
+		},
+		{
+			.horzphase = 0x400,
+			.vertphase = { 0x400, 0x400 },
+		},
+	},
+	.has_coef_access_ctrl	= true,
+};
+
 const struct of_device_id sun4i_frontend_of_table[] = {
-	{ .compatible = "allwinner,sun8i-a33-display-frontend" },
+	{
+		.compatible = "allwinner,sun4i-a10-display-frontend",
+		.data = &sun4i_a10_frontend
+	},
+	{
+		.compatible = "allwinner,sun7i-a20-display-frontend",
+		.data = &sun4i_a10_frontend
+	},
+	{
+		.compatible = "allwinner,sun8i-a23-display-frontend",
+		.data = &sun8i_a33_frontend
+	},
+	{
+		.compatible = "allwinner,sun8i-a33-display-frontend",
+		.data = &sun8i_a33_frontend
+	},
 	{ }
 };
 EXPORT_SYMBOL(sun4i_frontend_of_table);
diff --git a/drivers/gpu/drm/sun4i/sun4i_frontend.h b/drivers/gpu/drm/sun4i/sun4i_frontend.h
index ad146e8d8d70..0c382c1ddb0f 100644
--- a/drivers/gpu/drm/sun4i/sun4i_frontend.h
+++ b/drivers/gpu/drm/sun4i/sun4i_frontend.h
@@ -22,12 +22,49 @@
 #define SUN4I_FRONTEND_BYPASS_CSC_EN			BIT(1)
 
 #define SUN4I_FRONTEND_BUF_ADDR0_REG		0x020
+#define SUN4I_FRONTEND_BUF_ADDR1_REG		0x024
+#define SUN4I_FRONTEND_BUF_ADDR2_REG		0x028
+
+#define SUN4I_FRONTEND_TB_OFF0_REG		0x030
+#define SUN4I_FRONTEND_TB_OFF1_REG		0x034
+#define SUN4I_FRONTEND_TB_OFF2_REG		0x038
+#define SUN4I_FRONTEND_TB_OFF_X1(x1)			((x1) << 16)
+#define SUN4I_FRONTEND_TB_OFF_Y0(y0)			((y0) << 8)
+#define SUN4I_FRONTEND_TB_OFF_X0(x0)			(x0)
 
 #define SUN4I_FRONTEND_LINESTRD0_REG		0x040
+#define SUN4I_FRONTEND_LINESTRD1_REG		0x044
+#define SUN4I_FRONTEND_LINESTRD2_REG		0x048
+
+/*
+ * In tiled mode, the stride is defined as the distance between the start of the
+ * end line of the current tile and the start of the first line in the next
+ * vertical tile.
+ *
+ * Tiles are represented in row-major order, thus the end line of current tile
+ * starts at: 31 * 32 (31 lines of 32 cols), the next vertical tile starts at:
+ * 32-bit-aligned-width * 32 and the distance is:
+ * 32 * (32-bit-aligned-width - 31).
+ */
+#define SUN4I_FRONTEND_LINESTRD_TILED(stride)		(((stride) - 31) * 32)
 
 #define SUN4I_FRONTEND_INPUT_FMT_REG		0x04c
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_PLANAR	(0 << 8)
 #define SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_PACKED	(1 << 8)
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_SEMIPLANAR	(2 << 8)
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_MB32_PLANAR	(4 << 8)
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_MOD_MB32_SEMIPLANAR (6 << 8)
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_YUV444	(0 << 4)
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_YUV422	(1 << 4)
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_YUV420	(2 << 4)
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_YUV411	(3 << 4)
 #define SUN4I_FRONTEND_INPUT_FMT_DATA_FMT_RGB		(5 << 4)
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_PS_UYVY		0
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_PS_YUYV		1
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_PS_VYUY		2
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_PS_YVYU		3
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_PS_UV		0
+#define SUN4I_FRONTEND_INPUT_FMT_DATA_PS_VU		1
 #define SUN4I_FRONTEND_INPUT_FMT_DATA_PS_BGRX		0
 #define SUN4I_FRONTEND_INPUT_FMT_DATA_PS_XRGB		1
 
@@ -35,6 +72,8 @@
 #define SUN4I_FRONTEND_OUTPUT_FMT_DATA_FMT_BGRX8888	1
 #define SUN4I_FRONTEND_OUTPUT_FMT_DATA_FMT_XRGB8888	2
 
+#define SUN4I_FRONTEND_CSC_COEF_REG(c)		(0x070 + (0x4 * (c)))
+
 #define SUN4I_FRONTEND_CH0_INSIZE_REG		0x100
 #define SUN4I_FRONTEND_INSIZE(h, w)			((((h) - 1) << 16) | (((w) - 1)))
 
@@ -73,6 +112,16 @@ struct drm_plane;
 struct regmap;
 struct reset_control;
 
+struct sun4i_frontend_data {
+	bool	has_coef_access_ctrl;
+	bool	has_coef_rdy;
+
+	struct {
+		u32	horzphase;
+		u32	vertphase[2];
+	} ch_phase[2];
+};
+
 struct sun4i_frontend {
 	struct list_head	list;
 	struct device		*dev;
@@ -83,9 +132,12 @@ struct sun4i_frontend {
 	struct clk		*ram_clk;
 	struct regmap		*regs;
 	struct reset_control	*reset;
+
+	const struct sun4i_frontend_data	*data;
 };
 
 extern const struct of_device_id sun4i_frontend_of_table[];
+extern const u32 sunxi_bt601_yuv2rgb_coef[12];
 
 int sun4i_frontend_init(struct sun4i_frontend *frontend);
 void sun4i_frontend_exit(struct sun4i_frontend *frontend);
diff --git a/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c b/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c
index 416da5376701..d18862629301 100644
--- a/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c
+++ b/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c
@@ -11,7 +11,7 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_encoder.h>
 #include <drm/drm_of.h>
@@ -52,7 +52,8 @@ static int sun4i_hdmi_setup_avi_infoframes(struct sun4i_hdmi *hdmi,
 	u8 buffer[17];
 	int i, ret;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame,
+						       &hdmi->connector, mode);
 	if (ret < 0) {
 		DRM_ERROR("Failed to get infoframes from mode\n");
 		return ret;
diff --git a/drivers/gpu/drm/sun4i/sun4i_layer.c b/drivers/gpu/drm/sun4i/sun4i_layer.c
index 29631e0efde3..a514fe88d441 100644
--- a/drivers/gpu/drm/sun4i/sun4i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun4i_layer.c
@@ -114,6 +114,18 @@ static void sun4i_backend_layer_atomic_update(struct drm_plane *plane,
 	sun4i_backend_layer_enable(backend, layer->id, true);
 }
 
+static bool sun4i_layer_format_mod_supported(struct drm_plane *plane,
+					     uint32_t format, uint64_t modifier)
+{
+	struct sun4i_layer *layer = plane_to_sun4i_layer(plane);
+
+	if (IS_ERR_OR_NULL(layer->backend->frontend))
+		sun4i_backend_format_is_supported(format, modifier);
+
+	return sun4i_backend_format_is_supported(format, modifier) ||
+	       sun4i_frontend_format_is_supported(format, modifier);
+}
+
 static const struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
 	.prepare_fb	= drm_gem_fb_prepare_fb,
 	.atomic_disable	= sun4i_backend_layer_atomic_disable,
@@ -127,6 +139,7 @@ static const struct drm_plane_funcs sun4i_backend_layer_funcs = {
 	.disable_plane		= drm_atomic_helper_disable_plane,
 	.reset			= sun4i_backend_layer_reset,
 	.update_plane		= drm_atomic_helper_update_plane,
+	.format_mod_supported	= sun4i_layer_format_mod_supported,
 };
 
 static const uint32_t sun4i_layer_formats[] = {
@@ -138,17 +151,53 @@ static const uint32_t sun4i_layer_formats[] = {
 	DRM_FORMAT_RGBA4444,
 	DRM_FORMAT_RGB888,
 	DRM_FORMAT_RGB565,
+	DRM_FORMAT_NV12,
+	DRM_FORMAT_NV16,
+	DRM_FORMAT_NV21,
+	DRM_FORMAT_NV61,
 	DRM_FORMAT_UYVY,
 	DRM_FORMAT_VYUY,
 	DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_YUV411,
+	DRM_FORMAT_YUV420,
+	DRM_FORMAT_YUV422,
+	DRM_FORMAT_YUV444,
 	DRM_FORMAT_YUYV,
+	DRM_FORMAT_YVU411,
+	DRM_FORMAT_YVU420,
+	DRM_FORMAT_YVU422,
+	DRM_FORMAT_YVU444,
 	DRM_FORMAT_YVYU,
 };
 
+static const uint32_t sun4i_backend_layer_formats[] = {
+	DRM_FORMAT_ARGB8888,
+	DRM_FORMAT_ARGB4444,
+	DRM_FORMAT_ARGB1555,
+	DRM_FORMAT_RGBA5551,
+	DRM_FORMAT_RGBA4444,
+	DRM_FORMAT_RGB888,
+	DRM_FORMAT_RGB565,
+	DRM_FORMAT_UYVY,
+	DRM_FORMAT_VYUY,
+	DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_YUYV,
+	DRM_FORMAT_YVYU,
+};
+
+static const uint64_t sun4i_layer_modifiers[] = {
+	DRM_FORMAT_MOD_LINEAR,
+	DRM_FORMAT_MOD_ALLWINNER_TILED,
+	DRM_FORMAT_MOD_INVALID
+};
+
 static struct sun4i_layer *sun4i_layer_init_one(struct drm_device *drm,
 						struct sun4i_backend *backend,
 						enum drm_plane_type type)
 {
+	const uint64_t *modifiers = sun4i_layer_modifiers;
+	const uint32_t *formats = sun4i_layer_formats;
+	unsigned int formats_len = ARRAY_SIZE(sun4i_layer_formats);
 	struct sun4i_layer *layer;
 	int ret;
 
@@ -156,12 +205,19 @@ static struct sun4i_layer *sun4i_layer_init_one(struct drm_device *drm,
 	if (!layer)
 		return ERR_PTR(-ENOMEM);
 
+	layer->backend = backend;
+
+	if (IS_ERR_OR_NULL(backend->frontend)) {
+		formats = sun4i_backend_layer_formats;
+		formats_len = ARRAY_SIZE(sun4i_backend_layer_formats);
+		modifiers = NULL;
+	}
+
 	/* possible crtcs are set later */
 	ret = drm_universal_plane_init(drm, &layer->plane, 0,
 				       &sun4i_backend_layer_funcs,
-				       sun4i_layer_formats,
-				       ARRAY_SIZE(sun4i_layer_formats),
-				       NULL, type, NULL);
+				       formats, formats_len,
+				       modifiers, type, NULL);
 	if (ret) {
 		dev_err(drm->dev, "Couldn't initialize layer\n");
 		return ERR_PTR(ret);
@@ -169,7 +225,6 @@ static struct sun4i_layer *sun4i_layer_init_one(struct drm_device *drm,
 
 	drm_plane_helper_add(&layer->plane,
 			     &sun4i_backend_layer_helper_funcs);
-	layer->backend = backend;
 
 	drm_plane_create_alpha_property(&layer->plane);
 	drm_plane_create_zpos_property(&layer->plane, 0, 0,
diff --git a/drivers/gpu/drm/sun4i/sun4i_lvds.c b/drivers/gpu/drm/sun4i/sun4i_lvds.c
index e7eb0d1e17be..147b97ed1a09 100644
--- a/drivers/gpu/drm/sun4i/sun4i_lvds.c
+++ b/drivers/gpu/drm/sun4i/sun4i_lvds.c
@@ -8,9 +8,9 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include "sun4i_crtc.h"
 #include "sun4i_tcon.h"
diff --git a/drivers/gpu/drm/sun4i/sun4i_rgb.c b/drivers/gpu/drm/sun4i/sun4i_rgb.c
index f4a22689eb54..cae19e7bbeaa 100644
--- a/drivers/gpu/drm/sun4i/sun4i_rgb.c
+++ b/drivers/gpu/drm/sun4i/sun4i_rgb.c
@@ -14,9 +14,9 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include "sun4i_crtc.h"
 #include "sun4i_tcon.h"
diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c b/drivers/gpu/drm/sun4i/sun4i_tcon.c
index cf45d0f940f9..7136fc91c603 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tcon.c
+++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c
@@ -14,11 +14,11 @@
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_connector.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_encoder.h>
 #include <drm/drm_modes.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include <uapi/drm/drm_mode.h>
 
@@ -1496,6 +1496,7 @@ const struct of_device_id sun4i_tcon_of_table[] = {
 	{ .compatible = "allwinner,sun6i-a31-tcon", .data = &sun6i_a31_quirks },
 	{ .compatible = "allwinner,sun6i-a31s-tcon", .data = &sun6i_a31s_quirks },
 	{ .compatible = "allwinner,sun7i-a20-tcon", .data = &sun7i_a20_quirks },
+	{ .compatible = "allwinner,sun8i-a23-tcon", .data = &sun8i_a33_quirks },
 	{ .compatible = "allwinner,sun8i-a33-tcon", .data = &sun8i_a33_quirks },
 	{ .compatible = "allwinner,sun8i-a83t-tcon-lcd", .data = &sun8i_a83t_lcd_quirks },
 	{ .compatible = "allwinner,sun8i-a83t-tcon-tv", .data = &sun8i_a83t_tv_quirks },
diff --git a/drivers/gpu/drm/sun4i/sun4i_tv.c b/drivers/gpu/drm/sun4i/sun4i_tv.c
index 1a838d208211..e8700a362064 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_tv.c
@@ -18,9 +18,9 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include "sun4i_crtc.h"
 #include "sun4i_drv.h"
diff --git a/drivers/gpu/drm/sun4i/sun6i_drc.c b/drivers/gpu/drm/sun4i/sun6i_drc.c
index 88eb268fdf73..442094a4af7a 100644
--- a/drivers/gpu/drm/sun4i/sun6i_drc.c
+++ b/drivers/gpu/drm/sun4i/sun6i_drc.c
@@ -101,6 +101,7 @@ static int sun6i_drc_remove(struct platform_device *pdev)
 static const struct of_device_id sun6i_drc_of_table[] = {
 	{ .compatible = "allwinner,sun6i-a31-drc" },
 	{ .compatible = "allwinner,sun6i-a31s-drc" },
+	{ .compatible = "allwinner,sun8i-a23-drc" },
 	{ .compatible = "allwinner,sun8i-a33-drc" },
 	{ .compatible = "allwinner,sun9i-a80-drc" },
 	{ }
diff --git a/drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c b/drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c
index e3b34a345546..318994cd1b85 100644
--- a/drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c
+++ b/drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c
@@ -16,12 +16,13 @@
 #include <linux/slab.h>
 
 #include <linux/phy/phy.h>
+#include <linux/phy/phy-mipi-dphy.h>
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_mipi_dsi.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include "sun4i_drv.h"
 #include "sun6i_mipi_dsi.h"
@@ -616,6 +617,8 @@ static void sun6i_dsi_encoder_enable(struct drm_encoder *encoder)
 	struct drm_display_mode *mode = &encoder->crtc->state->adjusted_mode;
 	struct sun6i_dsi *dsi = encoder_to_sun6i_dsi(encoder);
 	struct mipi_dsi_device *device = dsi->device;
+	union phy_configure_opts opts = { 0 };
+	struct phy_configure_opts_mipi_dphy *cfg = &opts.mipi_dphy;
 	u16 delay;
 
 	DRM_DEBUG_DRIVER("Enabling DSI output\n");
@@ -634,8 +637,15 @@ static void sun6i_dsi_encoder_enable(struct drm_encoder *encoder)
 	sun6i_dsi_setup_format(dsi, mode);
 	sun6i_dsi_setup_timings(dsi, mode);
 
-	sun6i_dphy_init(dsi->dphy, device->lanes);
-	sun6i_dphy_power_on(dsi->dphy, device->lanes);
+	phy_init(dsi->dphy);
+
+	phy_mipi_dphy_get_default_config(mode->clock * 1000,
+					 mipi_dsi_pixel_format_to_bpp(device->format),
+					 device->lanes, cfg);
+
+	phy_set_mode(dsi->dphy, PHY_MODE_MIPI_DPHY);
+	phy_configure(dsi->dphy, &opts);
+	phy_power_on(dsi->dphy);
 
 	if (!IS_ERR(dsi->panel))
 		drm_panel_prepare(dsi->panel);
@@ -673,8 +683,8 @@ static void sun6i_dsi_encoder_disable(struct drm_encoder *encoder)
 		drm_panel_unprepare(dsi->panel);
 	}
 
-	sun6i_dphy_power_off(dsi->dphy);
-	sun6i_dphy_exit(dsi->dphy);
+	phy_power_off(dsi->dphy);
+	phy_exit(dsi->dphy);
 
 	pm_runtime_put(dsi->dev);
 }
@@ -967,7 +977,6 @@ static const struct component_ops sun6i_dsi_ops = {
 static int sun6i_dsi_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
-	struct device_node *dphy_node;
 	struct sun6i_dsi *dsi;
 	struct resource *res;
 	void __iomem *base;
@@ -1013,11 +1022,10 @@ static int sun6i_dsi_probe(struct platform_device *pdev)
 	 */
 	clk_set_rate_exclusive(dsi->mod_clk, 297000000);
 
-	dphy_node = of_parse_phandle(dev->of_node, "phys", 0);
-	ret = sun6i_dphy_probe(dsi, dphy_node);
-	of_node_put(dphy_node);
-	if (ret) {
+	dsi->dphy = devm_phy_get(dev, "dphy");
+	if (IS_ERR(dsi->dphy)) {
 		dev_err(dev, "Couldn't get the MIPI D-PHY\n");
+		ret = PTR_ERR(dsi->dphy);
 		goto err_unprotect_clk;
 	}
 
@@ -1026,7 +1034,7 @@ static int sun6i_dsi_probe(struct platform_device *pdev)
 	ret = mipi_dsi_host_register(&dsi->host);
 	if (ret) {
 		dev_err(dev, "Couldn't register MIPI-DSI host\n");
-		goto err_remove_phy;
+		goto err_pm_disable;
 	}
 
 	ret = component_add(&pdev->dev, &sun6i_dsi_ops);
@@ -1039,9 +1047,8 @@ static int sun6i_dsi_probe(struct platform_device *pdev)
 
 err_remove_dsi_host:
 	mipi_dsi_host_unregister(&dsi->host);
-err_remove_phy:
+err_pm_disable:
 	pm_runtime_disable(dev);
-	sun6i_dphy_remove(dsi);
 err_unprotect_clk:
 	clk_rate_exclusive_put(dsi->mod_clk);
 	return ret;
@@ -1055,7 +1062,6 @@ static int sun6i_dsi_remove(struct platform_device *pdev)
 	component_del(&pdev->dev, &sun6i_dsi_ops);
 	mipi_dsi_host_unregister(&dsi->host);
 	pm_runtime_disable(dev);
-	sun6i_dphy_remove(dsi);
 	clk_rate_exclusive_put(dsi->mod_clk);
 
 	return 0;
diff --git a/drivers/gpu/drm/sun4i/sun6i_mipi_dsi.h b/drivers/gpu/drm/sun4i/sun6i_mipi_dsi.h
index dbbc5b3ecbda..a07090579f84 100644
--- a/drivers/gpu/drm/sun4i/sun6i_mipi_dsi.h
+++ b/drivers/gpu/drm/sun4i/sun6i_mipi_dsi.h
@@ -13,13 +13,6 @@
 #include <drm/drm_encoder.h>
 #include <drm/drm_mipi_dsi.h>
 
-struct sun6i_dphy {
-	struct clk		*bus_clk;
-	struct clk		*mod_clk;
-	struct regmap		*regs;
-	struct reset_control	*reset;
-};
-
 struct sun6i_dsi {
 	struct drm_connector	connector;
 	struct drm_encoder	encoder;
@@ -29,7 +22,7 @@ struct sun6i_dsi {
 	struct clk		*mod_clk;
 	struct regmap		*regs;
 	struct reset_control	*reset;
-	struct sun6i_dphy	*dphy;
+	struct phy		*dphy;
 
 	struct device		*dev;
 	struct sun4i_drv	*drv;
@@ -52,12 +45,4 @@ static inline struct sun6i_dsi *encoder_to_sun6i_dsi(const struct drm_encoder *e
 	return container_of(encoder, struct sun6i_dsi, encoder);
 };
 
-int sun6i_dphy_probe(struct sun6i_dsi *dsi, struct device_node *node);
-int sun6i_dphy_remove(struct sun6i_dsi *dsi);
-
-int sun6i_dphy_init(struct sun6i_dphy *dphy, unsigned int lanes);
-int sun6i_dphy_power_on(struct sun6i_dphy *dphy, unsigned int lanes);
-int sun6i_dphy_power_off(struct sun6i_dphy *dphy);
-int sun6i_dphy_exit(struct sun6i_dphy *dphy);
-
 #endif /* _SUN6I_MIPI_DSI_H_ */
diff --git a/drivers/gpu/drm/sun4i/sun8i_mixer.c b/drivers/gpu/drm/sun4i/sun8i_mixer.c
index 44a9ba7d8433..30a2eff55687 100644
--- a/drivers/gpu/drm/sun4i/sun8i_mixer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_mixer.c
@@ -14,10 +14,10 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/component.h>
 #include <linux/dma-mapping.h>
diff --git a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
index 18534263a05d..a342ec8b131e 100644
--- a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
@@ -16,11 +16,11 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #include "sun8i_ui_layer.h"
diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
index 87be898f9b7a..8a0616238467 100644
--- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
@@ -10,11 +10,11 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #include "sun8i_vi_layer.h"
diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
index 2e0d6213f6bc..33c463e8d49f 100644
--- a/drivers/gpu/drm/tegra/Makefile
+++ b/drivers/gpu/drm/tegra/Makefile
@@ -10,6 +10,7 @@ tegra-drm-y := \
 	dc.o \
 	output.o \
 	rgb.o \
+	hda.o \
 	hdmi.o \
 	mipi-phy.o \
 	dsi.o \
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 4b70ce664c41..0c5f1e6a0446 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -92,10 +92,6 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 		return -ENOMEM;
 
 	if (iommu_present(&platform_bus_type)) {
-		u64 carveout_start, carveout_end, gem_start, gem_end;
-		struct iommu_domain_geometry *geometry;
-		unsigned long order;
-
 		tegra->domain = iommu_domain_alloc(&platform_bus_type);
 		if (!tegra->domain) {
 			err = -ENOMEM;
@@ -105,27 +101,6 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 		err = iova_cache_get();
 		if (err < 0)
 			goto domain;
-
-		geometry = &tegra->domain->geometry;
-		gem_start = geometry->aperture_start;
-		gem_end = geometry->aperture_end - CARVEOUT_SZ;
-		carveout_start = gem_end + 1;
-		carveout_end = geometry->aperture_end;
-
-		order = __ffs(tegra->domain->pgsize_bitmap);
-		init_iova_domain(&tegra->carveout.domain, 1UL << order,
-				 carveout_start >> order);
-
-		tegra->carveout.shift = iova_shift(&tegra->carveout.domain);
-		tegra->carveout.limit = carveout_end >> tegra->carveout.shift;
-
-		drm_mm_init(&tegra->mm, gem_start, gem_end - gem_start + 1);
-		mutex_init(&tegra->mm_lock);
-
-		DRM_DEBUG("IOMMU apertures:\n");
-		DRM_DEBUG("  GEM: %#llx-%#llx\n", gem_start, gem_end);
-		DRM_DEBUG("  Carveout: %#llx-%#llx\n", carveout_start,
-			  carveout_end);
 	}
 
 	mutex_init(&tegra->clients_lock);
@@ -159,6 +134,36 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 	if (err < 0)
 		goto fbdev;
 
+	if (tegra->domain) {
+		u64 carveout_start, carveout_end, gem_start, gem_end;
+		u64 dma_mask = dma_get_mask(&device->dev);
+		dma_addr_t start, end;
+		unsigned long order;
+
+		start = tegra->domain->geometry.aperture_start & dma_mask;
+		end = tegra->domain->geometry.aperture_end & dma_mask;
+
+		gem_start = start;
+		gem_end = end - CARVEOUT_SZ;
+		carveout_start = gem_end + 1;
+		carveout_end = end;
+
+		order = __ffs(tegra->domain->pgsize_bitmap);
+		init_iova_domain(&tegra->carveout.domain, 1UL << order,
+				 carveout_start >> order);
+
+		tegra->carveout.shift = iova_shift(&tegra->carveout.domain);
+		tegra->carveout.limit = carveout_end >> tegra->carveout.shift;
+
+		drm_mm_init(&tegra->mm, gem_start, gem_end - gem_start + 1);
+		mutex_init(&tegra->mm_lock);
+
+		DRM_DEBUG("IOMMU apertures:\n");
+		DRM_DEBUG("  GEM: %#llx-%#llx\n", gem_start, gem_end);
+		DRM_DEBUG("  Carveout: %#llx-%#llx\n", carveout_start,
+			  carveout_end);
+	}
+
 	if (tegra->hub) {
 		err = tegra_display_hub_prepare(tegra->hub);
 		if (err < 0)
@@ -1041,6 +1046,7 @@ int tegra_drm_register_client(struct tegra_drm *tegra,
 {
 	mutex_lock(&tegra->clients_lock);
 	list_add_tail(&client->list, &tegra->clients);
+	client->drm = tegra;
 	mutex_unlock(&tegra->clients_lock);
 
 	return 0;
@@ -1051,6 +1057,7 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
 {
 	mutex_lock(&tegra->clients_lock);
 	list_del_init(&client->list);
+	client->drm = NULL;
 	mutex_unlock(&tegra->clients_lock);
 
 	return 0;
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 1012335bb489..70154c253d45 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -17,11 +17,11 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_encoder.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fixed.h>
+#include <drm/drm_probe_helper.h>
 
 #include "gem.h"
 #include "hub.h"
@@ -88,6 +88,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 struct tegra_drm_client {
 	struct host1x_client base;
 	struct list_head list;
+	struct tegra_drm *drm;
 
 	unsigned int version;
 	const struct tegra_drm_client_ops *ops;
@@ -124,7 +125,7 @@ struct tegra_output {
 	struct drm_panel *panel;
 	struct i2c_adapter *ddc;
 	const struct edid *edid;
-	struct cec_notifier *notifier;
+	struct cec_notifier *cec;
 	unsigned int hpd_irq;
 	int hpd_gpio;
 	enum of_gpio_flags hpd_gpio_flags;
diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
index b947e82bbeb1..0a4ce05e00ab 100644
--- a/drivers/gpu/drm/tegra/fb.c
+++ b/drivers/gpu/drm/tegra/fb.c
@@ -15,6 +15,7 @@
 #include "drm.h"
 #include "gem.h"
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_modeset_helper.h>
 
 #ifdef CONFIG_DRM_FBDEV_EMULATION
 static inline struct tegra_fbdev *to_tegra_fbdev(struct drm_fb_helper *helper)
@@ -255,7 +256,6 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
 	helper->fbdev = info;
 
 	info->par = helper;
-	info->flags = FBINFO_FLAG_DEFAULT;
 	info->fbops = &tegra_fb_ops;
 
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
diff --git a/drivers/gpu/drm/tegra/hda.c b/drivers/gpu/drm/tegra/hda.c
new file mode 100644
index 000000000000..94245a18a043
--- /dev/null
+++ b/drivers/gpu/drm/tegra/hda.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright (C) 2019 NVIDIA Corporation
+ */
+
+#include <linux/bug.h>
+
+#include <sound/hda_verbs.h>
+
+#include "hda.h"
+
+void tegra_hda_parse_format(unsigned int format, struct tegra_hda_format *fmt)
+{
+	unsigned int mul, div, bits, channels;
+
+	if (format & AC_FMT_TYPE_NON_PCM)
+		fmt->pcm = false;
+	else
+		fmt->pcm = true;
+
+	if (format & AC_FMT_BASE_44K)
+		fmt->sample_rate = 44100;
+	else
+		fmt->sample_rate = 48000;
+
+	mul = (format & AC_FMT_MULT_MASK) >> AC_FMT_MULT_SHIFT;
+	div = (format & AC_FMT_DIV_MASK) >> AC_FMT_DIV_SHIFT;
+
+	fmt->sample_rate *= (mul + 1) / (div + 1);
+
+	switch (format & AC_FMT_BITS_MASK) {
+	case AC_FMT_BITS_8:
+		fmt->bits = 8;
+		break;
+
+	case AC_FMT_BITS_16:
+		fmt->bits = 16;
+		break;
+
+	case AC_FMT_BITS_20:
+		fmt->bits = 20;
+		break;
+
+	case AC_FMT_BITS_24:
+		fmt->bits = 24;
+		break;
+
+	case AC_FMT_BITS_32:
+		fmt->bits = 32;
+		break;
+
+	default:
+		bits = (format & AC_FMT_BITS_MASK) >> AC_FMT_BITS_SHIFT;
+		WARN(1, "invalid number of bits: %#x\n", bits);
+		fmt->bits = 8;
+		break;
+	}
+
+	channels = (format & AC_FMT_CHAN_MASK) >> AC_FMT_CHAN_SHIFT;
+
+	/* channels are encoded as n - 1 */
+	fmt->channels = channels + 1;
+}
diff --git a/drivers/gpu/drm/tegra/hda.h b/drivers/gpu/drm/tegra/hda.h
new file mode 100644
index 000000000000..77269955a4f2
--- /dev/null
+++ b/drivers/gpu/drm/tegra/hda.h
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright (C) 2019 NVIDIA Corporation
+ */
+
+#ifndef DRM_TEGRA_HDA_H
+#define DRM_TEGRA_HDA_H 1
+
+#include <linux/types.h>
+
+struct tegra_hda_format {
+	unsigned int sample_rate;
+	unsigned int channels;
+	unsigned int bits;
+	bool pcm;
+};
+
+void tegra_hda_parse_format(unsigned int format, struct tegra_hda_format *fmt);
+
+#endif
diff --git a/drivers/gpu/drm/tegra/hdmi.c b/drivers/gpu/drm/tegra/hdmi.c
index 0082468f703c..47c55974756d 100644
--- a/drivers/gpu/drm/tegra/hdmi.c
+++ b/drivers/gpu/drm/tegra/hdmi.c
@@ -11,6 +11,7 @@
 #include <linux/debugfs.h>
 #include <linux/gpio.h>
 #include <linux/hdmi.h>
+#include <linux/math64.h>
 #include <linux/of_device.h>
 #include <linux/pm_runtime.h>
 #include <linux/regulator/consumer.h>
@@ -18,12 +19,9 @@
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
-
-#include <sound/hda_verbs.h>
-
-#include <media/cec-notifier.h>
+#include <drm/drm_probe_helper.h>
 
+#include "hda.h"
 #include "hdmi.h"
 #include "drm.h"
 #include "dc.h"
@@ -71,8 +69,7 @@ struct tegra_hdmi {
 	const struct tegra_hdmi_config *config;
 
 	unsigned int audio_source;
-	unsigned int audio_sample_rate;
-	unsigned int audio_channels;
+	struct tegra_hda_format format;
 
 	unsigned int pixel_clock;
 	bool stereo;
@@ -119,68 +116,11 @@ static inline void tegra_hdmi_writel(struct tegra_hdmi *hdmi, u32 value,
 }
 
 struct tegra_hdmi_audio_config {
-	unsigned int pclk;
 	unsigned int n;
 	unsigned int cts;
 	unsigned int aval;
 };
 
-static const struct tegra_hdmi_audio_config tegra_hdmi_audio_32k[] = {
-	{  25200000, 4096,  25200, 24000 },
-	{  27000000, 4096,  27000, 24000 },
-	{  74250000, 4096,  74250, 24000 },
-	{ 148500000, 4096, 148500, 24000 },
-	{         0,    0,      0,     0 },
-};
-
-static const struct tegra_hdmi_audio_config tegra_hdmi_audio_44_1k[] = {
-	{  25200000, 5880,  26250, 25000 },
-	{  27000000, 5880,  28125, 25000 },
-	{  74250000, 4704,  61875, 20000 },
-	{ 148500000, 4704, 123750, 20000 },
-	{         0,    0,      0,     0 },
-};
-
-static const struct tegra_hdmi_audio_config tegra_hdmi_audio_48k[] = {
-	{  25200000, 6144,  25200, 24000 },
-	{  27000000, 6144,  27000, 24000 },
-	{  74250000, 6144,  74250, 24000 },
-	{ 148500000, 6144, 148500, 24000 },
-	{         0,    0,      0,     0 },
-};
-
-static const struct tegra_hdmi_audio_config tegra_hdmi_audio_88_2k[] = {
-	{  25200000, 11760,  26250, 25000 },
-	{  27000000, 11760,  28125, 25000 },
-	{  74250000,  9408,  61875, 20000 },
-	{ 148500000,  9408, 123750, 20000 },
-	{         0,     0,      0,     0 },
-};
-
-static const struct tegra_hdmi_audio_config tegra_hdmi_audio_96k[] = {
-	{  25200000, 12288,  25200, 24000 },
-	{  27000000, 12288,  27000, 24000 },
-	{  74250000, 12288,  74250, 24000 },
-	{ 148500000, 12288, 148500, 24000 },
-	{         0,     0,      0,     0 },
-};
-
-static const struct tegra_hdmi_audio_config tegra_hdmi_audio_176_4k[] = {
-	{  25200000, 23520,  26250, 25000 },
-	{  27000000, 23520,  28125, 25000 },
-	{  74250000, 18816,  61875, 20000 },
-	{ 148500000, 18816, 123750, 20000 },
-	{         0,     0,      0,     0 },
-};
-
-static const struct tegra_hdmi_audio_config tegra_hdmi_audio_192k[] = {
-	{  25200000, 24576,  25200, 24000 },
-	{  27000000, 24576,  27000, 24000 },
-	{  74250000, 24576,  74250, 24000 },
-	{ 148500000, 24576, 148500, 24000 },
-	{         0,     0,      0,     0 },
-};
-
 static const struct tmds_config tegra20_tmds_config[] = {
 	{ /* slow pixel clock modes */
 		.pclk = 27000000,
@@ -418,52 +358,53 @@ static const struct tmds_config tegra124_tmds_config[] = {
 	},
 };
 
-static const struct tegra_hdmi_audio_config *
-tegra_hdmi_get_audio_config(unsigned int sample_rate, unsigned int pclk)
+static int
+tegra_hdmi_get_audio_config(unsigned int audio_freq, unsigned int pix_clock,
+			    struct tegra_hdmi_audio_config *config)
 {
-	const struct tegra_hdmi_audio_config *table;
-
-	switch (sample_rate) {
-	case 32000:
-		table = tegra_hdmi_audio_32k;
-		break;
-
-	case 44100:
-		table = tegra_hdmi_audio_44_1k;
-		break;
-
-	case 48000:
-		table = tegra_hdmi_audio_48k;
-		break;
-
-	case 88200:
-		table = tegra_hdmi_audio_88_2k;
-		break;
-
-	case 96000:
-		table = tegra_hdmi_audio_96k;
-		break;
-
-	case 176400:
-		table = tegra_hdmi_audio_176_4k;
-		break;
-
-	case 192000:
-		table = tegra_hdmi_audio_192k;
-		break;
-
-	default:
-		return NULL;
-	}
-
-	while (table->pclk) {
-		if (table->pclk == pclk)
-			return table;
-
-		table++;
+	const unsigned int afreq = 128 * audio_freq;
+	const unsigned int min_n = afreq / 1500;
+	const unsigned int max_n = afreq / 300;
+	const unsigned int ideal_n = afreq / 1000;
+	int64_t min_err = (uint64_t)-1 >> 1;
+	unsigned int min_delta = -1;
+	int n;
+
+	memset(config, 0, sizeof(*config));
+	config->n = -1;
+
+	for (n = min_n; n <= max_n; n++) {
+		uint64_t cts_f, aval_f;
+		unsigned int delta;
+		int64_t cts, err;
+
+		/* compute aval in 48.16 fixed point */
+		aval_f = ((int64_t)24000000 << 16) * n;
+		do_div(aval_f, afreq);
+		/* It should round without any rest */
+		if (aval_f & 0xFFFF)
+			continue;
+
+		/* Compute cts in 48.16 fixed point */
+		cts_f = ((int64_t)pix_clock << 16) * n;
+		do_div(cts_f, afreq);
+		/* Round it to the nearest integer */
+		cts = (cts_f & ~0xFFFF) + ((cts_f & BIT(15)) << 1);
+
+		delta = abs(n - ideal_n);
+
+		/* Compute the absolute error */
+		err = abs((int64_t)cts_f - cts);
+		if (err < min_err || (err == min_err && delta < min_delta)) {
+			config->n = n;
+			config->cts = cts >> 16;
+			config->aval = aval_f >> 16;
+			min_delta = delta;
+			min_err = err;
+		}
 	}
 
-	return NULL;
+	return config->n != -1 ? 0 : -EINVAL;
 }
 
 static void tegra_hdmi_setup_audio_fs_tables(struct tegra_hdmi *hdmi)
@@ -510,7 +451,7 @@ static void tegra_hdmi_write_aval(struct tegra_hdmi *hdmi, u32 value)
 	unsigned int i;
 
 	for (i = 0; i < ARRAY_SIZE(regs); i++) {
-		if (regs[i].sample_rate == hdmi->audio_sample_rate) {
+		if (regs[i].sample_rate == hdmi->format.sample_rate) {
 			tegra_hdmi_writel(hdmi, value, regs[i].offset);
 			break;
 		}
@@ -519,8 +460,9 @@ static void tegra_hdmi_write_aval(struct tegra_hdmi *hdmi, u32 value)
 
 static int tegra_hdmi_setup_audio(struct tegra_hdmi *hdmi)
 {
-	const struct tegra_hdmi_audio_config *config;
+	struct tegra_hdmi_audio_config config;
 	u32 source, value;
+	int err;
 
 	switch (hdmi->audio_source) {
 	case HDA:
@@ -564,7 +506,7 @@ static int tegra_hdmi_setup_audio(struct tegra_hdmi *hdmi)
 		 * play back system startup sounds early. It is possibly not
 		 * needed on Linux at all.
 		 */
-		if (hdmi->audio_channels == 2)
+		if (hdmi->format.channels == 2)
 			value = SOR_AUDIO_CNTRL0_INJECT_NULLSMPL;
 		else
 			value = 0;
@@ -595,25 +537,28 @@ static int tegra_hdmi_setup_audio(struct tegra_hdmi *hdmi)
 		tegra_hdmi_writel(hdmi, value, HDMI_NV_PDISP_SOR_AUDIO_SPARE0);
 	}
 
-	config = tegra_hdmi_get_audio_config(hdmi->audio_sample_rate,
-					     hdmi->pixel_clock);
-	if (!config) {
+	err = tegra_hdmi_get_audio_config(hdmi->format.sample_rate,
+					  hdmi->pixel_clock, &config);
+	if (err < 0) {
 		dev_err(hdmi->dev,
 			"cannot set audio to %u Hz at %u Hz pixel clock\n",
-			hdmi->audio_sample_rate, hdmi->pixel_clock);
-		return -EINVAL;
+			hdmi->format.sample_rate, hdmi->pixel_clock);
+		return err;
 	}
 
+	dev_dbg(hdmi->dev, "audio: pixclk=%u, n=%u, cts=%u, aval=%u\n",
+		hdmi->pixel_clock, config.n, config.cts, config.aval);
+
 	tegra_hdmi_writel(hdmi, 0, HDMI_NV_PDISP_HDMI_ACR_CTRL);
 
 	value = AUDIO_N_RESETF | AUDIO_N_GENERATE_ALTERNATE |
-		AUDIO_N_VALUE(config->n - 1);
+		AUDIO_N_VALUE(config.n - 1);
 	tegra_hdmi_writel(hdmi, value, HDMI_NV_PDISP_AUDIO_N);
 
-	tegra_hdmi_writel(hdmi, ACR_SUBPACK_N(config->n) | ACR_ENABLE,
+	tegra_hdmi_writel(hdmi, ACR_SUBPACK_N(config.n) | ACR_ENABLE,
 			  HDMI_NV_PDISP_HDMI_ACR_0441_SUBPACK_HIGH);
 
-	tegra_hdmi_writel(hdmi, ACR_SUBPACK_CTS(config->cts),
+	tegra_hdmi_writel(hdmi, ACR_SUBPACK_CTS(config.cts),
 			  HDMI_NV_PDISP_HDMI_ACR_0441_SUBPACK_LOW);
 
 	value = SPARE_HW_CTS | SPARE_FORCE_SW_CTS | SPARE_CTS_RESET_VAL(1);
@@ -624,7 +569,7 @@ static int tegra_hdmi_setup_audio(struct tegra_hdmi *hdmi)
 	tegra_hdmi_writel(hdmi, value, HDMI_NV_PDISP_AUDIO_N);
 
 	if (hdmi->config->has_hda)
-		tegra_hdmi_write_aval(hdmi, config->aval);
+		tegra_hdmi_write_aval(hdmi, config.aval);
 
 	tegra_hdmi_setup_audio_fs_tables(hdmi);
 
@@ -741,7 +686,8 @@ static void tegra_hdmi_setup_avi_infoframe(struct tegra_hdmi *hdmi,
 	u8 buffer[17];
 	ssize_t err;
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame,
+						       &hdmi->output.connector, mode);
 	if (err < 0) {
 		dev_err(hdmi->dev, "failed to setup AVI infoframe: %zd\n", err);
 		return;
@@ -787,7 +733,7 @@ static void tegra_hdmi_setup_audio_infoframe(struct tegra_hdmi *hdmi)
 		return;
 	}
 
-	frame.channels = hdmi->audio_channels;
+	frame.channels = hdmi->format.channels;
 
 	err = hdmi_audio_infoframe_pack(&frame, buffer, sizeof(buffer));
 	if (err < 0) {
@@ -1589,24 +1535,6 @@ static const struct of_device_id tegra_hdmi_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, tegra_hdmi_of_match);
 
-static void hda_format_parse(unsigned int format, unsigned int *rate,
-			     unsigned int *channels)
-{
-	unsigned int mul, div;
-
-	if (format & AC_FMT_BASE_44K)
-		*rate = 44100;
-	else
-		*rate = 48000;
-
-	mul = (format & AC_FMT_MULT_MASK) >> AC_FMT_MULT_SHIFT;
-	div = (format & AC_FMT_DIV_MASK) >> AC_FMT_DIV_SHIFT;
-
-	*rate = *rate * (mul + 1) / (div + 1);
-
-	*channels = (format & AC_FMT_CHAN_MASK) >> AC_FMT_CHAN_SHIFT;
-}
-
 static irqreturn_t tegra_hdmi_irq(int irq, void *data)
 {
 	struct tegra_hdmi *hdmi = data;
@@ -1623,14 +1551,9 @@ static irqreturn_t tegra_hdmi_irq(int irq, void *data)
 		value = tegra_hdmi_readl(hdmi, HDMI_NV_PDISP_SOR_AUDIO_HDA_CODEC_SCRATCH0);
 
 		if (value & SOR_AUDIO_HDA_CODEC_SCRATCH0_VALID) {
-			unsigned int sample_rate, channels;
-
 			format = value & SOR_AUDIO_HDA_CODEC_SCRATCH0_FMT_MASK;
 
-			hda_format_parse(format, &sample_rate, &channels);
-
-			hdmi->audio_sample_rate = sample_rate;
-			hdmi->audio_channels = channels;
+			tegra_hda_parse_format(format, &hdmi->format);
 
 			err = tegra_hdmi_setup_audio(hdmi);
 			if (err < 0) {
@@ -1664,8 +1587,6 @@ static int tegra_hdmi_probe(struct platform_device *pdev)
 	hdmi->dev = &pdev->dev;
 
 	hdmi->audio_source = AUTO;
-	hdmi->audio_sample_rate = 48000;
-	hdmi->audio_channels = 2;
 	hdmi->stereo = false;
 	hdmi->dvi = false;
 
@@ -1709,10 +1630,6 @@ static int tegra_hdmi_probe(struct platform_device *pdev)
 		return PTR_ERR(hdmi->vdd);
 	}
 
-	hdmi->output.notifier = cec_notifier_get(&pdev->dev);
-	if (hdmi->output.notifier == NULL)
-		return -ENOMEM;
-
 	hdmi->output.dev = &pdev->dev;
 
 	err = tegra_output_probe(&hdmi->output);
@@ -1771,9 +1688,6 @@ static int tegra_hdmi_remove(struct platform_device *pdev)
 
 	tegra_output_remove(&hdmi->output);
 
-	if (hdmi->output.notifier)
-		cec_notifier_put(hdmi->output.notifier);
-
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/tegra/hub.c b/drivers/gpu/drm/tegra/hub.c
index 922a48d5a483..ba9b3cfb8c3d 100644
--- a/drivers/gpu/drm/tegra/hub.c
+++ b/drivers/gpu/drm/tegra/hub.c
@@ -19,7 +19,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "drm.h"
 #include "dc.h"
@@ -716,7 +716,7 @@ static int tegra_display_hub_init(struct host1x_client *client)
 	if (!state)
 		return -ENOMEM;
 
-	drm_atomic_private_obj_init(&hub->base, &state->base,
+	drm_atomic_private_obj_init(drm, &hub->base, &state->base,
 				    &tegra_display_hub_state_funcs);
 
 	tegra->hub = hub;
diff --git a/drivers/gpu/drm/tegra/output.c b/drivers/gpu/drm/tegra/output.c
index c662efc7e413..9c2b9dad55c3 100644
--- a/drivers/gpu/drm/tegra/output.c
+++ b/drivers/gpu/drm/tegra/output.c
@@ -36,7 +36,7 @@ int tegra_output_connector_get_modes(struct drm_connector *connector)
 	else if (output->ddc)
 		edid = drm_get_edid(connector, output->ddc);
 
-	cec_notifier_set_phys_addr_from_edid(output->notifier, edid);
+	cec_notifier_set_phys_addr_from_edid(output->cec, edid);
 	drm_connector_update_edid_property(connector, edid);
 
 	if (edid) {
@@ -73,7 +73,7 @@ tegra_output_connector_detect(struct drm_connector *connector, bool force)
 	}
 
 	if (status != connector_status_connected)
-		cec_notifier_phys_addr_invalidate(output->notifier);
+		cec_notifier_phys_addr_invalidate(output->cec);
 
 	return status;
 }
@@ -174,11 +174,18 @@ int tegra_output_probe(struct tegra_output *output)
 		disable_irq(output->hpd_irq);
 	}
 
+	output->cec = cec_notifier_get(output->dev);
+	if (!output->cec)
+		return -ENOMEM;
+
 	return 0;
 }
 
 void tegra_output_remove(struct tegra_output *output)
 {
+	if (output->cec)
+		cec_notifier_put(output->cec);
+
 	if (gpio_is_valid(output->hpd_gpio)) {
 		free_irq(output->hpd_irq, output);
 		gpio_free(output->hpd_gpio);
diff --git a/drivers/gpu/drm/tegra/sor.c b/drivers/gpu/drm/tegra/sor.c
index ef8692b7075a..40057106f5f3 100644
--- a/drivers/gpu/drm/tegra/sor.c
+++ b/drivers/gpu/drm/tegra/sor.c
@@ -19,8 +19,6 @@
 
 #include <soc/tegra/pmc.h>
 
-#include <sound/hda_verbs.h>
-
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_dp_helper.h>
 #include <drm/drm_panel.h>
@@ -28,6 +26,7 @@
 
 #include "dc.h"
 #include "drm.h"
+#include "hda.h"
 #include "sor.h"
 #include "trace.h"
 
@@ -411,6 +410,8 @@ struct tegra_sor {
 	struct clk *clk_dp;
 	struct clk *clk;
 
+	u8 xbar_cfg[5];
+
 	struct drm_dp_aux *aux;
 
 	struct drm_info_list *debugfs_files;
@@ -429,10 +430,7 @@ struct tegra_sor {
 	struct delayed_work scdc;
 	bool scdc_enabled;
 
-	struct {
-		unsigned int sample_rate;
-		unsigned int channels;
-	} audio;
+	struct tegra_hda_format format;
 };
 
 struct tegra_sor_state {
@@ -1818,7 +1816,7 @@ static void tegra_sor_edp_enable(struct drm_encoder *encoder)
 
 	/* XXX not in TRM */
 	for (value = 0, i = 0; i < 5; i++)
-		value |= SOR_XBAR_CTRL_LINK0_XSEL(i, sor->soc->xbar_cfg[i]) |
+		value |= SOR_XBAR_CTRL_LINK0_XSEL(i, sor->xbar_cfg[i]) |
 			 SOR_XBAR_CTRL_LINK1_XSEL(i, i);
 
 	tegra_sor_writel(sor, 0x00000000, SOR_XBAR_POL);
@@ -2116,7 +2114,8 @@ tegra_sor_hdmi_setup_avi_infoframe(struct tegra_sor *sor,
 	value &= ~INFOFRAME_CTRL_ENABLE;
 	tegra_sor_writel(sor, value, SOR_HDMI_AVI_INFOFRAME_CTRL);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame,
+						       &sor->output.connector, mode);
 	if (err < 0) {
 		dev_err(sor->dev, "failed to setup AVI infoframe: %d\n", err);
 		return err;
@@ -2185,7 +2184,7 @@ static int tegra_sor_hdmi_enable_audio_infoframe(struct tegra_sor *sor)
 		return err;
 	}
 
-	frame.channels = sor->audio.channels;
+	frame.channels = sor->format.channels;
 
 	err = hdmi_audio_infoframe_pack(&frame, buffer, sizeof(buffer));
 	if (err < 0) {
@@ -2214,7 +2213,7 @@ static void tegra_sor_hdmi_audio_enable(struct tegra_sor *sor)
 	value |= SOR_AUDIO_CNTRL_SOURCE_SELECT(SOURCE_SELECT_HDA);
 
 	/* inject null samples */
-	if (sor->audio.channels != 2)
+	if (sor->format.channels != 2)
 		value &= ~SOR_AUDIO_CNTRL_INJECT_NULLSMPL;
 	else
 		value |= SOR_AUDIO_CNTRL_INJECT_NULLSMPL;
@@ -2245,7 +2244,7 @@ static void tegra_sor_hdmi_audio_enable(struct tegra_sor *sor)
 	value = SOR_HDMI_AUDIO_N_RESET | SOR_HDMI_AUDIO_N_LOOKUP;
 	tegra_sor_writel(sor, value, SOR_HDMI_AUDIO_N);
 
-	value = (24000 * 4096) / (128 * sor->audio.sample_rate / 1000);
+	value = (24000 * 4096) / (128 * sor->format.sample_rate / 1000);
 	tegra_sor_writel(sor, value, SOR_AUDIO_AVAL_0320);
 	tegra_sor_writel(sor, 4096, SOR_AUDIO_NVAL_0320);
 
@@ -2258,15 +2257,15 @@ static void tegra_sor_hdmi_audio_enable(struct tegra_sor *sor)
 	tegra_sor_writel(sor, 20000, SOR_AUDIO_AVAL_1764);
 	tegra_sor_writel(sor, 18816, SOR_AUDIO_NVAL_1764);
 
-	value = (24000 * 6144) / (128 * sor->audio.sample_rate / 1000);
+	value = (24000 * 6144) / (128 * sor->format.sample_rate / 1000);
 	tegra_sor_writel(sor, value, SOR_AUDIO_AVAL_0480);
 	tegra_sor_writel(sor, 6144, SOR_AUDIO_NVAL_0480);
 
-	value = (24000 * 12288) / (128 * sor->audio.sample_rate / 1000);
+	value = (24000 * 12288) / (128 * sor->format.sample_rate / 1000);
 	tegra_sor_writel(sor, value, SOR_AUDIO_AVAL_0960);
 	tegra_sor_writel(sor, 12288, SOR_AUDIO_NVAL_0960);
 
-	value = (24000 * 24576) / (128 * sor->audio.sample_rate / 1000);
+	value = (24000 * 24576) / (128 * sor->format.sample_rate / 1000);
 	tegra_sor_writel(sor, value, SOR_AUDIO_AVAL_1920);
 	tegra_sor_writel(sor, 24576, SOR_AUDIO_NVAL_1920);
 
@@ -2554,7 +2553,7 @@ static void tegra_sor_hdmi_enable(struct drm_encoder *encoder)
 
 	/* XXX not in TRM */
 	for (value = 0, i = 0; i < 5; i++)
-		value |= SOR_XBAR_CTRL_LINK0_XSEL(i, sor->soc->xbar_cfg[i]) |
+		value |= SOR_XBAR_CTRL_LINK0_XSEL(i, sor->xbar_cfg[i]) |
 			 SOR_XBAR_CTRL_LINK1_XSEL(i, i);
 
 	tegra_sor_writel(sor, 0x00000000, SOR_XBAR_POL);
@@ -3175,6 +3174,8 @@ MODULE_DEVICE_TABLE(of, tegra_sor_of_match);
 static int tegra_sor_parse_dt(struct tegra_sor *sor)
 {
 	struct device_node *np = sor->dev->of_node;
+	u32 xbar_cfg[5];
+	unsigned int i;
 	u32 value;
 	int err;
 
@@ -3192,25 +3193,18 @@ static int tegra_sor_parse_dt(struct tegra_sor *sor)
 		sor->pad = TEGRA_IO_PAD_HDMI_DP0 + sor->index;
 	}
 
-	return 0;
-}
-
-static void tegra_hda_parse_format(unsigned int format, unsigned int *rate,
-				   unsigned int *channels)
-{
-	unsigned int mul, div;
-
-	if (format & AC_FMT_BASE_44K)
-		*rate = 44100;
-	else
-		*rate = 48000;
-
-	mul = (format & AC_FMT_MULT_MASK) >> AC_FMT_MULT_SHIFT;
-	div = (format & AC_FMT_DIV_MASK) >> AC_FMT_DIV_SHIFT;
-
-	*rate = *rate * (mul + 1) / (div + 1);
+	err = of_property_read_u32_array(np, "nvidia,xbar-cfg", xbar_cfg, 5);
+	if (err < 0) {
+		/* fall back to default per-SoC XBAR configuration */
+		for (i = 0; i < 5; i++)
+			sor->xbar_cfg[i] = sor->soc->xbar_cfg[i];
+	} else {
+		/* copy cells to SOR XBAR configuration */
+		for (i = 0; i < 5; i++)
+			sor->xbar_cfg[i] = xbar_cfg[i];
+	}
 
-	*channels = (format & AC_FMT_CHAN_MASK) >> AC_FMT_CHAN_SHIFT;
+	return 0;
 }
 
 static irqreturn_t tegra_sor_irq(int irq, void *data)
@@ -3225,14 +3219,11 @@ static irqreturn_t tegra_sor_irq(int irq, void *data)
 		value = tegra_sor_readl(sor, SOR_AUDIO_HDA_CODEC_SCRATCH0);
 
 		if (value & SOR_AUDIO_HDA_CODEC_SCRATCH0_VALID) {
-			unsigned int format, sample_rate, channels;
+			unsigned int format;
 
 			format = value & SOR_AUDIO_HDA_CODEC_SCRATCH0_FMT_MASK;
 
-			tegra_hda_parse_format(format, &sample_rate, &channels);
-
-			sor->audio.sample_rate = sample_rate;
-			sor->audio.channels = channels;
+			tegra_hda_parse_format(format, &sor->format);
 
 			tegra_sor_hdmi_audio_enable(sor);
 		} else {
diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index d47983deb1cf..39bfed9623de 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -26,6 +26,7 @@
 struct vic_config {
 	const char *firmware;
 	unsigned int version;
+	bool supports_sid;
 };
 
 struct vic {
@@ -105,6 +106,22 @@ static int vic_boot(struct vic *vic)
 	if (vic->booted)
 		return 0;
 
+	if (vic->config->supports_sid) {
+		struct iommu_fwspec *spec = dev_iommu_fwspec_get(vic->dev);
+		u32 value;
+
+		value = TRANSCFG_ATT(1, TRANSCFG_SID_FALCON) |
+			TRANSCFG_ATT(0, TRANSCFG_SID_HW);
+		vic_writel(vic, value, VIC_TFBIF_TRANSCFG);
+
+		if (spec && spec->num_ids > 0) {
+			value = spec->ids[0] & 0xffff;
+
+			vic_writel(vic, value, VIC_THI_STREAMID0);
+			vic_writel(vic, value, VIC_THI_STREAMID1);
+		}
+	}
+
 	/* setup clockgating registers */
 	vic_writel(vic, CG_IDLE_CG_DLY_CNT(4) |
 			CG_IDLE_CG_EN |
@@ -181,13 +198,6 @@ static int vic_init(struct host1x_client *client)
 		vic->domain = tegra->domain;
 	}
 
-	if (!vic->falcon.data) {
-		vic->falcon.data = tegra;
-		err = falcon_load_firmware(&vic->falcon);
-		if (err < 0)
-			goto detach;
-	}
-
 	vic->channel = host1x_channel_request(client->dev);
 	if (!vic->channel) {
 		err = -ENOMEM;
@@ -246,6 +256,30 @@ static const struct host1x_client_ops vic_client_ops = {
 	.exit = vic_exit,
 };
 
+static int vic_load_firmware(struct vic *vic)
+{
+	int err;
+
+	if (vic->falcon.data)
+		return 0;
+
+	vic->falcon.data = vic->client.drm;
+
+	err = falcon_read_firmware(&vic->falcon, vic->config->firmware);
+	if (err < 0)
+		goto cleanup;
+
+	err = falcon_load_firmware(&vic->falcon);
+	if (err < 0)
+		goto cleanup;
+
+	return 0;
+
+cleanup:
+	vic->falcon.data = NULL;
+	return err;
+}
+
 static int vic_open_channel(struct tegra_drm_client *client,
 			    struct tegra_drm_context *context)
 {
@@ -256,19 +290,25 @@ static int vic_open_channel(struct tegra_drm_client *client,
 	if (err < 0)
 		return err;
 
+	err = vic_load_firmware(vic);
+	if (err < 0)
+		goto rpm_put;
+
 	err = vic_boot(vic);
-	if (err < 0) {
-		pm_runtime_put(vic->dev);
-		return err;
-	}
+	if (err < 0)
+		goto rpm_put;
 
 	context->channel = host1x_channel_get(vic->channel);
 	if (!context->channel) {
-		pm_runtime_put(vic->dev);
-		return -ENOMEM;
+		err = -ENOMEM;
+		goto rpm_put;
 	}
 
 	return 0;
+
+rpm_put:
+	pm_runtime_put(vic->dev);
+	return err;
 }
 
 static void vic_close_channel(struct tegra_drm_context *context)
@@ -291,6 +331,7 @@ static const struct tegra_drm_client_ops vic_ops = {
 static const struct vic_config vic_t124_config = {
 	.firmware = NVIDIA_TEGRA_124_VIC_FIRMWARE,
 	.version = 0x40,
+	.supports_sid = false,
 };
 
 #define NVIDIA_TEGRA_210_VIC_FIRMWARE "nvidia/tegra210/vic04_ucode.bin"
@@ -298,6 +339,7 @@ static const struct vic_config vic_t124_config = {
 static const struct vic_config vic_t210_config = {
 	.firmware = NVIDIA_TEGRA_210_VIC_FIRMWARE,
 	.version = 0x21,
+	.supports_sid = false,
 };
 
 #define NVIDIA_TEGRA_186_VIC_FIRMWARE "nvidia/tegra186/vic04_ucode.bin"
@@ -305,6 +347,7 @@ static const struct vic_config vic_t210_config = {
 static const struct vic_config vic_t186_config = {
 	.firmware = NVIDIA_TEGRA_186_VIC_FIRMWARE,
 	.version = 0x18,
+	.supports_sid = true,
 };
 
 #define NVIDIA_TEGRA_194_VIC_FIRMWARE "nvidia/tegra194/vic.bin"
@@ -312,6 +355,7 @@ static const struct vic_config vic_t186_config = {
 static const struct vic_config vic_t194_config = {
 	.firmware = NVIDIA_TEGRA_194_VIC_FIRMWARE,
 	.version = 0x19,
+	.supports_sid = true,
 };
 
 static const struct of_device_id vic_match[] = {
@@ -372,10 +416,6 @@ static int vic_probe(struct platform_device *pdev)
 	if (err < 0)
 		return err;
 
-	err = falcon_read_firmware(&vic->falcon, vic->config->firmware);
-	if (err < 0)
-		goto exit_falcon;
-
 	platform_set_drvdata(pdev, vic);
 
 	INIT_LIST_HEAD(&vic->client.base.list);
@@ -393,7 +433,6 @@ static int vic_probe(struct platform_device *pdev)
 	err = host1x_client_register(&vic->client.base);
 	if (err < 0) {
 		dev_err(dev, "failed to register host1x client: %d\n", err);
-		platform_set_drvdata(pdev, NULL);
 		goto exit_falcon;
 	}
 
diff --git a/drivers/gpu/drm/tegra/vic.h b/drivers/gpu/drm/tegra/vic.h
index 21844817a7e1..017584340dd6 100644
--- a/drivers/gpu/drm/tegra/vic.h
+++ b/drivers/gpu/drm/tegra/vic.h
@@ -17,11 +17,20 @@
 
 /* VIC registers */
 
+#define VIC_THI_STREAMID0	0x00000030
+#define VIC_THI_STREAMID1	0x00000034
+
 #define NV_PVIC_MISC_PRI_VIC_CG			0x000016d0
 #define CG_IDLE_CG_DLY_CNT(val)			((val & 0x3f) << 0)
 #define CG_IDLE_CG_EN				(1 << 6)
 #define CG_WAKEUP_DLY_CNT(val)			((val & 0xf) << 16)
 
+#define VIC_TFBIF_TRANSCFG	0x00002044
+#define  TRANSCFG_ATT(i, v)	(((v) & 0x3) << (i * 4))
+#define  TRANSCFG_SID_HW	0
+#define  TRANSCFG_SID_PHY	1
+#define  TRANSCFG_SID_FALCON	2
+
 /* Firmware offsets */
 
 #define VIC_UCODE_FCE_HEADER_OFFSET		(6*4)
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_drv.c b/drivers/gpu/drm/tilcdc/tilcdc_drv.c
index 3dac08b24140..3030af9e7b35 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_drv.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_drv.c
@@ -24,6 +24,7 @@
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "tilcdc_drv.h"
 #include "tilcdc_regs.h"
@@ -183,6 +184,12 @@ static void tilcdc_fini(struct drm_device *dev)
 {
 	struct tilcdc_drm_private *priv = dev->dev_private;
 
+#ifdef CONFIG_CPU_FREQ
+	if (priv->freq_transition.notifier_call)
+		cpufreq_unregister_notifier(&priv->freq_transition,
+					    CPUFREQ_TRANSITION_NOTIFIER);
+#endif
+
 	if (priv->crtc)
 		tilcdc_crtc_shutdown(priv->crtc);
 
@@ -194,12 +201,6 @@ static void tilcdc_fini(struct drm_device *dev)
 	drm_mode_config_cleanup(dev);
 	tilcdc_remove_external_device(dev);
 
-#ifdef CONFIG_CPU_FREQ
-	if (priv->freq_transition.notifier_call)
-		cpufreq_unregister_notifier(&priv->freq_transition,
-					    CPUFREQ_TRANSITION_NOTIFIER);
-#endif
-
 	if (priv->clk)
 		clk_put(priv->clk);
 
@@ -270,17 +271,6 @@ static int tilcdc_init(struct drm_driver *ddrv, struct device *dev)
 		goto init_failed;
 	}
 
-#ifdef CONFIG_CPU_FREQ
-	priv->freq_transition.notifier_call = cpufreq_transition;
-	ret = cpufreq_register_notifier(&priv->freq_transition,
-			CPUFREQ_TRANSITION_NOTIFIER);
-	if (ret) {
-		dev_err(dev, "failed to register cpufreq notifier\n");
-		priv->freq_transition.notifier_call = NULL;
-		goto init_failed;
-	}
-#endif
-
 	if (of_property_read_u32(node, "max-bandwidth", &priv->max_bandwidth))
 		priv->max_bandwidth = TILCDC_DEFAULT_MAX_BANDWIDTH;
 
@@ -357,6 +347,17 @@ static int tilcdc_init(struct drm_driver *ddrv, struct device *dev)
 	}
 	modeset_init(ddev);
 
+#ifdef CONFIG_CPU_FREQ
+	priv->freq_transition.notifier_call = cpufreq_transition;
+	ret = cpufreq_register_notifier(&priv->freq_transition,
+			CPUFREQ_TRANSITION_NOTIFIER);
+	if (ret) {
+		dev_err(dev, "failed to register cpufreq notifier\n");
+		priv->freq_transition.notifier_call = NULL;
+		goto init_failed;
+	}
+#endif
+
 	if (priv->is_componentized) {
 		ret = component_bind_all(dev, ddev);
 		if (ret < 0)
@@ -511,7 +512,7 @@ static int tilcdc_debugfs_init(struct drm_minor *minor)
 DEFINE_DRM_GEM_CMA_FOPS(fops);
 
 static struct drm_driver tilcdc_driver = {
-	.driver_features    = (DRIVER_HAVE_IRQ | DRIVER_GEM | DRIVER_MODESET |
+	.driver_features    = (DRIVER_GEM | DRIVER_MODESET |
 			       DRIVER_PRIME | DRIVER_ATOMIC),
 	.irq_handler        = tilcdc_irq,
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_drv.h b/drivers/gpu/drm/tilcdc/tilcdc_drv.h
index 62cea5ff5558..d86397da12a9 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_drv.h
+++ b/drivers/gpu/drm/tilcdc/tilcdc_drv.h
@@ -30,10 +30,9 @@
 #include <linux/list.h>
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_gem_cma_helper.h>
-#include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_bridge.h>
+#include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_gem_cma_helper.h>
 
 /* Defaulting to pixel clock defined on AM335x */
 #define TILCDC_DEFAULT_MAX_PIXELCLOCK  126000
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_external.c b/drivers/gpu/drm/tilcdc/tilcdc_external.c
index b4eaf9bc87f8..e9969cd36610 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_external.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_external.c
@@ -10,6 +10,7 @@
 
 #include <linux/component.h>
 #include <linux/of_graph.h>
+#include <drm/drm_atomic_helper.h>
 #include <drm/drm_of.h>
 
 #include "tilcdc_drv.h"
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_panel.c b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
index a1acab39d87f..5d532a596e1e 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_panel.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
@@ -23,6 +23,7 @@
 #include <video/of_display_timing.h>
 #include <video/videomode.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "tilcdc_drv.h"
 #include "tilcdc_panel.h"
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_tfp410.c b/drivers/gpu/drm/tilcdc/tilcdc_tfp410.c
index daebf1aa6b0a..fe59fbfdde69 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_tfp410.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_tfp410.c
@@ -21,6 +21,7 @@
 #include <linux/pinctrl/pinmux.h>
 #include <linux/pinctrl/consumer.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "tilcdc_drv.h"
 #include "tilcdc_tfp410.h"
diff --git a/drivers/gpu/drm/tinydrm/core/tinydrm-core.c b/drivers/gpu/drm/tinydrm/core/tinydrm-core.c
index 01a6f2d42440..554abd5d3b53 100644
--- a/drivers/gpu/drm/tinydrm/core/tinydrm-core.c
+++ b/drivers/gpu/drm/tinydrm/core/tinydrm-core.c
@@ -9,12 +9,15 @@
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
+#include <drm/drm_print.h>
 #include <drm/tinydrm/tinydrm.h>
 #include <linux/device.h>
 #include <linux/dma-buf.h>
+#include <linux/module.h>
 
 /**
  * DOC: overview
@@ -36,31 +39,17 @@
  * and registers the DRM device using devm_tinydrm_register().
  */
 
-static struct drm_framebuffer *
-tinydrm_fb_create(struct drm_device *drm, struct drm_file *file_priv,
-		  const struct drm_mode_fb_cmd2 *mode_cmd)
-{
-	struct tinydrm_device *tdev = drm->dev_private;
-
-	return drm_gem_fb_create_with_funcs(drm, file_priv, mode_cmd,
-					    tdev->fb_funcs);
-}
-
 static const struct drm_mode_config_funcs tinydrm_mode_config_funcs = {
-	.fb_create = tinydrm_fb_create,
+	.fb_create = drm_gem_fb_create_with_dirty,
 	.atomic_check = drm_atomic_helper_check,
 	.atomic_commit = drm_atomic_helper_commit,
 };
 
 static int tinydrm_init(struct device *parent, struct tinydrm_device *tdev,
-			const struct drm_framebuffer_funcs *fb_funcs,
 			struct drm_driver *driver)
 {
 	struct drm_device *drm;
 
-	mutex_init(&tdev->dirty_lock);
-	tdev->fb_funcs = fb_funcs;
-
 	/*
 	 * We don't embed drm_device, because that prevent us from using
 	 * devm_kzalloc() to allocate tinydrm_device in the driver since
@@ -83,7 +72,6 @@ static int tinydrm_init(struct device *parent, struct tinydrm_device *tdev,
 static void tinydrm_fini(struct tinydrm_device *tdev)
 {
 	drm_mode_config_cleanup(tdev->drm);
-	mutex_destroy(&tdev->dirty_lock);
 	tdev->drm->dev_private = NULL;
 	drm_dev_put(tdev->drm);
 }
@@ -97,7 +85,6 @@ static void devm_tinydrm_release(void *data)
  * devm_tinydrm_init - Initialize tinydrm device
  * @parent: Parent device object
  * @tdev: tinydrm device
- * @fb_funcs: Framebuffer functions
  * @driver: DRM driver
  *
  * This function initializes @tdev, the underlying DRM device and it's
@@ -108,12 +95,11 @@ static void devm_tinydrm_release(void *data)
  * Zero on success, negative error code on failure.
  */
 int devm_tinydrm_init(struct device *parent, struct tinydrm_device *tdev,
-		      const struct drm_framebuffer_funcs *fb_funcs,
 		      struct drm_driver *driver)
 {
 	int ret;
 
-	ret = tinydrm_init(parent, tdev, fb_funcs, driver);
+	ret = tinydrm_init(parent, tdev, driver);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c b/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
index bf6bfbc5d412..2737b6fdadc8 100644
--- a/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
+++ b/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
@@ -17,104 +17,16 @@
 #include <drm/drm_device.h>
 #include <drm/drm_drv.h>
 #include <drm/drm_fourcc.h>
+#include <drm/drm_framebuffer.h>
 #include <drm/drm_print.h>
-#include <drm/tinydrm/tinydrm.h>
+#include <drm/drm_rect.h>
 #include <drm/tinydrm/tinydrm-helpers.h>
-#include <uapi/drm/drm.h>
 
 static unsigned int spi_max;
 module_param(spi_max, uint, 0400);
 MODULE_PARM_DESC(spi_max, "Set a lower SPI max transfer size");
 
 /**
- * tinydrm_merge_clips - Merge clip rectangles
- * @dst: Destination clip rectangle
- * @src: Source clip rectangle(s)
- * @num_clips: Number of @src clip rectangles
- * @flags: Dirty fb ioctl flags
- * @max_width: Maximum width of @dst
- * @max_height: Maximum height of @dst
- *
- * This function merges @src clip rectangle(s) into @dst. If @src is NULL,
- * @max_width and @min_width is used to set a full @dst clip rectangle.
- *
- * Returns:
- * true if it's a full clip, false otherwise
- */
-bool tinydrm_merge_clips(struct drm_clip_rect *dst,
-			 struct drm_clip_rect *src, unsigned int num_clips,
-			 unsigned int flags, u32 max_width, u32 max_height)
-{
-	unsigned int i;
-
-	if (!src || !num_clips) {
-		dst->x1 = 0;
-		dst->x2 = max_width;
-		dst->y1 = 0;
-		dst->y2 = max_height;
-		return true;
-	}
-
-	dst->x1 = ~0;
-	dst->y1 = ~0;
-	dst->x2 = 0;
-	dst->y2 = 0;
-
-	for (i = 0; i < num_clips; i++) {
-		if (flags & DRM_MODE_FB_DIRTY_ANNOTATE_COPY)
-			i++;
-		dst->x1 = min(dst->x1, src[i].x1);
-		dst->x2 = max(dst->x2, src[i].x2);
-		dst->y1 = min(dst->y1, src[i].y1);
-		dst->y2 = max(dst->y2, src[i].y2);
-	}
-
-	if (dst->x2 > max_width || dst->y2 > max_height ||
-	    dst->x1 >= dst->x2 || dst->y1 >= dst->y2) {
-		DRM_DEBUG_KMS("Illegal clip: x1=%u, x2=%u, y1=%u, y2=%u\n",
-			      dst->x1, dst->x2, dst->y1, dst->y2);
-		dst->x1 = 0;
-		dst->y1 = 0;
-		dst->x2 = max_width;
-		dst->y2 = max_height;
-	}
-
-	return (dst->x2 - dst->x1) == max_width &&
-	       (dst->y2 - dst->y1) == max_height;
-}
-EXPORT_SYMBOL(tinydrm_merge_clips);
-
-int tinydrm_fb_dirty(struct drm_framebuffer *fb,
-		     struct drm_file *file_priv,
-		     unsigned int flags, unsigned int color,
-		     struct drm_clip_rect *clips,
-		     unsigned int num_clips)
-{
-	struct tinydrm_device *tdev = fb->dev->dev_private;
-	struct drm_plane *plane = &tdev->pipe.plane;
-	int ret = 0;
-
-	drm_modeset_lock(&plane->mutex, NULL);
-
-	/* fbdev can flush even when we're not interested */
-	if (plane->state->fb == fb) {
-		mutex_lock(&tdev->dirty_lock);
-		ret = tdev->fb_dirty(fb, file_priv, flags,
-				     color, clips, num_clips);
-		mutex_unlock(&tdev->dirty_lock);
-	}
-
-	drm_modeset_unlock(&plane->mutex);
-
-	if (ret)
-		dev_err_once(fb->dev->dev,
-			     "Failed to update display %d\n", ret);
-
-	return ret;
-}
-EXPORT_SYMBOL(tinydrm_fb_dirty);
-
-/**
  * tinydrm_memcpy - Copy clip buffer
  * @dst: Destination buffer
  * @vaddr: Source buffer
@@ -122,7 +34,7 @@ EXPORT_SYMBOL(tinydrm_fb_dirty);
  * @clip: Clip rectangle area to copy
  */
 void tinydrm_memcpy(void *dst, void *vaddr, struct drm_framebuffer *fb,
-		    struct drm_clip_rect *clip)
+		    struct drm_rect *clip)
 {
 	unsigned int cpp = drm_format_plane_cpp(fb->format->format, 0);
 	unsigned int pitch = fb->pitches[0];
@@ -146,7 +58,7 @@ EXPORT_SYMBOL(tinydrm_memcpy);
  * @clip: Clip rectangle area to copy
  */
 void tinydrm_swab16(u16 *dst, void *vaddr, struct drm_framebuffer *fb,
-		    struct drm_clip_rect *clip)
+		    struct drm_rect *clip)
 {
 	size_t len = (clip->x2 - clip->x1) * sizeof(u16);
 	unsigned int x, y;
@@ -186,7 +98,7 @@ EXPORT_SYMBOL(tinydrm_swab16);
  */
 void tinydrm_xrgb8888_to_rgb565(u16 *dst, void *vaddr,
 				struct drm_framebuffer *fb,
-				struct drm_clip_rect *clip, bool swap)
+				struct drm_rect *clip, bool swap)
 {
 	size_t len = (clip->x2 - clip->x1) * sizeof(u32);
 	unsigned int x, y;
@@ -235,7 +147,7 @@ EXPORT_SYMBOL(tinydrm_xrgb8888_to_rgb565);
  * ITU BT.601 is used for the RGB -> luma (brightness) conversion.
  */
 void tinydrm_xrgb8888_to_gray8(u8 *dst, void *vaddr, struct drm_framebuffer *fb,
-			       struct drm_clip_rect *clip)
+			       struct drm_rect *clip)
 {
 	unsigned int len = (clip->x2 - clip->x1) * sizeof(u32);
 	unsigned int x, y;
diff --git a/drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c b/drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c
index eacfc0ec8ff1..bb5b1c1e21ba 100644
--- a/drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c
+++ b/drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c
@@ -8,9 +8,11 @@
  */
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_modes.h>
+#include <drm/drm_probe_helper.h>
+#include <drm/drm_print.h>
 #include <drm/tinydrm/tinydrm.h>
 
 struct tinydrm_connector {
@@ -108,36 +110,6 @@ tinydrm_connector_create(struct drm_device *drm,
 	return connector;
 }
 
-/**
- * tinydrm_display_pipe_update - Display pipe update helper
- * @pipe: Simple display pipe
- * @old_state: Old plane state
- *
- * This function does a full framebuffer flush if the plane framebuffer
- * has changed. It also handles vblank events. Drivers can use this as their
- * &drm_simple_display_pipe_funcs->update callback.
- */
-void tinydrm_display_pipe_update(struct drm_simple_display_pipe *pipe,
-				 struct drm_plane_state *old_state)
-{
-	struct tinydrm_device *tdev = pipe_to_tinydrm(pipe);
-	struct drm_framebuffer *fb = pipe->plane.state->fb;
-	struct drm_crtc *crtc = &tdev->pipe.crtc;
-
-	if (fb && (fb != old_state->fb)) {
-		if (tdev->fb_dirty)
-			tdev->fb_dirty(fb, NULL, 0, 0, NULL, 0);
-	}
-
-	if (crtc->state->event) {
-		spin_lock_irq(&crtc->dev->event_lock);
-		drm_crtc_send_vblank_event(crtc, crtc->state->event);
-		spin_unlock_irq(&crtc->dev->event_lock);
-		crtc->state->event = NULL;
-	}
-}
-EXPORT_SYMBOL(tinydrm_display_pipe_update);
-
 static int tinydrm_rotate_mode(struct drm_display_mode *mode,
 			       unsigned int rotation)
 {
diff --git a/drivers/gpu/drm/tinydrm/hx8357d.c b/drivers/gpu/drm/tinydrm/hx8357d.c
index 81a2bbeb25d4..8bbd0beafc6a 100644
--- a/drivers/gpu/drm/tinydrm/hx8357d.c
+++ b/drivers/gpu/drm/tinydrm/hx8357d.c
@@ -16,6 +16,7 @@
 #include <linux/property.h>
 #include <linux/spi/spi.h>
 
+#include <drm/drm_drv.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_modeset_helper.h>
@@ -175,7 +176,7 @@ out_enable:
 static const struct drm_simple_display_pipe_funcs hx8357d_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
-	.update = tinydrm_display_pipe_update,
+	.update = mipi_dbi_pipe_update,
 	.prepare_fb = drm_gem_fb_simple_display_pipe_prepare_fb,
 };
 
diff --git a/drivers/gpu/drm/tinydrm/ili9225.c b/drivers/gpu/drm/tinydrm/ili9225.c
index 78f7c2d1b449..43a3b68d90a2 100644
--- a/drivers/gpu/drm/tinydrm/ili9225.c
+++ b/drivers/gpu/drm/tinydrm/ili9225.c
@@ -20,9 +20,14 @@
 #include <linux/spi/spi.h>
 #include <video/mipi_display.h>
 
+#include <drm/drm_damage_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fourcc.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_rect.h>
+#include <drm/drm_vblank.h>
 #include <drm/tinydrm/mipi-dbi.h>
 #include <drm/tinydrm/tinydrm-helpers.h>
 
@@ -73,16 +78,14 @@ static inline int ili9225_command(struct mipi_dbi *mipi, u8 cmd, u16 data)
 	return mipi_dbi_command_buf(mipi, cmd, par, 2);
 }
 
-static int ili9225_fb_dirty(struct drm_framebuffer *fb,
-			    struct drm_file *file_priv, unsigned int flags,
-			    unsigned int color, struct drm_clip_rect *clips,
-			    unsigned int num_clips)
+static void ili9225_fb_dirty(struct drm_framebuffer *fb, struct drm_rect *rect)
 {
 	struct drm_gem_cma_object *cma_obj = drm_fb_cma_get_gem_obj(fb, 0);
 	struct tinydrm_device *tdev = fb->dev->dev_private;
 	struct mipi_dbi *mipi = mipi_dbi_from_tinydrm(tdev);
+	unsigned int height = rect->y2 - rect->y1;
+	unsigned int width = rect->x2 - rect->x1;
 	bool swap = mipi->swap_bytes;
-	struct drm_clip_rect clip;
 	u16 x_start, y_start;
 	u16 x1, x2, y1, y2;
 	int ret = 0;
@@ -90,54 +93,52 @@ static int ili9225_fb_dirty(struct drm_framebuffer *fb,
 	void *tr;
 
 	if (!mipi->enabled)
-		return 0;
+		return;
 
-	full = tinydrm_merge_clips(&clip, clips, num_clips, flags,
-				   fb->width, fb->height);
+	full = width == fb->width && height == fb->height;
 
-	DRM_DEBUG("Flushing [FB:%d] x1=%u, x2=%u, y1=%u, y2=%u\n", fb->base.id,
-		  clip.x1, clip.x2, clip.y1, clip.y2);
+	DRM_DEBUG_KMS("Flushing [FB:%d] " DRM_RECT_FMT "\n", fb->base.id, DRM_RECT_ARG(rect));
 
 	if (!mipi->dc || !full || swap ||
 	    fb->format->format == DRM_FORMAT_XRGB8888) {
 		tr = mipi->tx_buf;
-		ret = mipi_dbi_buf_copy(mipi->tx_buf, fb, &clip, swap);
+		ret = mipi_dbi_buf_copy(mipi->tx_buf, fb, rect, swap);
 		if (ret)
-			return ret;
+			goto err_msg;
 	} else {
 		tr = cma_obj->vaddr;
 	}
 
 	switch (mipi->rotation) {
 	default:
-		x1 = clip.x1;
-		x2 = clip.x2 - 1;
-		y1 = clip.y1;
-		y2 = clip.y2 - 1;
+		x1 = rect->x1;
+		x2 = rect->x2 - 1;
+		y1 = rect->y1;
+		y2 = rect->y2 - 1;
 		x_start = x1;
 		y_start = y1;
 		break;
 	case 90:
-		x1 = clip.y1;
-		x2 = clip.y2 - 1;
-		y1 = fb->width - clip.x2;
-		y2 = fb->width - clip.x1 - 1;
+		x1 = rect->y1;
+		x2 = rect->y2 - 1;
+		y1 = fb->width - rect->x2;
+		y2 = fb->width - rect->x1 - 1;
 		x_start = x1;
 		y_start = y2;
 		break;
 	case 180:
-		x1 = fb->width - clip.x2;
-		x2 = fb->width - clip.x1 - 1;
-		y1 = fb->height - clip.y2;
-		y2 = fb->height - clip.y1 - 1;
+		x1 = fb->width - rect->x2;
+		x2 = fb->width - rect->x1 - 1;
+		y1 = fb->height - rect->y2;
+		y2 = fb->height - rect->y1 - 1;
 		x_start = x2;
 		y_start = y2;
 		break;
 	case 270:
-		x1 = fb->height - clip.y2;
-		x2 = fb->height - clip.y1 - 1;
-		y1 = clip.x1;
-		y2 = clip.x2 - 1;
+		x1 = fb->height - rect->y2;
+		x2 = fb->height - rect->y1 - 1;
+		y1 = rect->x1;
+		y2 = rect->x2 - 1;
 		x_start = x2;
 		y_start = y1;
 		break;
@@ -152,16 +153,29 @@ static int ili9225_fb_dirty(struct drm_framebuffer *fb,
 	ili9225_command(mipi, ILI9225_RAM_ADDRESS_SET_2, y_start);
 
 	ret = mipi_dbi_command_buf(mipi, ILI9225_WRITE_DATA_TO_GRAM, tr,
-				(clip.x2 - clip.x1) * (clip.y2 - clip.y1) * 2);
-
-	return ret;
+				   width * height * 2);
+err_msg:
+	if (ret)
+		dev_err_once(fb->dev->dev, "Failed to update display %d\n", ret);
 }
 
-static const struct drm_framebuffer_funcs ili9225_fb_funcs = {
-	.destroy	= drm_gem_fb_destroy,
-	.create_handle	= drm_gem_fb_create_handle,
-	.dirty		= tinydrm_fb_dirty,
-};
+static void ili9225_pipe_update(struct drm_simple_display_pipe *pipe,
+				struct drm_plane_state *old_state)
+{
+	struct drm_plane_state *state = pipe->plane.state;
+	struct drm_crtc *crtc = &pipe->crtc;
+	struct drm_rect rect;
+
+	if (drm_atomic_helper_damage_merged(old_state, state, &rect))
+		ili9225_fb_dirty(state->fb, &rect);
+
+	if (crtc->state->event) {
+		spin_lock_irq(&crtc->dev->event_lock);
+		drm_crtc_send_vblank_event(crtc, crtc->state->event);
+		spin_unlock_irq(&crtc->dev->event_lock);
+		crtc->state->event = NULL;
+	}
+}
 
 static void ili9225_pipe_enable(struct drm_simple_display_pipe *pipe,
 				struct drm_crtc_state *crtc_state,
@@ -169,7 +183,14 @@ static void ili9225_pipe_enable(struct drm_simple_display_pipe *pipe,
 {
 	struct tinydrm_device *tdev = pipe_to_tinydrm(pipe);
 	struct mipi_dbi *mipi = mipi_dbi_from_tinydrm(tdev);
+	struct drm_framebuffer *fb = plane_state->fb;
 	struct device *dev = tdev->drm->dev;
+	struct drm_rect rect = {
+		.x1 = 0,
+		.x2 = fb->width,
+		.y1 = 0,
+		.y2 = fb->height,
+	};
 	int ret;
 	u8 am_id;
 
@@ -257,7 +278,8 @@ static void ili9225_pipe_enable(struct drm_simple_display_pipe *pipe,
 
 	ili9225_command(mipi, ILI9225_DISPLAY_CONTROL_1, 0x1017);
 
-	mipi_dbi_enable_flush(mipi, crtc_state, plane_state);
+	mipi->enabled = true;
+	ili9225_fb_dirty(fb, &rect);
 }
 
 static void ili9225_pipe_disable(struct drm_simple_display_pipe *pipe)
@@ -302,59 +324,10 @@ static int ili9225_dbi_command(struct mipi_dbi *mipi, u8 cmd, u8 *par,
 	return tinydrm_spi_transfer(spi, speed_hz, NULL, bpw, par, num);
 }
 
-static const u32 ili9225_formats[] = {
-	DRM_FORMAT_RGB565,
-	DRM_FORMAT_XRGB8888,
-};
-
-static int ili9225_init(struct device *dev, struct mipi_dbi *mipi,
-			const struct drm_simple_display_pipe_funcs *pipe_funcs,
-			struct drm_driver *driver,
-			const struct drm_display_mode *mode,
-			unsigned int rotation)
-{
-	size_t bufsize = mode->vdisplay * mode->hdisplay * sizeof(u16);
-	struct tinydrm_device *tdev = &mipi->tinydrm;
-	int ret;
-
-	if (!mipi->command)
-		return -EINVAL;
-
-	mutex_init(&mipi->cmdlock);
-
-	mipi->tx_buf = devm_kmalloc(dev, bufsize, GFP_KERNEL);
-	if (!mipi->tx_buf)
-		return -ENOMEM;
-
-	ret = devm_tinydrm_init(dev, tdev, &ili9225_fb_funcs, driver);
-	if (ret)
-		return ret;
-
-	tdev->fb_dirty = ili9225_fb_dirty;
-
-	ret = tinydrm_display_pipe_init(tdev, pipe_funcs,
-					DRM_MODE_CONNECTOR_VIRTUAL,
-					ili9225_formats,
-					ARRAY_SIZE(ili9225_formats), mode,
-					rotation);
-	if (ret)
-		return ret;
-
-	tdev->drm->mode_config.preferred_depth = 16;
-	mipi->rotation = rotation;
-
-	drm_mode_config_reset(tdev->drm);
-
-	DRM_DEBUG_KMS("preferred_depth=%u, rotation = %u\n",
-		      tdev->drm->mode_config.preferred_depth, rotation);
-
-	return 0;
-}
-
 static const struct drm_simple_display_pipe_funcs ili9225_pipe_funcs = {
 	.enable		= ili9225_pipe_enable,
 	.disable	= ili9225_pipe_disable,
-	.update		= tinydrm_display_pipe_update,
+	.update		= ili9225_pipe_update,
 	.prepare_fb	= drm_gem_fb_simple_display_pipe_prepare_fb,
 };
 
@@ -421,8 +394,8 @@ static int ili9225_probe(struct spi_device *spi)
 	/* override the command function set in  mipi_dbi_spi_init() */
 	mipi->command = ili9225_dbi_command;
 
-	ret = ili9225_init(&spi->dev, mipi, &ili9225_pipe_funcs,
-			   &ili9225_driver, &ili9225_mode, rotation);
+	ret = mipi_dbi_init(&spi->dev, mipi, &ili9225_pipe_funcs,
+			    &ili9225_driver, &ili9225_mode, rotation);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/tinydrm/ili9341.c b/drivers/gpu/drm/tinydrm/ili9341.c
index 51395bdc6ca2..713bb2dd7e04 100644
--- a/drivers/gpu/drm/tinydrm/ili9341.c
+++ b/drivers/gpu/drm/tinydrm/ili9341.c
@@ -15,6 +15,7 @@
 #include <linux/property.h>
 #include <linux/spi/spi.h>
 
+#include <drm/drm_drv.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_modeset_helper.h>
@@ -131,7 +132,7 @@ out_enable:
 static const struct drm_simple_display_pipe_funcs ili9341_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
-	.update = tinydrm_display_pipe_update,
+	.update = mipi_dbi_pipe_update,
 	.prepare_fb = drm_gem_fb_simple_display_pipe_prepare_fb,
 };
 
diff --git a/drivers/gpu/drm/tinydrm/mi0283qt.c b/drivers/gpu/drm/tinydrm/mi0283qt.c
index 3fa62e77c30b..82a92ec9ae3c 100644
--- a/drivers/gpu/drm/tinydrm/mi0283qt.c
+++ b/drivers/gpu/drm/tinydrm/mi0283qt.c
@@ -17,6 +17,7 @@
 #include <linux/regulator/consumer.h>
 #include <linux/spi/spi.h>
 
+#include <drm/drm_drv.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_modeset_helper.h>
@@ -139,7 +140,7 @@ out_enable:
 static const struct drm_simple_display_pipe_funcs mi0283qt_pipe_funcs = {
 	.enable = mi0283qt_enable,
 	.disable = mipi_dbi_pipe_disable,
-	.update = tinydrm_display_pipe_update,
+	.update = mipi_dbi_pipe_update,
 	.prepare_fb = drm_gem_fb_simple_display_pipe_prepare_fb,
 };
 
diff --git a/drivers/gpu/drm/tinydrm/mipi-dbi.c b/drivers/gpu/drm/tinydrm/mipi-dbi.c
index 3a05e56f9b0d..918f77c7de34 100644
--- a/drivers/gpu/drm/tinydrm/mipi-dbi.c
+++ b/drivers/gpu/drm/tinydrm/mipi-dbi.c
@@ -10,18 +10,23 @@
  */
 
 #include <linux/debugfs.h>
+#include <linux/delay.h>
 #include <linux/dma-buf.h>
 #include <linux/gpio/consumer.h>
 #include <linux/module.h>
 #include <linux/regulator/consumer.h>
 #include <linux/spi/spi.h>
 
+#include <drm/drm_damage_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_fourcc.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_vblank.h>
+#include <drm/drm_rect.h>
 #include <drm/tinydrm/mipi-dbi.h>
 #include <drm/tinydrm/tinydrm-helpers.h>
-#include <uapi/drm/drm.h>
 #include <video/mipi_display.h>
 
 #define MIPI_DBI_MAX_SPI_READ_SPEED 2000000 /* 2MHz */
@@ -169,7 +174,7 @@ EXPORT_SYMBOL(mipi_dbi_command_buf);
  * Zero on success, negative error code on failure.
  */
 int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer *fb,
-		      struct drm_clip_rect *clip, bool swap)
+		      struct drm_rect *clip, bool swap)
 {
 	struct drm_gem_cma_object *cma_obj = drm_fb_cma_get_gem_obj(fb, 0);
 	struct dma_buf_attachment *import_attach = cma_obj->base.import_attach;
@@ -208,58 +213,75 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer *fb,
 }
 EXPORT_SYMBOL(mipi_dbi_buf_copy);
 
-static int mipi_dbi_fb_dirty(struct drm_framebuffer *fb,
-			     struct drm_file *file_priv,
-			     unsigned int flags, unsigned int color,
-			     struct drm_clip_rect *clips,
-			     unsigned int num_clips)
+static void mipi_dbi_fb_dirty(struct drm_framebuffer *fb, struct drm_rect *rect)
 {
 	struct drm_gem_cma_object *cma_obj = drm_fb_cma_get_gem_obj(fb, 0);
 	struct tinydrm_device *tdev = fb->dev->dev_private;
 	struct mipi_dbi *mipi = mipi_dbi_from_tinydrm(tdev);
+	unsigned int height = rect->y2 - rect->y1;
+	unsigned int width = rect->x2 - rect->x1;
 	bool swap = mipi->swap_bytes;
-	struct drm_clip_rect clip;
 	int ret = 0;
 	bool full;
 	void *tr;
 
 	if (!mipi->enabled)
-		return 0;
+		return;
 
-	full = tinydrm_merge_clips(&clip, clips, num_clips, flags,
-				   fb->width, fb->height);
+	full = width == fb->width && height == fb->height;
 
-	DRM_DEBUG("Flushing [FB:%d] x1=%u, x2=%u, y1=%u, y2=%u\n", fb->base.id,
-		  clip.x1, clip.x2, clip.y1, clip.y2);
+	DRM_DEBUG_KMS("Flushing [FB:%d] " DRM_RECT_FMT "\n", fb->base.id, DRM_RECT_ARG(rect));
 
 	if (!mipi->dc || !full || swap ||
 	    fb->format->format == DRM_FORMAT_XRGB8888) {
 		tr = mipi->tx_buf;
-		ret = mipi_dbi_buf_copy(mipi->tx_buf, fb, &clip, swap);
+		ret = mipi_dbi_buf_copy(mipi->tx_buf, fb, rect, swap);
 		if (ret)
-			return ret;
+			goto err_msg;
 	} else {
 		tr = cma_obj->vaddr;
 	}
 
 	mipi_dbi_command(mipi, MIPI_DCS_SET_COLUMN_ADDRESS,
-			 (clip.x1 >> 8) & 0xFF, clip.x1 & 0xFF,
-			 ((clip.x2 - 1) >> 8) & 0xFF, (clip.x2 - 1) & 0xFF);
+			 (rect->x1 >> 8) & 0xff, rect->x1 & 0xff,
+			 ((rect->x2 - 1) >> 8) & 0xff, (rect->x2 - 1) & 0xff);
 	mipi_dbi_command(mipi, MIPI_DCS_SET_PAGE_ADDRESS,
-			 (clip.y1 >> 8) & 0xFF, clip.y1 & 0xFF,
-			 ((clip.y2 - 1) >> 8) & 0xFF, (clip.y2 - 1) & 0xFF);
+			 (rect->y1 >> 8) & 0xff, rect->y1 & 0xff,
+			 ((rect->y2 - 1) >> 8) & 0xff, (rect->y2 - 1) & 0xff);
 
 	ret = mipi_dbi_command_buf(mipi, MIPI_DCS_WRITE_MEMORY_START, tr,
-				(clip.x2 - clip.x1) * (clip.y2 - clip.y1) * 2);
-
-	return ret;
+				   width * height * 2);
+err_msg:
+	if (ret)
+		dev_err_once(fb->dev->dev, "Failed to update display %d\n", ret);
 }
 
-static const struct drm_framebuffer_funcs mipi_dbi_fb_funcs = {
-	.destroy	= drm_gem_fb_destroy,
-	.create_handle	= drm_gem_fb_create_handle,
-	.dirty		= tinydrm_fb_dirty,
-};
+/**
+ * mipi_dbi_pipe_update - Display pipe update helper
+ * @pipe: Simple display pipe
+ * @old_state: Old plane state
+ *
+ * This function handles framebuffer flushing and vblank events. Drivers can use
+ * this as their &drm_simple_display_pipe_funcs->update callback.
+ */
+void mipi_dbi_pipe_update(struct drm_simple_display_pipe *pipe,
+			  struct drm_plane_state *old_state)
+{
+	struct drm_plane_state *state = pipe->plane.state;
+	struct drm_crtc *crtc = &pipe->crtc;
+	struct drm_rect rect;
+
+	if (drm_atomic_helper_damage_merged(old_state, state, &rect))
+		mipi_dbi_fb_dirty(state->fb, &rect);
+
+	if (crtc->state->event) {
+		spin_lock_irq(&crtc->dev->event_lock);
+		drm_crtc_send_vblank_event(crtc, crtc->state->event);
+		spin_unlock_irq(&crtc->dev->event_lock);
+		crtc->state->event = NULL;
+	}
+}
+EXPORT_SYMBOL(mipi_dbi_pipe_update);
 
 /**
  * mipi_dbi_enable_flush - MIPI DBI enable helper
@@ -270,18 +292,25 @@ static const struct drm_framebuffer_funcs mipi_dbi_fb_funcs = {
  * This function sets &mipi_dbi->enabled, flushes the whole framebuffer and
  * enables the backlight. Drivers can use this in their
  * &drm_simple_display_pipe_funcs->enable callback.
+ *
+ * Note: Drivers which don't use mipi_dbi_pipe_update() because they have custom
+ * framebuffer flushing, can't use this function since they both use the same
+ * flushing code.
  */
 void mipi_dbi_enable_flush(struct mipi_dbi *mipi,
 			   struct drm_crtc_state *crtc_state,
 			   struct drm_plane_state *plane_state)
 {
-	struct tinydrm_device *tdev = &mipi->tinydrm;
 	struct drm_framebuffer *fb = plane_state->fb;
+	struct drm_rect rect = {
+		.x1 = 0,
+		.x2 = fb->width,
+		.y1 = 0,
+		.y2 = fb->height,
+	};
 
 	mipi->enabled = true;
-	if (fb)
-		tdev->fb_dirty(fb, NULL, 0, 0, NULL, 0);
-
+	mipi_dbi_fb_dirty(fb, &rect);
 	backlight_enable(mipi->backlight);
 }
 EXPORT_SYMBOL(mipi_dbi_enable_flush);
@@ -373,12 +402,10 @@ int mipi_dbi_init(struct device *dev, struct mipi_dbi *mipi,
 	if (!mipi->tx_buf)
 		return -ENOMEM;
 
-	ret = devm_tinydrm_init(dev, tdev, &mipi_dbi_fb_funcs, driver);
+	ret = devm_tinydrm_init(dev, tdev, driver);
 	if (ret)
 		return ret;
 
-	tdev->fb_dirty = mipi_dbi_fb_dirty;
-
 	/* TODO: Maybe add DRM_MODE_CONNECTOR_SPI */
 	ret = tinydrm_display_pipe_init(tdev, pipe_funcs,
 					DRM_MODE_CONNECTOR_VIRTUAL,
@@ -388,6 +415,8 @@ int mipi_dbi_init(struct device *dev, struct mipi_dbi *mipi,
 	if (ret)
 		return ret;
 
+	drm_plane_enable_fb_damage_clips(&tdev->pipe.plane);
+
 	tdev->drm->mode_config.preferred_depth = 16;
 	mipi->rotation = rotation;
 
diff --git a/drivers/gpu/drm/tinydrm/repaper.c b/drivers/gpu/drm/tinydrm/repaper.c
index 54d6fe0f37ce..b037c6540cf3 100644
--- a/drivers/gpu/drm/tinydrm/repaper.c
+++ b/drivers/gpu/drm/tinydrm/repaper.c
@@ -26,9 +26,13 @@
 #include <linux/spi/spi.h>
 #include <linux/thermal.h>
 
+#include <drm/drm_damage_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_rect.h>
+#include <drm/drm_vblank.h>
 #include <drm/tinydrm/tinydrm.h>
 #include <drm/tinydrm/tinydrm-helpers.h>
 
@@ -521,17 +525,13 @@ static void repaper_gray8_to_mono_reversed(u8 *buf, u32 width, u32 height)
 		}
 }
 
-static int repaper_fb_dirty(struct drm_framebuffer *fb,
-			    struct drm_file *file_priv,
-			    unsigned int flags, unsigned int color,
-			    struct drm_clip_rect *clips,
-			    unsigned int num_clips)
+static int repaper_fb_dirty(struct drm_framebuffer *fb)
 {
 	struct drm_gem_cma_object *cma_obj = drm_fb_cma_get_gem_obj(fb, 0);
 	struct dma_buf_attachment *import_attach = cma_obj->base.import_attach;
 	struct tinydrm_device *tdev = fb->dev->dev_private;
 	struct repaper_epd *epd = epd_from_tinydrm(tdev);
-	struct drm_clip_rect clip;
+	struct drm_rect clip;
 	u8 *buf = NULL;
 	int ret = 0;
 
@@ -624,12 +624,6 @@ out_free:
 	return ret;
 }
 
-static const struct drm_framebuffer_funcs repaper_fb_funcs = {
-	.destroy	= drm_gem_fb_destroy,
-	.create_handle	= drm_gem_fb_create_handle,
-	.dirty		= tinydrm_fb_dirty,
-};
-
 static void power_off(struct repaper_epd *epd)
 {
 	/* Turn off power and all signals */
@@ -793,9 +787,7 @@ static void repaper_pipe_disable(struct drm_simple_display_pipe *pipe)
 
 	DRM_DEBUG_DRIVER("\n");
 
-	mutex_lock(&tdev->dirty_lock);
 	epd->enabled = false;
-	mutex_unlock(&tdev->dirty_lock);
 
 	/* Nothing frame */
 	for (line = 0; line < epd->height; line++)
@@ -838,10 +830,28 @@ static void repaper_pipe_disable(struct drm_simple_display_pipe *pipe)
 	power_off(epd);
 }
 
+static void repaper_pipe_update(struct drm_simple_display_pipe *pipe,
+				struct drm_plane_state *old_state)
+{
+	struct drm_plane_state *state = pipe->plane.state;
+	struct drm_crtc *crtc = &pipe->crtc;
+	struct drm_rect rect;
+
+	if (drm_atomic_helper_damage_merged(old_state, state, &rect))
+		repaper_fb_dirty(state->fb);
+
+	if (crtc->state->event) {
+		spin_lock_irq(&crtc->dev->event_lock);
+		drm_crtc_send_vblank_event(crtc, crtc->state->event);
+		spin_unlock_irq(&crtc->dev->event_lock);
+		crtc->state->event = NULL;
+	}
+}
+
 static const struct drm_simple_display_pipe_funcs repaper_pipe_funcs = {
 	.enable = repaper_pipe_enable,
 	.disable = repaper_pipe_disable,
-	.update = tinydrm_display_pipe_update,
+	.update = repaper_pipe_update,
 	.prepare_fb = drm_gem_fb_simple_display_pipe_prepare_fb,
 };
 
@@ -1055,12 +1065,10 @@ static int repaper_probe(struct spi_device *spi)
 
 	tdev = &epd->tinydrm;
 
-	ret = devm_tinydrm_init(dev, tdev, &repaper_fb_funcs, &repaper_driver);
+	ret = devm_tinydrm_init(dev, tdev, &repaper_driver);
 	if (ret)
 		return ret;
 
-	tdev->fb_dirty = repaper_fb_dirty;
-
 	ret = tinydrm_display_pipe_init(tdev, &repaper_pipe_funcs,
 					DRM_MODE_CONNECTOR_VIRTUAL,
 					repaper_formats,
diff --git a/drivers/gpu/drm/tinydrm/st7586.c b/drivers/gpu/drm/tinydrm/st7586.c
index a6a8a1081b73..01a8077954b3 100644
--- a/drivers/gpu/drm/tinydrm/st7586.c
+++ b/drivers/gpu/drm/tinydrm/st7586.c
@@ -17,9 +17,13 @@
 #include <linux/spi/spi.h>
 #include <video/mipi_display.h>
 
+#include <drm/drm_damage_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_rect.h>
+#include <drm/drm_vblank.h>
 #include <drm/tinydrm/mipi-dbi.h>
 #include <drm/tinydrm/tinydrm-helpers.h>
 
@@ -61,7 +65,7 @@ static const u8 st7586_lookup[] = { 0x7, 0x4, 0x2, 0x0 };
 
 static void st7586_xrgb8888_to_gray332(u8 *dst, void *vaddr,
 				       struct drm_framebuffer *fb,
-				       struct drm_clip_rect *clip)
+				       struct drm_rect *clip)
 {
 	size_t len = (clip->x2 - clip->x1) * (clip->y2 - clip->y1);
 	unsigned int x, y;
@@ -87,7 +91,7 @@ static void st7586_xrgb8888_to_gray332(u8 *dst, void *vaddr,
 }
 
 static int st7586_buf_copy(void *dst, struct drm_framebuffer *fb,
-			   struct drm_clip_rect *clip)
+			   struct drm_rect *clip)
 {
 	struct drm_gem_cma_object *cma_obj = drm_fb_cma_get_gem_obj(fb, 0);
 	struct dma_buf_attachment *import_attach = cma_obj->base.import_attach;
@@ -110,57 +114,62 @@ static int st7586_buf_copy(void *dst, struct drm_framebuffer *fb,
 	return ret;
 }
 
-static int st7586_fb_dirty(struct drm_framebuffer *fb,
-			   struct drm_file *file_priv, unsigned int flags,
-			   unsigned int color, struct drm_clip_rect *clips,
-			   unsigned int num_clips)
+static void st7586_fb_dirty(struct drm_framebuffer *fb, struct drm_rect *rect)
 {
 	struct tinydrm_device *tdev = fb->dev->dev_private;
 	struct mipi_dbi *mipi = mipi_dbi_from_tinydrm(tdev);
-	struct drm_clip_rect clip;
 	int start, end;
 	int ret = 0;
 
 	if (!mipi->enabled)
-		return 0;
-
-	tinydrm_merge_clips(&clip, clips, num_clips, flags, fb->width,
-			    fb->height);
+		return;
 
 	/* 3 pixels per byte, so grow clip to nearest multiple of 3 */
-	clip.x1 = rounddown(clip.x1, 3);
-	clip.x2 = roundup(clip.x2, 3);
+	rect->x1 = rounddown(rect->x1, 3);
+	rect->x2 = roundup(rect->x2, 3);
 
-	DRM_DEBUG("Flushing [FB:%d] x1=%u, x2=%u, y1=%u, y2=%u\n", fb->base.id,
-		  clip.x1, clip.x2, clip.y1, clip.y2);
+	DRM_DEBUG_KMS("Flushing [FB:%d] " DRM_RECT_FMT "\n", fb->base.id, DRM_RECT_ARG(rect));
 
-	ret = st7586_buf_copy(mipi->tx_buf, fb, &clip);
+	ret = st7586_buf_copy(mipi->tx_buf, fb, rect);
 	if (ret)
-		return ret;
+		goto err_msg;
 
 	/* Pixels are packed 3 per byte */
-	start = clip.x1 / 3;
-	end = clip.x2 / 3;
+	start = rect->x1 / 3;
+	end = rect->x2 / 3;
 
 	mipi_dbi_command(mipi, MIPI_DCS_SET_COLUMN_ADDRESS,
 			 (start >> 8) & 0xFF, start & 0xFF,
 			 (end >> 8) & 0xFF, (end - 1) & 0xFF);
 	mipi_dbi_command(mipi, MIPI_DCS_SET_PAGE_ADDRESS,
-			 (clip.y1 >> 8) & 0xFF, clip.y1 & 0xFF,
-			 (clip.y2 >> 8) & 0xFF, (clip.y2 - 1) & 0xFF);
+			 (rect->y1 >> 8) & 0xFF, rect->y1 & 0xFF,
+			 (rect->y2 >> 8) & 0xFF, (rect->y2 - 1) & 0xFF);
 
 	ret = mipi_dbi_command_buf(mipi, MIPI_DCS_WRITE_MEMORY_START,
 				   (u8 *)mipi->tx_buf,
-				   (end - start) * (clip.y2 - clip.y1));
-
-	return ret;
+				   (end - start) * (rect->y2 - rect->y1));
+err_msg:
+	if (ret)
+		dev_err_once(fb->dev->dev, "Failed to update display %d\n", ret);
 }
 
-static const struct drm_framebuffer_funcs st7586_fb_funcs = {
-	.destroy	= drm_gem_fb_destroy,
-	.create_handle	= drm_gem_fb_create_handle,
-	.dirty		= tinydrm_fb_dirty,
-};
+static void st7586_pipe_update(struct drm_simple_display_pipe *pipe,
+			       struct drm_plane_state *old_state)
+{
+	struct drm_plane_state *state = pipe->plane.state;
+	struct drm_crtc *crtc = &pipe->crtc;
+	struct drm_rect rect;
+
+	if (drm_atomic_helper_damage_merged(old_state, state, &rect))
+		st7586_fb_dirty(state->fb, &rect);
+
+	if (crtc->state->event) {
+		spin_lock_irq(&crtc->dev->event_lock);
+		drm_crtc_send_vblank_event(crtc, crtc->state->event);
+		spin_unlock_irq(&crtc->dev->event_lock);
+		crtc->state->event = NULL;
+	}
+}
 
 static void st7586_pipe_enable(struct drm_simple_display_pipe *pipe,
 			       struct drm_crtc_state *crtc_state,
@@ -168,6 +177,13 @@ static void st7586_pipe_enable(struct drm_simple_display_pipe *pipe,
 {
 	struct tinydrm_device *tdev = pipe_to_tinydrm(pipe);
 	struct mipi_dbi *mipi = mipi_dbi_from_tinydrm(tdev);
+	struct drm_framebuffer *fb = plane_state->fb;
+	struct drm_rect rect = {
+		.x1 = 0,
+		.x2 = fb->width,
+		.y1 = 0,
+		.y2 = fb->height,
+	};
 	int ret;
 	u8 addr_mode;
 
@@ -224,9 +240,10 @@ static void st7586_pipe_enable(struct drm_simple_display_pipe *pipe,
 
 	msleep(100);
 
-	mipi_dbi_command(mipi, MIPI_DCS_SET_DISPLAY_ON);
+	mipi->enabled = true;
+	st7586_fb_dirty(fb, &rect);
 
-	mipi_dbi_enable_flush(mipi, crtc_state, plane_state);
+	mipi_dbi_command(mipi, MIPI_DCS_SET_DISPLAY_ON);
 }
 
 static void st7586_pipe_disable(struct drm_simple_display_pipe *pipe)
@@ -262,12 +279,10 @@ static int st7586_init(struct device *dev, struct mipi_dbi *mipi,
 	if (!mipi->tx_buf)
 		return -ENOMEM;
 
-	ret = devm_tinydrm_init(dev, tdev, &st7586_fb_funcs, driver);
+	ret = devm_tinydrm_init(dev, tdev, driver);
 	if (ret)
 		return ret;
 
-	tdev->fb_dirty = st7586_fb_dirty;
-
 	ret = tinydrm_display_pipe_init(tdev, pipe_funcs,
 					DRM_MODE_CONNECTOR_VIRTUAL,
 					st7586_formats,
@@ -276,6 +291,8 @@ static int st7586_init(struct device *dev, struct mipi_dbi *mipi,
 	if (ret)
 		return ret;
 
+	drm_plane_enable_fb_damage_clips(&tdev->pipe.plane);
+
 	tdev->drm->mode_config.preferred_depth = 32;
 	mipi->rotation = rotation;
 
@@ -290,7 +307,7 @@ static int st7586_init(struct device *dev, struct mipi_dbi *mipi,
 static const struct drm_simple_display_pipe_funcs st7586_pipe_funcs = {
 	.enable		= st7586_pipe_enable,
 	.disable	= st7586_pipe_disable,
-	.update		= tinydrm_display_pipe_update,
+	.update		= st7586_pipe_update,
 	.prepare_fb	= drm_gem_fb_simple_display_pipe_prepare_fb,
 };
 
diff --git a/drivers/gpu/drm/tinydrm/st7735r.c b/drivers/gpu/drm/tinydrm/st7735r.c
index b39779e0dcd8..3bab9a9569a6 100644
--- a/drivers/gpu/drm/tinydrm/st7735r.c
+++ b/drivers/gpu/drm/tinydrm/st7735r.c
@@ -14,6 +14,7 @@
 #include <linux/spi/spi.h>
 #include <video/mipi_display.h>
 
+#include <drm/drm_drv.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/tinydrm/mipi-dbi.h>
@@ -105,7 +106,7 @@ static void jd_t18003_t01_pipe_enable(struct drm_simple_display_pipe *pipe,
 static const struct drm_simple_display_pipe_funcs jd_t18003_t01_pipe_funcs = {
 	.enable		= jd_t18003_t01_pipe_enable,
 	.disable	= mipi_dbi_pipe_disable,
-	.update		= tinydrm_display_pipe_update,
+	.update		= mipi_dbi_pipe_update,
 	.prepare_fb	= drm_gem_fb_simple_display_pipe_prepare_fb,
 };
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 0ec08394e17a..3f56647cdb35 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -198,19 +198,22 @@ static void ttm_bo_ref_bug(struct kref *list_kref)
 
 void ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
 {
+	struct ttm_bo_device *bdev = bo->bdev;
+	bool notify = false;
+
 	if (!list_empty(&bo->swap)) {
 		list_del_init(&bo->swap);
 		kref_put(&bo->list_kref, ttm_bo_ref_bug);
+		notify = true;
 	}
 	if (!list_empty(&bo->lru)) {
 		list_del_init(&bo->lru);
 		kref_put(&bo->list_kref, ttm_bo_ref_bug);
+		notify = true;
 	}
 
-	/*
-	 * TODO: Add a driver hook to delete from
-	 * driver-specific LRU's here.
-	 */
+	if (notify && bdev->driver->del_from_lru_notify)
+		bdev->driver->del_from_lru_notify(bo);
 }
 
 void ttm_bo_del_sub_from_lru(struct ttm_buffer_object *bo)
@@ -676,15 +679,6 @@ void ttm_bo_put(struct ttm_buffer_object *bo)
 }
 EXPORT_SYMBOL(ttm_bo_put);
 
-void ttm_bo_unref(struct ttm_buffer_object **p_bo)
-{
-	struct ttm_buffer_object *bo = *p_bo;
-
-	*p_bo = NULL;
-	ttm_bo_put(bo);
-}
-EXPORT_SYMBOL(ttm_bo_unref);
-
 int ttm_bo_lock_delayed_workqueue(struct ttm_bo_device *bdev)
 {
 	return cancel_delayed_work_sync(&bdev->wq);
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index a1d977fbade5..e86a29a1e51f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -71,7 +71,7 @@ static vm_fault_t ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo,
 		ttm_bo_get(bo);
 		up_read(&vmf->vma->vm_mm->mmap_sem);
 		(void) dma_fence_wait(bo->moving, true);
-		ttm_bo_unreserve(bo);
+		reservation_object_unlock(bo->resv);
 		ttm_bo_put(bo);
 		goto out_unlock;
 	}
@@ -131,11 +131,7 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 	 * for reserve, and if it fails, retry the fault after waiting
 	 * for the buffer to become unreserved.
 	 */
-	err = ttm_bo_reserve(bo, true, true, NULL);
-	if (unlikely(err != 0)) {
-		if (err != -EBUSY)
-			return VM_FAULT_NOPAGE;
-
+	if (unlikely(!reservation_object_trylock(bo->resv))) {
 		if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
 			if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) {
 				ttm_bo_get(bo);
@@ -165,6 +161,8 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 	}
 
 	if (bdev->driver->fault_reserve_notify) {
+		struct dma_fence *moving = dma_fence_get(bo->moving);
+
 		err = bdev->driver->fault_reserve_notify(bo);
 		switch (err) {
 		case 0:
@@ -177,6 +175,13 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 			ret = VM_FAULT_SIGBUS;
 			goto out_unlock;
 		}
+
+		if (bo->moving != moving) {
+			spin_lock(&bdev->glob->lru_lock);
+			ttm_bo_move_to_lru_tail(bo, NULL);
+			spin_unlock(&bdev->glob->lru_lock);
+		}
+		dma_fence_put(moving);
 	}
 
 	/*
@@ -291,7 +296,7 @@ static vm_fault_t ttm_bo_vm_fault(struct vm_fault *vmf)
 out_io_unlock:
 	ttm_mem_io_unlock(man);
 out_unlock:
-	ttm_bo_unreserve(bo);
+	reservation_object_unlock(bo->resv);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/tve200/tve200_drv.c b/drivers/gpu/drm/tve200/tve200_drv.c
index 28e2d03c0ccf..d5c6a7ecf232 100644
--- a/drivers/gpu/drm/tve200/tve200_drv.c
+++ b/drivers/gpu/drm/tve200/tve200_drv.c
@@ -43,14 +43,14 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_bridge.h>
+#include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
-#include <drm/drm_fb_helper.h>
-#include <drm/drm_fb_cma_helper.h>
-#include <drm/drm_panel.h>
 #include <drm/drm_of.h>
-#include <drm/drm_bridge.h>
+#include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 
 #include "tve200_drm.h"
 
diff --git a/drivers/gpu/drm/udl/udl_connector.c b/drivers/gpu/drm/udl/udl_connector.c
index 68e88bed77ca..66885c24590f 100644
--- a/drivers/gpu/drm/udl/udl_connector.c
+++ b/drivers/gpu/drm/udl/udl_connector.c
@@ -14,6 +14,7 @@
 #include <drm/drm_crtc.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "udl_connector.h"
 #include "udl_drv.h"
 
diff --git a/drivers/gpu/drm/udl/udl_drv.c b/drivers/gpu/drm/udl/udl_drv.c
index a63e3011e971..22cd2d13e272 100644
--- a/drivers/gpu/drm/udl/udl_drv.c
+++ b/drivers/gpu/drm/udl/udl_drv.c
@@ -9,6 +9,7 @@
 #include <linux/module.h>
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "udl_drv.h"
 
 static int udl_usb_suspend(struct usb_interface *interface,
diff --git a/drivers/gpu/drm/udl/udl_main.c b/drivers/gpu/drm/udl/udl_main.c
index 1b014d92855b..9086d0d1b880 100644
--- a/drivers/gpu/drm/udl/udl_main.c
+++ b/drivers/gpu/drm/udl/udl_main.c
@@ -12,6 +12,7 @@
  */
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "udl_drv.h"
 
 /* -BULK_SIZE as per usb-skeleton. Can we get full page and avoid overhead? */
diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index dcb772a19191..fdda3037f7af 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -308,7 +308,6 @@ void v3d_exec_put(struct v3d_exec_info *exec);
 void v3d_tfu_job_put(struct v3d_tfu_job *exec);
 void v3d_reset(struct v3d_dev *v3d);
 void v3d_invalidate_caches(struct v3d_dev *v3d);
-void v3d_flush_caches(struct v3d_dev *v3d);
 
 /* v3d_irq.c */
 void v3d_irq_init(struct v3d_dev *v3d);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 05ca6319065e..803f31467ec1 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -130,38 +130,31 @@ v3d_flush_l3(struct v3d_dev *v3d)
 	}
 }
 
-/* Invalidates the (read-only) L2 cache. */
+/* Invalidates the (read-only) L2C cache.  This was the L2 cache for
+ * uniforms and instructions on V3D 3.2.
+ */
 static void
-v3d_invalidate_l2(struct v3d_dev *v3d, int core)
+v3d_invalidate_l2c(struct v3d_dev *v3d, int core)
 {
+	if (v3d->ver > 32)
+		return;
+
 	V3D_CORE_WRITE(core, V3D_CTL_L2CACTL,
 		       V3D_L2CACTL_L2CCLR |
 		       V3D_L2CACTL_L2CENA);
 }
 
-static void
-v3d_invalidate_l1td(struct v3d_dev *v3d, int core)
-{
-	V3D_CORE_WRITE(core, V3D_CTL_L2TCACTL, V3D_L2TCACTL_TMUWCF);
-	if (wait_for(!(V3D_CORE_READ(core, V3D_CTL_L2TCACTL) &
-		       V3D_L2TCACTL_L2TFLS), 100)) {
-		DRM_ERROR("Timeout waiting for L1T write combiner flush\n");
-	}
-}
-
 /* Invalidates texture L2 cachelines */
 static void
 v3d_flush_l2t(struct v3d_dev *v3d, int core)
 {
-	v3d_invalidate_l1td(v3d, core);
-
+	/* While there is a busy bit (V3D_L2TCACTL_L2TFLS), we don't
+	 * need to wait for completion before dispatching the job --
+	 * L2T accesses will be stalled until the flush has completed.
+	 */
 	V3D_CORE_WRITE(core, V3D_CTL_L2TCACTL,
 		       V3D_L2TCACTL_L2TFLS |
 		       V3D_SET_FIELD(V3D_L2TCACTL_FLM_FLUSH, V3D_L2TCACTL_FLM));
-	if (wait_for(!(V3D_CORE_READ(core, V3D_CTL_L2TCACTL) &
-		       V3D_L2TCACTL_L2TFLS), 100)) {
-		DRM_ERROR("Timeout waiting for L2T flush\n");
-	}
 }
 
 /* Invalidates the slice caches.  These are read-only caches. */
@@ -175,35 +168,18 @@ v3d_invalidate_slices(struct v3d_dev *v3d, int core)
 		       V3D_SET_FIELD(0xf, V3D_SLCACTL_ICC));
 }
 
-/* Invalidates texture L2 cachelines */
-static void
-v3d_invalidate_l2t(struct v3d_dev *v3d, int core)
-{
-	V3D_CORE_WRITE(core,
-		       V3D_CTL_L2TCACTL,
-		       V3D_L2TCACTL_L2TFLS |
-		       V3D_SET_FIELD(V3D_L2TCACTL_FLM_CLEAR, V3D_L2TCACTL_FLM));
-	if (wait_for(!(V3D_CORE_READ(core, V3D_CTL_L2TCACTL) &
-		       V3D_L2TCACTL_L2TFLS), 100)) {
-		DRM_ERROR("Timeout waiting for L2T invalidate\n");
-	}
-}
-
 void
 v3d_invalidate_caches(struct v3d_dev *v3d)
 {
+	/* Invalidate the caches from the outside in.  That way if
+	 * another CL's concurrent use of nearby memory were to pull
+	 * an invalidated cacheline back in, we wouldn't leave stale
+	 * data in the inner cache.
+	 */
 	v3d_flush_l3(v3d);
-
-	v3d_invalidate_l2(v3d, 0);
-	v3d_invalidate_slices(v3d, 0);
+	v3d_invalidate_l2c(v3d, 0);
 	v3d_flush_l2t(v3d, 0);
-}
-
-void
-v3d_flush_caches(struct v3d_dev *v3d)
-{
-	v3d_invalidate_l1td(v3d, 0);
-	v3d_invalidate_l2t(v3d, 0);
+	v3d_invalidate_slices(v3d, 0);
 }
 
 static void
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index f7508e907536..4704b2df3688 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -234,18 +234,21 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job)
 	for (q = 0; q < V3D_MAX_QUEUES; q++) {
 		struct drm_gpu_scheduler *sched = &v3d->queue[q].sched;
 
-		kthread_park(sched->thread);
-		drm_sched_hw_job_reset(sched, (sched_job->sched == sched ?
-					       sched_job : NULL));
+		drm_sched_stop(sched);
+
+		if(sched_job)
+			drm_sched_increase_karma(sched_job);
 	}
 
 	/* get the GPU back into the init state */
 	v3d_reset(v3d);
 
+	for (q = 0; q < V3D_MAX_QUEUES; q++)
+		drm_sched_resubmit_jobs(sched_job->sched);
+
 	/* Unblock schedulers and restart their jobs. */
 	for (q = 0; q < V3D_MAX_QUEUES; q++) {
-		drm_sched_job_recovery(&v3d->queue[q].sched);
-		kthread_unpark(v3d->queue[q].sched.thread);
+		drm_sched_start(&v3d->queue[q].sched, true);
 	}
 
 	mutex_unlock(&v3d->reset_lock);
diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index 3ce136ba8791..730008d3da76 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -34,8 +34,8 @@
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_uapi.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <linux/component.h>
@@ -49,6 +49,13 @@ struct vc4_crtc_state {
 	struct drm_mm_node mm;
 	bool feed_txp;
 	bool txp_armed;
+
+	struct {
+		unsigned int left;
+		unsigned int right;
+		unsigned int top;
+		unsigned int bottom;
+	} margins;
 };
 
 static inline struct vc4_crtc_state *
@@ -624,6 +631,37 @@ static enum drm_mode_status vc4_crtc_mode_valid(struct drm_crtc *crtc,
 	return MODE_OK;
 }
 
+void vc4_crtc_get_margins(struct drm_crtc_state *state,
+			  unsigned int *left, unsigned int *right,
+			  unsigned int *top, unsigned int *bottom)
+{
+	struct vc4_crtc_state *vc4_state = to_vc4_crtc_state(state);
+	struct drm_connector_state *conn_state;
+	struct drm_connector *conn;
+	int i;
+
+	*left = vc4_state->margins.left;
+	*right = vc4_state->margins.right;
+	*top = vc4_state->margins.top;
+	*bottom = vc4_state->margins.bottom;
+
+	/* We have to interate over all new connector states because
+	 * vc4_crtc_get_margins() might be called before
+	 * vc4_crtc_atomic_check() which means margins info in vc4_crtc_state
+	 * might be outdated.
+	 */
+	for_each_new_connector_in_state(state->state, conn, conn_state, i) {
+		if (conn_state->crtc != state->crtc)
+			continue;
+
+		*left = conn_state->tv.margins.left;
+		*right = conn_state->tv.margins.right;
+		*top = conn_state->tv.margins.top;
+		*bottom = conn_state->tv.margins.bottom;
+		break;
+	}
+}
+
 static int vc4_crtc_atomic_check(struct drm_crtc *crtc,
 				 struct drm_crtc_state *state)
 {
@@ -671,6 +709,10 @@ static int vc4_crtc_atomic_check(struct drm_crtc *crtc,
 			vc4_state->feed_txp = false;
 		}
 
+		vc4_state->margins.left = conn_state->tv.margins.left;
+		vc4_state->margins.right = conn_state->tv.margins.right;
+		vc4_state->margins.top = conn_state->tv.margins.top;
+		vc4_state->margins.bottom = conn_state->tv.margins.bottom;
 		break;
 	}
 
@@ -972,6 +1014,7 @@ static struct drm_crtc_state *vc4_crtc_duplicate_state(struct drm_crtc *crtc)
 
 	old_vc4_state = to_vc4_crtc_state(crtc->state);
 	vc4_state->feed_txp = old_vc4_state->feed_txp;
+	vc4_state->margins = old_vc4_state->margins;
 
 	__drm_atomic_helper_crtc_duplicate_state(crtc, &vc4_state->base);
 	return &vc4_state->base;
diff --git a/drivers/gpu/drm/vc4/vc4_dpi.c b/drivers/gpu/drm/vc4/vc4_dpi.c
index f185812970da..169521e547ba 100644
--- a/drivers/gpu/drm/vc4/vc4_dpi.c
+++ b/drivers/gpu/drm/vc4/vc4_dpi.c
@@ -24,10 +24,10 @@
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_bridge.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/component.h>
 #include <linux/of_graph.h>
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index f6f5cd80c04d..5fcd2f0da7f7 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -175,7 +175,6 @@ static struct drm_driver vc4_drm_driver = {
 	.driver_features = (DRIVER_MODESET |
 			    DRIVER_ATOMIC |
 			    DRIVER_GEM |
-			    DRIVER_HAVE_IRQ |
 			    DRIVER_RENDER |
 			    DRIVER_PRIME |
 			    DRIVER_SYNCOBJ),
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 4f87b03f837d..2c635f001c71 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -9,6 +9,7 @@
 #include <linux/mm_types.h>
 #include <linux/reservation.h>
 #include <drm/drmP.h>
+#include <drm/drm_util.h>
 #include <drm/drm_encoder.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_atomic.h>
@@ -707,6 +708,9 @@ bool vc4_crtc_get_scanoutpos(struct drm_device *dev, unsigned int crtc_id,
 			     const struct drm_display_mode *mode);
 void vc4_crtc_handle_vblank(struct vc4_crtc *crtc);
 void vc4_crtc_txp_armed(struct drm_crtc_state *state);
+void vc4_crtc_get_margins(struct drm_crtc_state *state,
+			  unsigned int *right, unsigned int *left,
+			  unsigned int *top, unsigned int *bottom);
 
 /* vc4_debugfs.c */
 int vc4_debugfs_init(struct drm_minor *minor);
diff --git a/drivers/gpu/drm/vc4/vc4_dsi.c b/drivers/gpu/drm/vc4/vc4_dsi.c
index 0c607eb33d7e..11702e1d9011 100644
--- a/drivers/gpu/drm/vc4/vc4_dsi.c
+++ b/drivers/gpu/drm/vc4/vc4_dsi.c
@@ -30,11 +30,11 @@
  */
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_mipi_dsi.h>
 #include <drm/drm_of.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/clk-provider.h>
 #include <linux/completion.h>
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index fd5522fd179e..88fd5df7e7dc 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -43,8 +43,8 @@
  */
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/component.h>
 #include <linux/i2c.h>
@@ -109,7 +109,6 @@ struct vc4_hdmi_encoder {
 	struct vc4_encoder base;
 	bool hdmi_monitor;
 	bool limited_rgb_range;
-	bool rgb_range_selectable;
 };
 
 static inline struct vc4_hdmi_encoder *
@@ -280,11 +279,6 @@ static int vc4_hdmi_connector_get_modes(struct drm_connector *connector)
 
 	vc4_encoder->hdmi_monitor = drm_detect_hdmi_monitor(edid);
 
-	if (edid && edid->input & DRM_EDID_INPUT_DIGITAL) {
-		vc4_encoder->rgb_range_selectable =
-			drm_rgb_quant_range_selectable(edid);
-	}
-
 	drm_connector_update_edid_property(connector, edid);
 	ret = drm_add_edid_modes(connector, edid);
 	kfree(edid);
@@ -310,6 +304,7 @@ static struct drm_connector *vc4_hdmi_connector_init(struct drm_device *dev,
 {
 	struct drm_connector *connector;
 	struct vc4_hdmi_connector *hdmi_connector;
+	int ret;
 
 	hdmi_connector = devm_kzalloc(dev->dev, sizeof(*hdmi_connector),
 				      GFP_KERNEL);
@@ -323,6 +318,13 @@ static struct drm_connector *vc4_hdmi_connector_init(struct drm_device *dev,
 			   DRM_MODE_CONNECTOR_HDMIA);
 	drm_connector_helper_add(connector, &vc4_hdmi_connector_helper_funcs);
 
+	/* Create and attach TV margin props to this connector. */
+	ret = drm_mode_create_tv_margin_properties(dev);
+	if (ret)
+		return ERR_PTR(ret);
+
+	drm_connector_attach_tv_margin_properties(connector);
+
 	connector->polled = (DRM_CONNECTOR_POLL_CONNECT |
 			     DRM_CONNECTOR_POLL_DISCONNECT);
 
@@ -408,23 +410,31 @@ static void vc4_hdmi_write_infoframe(struct drm_encoder *encoder,
 static void vc4_hdmi_set_avi_infoframe(struct drm_encoder *encoder)
 {
 	struct vc4_hdmi_encoder *vc4_encoder = to_vc4_hdmi_encoder(encoder);
+	struct vc4_dev *vc4 = encoder->dev->dev_private;
+	struct vc4_hdmi *hdmi = vc4->hdmi;
+	struct drm_connector_state *cstate = hdmi->connector->state;
 	struct drm_crtc *crtc = encoder->crtc;
 	const struct drm_display_mode *mode = &crtc->state->adjusted_mode;
 	union hdmi_infoframe frame;
 	int ret;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
+						       hdmi->connector, mode);
 	if (ret < 0) {
 		DRM_ERROR("couldn't fill AVI infoframe\n");
 		return;
 	}
 
-	drm_hdmi_avi_infoframe_quant_range(&frame.avi, mode,
+	drm_hdmi_avi_infoframe_quant_range(&frame.avi,
+					   hdmi->connector, mode,
 					   vc4_encoder->limited_rgb_range ?
 					   HDMI_QUANTIZATION_RANGE_LIMITED :
-					   HDMI_QUANTIZATION_RANGE_FULL,
-					   vc4_encoder->rgb_range_selectable,
-					   false);
+					   HDMI_QUANTIZATION_RANGE_FULL);
+
+	frame.avi.right_bar = cstate->tv.margins.right;
+	frame.avi.left_bar = cstate->tv.margins.left;
+	frame.avi.top_bar = cstate->tv.margins.top;
+	frame.avi.bottom_bar = cstate->tv.margins.bottom;
 
 	vc4_hdmi_write_infoframe(encoder, &frame);
 }
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 1f94b9affe4b..91b8c72ff361 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -17,9 +17,9 @@
 #include <drm/drm_crtc.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_plane_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "vc4_drv.h"
 #include "vc4_regs.h"
 
@@ -432,7 +432,8 @@ int vc4_kms_load(struct drm_device *dev)
 	ctm_state = kzalloc(sizeof(*ctm_state), GFP_KERNEL);
 	if (!ctm_state)
 		return -ENOMEM;
-	drm_atomic_private_obj_init(&vc4->ctm_manager, &ctm_state->base,
+
+	drm_atomic_private_obj_init(dev, &vc4->ctm_manager, &ctm_state->base,
 				    &vc4_ctm_state_funcs);
 
 	drm_mode_config_reset(dev);
diff --git a/drivers/gpu/drm/vc4/vc4_perfmon.c b/drivers/gpu/drm/vc4/vc4_perfmon.c
index 437e7a27f21d..495150415020 100644
--- a/drivers/gpu/drm/vc4/vc4_perfmon.c
+++ b/drivers/gpu/drm/vc4/vc4_perfmon.c
@@ -117,7 +117,7 @@ int vc4_perfmon_create_ioctl(struct drm_device *dev, void *data,
 			return -EINVAL;
 	}
 
-	perfmon = kzalloc(sizeof(*perfmon) + (req->ncounters * sizeof(u64)),
+	perfmon = kzalloc(struct_size(perfmon, counters, req->ncounters),
 			  GFP_KERNEL);
 	if (!perfmon)
 		return -ENOMEM;
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c
index 75db62cbe468..d098337c10e9 100644
--- a/drivers/gpu/drm/vc4/vc4_plane.c
+++ b/drivers/gpu/drm/vc4/vc4_plane.c
@@ -258,6 +258,52 @@ static u32 vc4_get_scl_field(struct drm_plane_state *state, int plane)
 	}
 }
 
+static int vc4_plane_margins_adj(struct drm_plane_state *pstate)
+{
+	struct vc4_plane_state *vc4_pstate = to_vc4_plane_state(pstate);
+	unsigned int left, right, top, bottom, adjhdisplay, adjvdisplay;
+	struct drm_crtc_state *crtc_state;
+
+	crtc_state = drm_atomic_get_new_crtc_state(pstate->state,
+						   pstate->crtc);
+
+	vc4_crtc_get_margins(crtc_state, &left, &right, &top, &bottom);
+	if (!left && !right && !top && !bottom)
+		return 0;
+
+	if (left + right >= crtc_state->mode.hdisplay ||
+	    top + bottom >= crtc_state->mode.vdisplay)
+		return -EINVAL;
+
+	adjhdisplay = crtc_state->mode.hdisplay - (left + right);
+	vc4_pstate->crtc_x = DIV_ROUND_CLOSEST(vc4_pstate->crtc_x *
+					       adjhdisplay,
+					       crtc_state->mode.hdisplay);
+	vc4_pstate->crtc_x += left;
+	if (vc4_pstate->crtc_x > crtc_state->mode.hdisplay - left)
+		vc4_pstate->crtc_x = crtc_state->mode.hdisplay - left;
+
+	adjvdisplay = crtc_state->mode.vdisplay - (top + bottom);
+	vc4_pstate->crtc_y = DIV_ROUND_CLOSEST(vc4_pstate->crtc_y *
+					       adjvdisplay,
+					       crtc_state->mode.vdisplay);
+	vc4_pstate->crtc_y += top;
+	if (vc4_pstate->crtc_y > crtc_state->mode.vdisplay - top)
+		vc4_pstate->crtc_y = crtc_state->mode.vdisplay - top;
+
+	vc4_pstate->crtc_w = DIV_ROUND_CLOSEST(vc4_pstate->crtc_w *
+					       adjhdisplay,
+					       crtc_state->mode.hdisplay);
+	vc4_pstate->crtc_h = DIV_ROUND_CLOSEST(vc4_pstate->crtc_h *
+					       adjvdisplay,
+					       crtc_state->mode.vdisplay);
+
+	if (!vc4_pstate->crtc_w || !vc4_pstate->crtc_h)
+		return -EINVAL;
+
+	return 0;
+}
+
 static int vc4_plane_setup_clipping_and_scaling(struct drm_plane_state *state)
 {
 	struct vc4_plane_state *vc4_state = to_vc4_plane_state(state);
@@ -306,6 +352,10 @@ static int vc4_plane_setup_clipping_and_scaling(struct drm_plane_state *state)
 	vc4_state->crtc_w = state->dst.x2 - state->dst.x1;
 	vc4_state->crtc_h = state->dst.y2 - state->dst.y1;
 
+	ret = vc4_plane_margins_adj(state);
+	if (ret)
+		return ret;
+
 	vc4_state->x_scaling[0] = vc4_get_scaling_mode(vc4_state->src_w[0],
 						       vc4_state->crtc_w);
 	vc4_state->y_scaling[0] = vc4_get_scaling_mode(vc4_state->src_h[0],
@@ -492,8 +542,9 @@ static int vc4_plane_mode_set(struct drm_plane *plane,
 	bool mix_plane_alpha;
 	bool covers_screen;
 	u32 scl0, scl1, pitch0;
-	u32 tiling;
+	u32 tiling, src_y;
 	u32 hvs_format = format->hvs;
+	unsigned int rotation;
 	int ret, i;
 
 	if (vc4_state->dlist_initialized)
@@ -520,6 +571,16 @@ static int vc4_plane_mode_set(struct drm_plane *plane,
 	h_subsample = drm_format_horz_chroma_subsampling(format->drm);
 	v_subsample = drm_format_vert_chroma_subsampling(format->drm);
 
+	rotation = drm_rotation_simplify(state->rotation,
+					 DRM_MODE_ROTATE_0 |
+					 DRM_MODE_REFLECT_X |
+					 DRM_MODE_REFLECT_Y);
+
+	/* We must point to the last line when Y reflection is enabled. */
+	src_y = vc4_state->src_y;
+	if (rotation & DRM_MODE_REFLECT_Y)
+		src_y += vc4_state->src_h[0] - 1;
+
 	switch (base_format_mod) {
 	case DRM_FORMAT_MOD_LINEAR:
 		tiling = SCALER_CTL0_TILING_LINEAR;
@@ -529,9 +590,10 @@ static int vc4_plane_mode_set(struct drm_plane *plane,
 		 * out.
 		 */
 		for (i = 0; i < num_planes; i++) {
-			vc4_state->offsets[i] += vc4_state->src_y /
+			vc4_state->offsets[i] += src_y /
 						 (i ? v_subsample : 1) *
 						 fb->pitches[i];
+
 			vc4_state->offsets[i] += vc4_state->src_x /
 						 (i ? h_subsample : 1) *
 						 fb->format->cpp[i];
@@ -557,22 +619,38 @@ static int vc4_plane_mode_set(struct drm_plane *plane,
 		u32 tiles_w = fb->pitches[0] >> (tile_size_shift - tile_h_shift);
 		u32 tiles_l = vc4_state->src_x >> tile_w_shift;
 		u32 tiles_r = tiles_w - tiles_l;
-		u32 tiles_t = vc4_state->src_y >> tile_h_shift;
+		u32 tiles_t = src_y >> tile_h_shift;
 		/* Intra-tile offsets, which modify the base address (the
 		 * SCALER_PITCH0_TILE_Y_OFFSET tells HVS how to walk from that
 		 * base address).
 		 */
-		u32 tile_y = (vc4_state->src_y >> 4) & 1;
-		u32 subtile_y = (vc4_state->src_y >> 2) & 3;
-		u32 utile_y = vc4_state->src_y & 3;
+		u32 tile_y = (src_y >> 4) & 1;
+		u32 subtile_y = (src_y >> 2) & 3;
+		u32 utile_y = src_y & 3;
 		u32 x_off = vc4_state->src_x & tile_w_mask;
-		u32 y_off = vc4_state->src_y & tile_h_mask;
+		u32 y_off = src_y & tile_h_mask;
+
+		/* When Y reflection is requested we must set the
+		 * SCALER_PITCH0_TILE_LINE_DIR flag to tell HVS that all lines
+		 * after the initial one should be fetched in descending order,
+		 * which makes sense since we start from the last line and go
+		 * backward.
+		 * Don't know why we need y_off = max_y_off - y_off, but it's
+		 * definitely required (I guess it's also related to the "going
+		 * backward" situation).
+		 */
+		if (rotation & DRM_MODE_REFLECT_Y) {
+			y_off = tile_h_mask - y_off;
+			pitch0 = SCALER_PITCH0_TILE_LINE_DIR;
+		} else {
+			pitch0 = 0;
+		}
 
 		tiling = SCALER_CTL0_TILING_256B_OR_T;
-		pitch0 = (VC4_SET_FIELD(x_off, SCALER_PITCH0_SINK_PIX) |
-			  VC4_SET_FIELD(y_off, SCALER_PITCH0_TILE_Y_OFFSET) |
-			  VC4_SET_FIELD(tiles_l, SCALER_PITCH0_TILE_WIDTH_L) |
-			  VC4_SET_FIELD(tiles_r, SCALER_PITCH0_TILE_WIDTH_R));
+		pitch0 |= (VC4_SET_FIELD(x_off, SCALER_PITCH0_SINK_PIX) |
+			   VC4_SET_FIELD(y_off, SCALER_PITCH0_TILE_Y_OFFSET) |
+			   VC4_SET_FIELD(tiles_l, SCALER_PITCH0_TILE_WIDTH_L) |
+			   VC4_SET_FIELD(tiles_r, SCALER_PITCH0_TILE_WIDTH_R));
 		vc4_state->offsets[0] += tiles_t * (tiles_w << tile_size_shift);
 		vc4_state->offsets[0] += subtile_y << 8;
 		vc4_state->offsets[0] += utile_y << 4;
@@ -595,31 +673,22 @@ static int vc4_plane_mode_set(struct drm_plane *plane,
 	case DRM_FORMAT_MOD_BROADCOM_SAND128:
 	case DRM_FORMAT_MOD_BROADCOM_SAND256: {
 		uint32_t param = fourcc_mod_broadcom_param(fb->modifier);
+		u32 tile_w, tile, x_off, pix_per_tile;
 
-		/* Column-based NV12 or RGBA.
-		 */
-		if (fb->format->num_planes > 1) {
-			if (hvs_format != HVS_PIXEL_FORMAT_YCBCR_YUV420_2PLANE) {
-				DRM_DEBUG_KMS("SAND format only valid for NV12/21");
-				return -EINVAL;
-			}
-			hvs_format = HVS_PIXEL_FORMAT_H264;
-		} else {
-			if (base_format_mod == DRM_FORMAT_MOD_BROADCOM_SAND256) {
-				DRM_DEBUG_KMS("SAND256 format only valid for H.264");
-				return -EINVAL;
-			}
-		}
+		hvs_format = HVS_PIXEL_FORMAT_H264;
 
 		switch (base_format_mod) {
 		case DRM_FORMAT_MOD_BROADCOM_SAND64:
 			tiling = SCALER_CTL0_TILING_64B;
+			tile_w = 64;
 			break;
 		case DRM_FORMAT_MOD_BROADCOM_SAND128:
 			tiling = SCALER_CTL0_TILING_128B;
+			tile_w = 128;
 			break;
 		case DRM_FORMAT_MOD_BROADCOM_SAND256:
 			tiling = SCALER_CTL0_TILING_256B_OR_T;
+			tile_w = 256;
 			break;
 		default:
 			break;
@@ -630,6 +699,23 @@ static int vc4_plane_mode_set(struct drm_plane *plane,
 			return -EINVAL;
 		}
 
+		pix_per_tile = tile_w / fb->format->cpp[0];
+		tile = vc4_state->src_x / pix_per_tile;
+		x_off = vc4_state->src_x % pix_per_tile;
+
+		/* Adjust the base pointer to the first pixel to be scanned
+		 * out.
+		 */
+		for (i = 0; i < num_planes; i++) {
+			vc4_state->offsets[i] += param * tile_w * tile;
+			vc4_state->offsets[i] += src_y /
+						 (i ? v_subsample : 1) *
+						 tile_w;
+			vc4_state->offsets[i] += x_off /
+						 (i ? h_subsample : 1) *
+						 fb->format->cpp[i];
+		}
+
 		pitch0 = VC4_SET_FIELD(param, SCALER_TILE_HEIGHT);
 		break;
 	}
@@ -643,6 +729,8 @@ static int vc4_plane_mode_set(struct drm_plane *plane,
 	/* Control word */
 	vc4_dlist_write(vc4_state,
 			SCALER_CTL0_VALID |
+			(rotation & DRM_MODE_REFLECT_X ? SCALER_CTL0_HFLIP : 0) |
+			(rotation & DRM_MODE_REFLECT_Y ? SCALER_CTL0_VFLIP : 0) |
 			VC4_SET_FIELD(SCALER_CTL0_RGBA_EXPAND_ROUND, SCALER_CTL0_RGBA_EXPAND) |
 			(format->pixel_order << SCALER_CTL0_ORDER_SHIFT) |
 			(hvs_format << SCALER_CTL0_PIXEL_FORMAT_SHIFT) |
@@ -1050,8 +1138,6 @@ static bool vc4_format_mod_supported(struct drm_plane *plane,
 		switch (fourcc_mod_broadcom_mod(modifier)) {
 		case DRM_FORMAT_MOD_LINEAR:
 		case DRM_FORMAT_MOD_BROADCOM_VC4_T_TILED:
-		case DRM_FORMAT_MOD_BROADCOM_SAND64:
-		case DRM_FORMAT_MOD_BROADCOM_SAND128:
 			return true;
 		default:
 			return false;
@@ -1123,6 +1209,11 @@ struct drm_plane *vc4_plane_init(struct drm_device *dev,
 	drm_plane_helper_add(plane, &vc4_plane_helper_funcs);
 
 	drm_plane_create_alpha_property(plane);
+	drm_plane_create_rotation_property(plane, DRM_MODE_ROTATE_0,
+					   DRM_MODE_ROTATE_0 |
+					   DRM_MODE_ROTATE_180 |
+					   DRM_MODE_REFLECT_X |
+					   DRM_MODE_REFLECT_Y);
 
 	return plane;
 }
diff --git a/drivers/gpu/drm/vc4/vc4_txp.c b/drivers/gpu/drm/vc4/vc4_txp.c
index 6e23c50168f9..aa279b5b0de7 100644
--- a/drivers/gpu/drm/vc4/vc4_txp.c
+++ b/drivers/gpu/drm/vc4/vc4_txp.c
@@ -9,9 +9,9 @@
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_fb_cma_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_writeback.h>
 #include <linux/clk.h>
 #include <linux/component.h>
diff --git a/drivers/gpu/drm/vc4/vc4_vec.c b/drivers/gpu/drm/vc4/vc4_vec.c
index 8e7facb6514e..858c3a483229 100644
--- a/drivers/gpu/drm/vc4/vc4_vec.c
+++ b/drivers/gpu/drm/vc4/vc4_vec.c
@@ -25,9 +25,9 @@
  */
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_panel.h>
+#include <drm/drm_probe_helper.h>
 #include <linux/clk.h>
 #include <linux/component.h>
 #include <linux/of_graph.h>
diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
index c1c420afe2dd..eb17c0cd3727 100644
--- a/drivers/gpu/drm/vgem/vgem_fence.c
+++ b/drivers/gpu/drm/vgem/vgem_fence.c
@@ -53,13 +53,13 @@ static void vgem_fence_release(struct dma_fence *base)
 
 static void vgem_fence_value_str(struct dma_fence *fence, char *str, int size)
 {
-	snprintf(str, size, "%u", fence->seqno);
+	snprintf(str, size, "%llu", fence->seqno);
 }
 
 static void vgem_fence_timeline_value_str(struct dma_fence *fence, char *str,
 					  int size)
 {
-	snprintf(str, size, "%u",
+	snprintf(str, size, "%llu",
 		 dma_fence_is_signaled(fence) ? fence->seqno : 0);
 }
 
diff --git a/drivers/gpu/drm/via/via_dmablit.c b/drivers/gpu/drm/via/via_dmablit.c
index 345bda4494e1..8bf3a7c23ed3 100644
--- a/drivers/gpu/drm/via/via_dmablit.c
+++ b/drivers/gpu/drm/via/via_dmablit.c
@@ -177,12 +177,14 @@ via_free_sg_info(struct pci_dev *pdev, drm_via_sg_info_t *vsg)
 	switch (vsg->state) {
 	case dr_via_device_mapped:
 		via_unmap_blit_from_device(pdev, vsg);
+		/* fall through */
 	case dr_via_desc_pages_alloc:
 		for (i = 0; i < vsg->num_desc_pages; ++i) {
 			if (vsg->desc_pages[i] != NULL)
 				free_page((unsigned long)vsg->desc_pages[i]);
 		}
 		kfree(vsg->desc_pages);
+		/* fall through */
 	case dr_via_pages_locked:
 		for (i = 0; i < vsg->num_pages; ++i) {
 			if (NULL != (page = vsg->pages[i])) {
@@ -191,8 +193,10 @@ via_free_sg_info(struct pci_dev *pdev, drm_via_sg_info_t *vsg)
 				put_page(page);
 			}
 		}
+		/* fall through */
 	case dr_via_pages_alloc:
 		vfree(vsg->pages);
+		/* fall through */
 	default:
 		vsg->state = dr_via_sg_init;
 	}
diff --git a/drivers/gpu/drm/via/via_drv.c b/drivers/gpu/drm/via/via_drv.c
index aaf766f7cca2..af6a12d3c058 100644
--- a/drivers/gpu/drm/via/via_drv.c
+++ b/drivers/gpu/drm/via/via_drv.c
@@ -70,8 +70,7 @@ static const struct file_operations via_driver_fops = {
 
 static struct drm_driver driver = {
 	.driver_features =
-	    DRIVER_USE_AGP | DRIVER_HAVE_IRQ | DRIVER_LEGACY |
-	    DRIVER_IRQ_SHARED,
+	    DRIVER_USE_AGP | DRIVER_HAVE_IRQ | DRIVER_LEGACY,
 	.load = via_driver_load,
 	.unload = via_driver_unload,
 	.open = via_driver_open,
diff --git a/drivers/gpu/drm/virtio/Makefile b/drivers/gpu/drm/virtio/Makefile
index f29deec83d1f..4e90cc8fa651 100644
--- a/drivers/gpu/drm/virtio/Makefile
+++ b/drivers/gpu/drm/virtio/Makefile
@@ -3,7 +3,7 @@
 # Makefile for the drm device driver.  This driver provides support for the
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
-virtio-gpu-y := virtgpu_drv.o virtgpu_kms.o virtgpu_drm_bus.o virtgpu_gem.o \
+virtio-gpu-y := virtgpu_drv.o virtgpu_kms.o virtgpu_gem.o \
 	virtgpu_fb.o virtgpu_display.o virtgpu_vq.o virtgpu_ttm.o \
 	virtgpu_fence.o virtgpu_object.o virtgpu_debugfs.o virtgpu_plane.o \
 	virtgpu_ioctl.o virtgpu_prime.o
diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c b/drivers/gpu/drm/virtio/virtgpu_display.c
index b5580b11a063..653ec7d0bf4d 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -26,9 +26,9 @@
  */
 
 #include "virtgpu_drv.h"
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #define XRES_MIN    32
 #define YRES_MIN    32
@@ -243,12 +243,8 @@ static enum drm_connector_status virtio_gpu_conn_detect(
 
 static void virtio_gpu_conn_destroy(struct drm_connector *connector)
 {
-	struct virtio_gpu_output *virtio_gpu_output =
-		drm_connector_to_virtio_gpu_output(connector);
-
 	drm_connector_unregister(connector);
 	drm_connector_cleanup(connector);
-	kfree(virtio_gpu_output);
 }
 
 static const struct drm_connector_funcs virtio_gpu_connector_funcs = {
@@ -362,7 +358,7 @@ static const struct drm_mode_config_funcs virtio_gpu_mode_funcs = {
 	.atomic_commit = drm_atomic_helper_commit,
 };
 
-int virtio_gpu_modeset_init(struct virtio_gpu_device *vgdev)
+void virtio_gpu_modeset_init(struct virtio_gpu_device *vgdev)
 {
 	int i;
 
@@ -381,7 +377,6 @@ int virtio_gpu_modeset_init(struct virtio_gpu_device *vgdev)
 		vgdev_output_init(vgdev, i);
 
 	drm_mode_config_reset(vgdev->ddev);
-	return 0;
 }
 
 void virtio_gpu_modeset_fini(struct virtio_gpu_device *vgdev)
@@ -390,6 +385,5 @@ void virtio_gpu_modeset_fini(struct virtio_gpu_device *vgdev)
 
 	for (i = 0 ; i < vgdev->num_scanouts; ++i)
 		kfree(vgdev->outputs[i].edid);
-	virtio_gpu_fbdev_fini(vgdev);
 	drm_mode_config_cleanup(vgdev->ddev);
 }
diff --git a/drivers/gpu/drm/virtio/virtgpu_drm_bus.c b/drivers/gpu/drm/virtio/virtgpu_drm_bus.c
deleted file mode 100644
index 0887e0b64b9c..000000000000
--- a/drivers/gpu/drm/virtio/virtgpu_drm_bus.c
+++ /dev/null
@@ -1,103 +0,0 @@
-/*
- * Copyright (C) 2015 Red Hat, Inc.
- * All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining
- * a copy of this software and associated documentation files (the
- * "Software"), to deal in the Software without restriction, including
- * without limitation the rights to use, copy, modify, merge, publish,
- * distribute, sublicense, and/or sell copies of the Software, and to
- * permit persons to whom the Software is furnished to do so, subject to
- * the following conditions:
- *
- * The above copyright notice and this permission notice (including the
- * next paragraph) shall be included in all copies or substantial
- * portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- * IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
- * LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
- * OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
- * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- */
-
-#include <linux/pci.h>
-#include <drm/drm_fb_helper.h>
-
-#include "virtgpu_drv.h"
-
-int drm_virtio_init(struct drm_driver *driver, struct virtio_device *vdev)
-{
-	struct drm_device *dev;
-	int ret;
-
-	dev = drm_dev_alloc(driver, &vdev->dev);
-	if (IS_ERR(dev))
-		return PTR_ERR(dev);
-	vdev->priv = dev;
-
-	if (strcmp(vdev->dev.parent->bus->name, "pci") == 0) {
-		struct pci_dev *pdev = to_pci_dev(vdev->dev.parent);
-		const char *pname = dev_name(&pdev->dev);
-		bool vga = (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA;
-		char unique[20];
-
-		DRM_INFO("pci: %s detected at %s\n",
-			 vga ? "virtio-vga" : "virtio-gpu-pci",
-			 pname);
-		dev->pdev = pdev;
-		if (vga)
-			drm_fb_helper_remove_conflicting_pci_framebuffers(pdev,
-									  0,
-									  "virtiodrmfb");
-
-		/*
-		 * Normally the drm_dev_set_unique() call is done by core DRM.
-		 * The following comment covers, why virtio cannot rely on it.
-		 *
-		 * Unlike the other virtual GPU drivers, virtio abstracts the
-		 * underlying bus type by using struct virtio_device.
-		 *
-		 * Hence the dev_is_pci() check, used in core DRM, will fail
-		 * and the unique returned will be the virtio_device "virtio0",
-		 * while a "pci:..." one is required.
-		 *
-		 * A few other ideas were considered:
-		 * - Extend the dev_is_pci() check [in drm_set_busid] to
-		 *   consider virtio.
-		 *   Seems like a bigger hack than what we have already.
-		 *
-		 * - Point drm_device::dev to the parent of the virtio_device
-		 *   Semantic changes:
-		 *   * Using the wrong device for i2c, framebuffer_alloc and
-		 *     prime import.
-		 *   Visual changes:
-		 *   * Helpers such as DRM_DEV_ERROR, dev_info, drm_printer,
-		 *     will print the wrong information.
-		 *
-		 * We could address the latter issues, by introducing
-		 * drm_device::bus_dev, ... which would be used solely for this.
-		 *
-		 * So for the moment keep things as-is, with a bulky comment
-		 * for the next person who feels like removing this
-		 * drm_dev_set_unique() quirk.
-		 */
-		snprintf(unique, sizeof(unique), "pci:%s", pname);
-		ret = drm_dev_set_unique(dev, unique);
-		if (ret)
-			goto err_free;
-
-	}
-
-	ret = drm_dev_register(dev, 0);
-	if (ret)
-		goto err_free;
-
-	return 0;
-
-err_free:
-	drm_dev_put(dev);
-	return ret;
-}
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c
index 2d1aaca49105..b996ac1d4fcc 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.c
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.c
@@ -40,21 +40,101 @@ static int virtio_gpu_modeset = -1;
 MODULE_PARM_DESC(modeset, "Disable/Enable modesetting");
 module_param_named(modeset, virtio_gpu_modeset, int, 0400);
 
+static int virtio_gpu_pci_quirk(struct drm_device *dev, struct virtio_device *vdev)
+{
+	struct pci_dev *pdev = to_pci_dev(vdev->dev.parent);
+	const char *pname = dev_name(&pdev->dev);
+	bool vga = (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA;
+	char unique[20];
+
+	DRM_INFO("pci: %s detected at %s\n",
+		 vga ? "virtio-vga" : "virtio-gpu-pci",
+		 pname);
+	dev->pdev = pdev;
+	if (vga)
+		drm_fb_helper_remove_conflicting_pci_framebuffers(pdev,
+								  0,
+								  "virtiodrmfb");
+
+	/*
+	 * Normally the drm_dev_set_unique() call is done by core DRM.
+	 * The following comment covers, why virtio cannot rely on it.
+	 *
+	 * Unlike the other virtual GPU drivers, virtio abstracts the
+	 * underlying bus type by using struct virtio_device.
+	 *
+	 * Hence the dev_is_pci() check, used in core DRM, will fail
+	 * and the unique returned will be the virtio_device "virtio0",
+	 * while a "pci:..." one is required.
+	 *
+	 * A few other ideas were considered:
+	 * - Extend the dev_is_pci() check [in drm_set_busid] to
+	 *   consider virtio.
+	 *   Seems like a bigger hack than what we have already.
+	 *
+	 * - Point drm_device::dev to the parent of the virtio_device
+	 *   Semantic changes:
+	 *   * Using the wrong device for i2c, framebuffer_alloc and
+	 *     prime import.
+	 *   Visual changes:
+	 *   * Helpers such as DRM_DEV_ERROR, dev_info, drm_printer,
+	 *     will print the wrong information.
+	 *
+	 * We could address the latter issues, by introducing
+	 * drm_device::bus_dev, ... which would be used solely for this.
+	 *
+	 * So for the moment keep things as-is, with a bulky comment
+	 * for the next person who feels like removing this
+	 * drm_dev_set_unique() quirk.
+	 */
+	snprintf(unique, sizeof(unique), "pci:%s", pname);
+	return drm_dev_set_unique(dev, unique);
+}
+
 static int virtio_gpu_probe(struct virtio_device *vdev)
 {
+	struct drm_device *dev;
+	int ret;
+
 	if (vgacon_text_force() && virtio_gpu_modeset == -1)
 		return -EINVAL;
 
 	if (virtio_gpu_modeset == 0)
 		return -EINVAL;
 
-	return drm_virtio_init(&driver, vdev);
+	dev = drm_dev_alloc(&driver, &vdev->dev);
+	if (IS_ERR(dev))
+		return PTR_ERR(dev);
+	vdev->priv = dev;
+
+	if (!strcmp(vdev->dev.parent->bus->name, "pci")) {
+		ret = virtio_gpu_pci_quirk(dev, vdev);
+		if (ret)
+			goto err_free;
+	}
+
+	ret = virtio_gpu_init(dev);
+	if (ret)
+		goto err_free;
+
+	ret = drm_dev_register(dev, 0);
+	if (ret)
+		goto err_free;
+
+	drm_fbdev_generic_setup(vdev->priv, 32);
+	return 0;
+
+err_free:
+	drm_dev_put(dev);
+	return ret;
 }
 
 static void virtio_gpu_remove(struct virtio_device *vdev)
 {
 	struct drm_device *dev = vdev->priv;
 
+	drm_dev_unregister(dev);
+	virtio_gpu_deinit(dev);
 	drm_put_dev(dev);
 }
 
@@ -116,8 +196,6 @@ static const struct file_operations virtio_gpu_driver_fops = {
 
 static struct drm_driver driver = {
 	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME | DRIVER_RENDER | DRIVER_ATOMIC,
-	.load = virtio_gpu_driver_load,
-	.unload = virtio_gpu_driver_unload,
 	.open = virtio_gpu_driver_open,
 	.postclose = virtio_gpu_driver_postclose,
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 0c15000f926e..3238fdf58eb4 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -34,9 +34,9 @@
 #include <drm/drmP.h>
 #include <drm/drm_gem.h>
 #include <drm/drm_atomic.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_encoder.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/ttm/ttm_bo_api.h>
 #include <drm/ttm/ttm_bo_driver.h>
 #include <drm/ttm/ttm_placement.h>
@@ -50,9 +50,6 @@
 #define DRIVER_MINOR 1
 #define DRIVER_PATCHLEVEL 0
 
-/* virtgpu_drm_bus.c */
-int drm_virtio_init(struct drm_driver *driver, struct virtio_device *vdev);
-
 struct virtio_gpu_object {
 	struct drm_gem_object gem_base;
 	uint32_t hw_res_handle;
@@ -137,19 +134,10 @@ struct virtio_gpu_framebuffer {
 #define to_virtio_gpu_framebuffer(x) \
 	container_of(x, struct virtio_gpu_framebuffer, base)
 
-struct virtio_gpu_fbdev {
-	struct drm_fb_helper           helper;
-	struct virtio_gpu_framebuffer  vgfb;
-	struct virtio_gpu_device       *vgdev;
-	struct delayed_work            work;
-};
-
 struct virtio_gpu_mman {
 	struct ttm_bo_device		bdev;
 };
 
-struct virtio_gpu_fbdev;
-
 struct virtio_gpu_queue {
 	struct virtqueue *vq;
 	spinlock_t qlock;
@@ -180,8 +168,6 @@ struct virtio_gpu_device {
 
 	struct virtio_gpu_mman mman;
 
-	/* pointer to fbdev info structure */
-	struct virtio_gpu_fbdev *vgfbdev;
 	struct virtio_gpu_output outputs[VIRTIO_GPU_MAX_SCANOUTS];
 	uint32_t num_scanouts;
 
@@ -220,8 +206,8 @@ struct virtio_gpu_fpriv {
 extern struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS];
 
 /* virtio_kms.c */
-int virtio_gpu_driver_load(struct drm_device *dev, unsigned long flags);
-void virtio_gpu_driver_unload(struct drm_device *dev);
+int virtio_gpu_init(struct drm_device *dev);
+void virtio_gpu_deinit(struct drm_device *dev);
 int virtio_gpu_driver_open(struct drm_device *dev, struct drm_file *file);
 void virtio_gpu_driver_postclose(struct drm_device *dev, struct drm_file *file);
 
@@ -249,9 +235,6 @@ int virtio_gpu_mode_dumb_mmap(struct drm_file *file_priv,
 			      uint32_t handle, uint64_t *offset_p);
 
 /* virtio_fb */
-#define VIRTIO_GPUFB_CONN_LIMIT 1
-int virtio_gpu_fbdev_init(struct virtio_gpu_device *vgdev);
-void virtio_gpu_fbdev_fini(struct virtio_gpu_device *vgdev);
 int virtio_gpu_surface_dirty(struct virtio_gpu_framebuffer *qfb,
 			     struct drm_clip_rect *clips,
 			     unsigned int num_clips);
@@ -334,7 +317,7 @@ int virtio_gpu_framebuffer_init(struct drm_device *dev,
 				struct virtio_gpu_framebuffer *vgfb,
 				const struct drm_mode_fb_cmd2 *mode_cmd,
 				struct drm_gem_object *obj);
-int virtio_gpu_modeset_init(struct virtio_gpu_device *vgdev);
+void virtio_gpu_modeset_init(struct virtio_gpu_device *vgdev);
 void virtio_gpu_modeset_fini(struct virtio_gpu_device *vgdev);
 
 /* virtio_gpu_plane.c */
@@ -351,7 +334,6 @@ int virtio_gpu_mmap(struct file *filp, struct vm_area_struct *vma);
 /* virtio_gpu_fence.c */
 struct virtio_gpu_fence *virtio_gpu_fence_alloc(
 	struct virtio_gpu_device *vgdev);
-void virtio_gpu_fence_cleanup(struct virtio_gpu_fence *fence);
 int virtio_gpu_fence_emit(struct virtio_gpu_device *vgdev,
 			  struct virtio_gpu_ctrl_hdr *cmd_hdr,
 			  struct virtio_gpu_fence *fence);
diff --git a/drivers/gpu/drm/virtio/virtgpu_fb.c b/drivers/gpu/drm/virtio/virtgpu_fb.c
index fb1cc8b2f119..b07584b1c2bf 100644
--- a/drivers/gpu/drm/virtio/virtgpu_fb.c
+++ b/drivers/gpu/drm/virtio/virtgpu_fb.c
@@ -27,8 +27,6 @@
 #include <drm/drm_fb_helper.h>
 #include "virtgpu_drv.h"
 
-#define VIRTIO_GPU_FBCON_POLL_PERIOD (HZ / 60)
-
 static int virtio_gpu_dirty_update(struct virtio_gpu_framebuffer *fb,
 				   bool store, int x, int y,
 				   int width, int height)
@@ -150,192 +148,3 @@ int virtio_gpu_surface_dirty(struct virtio_gpu_framebuffer *vgfb,
 				      left, top, right - left, bottom - top);
 	return 0;
 }
-
-static void virtio_gpu_fb_dirty_work(struct work_struct *work)
-{
-	struct delayed_work *delayed_work = to_delayed_work(work);
-	struct virtio_gpu_fbdev *vfbdev =
-		container_of(delayed_work, struct virtio_gpu_fbdev, work);
-	struct virtio_gpu_framebuffer *vgfb = &vfbdev->vgfb;
-
-	virtio_gpu_dirty_update(&vfbdev->vgfb, false, vgfb->x1, vgfb->y1,
-				vgfb->x2 - vgfb->x1, vgfb->y2 - vgfb->y1);
-}
-
-static void virtio_gpu_3d_fillrect(struct fb_info *info,
-				   const struct fb_fillrect *rect)
-{
-	struct virtio_gpu_fbdev *vfbdev = info->par;
-
-	drm_fb_helper_sys_fillrect(info, rect);
-	virtio_gpu_dirty_update(&vfbdev->vgfb, true, rect->dx, rect->dy,
-			     rect->width, rect->height);
-	schedule_delayed_work(&vfbdev->work, VIRTIO_GPU_FBCON_POLL_PERIOD);
-}
-
-static void virtio_gpu_3d_copyarea(struct fb_info *info,
-				   const struct fb_copyarea *area)
-{
-	struct virtio_gpu_fbdev *vfbdev = info->par;
-
-	drm_fb_helper_sys_copyarea(info, area);
-	virtio_gpu_dirty_update(&vfbdev->vgfb, true, area->dx, area->dy,
-			   area->width, area->height);
-	schedule_delayed_work(&vfbdev->work, VIRTIO_GPU_FBCON_POLL_PERIOD);
-}
-
-static void virtio_gpu_3d_imageblit(struct fb_info *info,
-				    const struct fb_image *image)
-{
-	struct virtio_gpu_fbdev *vfbdev = info->par;
-
-	drm_fb_helper_sys_imageblit(info, image);
-	virtio_gpu_dirty_update(&vfbdev->vgfb, true, image->dx, image->dy,
-			     image->width, image->height);
-	schedule_delayed_work(&vfbdev->work, VIRTIO_GPU_FBCON_POLL_PERIOD);
-}
-
-static struct fb_ops virtio_gpufb_ops = {
-	.owner = THIS_MODULE,
-	DRM_FB_HELPER_DEFAULT_OPS,
-	.fb_fillrect = virtio_gpu_3d_fillrect,
-	.fb_copyarea = virtio_gpu_3d_copyarea,
-	.fb_imageblit = virtio_gpu_3d_imageblit,
-};
-
-static int virtio_gpufb_create(struct drm_fb_helper *helper,
-			       struct drm_fb_helper_surface_size *sizes)
-{
-	struct virtio_gpu_fbdev *vfbdev =
-		container_of(helper, struct virtio_gpu_fbdev, helper);
-	struct drm_device *dev = helper->dev;
-	struct virtio_gpu_device *vgdev = dev->dev_private;
-	struct fb_info *info;
-	struct drm_framebuffer *fb;
-	struct drm_mode_fb_cmd2 mode_cmd = {};
-	struct virtio_gpu_object *obj;
-	uint32_t format, size;
-	int ret;
-
-	mode_cmd.width = sizes->surface_width;
-	mode_cmd.height = sizes->surface_height;
-	mode_cmd.pitches[0] = mode_cmd.width * 4;
-	mode_cmd.pixel_format = DRM_FORMAT_HOST_XRGB8888;
-
-	format = virtio_gpu_translate_format(mode_cmd.pixel_format);
-	if (format == 0)
-		return -EINVAL;
-
-	size = mode_cmd.pitches[0] * mode_cmd.height;
-	obj = virtio_gpu_alloc_object(dev, size, false, true);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
-
-	virtio_gpu_cmd_create_resource(vgdev, obj, format,
-				       mode_cmd.width, mode_cmd.height);
-
-	ret = virtio_gpu_object_kmap(obj);
-	if (ret) {
-		DRM_ERROR("failed to kmap fb %d\n", ret);
-		goto err_obj_vmap;
-	}
-
-	/* attach the object to the resource */
-	ret = virtio_gpu_object_attach(vgdev, obj, NULL);
-	if (ret)
-		goto err_obj_attach;
-
-	info = drm_fb_helper_alloc_fbi(helper);
-	if (IS_ERR(info)) {
-		ret = PTR_ERR(info);
-		goto err_fb_alloc;
-	}
-
-	info->par = helper;
-
-	ret = virtio_gpu_framebuffer_init(dev, &vfbdev->vgfb,
-					  &mode_cmd, &obj->gem_base);
-	if (ret)
-		goto err_fb_alloc;
-
-	fb = &vfbdev->vgfb.base;
-
-	vfbdev->helper.fb = fb;
-
-	strcpy(info->fix.id, "virtiodrmfb");
-	info->fbops = &virtio_gpufb_ops;
-	info->pixmap.flags = FB_PIXMAP_SYSTEM;
-
-	info->screen_buffer = obj->vmap;
-	info->screen_size = obj->gem_base.size;
-	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
-	drm_fb_helper_fill_var(info, &vfbdev->helper,
-			       sizes->fb_width, sizes->fb_height);
-
-	info->fix.mmio_start = 0;
-	info->fix.mmio_len = 0;
-	return 0;
-
-err_fb_alloc:
-	virtio_gpu_object_detach(vgdev, obj);
-err_obj_attach:
-err_obj_vmap:
-	virtio_gpu_gem_free_object(&obj->gem_base);
-	return ret;
-}
-
-static int virtio_gpu_fbdev_destroy(struct drm_device *dev,
-				    struct virtio_gpu_fbdev *vgfbdev)
-{
-	struct virtio_gpu_framebuffer *vgfb = &vgfbdev->vgfb;
-
-	drm_fb_helper_unregister_fbi(&vgfbdev->helper);
-
-	if (vgfb->base.obj[0])
-		vgfb->base.obj[0] = NULL;
-	drm_fb_helper_fini(&vgfbdev->helper);
-	drm_framebuffer_cleanup(&vgfb->base);
-
-	return 0;
-}
-static const struct drm_fb_helper_funcs virtio_gpu_fb_helper_funcs = {
-	.fb_probe = virtio_gpufb_create,
-};
-
-int virtio_gpu_fbdev_init(struct virtio_gpu_device *vgdev)
-{
-	struct virtio_gpu_fbdev *vgfbdev;
-	int bpp_sel = 32; /* TODO: parameter from somewhere? */
-	int ret;
-
-	vgfbdev = kzalloc(sizeof(struct virtio_gpu_fbdev), GFP_KERNEL);
-	if (!vgfbdev)
-		return -ENOMEM;
-
-	vgfbdev->vgdev = vgdev;
-	vgdev->vgfbdev = vgfbdev;
-	INIT_DELAYED_WORK(&vgfbdev->work, virtio_gpu_fb_dirty_work);
-
-	drm_fb_helper_prepare(vgdev->ddev, &vgfbdev->helper,
-			      &virtio_gpu_fb_helper_funcs);
-	ret = drm_fb_helper_init(vgdev->ddev, &vgfbdev->helper,
-				 VIRTIO_GPUFB_CONN_LIMIT);
-	if (ret) {
-		kfree(vgfbdev);
-		return ret;
-	}
-
-	drm_fb_helper_single_add_all_connectors(&vgfbdev->helper);
-	drm_fb_helper_initial_config(&vgfbdev->helper, bpp_sel);
-	return 0;
-}
-
-void virtio_gpu_fbdev_fini(struct virtio_gpu_device *vgdev)
-{
-	if (!vgdev->vgfbdev)
-		return;
-
-	virtio_gpu_fbdev_destroy(vgdev->ddev, vgdev->vgfbdev);
-	kfree(vgdev->vgfbdev);
-	vgdev->vgfbdev = NULL;
-}
diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c b/drivers/gpu/drm/virtio/virtgpu_fence.c
index 4d6826b27814..21bd4c4a32d1 100644
--- a/drivers/gpu/drm/virtio/virtgpu_fence.c
+++ b/drivers/gpu/drm/virtio/virtgpu_fence.c
@@ -81,14 +81,6 @@ struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev)
 	return fence;
 }
 
-void virtio_gpu_fence_cleanup(struct virtio_gpu_fence *fence)
-{
-	if (!fence)
-		return;
-
-	dma_fence_put(&fence->f);
-}
-
 int virtio_gpu_fence_emit(struct virtio_gpu_device *vgdev,
 			  struct virtio_gpu_ctrl_hdr *cmd_hdr,
 			  struct virtio_gpu_fence *fence)
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index 161b80fee492..14ce8188c052 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -351,7 +351,7 @@ static int virtio_gpu_resource_create_ioctl(struct drm_device *dev, void *data,
 		virtio_gpu_cmd_resource_create_3d(vgdev, qobj, &rc_3d);
 		ret = virtio_gpu_object_attach(vgdev, qobj, fence);
 		if (ret) {
-			virtio_gpu_fence_cleanup(fence);
+			dma_fence_put(&fence->f);
 			goto fail_backoff;
 		}
 		ttm_eu_fence_buffer_objects(&ticket, &validate_list, &fence->f);
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c
index 3af6181c05a8..84b6a6bf00c6 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -28,11 +28,6 @@
 #include <drm/drmP.h>
 #include "virtgpu_drv.h"
 
-static int virtio_gpu_fbdev = 1;
-
-MODULE_PARM_DESC(fbdev, "Disable/Enable framebuffer device & console");
-module_param_named(fbdev, virtio_gpu_fbdev, int, 0400);
-
 static void virtio_gpu_config_changed_work_func(struct work_struct *work)
 {
 	struct virtio_gpu_device *vgdev =
@@ -111,7 +106,7 @@ static void virtio_gpu_get_capsets(struct virtio_gpu_device *vgdev,
 	vgdev->num_capsets = num_capsets;
 }
 
-int virtio_gpu_driver_load(struct drm_device *dev, unsigned long flags)
+int virtio_gpu_init(struct drm_device *dev)
 {
 	static vq_callback_t *callbacks[] = {
 		virtio_gpu_ctrl_ack, virtio_gpu_cursor_ack
@@ -198,9 +193,7 @@ int virtio_gpu_driver_load(struct drm_device *dev, unsigned long flags)
 		     num_capsets, &num_capsets);
 	DRM_INFO("number of cap sets: %d\n", num_capsets);
 
-	ret = virtio_gpu_modeset_init(vgdev);
-	if (ret)
-		goto err_modeset;
+	virtio_gpu_modeset_init(vgdev);
 
 	virtio_device_ready(vgdev->vdev);
 	vgdev->vqs_ready = true;
@@ -212,12 +205,8 @@ int virtio_gpu_driver_load(struct drm_device *dev, unsigned long flags)
 	virtio_gpu_cmd_get_display_info(vgdev);
 	wait_event_timeout(vgdev->resp_wq, !vgdev->display_info_pending,
 			   5 * HZ);
-	if (virtio_gpu_fbdev)
-		virtio_gpu_fbdev_init(vgdev);
-
 	return 0;
 
-err_modeset:
 err_scanouts:
 	virtio_gpu_ttm_fini(vgdev);
 err_ttm:
@@ -239,7 +228,7 @@ static void virtio_gpu_cleanup_cap_cache(struct virtio_gpu_device *vgdev)
 	}
 }
 
-void virtio_gpu_driver_unload(struct drm_device *dev)
+void virtio_gpu_deinit(struct drm_device *dev)
 {
 	struct virtio_gpu_device *vgdev = dev->dev_private;
 
@@ -247,6 +236,7 @@ void virtio_gpu_driver_unload(struct drm_device *dev)
 	flush_work(&vgdev->ctrlq.dequeue_work);
 	flush_work(&vgdev->cursorq.dequeue_work);
 	flush_work(&vgdev->config_changed_work);
+	vgdev->vdev->config->reset(vgdev->vdev);
 	vgdev->vdev->config->del_vqs(vgdev->vdev);
 
 	virtio_gpu_modeset_fini(vgdev);
diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c b/drivers/gpu/drm/virtio/virtgpu_object.c
index f39a183d59c2..e7e946035027 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -28,10 +28,21 @@
 static int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
 				       uint32_t *resid)
 {
+#if 0
 	int handle = ida_alloc(&vgdev->resource_ida, GFP_KERNEL);
 
 	if (handle < 0)
 		return handle;
+#else
+	static int handle;
+
+	/*
+	 * FIXME: dirty hack to avoid re-using IDs, virglrenderer
+	 * can't deal with that.  Needs fixing in virglrenderer, also
+	 * should figure a better way to handle that in the guest.
+	 */
+	handle++;
+#endif
 
 	*resid = handle + 1;
 	return 0;
@@ -39,7 +50,9 @@ static int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev,
 
 static void virtio_gpu_resource_id_put(struct virtio_gpu_device *vgdev, uint32_t id)
 {
+#if 0
 	ida_free(&vgdev->resource_ida, id - 1);
+#endif
 }
 
 static void virtio_gpu_ttm_bo_destroy(struct ttm_buffer_object *tbo)
diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
index ead5c53d4e21..024c2aa0c929 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -130,11 +130,12 @@ static void virtio_gpu_primary_plane_update(struct drm_plane *plane,
 				   plane->state->src_h >> 16,
 				   plane->state->src_x >> 16,
 				   plane->state->src_y >> 16);
-	virtio_gpu_cmd_resource_flush(vgdev, handle,
-				      plane->state->src_x >> 16,
-				      plane->state->src_y >> 16,
-				      plane->state->src_w >> 16,
-				      plane->state->src_h >> 16);
+	if (handle)
+		virtio_gpu_cmd_resource_flush(vgdev, handle,
+					      plane->state->src_x >> 16,
+					      plane->state->src_y >> 16,
+					      plane->state->src_w >> 16,
+					      plane->state->src_h >> 16);
 }
 
 static int virtio_gpu_cursor_prepare_fb(struct drm_plane *plane,
@@ -168,8 +169,10 @@ static void virtio_gpu_cursor_cleanup_fb(struct drm_plane *plane,
 		return;
 
 	vgfb = to_virtio_gpu_framebuffer(plane->state->fb);
-	if (vgfb->fence)
-		virtio_gpu_fence_cleanup(vgfb->fence);
+	if (vgfb->fence) {
+		dma_fence_put(&vgfb->fence->f);
+		vgfb->fence = NULL;
+	}
 }
 
 static void virtio_gpu_cursor_plane_update(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c b/drivers/gpu/drm/virtio/virtgpu_vq.c
index e27c4aedb809..6bc2008b0d0d 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -192,8 +192,16 @@ void virtio_gpu_dequeue_ctrl_func(struct work_struct *work)
 
 	list_for_each_entry_safe(entry, tmp, &reclaim_list, list) {
 		resp = (struct virtio_gpu_ctrl_hdr *)entry->resp_buf;
-		if (resp->type != cpu_to_le32(VIRTIO_GPU_RESP_OK_NODATA))
-			DRM_DEBUG("response 0x%x\n", le32_to_cpu(resp->type));
+		if (resp->type != cpu_to_le32(VIRTIO_GPU_RESP_OK_NODATA)) {
+			if (resp->type >= cpu_to_le32(VIRTIO_GPU_RESP_ERR_UNSPEC)) {
+				struct virtio_gpu_ctrl_hdr *cmd;
+				cmd = (struct virtio_gpu_ctrl_hdr *)entry->buf;
+				DRM_ERROR("response 0x%x (command 0x%x)\n",
+					  le32_to_cpu(resp->type),
+					  le32_to_cpu(cmd->type));
+			} else
+				DRM_DEBUG("response 0x%x\n", le32_to_cpu(resp->type));
+		}
 		if (resp->flags & cpu_to_le32(VIRTIO_GPU_FLAG_FENCE)) {
 			u64 f = le64_to_cpu(resp->fence_id);
 
diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index eb56ee893761..8a9aeb0a9ea8 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -2,15 +2,19 @@
 
 #include "vkms_drv.h"
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
-static void _vblank_handle(struct vkms_output *output)
+static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 {
+	struct vkms_output *output = container_of(timer, struct vkms_output,
+						  vblank_hrtimer);
 	struct drm_crtc *crtc = &output->crtc;
 	struct vkms_crtc_state *state = to_vkms_crtc_state(crtc->state);
+	u64 ret_overrun;
 	bool ret;
 
 	spin_lock(&output->lock);
+
 	ret = drm_crtc_handle_vblank(crtc);
 	if (!ret)
 		DRM_ERROR("vkms failure on handling vblank");
@@ -31,19 +35,11 @@ static void _vblank_handle(struct vkms_output *output)
 			DRM_WARN("failed to queue vkms_crc_work_handle");
 	}
 
-	spin_unlock(&output->lock);
-}
-
-static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
-{
-	struct vkms_output *output = container_of(timer, struct vkms_output,
-						  vblank_hrtimer);
-	int ret_overrun;
-
-	_vblank_handle(output);
-
 	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
 					  output->period_ns);
+	WARN_ON(ret_overrun != 1);
+
+	spin_unlock(&output->lock);
 
 	return HRTIMER_RESTART;
 }
@@ -81,6 +77,9 @@ bool vkms_get_vblank_timestamp(struct drm_device *dev, unsigned int pipe,
 
 	*vblank_time = output->vblank_hrtimer.node.expires;
 
+	if (!in_vblank_irq)
+		*vblank_time -= output->period_ns;
+
 	return true;
 }
 
@@ -98,6 +97,7 @@ static void vkms_atomic_crtc_reset(struct drm_crtc *crtc)
 	vkms_state = kzalloc(sizeof(*vkms_state), GFP_KERNEL);
 	if (!vkms_state)
 		return;
+	INIT_WORK(&vkms_state->crc_work, vkms_crc_work_handle);
 
 	crtc->state = &vkms_state->base;
 	crtc->state->crtc = crtc;
diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
index 7dcbecb5fac2..738dd6206d85 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.c
+++ b/drivers/gpu/drm/vkms/vkms_drv.c
@@ -11,10 +11,10 @@
 
 #include <linux/module.h>
 #include <drm/drm_gem.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_fb_helper.h>
+#include <drm/drm_probe_helper.h>
 #include "vkms_drv.h"
 
 #define DRIVER_NAME	"vkms"
@@ -90,6 +90,7 @@ static int vkms_modeset_init(struct vkms_device *vkmsdev)
 	dev->mode_config.min_height = YRES_MIN;
 	dev->mode_config.max_width = XRES_MAX;
 	dev->mode_config.max_height = YRES_MAX;
+	dev->mode_config.preferred_depth = 24;
 
 	return vkms_output_init(vkmsdev);
 }
diff --git a/drivers/gpu/drm/vkms/vkms_output.c b/drivers/gpu/drm/vkms/vkms_output.c
index 4173e4f48334..3b162b25312e 100644
--- a/drivers/gpu/drm/vkms/vkms_output.c
+++ b/drivers/gpu/drm/vkms/vkms_output.c
@@ -1,8 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0+
 
 #include "vkms_drv.h"
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_probe_helper.h>
 
 static void vkms_connector_destroy(struct drm_connector *connector)
 {
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
index 7ce1c2f87d9a..5d5c2bce01f3 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
@@ -534,7 +534,6 @@ static void vmw_user_bo_release(struct ttm_base_object **p_base)
 {
 	struct vmw_user_buffer_object *vmw_user_bo;
 	struct ttm_base_object *base = *p_base;
-	struct ttm_buffer_object *bo;
 
 	*p_base = NULL;
 
@@ -543,8 +542,7 @@ static void vmw_user_bo_release(struct ttm_base_object **p_base)
 
 	vmw_user_bo = container_of(base, struct vmw_user_buffer_object,
 				   prime.base);
-	bo = &vmw_user_bo->vbo.base;
-	ttm_bo_unref(&bo);
+	ttm_bo_put(&vmw_user_bo->vbo.base);
 }
 
 
@@ -597,7 +595,6 @@ int vmw_user_bo_alloc(struct vmw_private *dev_priv,
 		      struct ttm_base_object **p_base)
 {
 	struct vmw_user_buffer_object *user_bo;
-	struct ttm_buffer_object *tmp;
 	int ret;
 
 	user_bo = kzalloc(sizeof(*user_bo), GFP_KERNEL);
@@ -614,7 +611,7 @@ int vmw_user_bo_alloc(struct vmw_private *dev_priv,
 	if (unlikely(ret != 0))
 		return ret;
 
-	tmp = ttm_bo_reference(&user_bo->vbo.base);
+	ttm_bo_get(&user_bo->vbo.base);
 	ret = ttm_prime_object_init(tfile,
 				    size,
 				    &user_bo->prime,
@@ -623,7 +620,7 @@ int vmw_user_bo_alloc(struct vmw_private *dev_priv,
 				    &vmw_user_bo_release,
 				    &vmw_user_bo_ref_obj_release);
 	if (unlikely(ret != 0)) {
-		ttm_bo_unref(&tmp);
+		ttm_bo_put(&user_bo->vbo.base);
 		goto out_no_base_object;
 	}
 
@@ -911,7 +908,7 @@ int vmw_user_bo_lookup(struct ttm_object_file *tfile,
 
 	vmw_user_bo = container_of(base, struct vmw_user_buffer_object,
 				   prime.base);
-	(void)ttm_bo_reference(&vmw_user_bo->vbo.base);
+	ttm_bo_get(&vmw_user_bo->vbo.base);
 	if (p_base)
 		*p_base = base;
 	else
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c
index 48d1380a952e..70dab55e7888 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c
@@ -765,7 +765,7 @@ static bool vmw_cmdbuf_try_alloc(struct vmw_cmdbuf_man *man,
 
 	if (info->done)
 		return true;
- 
+
 	memset(info->node, 0, sizeof(*info->node));
 	spin_lock(&man->lock);
 	ret = drm_mm_insert_node(&man->mm, info->node, info->page_size);
@@ -1276,8 +1276,10 @@ int vmw_cmdbuf_set_pool_size(struct vmw_cmdbuf_man *man,
 	return 0;
 
 out_no_map:
-	if (man->using_mob)
-		ttm_bo_unref(&man->cmd_space);
+	if (man->using_mob) {
+		ttm_bo_put(man->cmd_space);
+		man->cmd_space = NULL;
+	}
 
 	return ret;
 }
@@ -1380,7 +1382,8 @@ void vmw_cmdbuf_remove_pool(struct vmw_cmdbuf_man *man)
 	(void) vmw_cmdbuf_idle(man, false, 10*HZ);
 	if (man->using_mob) {
 		(void) ttm_bo_kunmap(&man->map_obj);
-		ttm_bo_unref(&man->cmd_space);
+		ttm_bo_put(man->cmd_space);
+		man->cmd_space = NULL;
 	} else {
 		dma_free_coherent(&man->dev_priv->dev->pdev->dev,
 				  man->size, man->map, man->handle);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 7ef5dcb06104..6165fe2c4504 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -1565,7 +1565,7 @@ static const struct file_operations vmwgfx_driver_fops = {
 };
 
 static struct drm_driver driver = {
-	.driver_features = DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED |
+	.driver_features =
 	DRIVER_MODESET | DRIVER_PRIME | DRIVER_RENDER | DRIVER_ATOMIC,
 	.load = vmw_driver_load,
 	.unload = vmw_driver_unload,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index cd607ba9c2fe..accb2fafe2f1 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -1337,18 +1337,15 @@ static inline void vmw_bo_unreference(struct vmw_buffer_object **buf)
 
 	*buf = NULL;
 	if (tmp_buf != NULL) {
-		struct ttm_buffer_object *bo = &tmp_buf->base;
-
-		ttm_bo_unref(&bo);
+		ttm_bo_put(&tmp_buf->base);
 	}
 }
 
 static inline struct vmw_buffer_object *
 vmw_bo_reference(struct vmw_buffer_object *buf)
 {
-	if (ttm_bo_reference(&buf->base))
-		return buf;
-	return NULL;
+	ttm_bo_get(&buf->base);
+	return buf;
 }
 
 static inline struct ttm_mem_global *vmw_mem_glob(struct vmw_private *dev_priv)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index 655abbcd4058..535b03599e55 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -29,8 +29,8 @@
 #define VMWGFX_KMS_H_
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_encoder.h>
+#include <drm/drm_probe_helper.h>
 #include "vmwgfx_drv.h"
 
 /**
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c b/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c
index 7ed179d30ec5..d83cc66e1210 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c
@@ -300,7 +300,8 @@ out_no_setup:
 						 &batch->otables[i]);
 	}
 
-	ttm_bo_unref(&batch->otable_bo);
+	ttm_bo_put(batch->otable_bo);
+	batch->otable_bo = NULL;
 out_no_bo:
 	return ret;
 }
@@ -365,7 +366,8 @@ static void vmw_otable_batch_takedown(struct vmw_private *dev_priv,
 	vmw_bo_fence_single(bo, NULL);
 	ttm_bo_unreserve(bo);
 
-	ttm_bo_unref(&batch->otable_bo);
+	ttm_bo_put(batch->otable_bo);
+	batch->otable_bo = NULL;
 }
 
 /*
@@ -463,7 +465,8 @@ static int vmw_mob_pt_populate(struct vmw_private *dev_priv,
 
 out_unreserve:
 	ttm_bo_unreserve(mob->pt_bo);
-	ttm_bo_unref(&mob->pt_bo);
+	ttm_bo_put(mob->pt_bo);
+	mob->pt_bo = NULL;
 
 	return ret;
 }
@@ -580,8 +583,10 @@ static void vmw_mob_pt_setup(struct vmw_mob *mob,
  */
 void vmw_mob_destroy(struct vmw_mob *mob)
 {
-	if (mob->pt_bo)
-		ttm_bo_unref(&mob->pt_bo);
+	if (mob->pt_bo) {
+		ttm_bo_put(mob->pt_bo);
+		mob->pt_bo = NULL;
+	}
 	kfree(mob);
 }
 
@@ -698,8 +703,10 @@ int vmw_mob_bind(struct vmw_private *dev_priv,
 
 out_no_cmd_space:
 	vmw_fifo_resource_dec(dev_priv);
-	if (pt_set_up)
-		ttm_bo_unref(&mob->pt_bo);
+	if (pt_set_up) {
+		ttm_bo_put(mob->pt_bo);
+		mob->pt_bo = NULL;
+	}
 
 	return -ENOMEM;
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index 3025bfc001a1..a7c30e567f09 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -461,7 +461,8 @@ vmw_resource_check_buffer(struct ww_acquire_ctx *ticket,
 	}
 
 	INIT_LIST_HEAD(&val_list);
-	val_buf->bo = ttm_bo_reference(&res->backup->base);
+	ttm_bo_get(&res->backup->base);
+	val_buf->bo = &res->backup->base;
 	val_buf->num_shared = 0;
 	list_add_tail(&val_buf->head, &val_list);
 	ret = ttm_eu_reserve_buffers(ticket, &val_list, interruptible, NULL);
@@ -484,7 +485,8 @@ vmw_resource_check_buffer(struct ww_acquire_ctx *ticket,
 out_no_validate:
 	ttm_eu_backoff_reservation(ticket, &val_list);
 out_no_reserve:
-	ttm_bo_unref(&val_buf->bo);
+	ttm_bo_put(val_buf->bo);
+	val_buf->bo = NULL;
 	if (backup_dirty)
 		vmw_bo_unreference(&res->backup);
 
@@ -544,7 +546,8 @@ vmw_resource_backoff_reservation(struct ww_acquire_ctx *ticket,
 	INIT_LIST_HEAD(&val_list);
 	list_add_tail(&val_buf->head, &val_list);
 	ttm_eu_backoff_reservation(ticket, &val_list);
-	ttm_bo_unref(&val_buf->bo);
+	ttm_bo_put(val_buf->bo);
+	val_buf->bo = NULL;
 }
 
 /**
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
index b3f547fc5d3d..e9944ac2e057 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
@@ -628,8 +628,10 @@ void vmw_validation_unref_lists(struct vmw_validation_context *ctx)
 	struct vmw_validation_bo_node *entry;
 	struct vmw_validation_res_node *val;
 
-	list_for_each_entry(entry, &ctx->bo_list, base.head)
-		ttm_bo_unref(&entry->base.bo);
+	list_for_each_entry(entry, &ctx->bo_list, base.head) {
+		ttm_bo_put(entry->base.bo);
+		entry->base.bo = NULL;
+	}
 
 	list_splice_init(&ctx->resource_ctx_list, &ctx->resource_list);
 	list_for_each_entry(val, &ctx->resource_list, head)
diff --git a/drivers/gpu/drm/xen/xen_drm_front.c b/drivers/gpu/drm/xen/xen_drm_front.c
index 4d3d36fc3a5d..3e78a832d7f9 100644
--- a/drivers/gpu/drm/xen/xen_drm_front.c
+++ b/drivers/gpu/drm/xen/xen_drm_front.c
@@ -10,7 +10,7 @@
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drm_gem.h>
 
 #include <linux/of_device.h>
diff --git a/drivers/gpu/drm/xen/xen_drm_front_conn.c b/drivers/gpu/drm/xen/xen_drm_front_conn.c
index c91ae532fa55..9f5f31f77f1e 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_conn.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_conn.c
@@ -9,7 +9,7 @@
  */
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include <video/videomode.h>
 
@@ -89,7 +89,6 @@ static const struct drm_connector_helper_funcs connector_helper_funcs = {
 };
 
 static const struct drm_connector_funcs connector_funcs = {
-	.dpms = drm_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c
index 28bc501af450..53c376d55fcf 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_gem.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c
@@ -11,9 +11,9 @@
 #include "xen_drm_front_gem.h"
 
 #include <drm/drmP.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem.h>
+#include <drm/drm_probe_helper.h>
 
 #include <linux/dma-buf.h>
 #include <linux/scatterlist.h>
@@ -235,8 +235,14 @@ static int gem_mmap_obj(struct xen_gem_object *xen_obj,
 	vma->vm_flags &= ~VM_PFNMAP;
 	vma->vm_flags |= VM_MIXEDMAP;
 	vma->vm_pgoff = 0;
-	vma->vm_page_prot =
-			pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+	/*
+	 * According to Xen on ARM ABI (xen/include/public/arch-arm.h):
+	 * all memory which is shared with other entities in the system
+	 * (including the hypervisor and other guests) must reside in memory
+	 * which is mapped as Normal Inner Write-Back Outer Write-Back
+	 * Inner-Shareable.
+	 */
+	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
 
 	/*
 	 * vm_operations_struct.fault handler will be called if CPU access
@@ -282,8 +288,9 @@ void *xen_drm_front_gem_prime_vmap(struct drm_gem_object *gem_obj)
 	if (!xen_obj->pages)
 		return NULL;
 
+	/* Please see comment in gem_mmap_obj on mapping and attributes. */
 	return vmap(xen_obj->pages, xen_obj->num_pages,
-		    VM_MAP, pgprot_writecombine(PAGE_KERNEL));
+		    VM_MAP, PAGE_KERNEL);
 }
 
 void xen_drm_front_gem_prime_vunmap(struct drm_gem_object *gem_obj,
diff --git a/drivers/gpu/drm/xen/xen_drm_front_kms.c b/drivers/gpu/drm/xen/xen_drm_front_kms.c
index a3479eb72d79..c2955d375394 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_kms.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_kms.c
@@ -13,9 +13,9 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_gem.h>
 #include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_probe_helper.h>
 
 #include "xen_drm_front.h"
 #include "xen_drm_front_conn.h"
@@ -54,7 +54,7 @@ fb_create(struct drm_device *dev, struct drm_file *filp,
 	  const struct drm_mode_fb_cmd2 *mode_cmd)
 {
 	struct xen_drm_front_drm_info *drm_info = dev->dev_private;
-	static struct drm_framebuffer *fb;
+	struct drm_framebuffer *fb;
 	struct drm_gem_object *gem_obj;
 	int ret;
 
diff --git a/drivers/gpu/drm/zte/zx_drm_drv.c b/drivers/gpu/drm/zte/zx_drm_drv.c
index f5ea32ae8600..28e8d6072910 100644
--- a/drivers/gpu/drm/zte/zx_drm_drv.c
+++ b/drivers/gpu/drm/zte/zx_drm_drv.c
@@ -18,12 +18,12 @@
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #include "zx_drm_drv.h"
diff --git a/drivers/gpu/drm/zte/zx_hdmi.c b/drivers/gpu/drm/zte/zx_hdmi.c
index 78655269d843..df522d74bebf 100644
--- a/drivers/gpu/drm/zte/zx_hdmi.c
+++ b/drivers/gpu/drm/zte/zx_hdmi.c
@@ -20,9 +20,9 @@
 #include <linux/of_device.h>
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_of.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #include <sound/hdmi-codec.h>
@@ -125,7 +125,9 @@ static int zx_hdmi_config_video_avi(struct zx_hdmi *hdmi,
 	union hdmi_infoframe frame;
 	int ret;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
+						       &hdmi->connector,
+						       mode);
 	if (ret) {
 		DRM_DEV_ERROR(hdmi->dev, "failed to get avi infoframe: %d\n",
 			      ret);
diff --git a/drivers/gpu/drm/zte/zx_tvenc.c b/drivers/gpu/drm/zte/zx_tvenc.c
index b73afb212fb2..87b5d86413d2 100644
--- a/drivers/gpu/drm/zte/zx_tvenc.c
+++ b/drivers/gpu/drm/zte/zx_tvenc.c
@@ -14,7 +14,7 @@
 #include <linux/regmap.h>
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #include "zx_drm_drv.h"
diff --git a/drivers/gpu/drm/zte/zx_vga.c b/drivers/gpu/drm/zte/zx_vga.c
index 23d1ff4355a0..e14c1d709740 100644
--- a/drivers/gpu/drm/zte/zx_vga.c
+++ b/drivers/gpu/drm/zte/zx_vga.c
@@ -13,7 +13,7 @@
 #include <linux/regmap.h>
 
 #include <drm/drm_atomic_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #include "zx_drm_drv.h"
diff --git a/drivers/gpu/drm/zte/zx_vou.c b/drivers/gpu/drm/zte/zx_vou.c
index 442311d31110..15400ffb1d22 100644
--- a/drivers/gpu/drm/zte/zx_vou.c
+++ b/drivers/gpu/drm/zte/zx_vou.c
@@ -15,12 +15,12 @@
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
-#include <drm/drm_crtc_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_of.h>
 #include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
 #include <drm/drmP.h>
 
 #include "zx_common_regs.h"
diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c
index b4c385d4a6af..103fffc1904b 100644
--- a/drivers/gpu/host1x/bus.c
+++ b/drivers/gpu/host1x/bus.c
@@ -15,8 +15,10 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/debugfs.h>
 #include <linux/host1x.h>
 #include <linux/of.h>
+#include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/of_device.h>
 
@@ -500,6 +502,36 @@ static void host1x_detach_driver(struct host1x *host1x,
 	mutex_unlock(&host1x->devices_lock);
 }
 
+static int host1x_devices_show(struct seq_file *s, void *data)
+{
+	struct host1x *host1x = s->private;
+	struct host1x_device *device;
+
+	mutex_lock(&host1x->devices_lock);
+
+	list_for_each_entry(device, &host1x->devices, list) {
+		struct host1x_subdev *subdev;
+
+		seq_printf(s, "%s\n", dev_name(&device->dev));
+
+		mutex_lock(&device->subdevs_lock);
+
+		list_for_each_entry(subdev, &device->active, list)
+			seq_printf(s, "  %pOFf: %s\n", subdev->np,
+				   dev_name(subdev->client->dev));
+
+		list_for_each_entry(subdev, &device->subdevs, list)
+			seq_printf(s, "  %pOFf:\n", subdev->np);
+
+		mutex_unlock(&device->subdevs_lock);
+	}
+
+	mutex_unlock(&host1x->devices_lock);
+
+	return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(host1x_devices);
+
 /**
  * host1x_register() - register a host1x controller
  * @host1x: host1x controller
@@ -523,6 +555,9 @@ int host1x_register(struct host1x *host1x)
 
 	mutex_unlock(&drivers_lock);
 
+	debugfs_create_file("devices", S_IRUGO, host1x->debugfs, host1x,
+			    &host1x_devices_fops);
+
 	return 0;
 }
 
diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
index 91df51e631b2..f45b7c69b694 100644
--- a/drivers/gpu/host1x/cdma.c
+++ b/drivers/gpu/host1x/cdma.c
@@ -41,7 +41,17 @@
  * means that the push buffer is full, not empty.
  */
 
-#define HOST1X_PUSHBUFFER_SLOTS	512
+/*
+ * Typically the commands written into the push buffer are a pair of words. We
+ * use slots to represent each of these pairs and to simplify things. Note the
+ * strange number of slots allocated here. 512 slots will fit exactly within a
+ * single memory page. We also need one additional word at the end of the push
+ * buffer for the RESTART opcode that will instruct the CDMA to jump back to
+ * the beginning of the push buffer. With 512 slots, this means that we'll use
+ * 2 memory pages and waste 4092 bytes of the second page that will never be
+ * used.
+ */
+#define HOST1X_PUSHBUFFER_SLOTS	511
 
 /*
  * Clean up push buffer resources
@@ -143,7 +153,10 @@ static void host1x_pushbuffer_push(struct push_buffer *pb, u32 op1, u32 op2)
 	WARN_ON(pb->pos == pb->fence);
 	*(p++) = op1;
 	*(p++) = op2;
-	pb->pos = (pb->pos + 8) & (pb->size - 1);
+	pb->pos += 8;
+
+	if (pb->pos >= pb->size)
+		pb->pos -= pb->size;
 }
 
 /*
@@ -153,7 +166,10 @@ static void host1x_pushbuffer_push(struct push_buffer *pb, u32 op1, u32 op2)
 static void host1x_pushbuffer_pop(struct push_buffer *pb, unsigned int slots)
 {
 	/* Advance the next write position */
-	pb->fence = (pb->fence + slots * 8) & (pb->size - 1);
+	pb->fence += slots * 8;
+
+	if (pb->fence >= pb->size)
+		pb->fence -= pb->size;
 }
 
 /*
@@ -161,7 +177,12 @@ static void host1x_pushbuffer_pop(struct push_buffer *pb, unsigned int slots)
  */
 static u32 host1x_pushbuffer_space(struct push_buffer *pb)
 {
-	return ((pb->fence - pb->pos) & (pb->size - 1)) / 8;
+	unsigned int fence = pb->fence;
+
+	if (pb->fence < pb->pos)
+		fence += pb->size;
+
+	return (fence - pb->pos) / 8;
 }
 
 /*
@@ -210,7 +231,7 @@ unsigned int host1x_cdma_wait_locked(struct host1x_cdma *cdma,
 		cdma->event = event;
 
 		mutex_unlock(&cdma->lock);
-		down(&cdma->sem);
+		wait_for_completion(&cdma->complete);
 		mutex_lock(&cdma->lock);
 	}
 
@@ -218,6 +239,45 @@ unsigned int host1x_cdma_wait_locked(struct host1x_cdma *cdma,
 }
 
 /*
+ * Sleep (if necessary) until the push buffer has enough free space.
+ *
+ * Must be called with the cdma lock held.
+ */
+int host1x_cdma_wait_pushbuffer_space(struct host1x *host1x,
+				      struct host1x_cdma *cdma,
+				      unsigned int needed)
+{
+	while (true) {
+		struct push_buffer *pb = &cdma->push_buffer;
+		unsigned int space;
+
+		space = host1x_pushbuffer_space(pb);
+		if (space >= needed)
+			break;
+
+		trace_host1x_wait_cdma(dev_name(cdma_to_channel(cdma)->dev),
+				       CDMA_EVENT_PUSH_BUFFER_SPACE);
+
+		host1x_hw_cdma_flush(host1x, cdma);
+
+		/* If somebody has managed to already start waiting, yield */
+		if (cdma->event != CDMA_EVENT_NONE) {
+			mutex_unlock(&cdma->lock);
+			schedule();
+			mutex_lock(&cdma->lock);
+			continue;
+		}
+
+		cdma->event = CDMA_EVENT_PUSH_BUFFER_SPACE;
+
+		mutex_unlock(&cdma->lock);
+		wait_for_completion(&cdma->complete);
+		mutex_lock(&cdma->lock);
+	}
+
+	return 0;
+}
+/*
  * Start timer that tracks the time spent by the job.
  * Must be called with the cdma lock held.
  */
@@ -314,7 +374,7 @@ static void update_cdma_locked(struct host1x_cdma *cdma)
 
 	if (signal) {
 		cdma->event = CDMA_EVENT_NONE;
-		up(&cdma->sem);
+		complete(&cdma->complete);
 	}
 }
 
@@ -323,7 +383,7 @@ void host1x_cdma_update_sync_queue(struct host1x_cdma *cdma,
 {
 	struct host1x *host1x = cdma_to_host1x(cdma);
 	u32 restart_addr, syncpt_incrs, syncpt_val;
-	struct host1x_job *job = NULL;
+	struct host1x_job *job, *next_job = NULL;
 
 	syncpt_val = host1x_syncpt_load(cdma->timeout.syncpt);
 
@@ -341,40 +401,37 @@ void host1x_cdma_update_sync_queue(struct host1x_cdma *cdma,
 		__func__);
 
 	list_for_each_entry(job, &cdma->sync_queue, list) {
-		if (syncpt_val < job->syncpt_end)
-			break;
+		if (syncpt_val < job->syncpt_end) {
+
+			if (!list_is_last(&job->list, &cdma->sync_queue))
+				next_job = list_next_entry(job, list);
+
+			goto syncpt_incr;
+		}
 
 		host1x_job_dump(dev, job);
 	}
 
+	/* all jobs have been completed */
+	job = NULL;
+
+syncpt_incr:
+
 	/*
-	 * Walk the sync_queue, first incrementing with the CPU syncpts that
-	 * are partially executed (the first buffer) or fully skipped while
-	 * still in the current context (slots are also NOP-ed).
+	 * Increment with CPU the remaining syncpts of a partially executed job.
 	 *
-	 * At the point contexts are interleaved, syncpt increments must be
-	 * done inline with the pushbuffer from a GATHER buffer to maintain
-	 * the order (slots are modified to be a GATHER of syncpt incrs).
-	 *
-	 * Note: save in restart_addr the location where the timed out buffer
-	 * started in the PB, so we can start the refetch from there (with the
-	 * modified NOP-ed PB slots). This lets things appear to have completed
-	 * properly for this buffer and resources are freed.
+	 * CDMA will continue execution starting with the next job or will get
+	 * into idle state.
 	 */
-
-	dev_dbg(dev, "%s: perform CPU incr on pending same ctx buffers\n",
-		__func__);
-
-	if (!list_empty(&cdma->sync_queue))
-		restart_addr = job->first_get;
+	if (next_job)
+		restart_addr = next_job->first_get;
 	else
 		restart_addr = cdma->last_pos;
 
-	/* do CPU increments as long as this context continues */
-	list_for_each_entry_from(job, &cdma->sync_queue, list) {
-		/* different context, gets us out of this loop */
-		if (job->client != cdma->timeout.client)
-			break;
+	/* do CPU increments for the remaining syncpts */
+	if (job) {
+		dev_dbg(dev, "%s: perform CPU incr on pending buffers\n",
+			__func__);
 
 		/* won't need a timeout when replayed */
 		job->timeout = 0;
@@ -389,21 +446,10 @@ void host1x_cdma_update_sync_queue(struct host1x_cdma *cdma,
 						syncpt_incrs, job->syncpt_end,
 						job->num_slots);
 
-		syncpt_val += syncpt_incrs;
+		dev_dbg(dev, "%s: finished sync_queue modification\n",
+			__func__);
 	}
 
-	/*
-	 * The following sumbits from the same client may be dependent on the
-	 * failed submit and therefore they may fail. Force a small timeout
-	 * to make the queue cleanup faster.
-	 */
-
-	list_for_each_entry_from(job, &cdma->sync_queue, list)
-		if (job->client == cdma->timeout.client)
-			job->timeout = min_t(unsigned int, job->timeout, 500);
-
-	dev_dbg(dev, "%s: finished sync_queue modification\n", __func__);
-
 	/* roll back DMAGET and start up channel again */
 	host1x_hw_cdma_resume(host1x, cdma, restart_addr);
 }
@@ -416,7 +462,7 @@ int host1x_cdma_init(struct host1x_cdma *cdma)
 	int err;
 
 	mutex_init(&cdma->lock);
-	sema_init(&cdma->sem, 0);
+	init_completion(&cdma->complete);
 
 	INIT_LIST_HEAD(&cdma->sync_queue);
 
@@ -510,6 +556,59 @@ void host1x_cdma_push(struct host1x_cdma *cdma, u32 op1, u32 op2)
 }
 
 /*
+ * Push four words into two consecutive push buffer slots. Note that extra
+ * care needs to be taken not to split the two slots across the end of the
+ * push buffer. Otherwise the RESTART opcode at the end of the push buffer
+ * that ensures processing will restart at the beginning will break up the
+ * four words.
+ *
+ * Blocks as necessary if the push buffer is full.
+ */
+void host1x_cdma_push_wide(struct host1x_cdma *cdma, u32 op1, u32 op2,
+			   u32 op3, u32 op4)
+{
+	struct host1x_channel *channel = cdma_to_channel(cdma);
+	struct host1x *host1x = cdma_to_host1x(cdma);
+	struct push_buffer *pb = &cdma->push_buffer;
+	unsigned int needed = 2, extra = 0, i;
+	unsigned int space = cdma->slots_free;
+
+	if (host1x_debug_trace_cmdbuf)
+		trace_host1x_cdma_push_wide(dev_name(channel->dev), op1, op2,
+					    op3, op4);
+
+	/* compute number of extra slots needed for padding */
+	if (pb->pos + 16 > pb->size) {
+		extra = (pb->size - pb->pos) / 8;
+		needed += extra;
+	}
+
+	host1x_cdma_wait_pushbuffer_space(host1x, cdma, needed);
+	space = host1x_pushbuffer_space(pb);
+
+	cdma->slots_free = space - needed;
+	cdma->slots_used += needed;
+
+	/*
+	 * Note that we rely on the fact that this is only used to submit wide
+	 * gather opcodes, which consist of 3 words, and they are padded with
+	 * a NOP to avoid having to deal with fractional slots (a slot always
+	 * represents 2 words). The fourth opcode passed to this function will
+	 * therefore always be a NOP.
+	 *
+	 * This works around a slight ambiguity when it comes to opcodes. For
+	 * all current host1x incarnations the NOP opcode uses the exact same
+	 * encoding (0x20000000), so we could hard-code the value here, but a
+	 * new incarnation may change it and break that assumption.
+	 */
+	for (i = 0; i < extra; i++)
+		host1x_pushbuffer_push(pb, op4, op4);
+
+	host1x_pushbuffer_push(pb, op1, op2);
+	host1x_pushbuffer_push(pb, op3, op4);
+}
+
+/*
  * End a cdma submit
  * Kick off DMA, add job to the sync queue, and a number of slots to be freed
  * from the pushbuffer. The handles for a submit must all be pinned at the same
diff --git a/drivers/gpu/host1x/cdma.h b/drivers/gpu/host1x/cdma.h
index e97e17b82370..3a5e0408b8d1 100644
--- a/drivers/gpu/host1x/cdma.h
+++ b/drivers/gpu/host1x/cdma.h
@@ -20,7 +20,7 @@
 #define __HOST1X_CDMA_H
 
 #include <linux/sched.h>
-#include <linux/semaphore.h>
+#include <linux/completion.h>
 #include <linux/list.h>
 
 struct host1x_syncpt;
@@ -69,8 +69,8 @@ enum cdma_event {
 
 struct host1x_cdma {
 	struct mutex lock;		/* controls access to shared state */
-	struct semaphore sem;		/* signalled when event occurs */
-	enum cdma_event event;		/* event that sem is waiting for */
+	struct completion complete;	/* signalled when event occurs */
+	enum cdma_event event;		/* event that complete is waiting for */
 	unsigned int slots_used;	/* pb slots used in current submit */
 	unsigned int slots_free;	/* pb slots free in current submit */
 	unsigned int first_get;		/* DMAGET value, where submit begins */
@@ -90,6 +90,8 @@ int host1x_cdma_init(struct host1x_cdma *cdma);
 int host1x_cdma_deinit(struct host1x_cdma *cdma);
 int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job);
 void host1x_cdma_push(struct host1x_cdma *cdma, u32 op1, u32 op2);
+void host1x_cdma_push_wide(struct host1x_cdma *cdma, u32 op1, u32 op2,
+			   u32 op3, u32 op4);
 void host1x_cdma_end(struct host1x_cdma *cdma, struct host1x_job *job);
 void host1x_cdma_update(struct host1x_cdma *cdma);
 void host1x_cdma_peek(struct host1x_cdma *cdma, u32 dmaget, int slot,
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index 419d8929a98f..ee3c7b81a29d 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -120,6 +120,15 @@ static const struct host1x_info host1x05_info = {
 	.dma_mask = DMA_BIT_MASK(34),
 };
 
+static const struct host1x_sid_entry tegra186_sid_table[] = {
+	{
+		/* VIC */
+		.base = 0x1af0,
+		.offset = 0x30,
+		.limit = 0x34
+	},
+};
+
 static const struct host1x_info host1x06_info = {
 	.nb_channels = 63,
 	.nb_pts = 576,
@@ -127,8 +136,19 @@ static const struct host1x_info host1x06_info = {
 	.nb_bases = 16,
 	.init = host1x06_init,
 	.sync_offset = 0x0,
-	.dma_mask = DMA_BIT_MASK(34),
+	.dma_mask = DMA_BIT_MASK(40),
 	.has_hypervisor = true,
+	.num_sid_entries = ARRAY_SIZE(tegra186_sid_table),
+	.sid_table = tegra186_sid_table,
+};
+
+static const struct host1x_sid_entry tegra194_sid_table[] = {
+	{
+		/* VIC */
+		.base = 0x1af0,
+		.offset = 0x30,
+		.limit = 0x34
+	},
 };
 
 static const struct host1x_info host1x07_info = {
@@ -140,6 +160,8 @@ static const struct host1x_info host1x07_info = {
 	.sync_offset = 0x0,
 	.dma_mask = DMA_BIT_MASK(40),
 	.has_hypervisor = true,
+	.num_sid_entries = ARRAY_SIZE(tegra194_sid_table),
+	.sid_table = tegra194_sid_table,
 };
 
 static const struct of_device_id host1x_of_match[] = {
@@ -154,6 +176,19 @@ static const struct of_device_id host1x_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, host1x_of_match);
 
+static void host1x_setup_sid_table(struct host1x *host)
+{
+	const struct host1x_info *info = host->info;
+	unsigned int i;
+
+	for (i = 0; i < info->num_sid_entries; i++) {
+		const struct host1x_sid_entry *entry = &info->sid_table[i];
+
+		host1x_hypervisor_writel(host, entry->offset, entry->base);
+		host1x_hypervisor_writel(host, entry->limit, entry->base + 4);
+	}
+}
+
 static int host1x_probe(struct platform_device *pdev)
 {
 	struct host1x *host;
@@ -248,6 +283,8 @@ static int host1x_probe(struct platform_device *pdev)
 	host->group = iommu_group_get(&pdev->dev);
 	if (host->group) {
 		struct iommu_domain_geometry *geometry;
+		u64 mask = dma_get_mask(host->dev);
+		dma_addr_t start, end;
 		unsigned long order;
 
 		err = iova_cache_get();
@@ -275,11 +312,12 @@ static int host1x_probe(struct platform_device *pdev)
 		}
 
 		geometry = &host->domain->geometry;
+		start = geometry->aperture_start & mask;
+		end = geometry->aperture_end & mask;
 
 		order = __ffs(host->domain->pgsize_bitmap);
-		init_iova_domain(&host->iova, 1UL << order,
-				 geometry->aperture_start >> order);
-		host->iova_end = geometry->aperture_end;
+		init_iova_domain(&host->iova, 1UL << order, start >> order);
+		host->iova_end = end;
 	}
 
 skip_iommu:
@@ -316,6 +354,9 @@ skip_iommu:
 
 	host1x_debug_init(host);
 
+	if (host->info->has_hypervisor)
+		host1x_setup_sid_table(host);
+
 	err = host1x_register(host);
 	if (err < 0)
 		goto fail_deinit_intr;
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index 36f44ffebe73..05216a7e4830 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -94,6 +94,12 @@ struct host1x_intr_ops {
 	int (*free_syncpt_irq)(struct host1x *host);
 };
 
+struct host1x_sid_entry {
+	unsigned int base;
+	unsigned int offset;
+	unsigned int limit;
+};
+
 struct host1x_info {
 	unsigned int nb_channels; /* host1x: number of channels supported */
 	unsigned int nb_pts; /* host1x: number of syncpoints supported */
@@ -103,6 +109,8 @@ struct host1x_info {
 	unsigned int sync_offset; /* offset of syncpoint registers */
 	u64 dma_mask; /* mask of addressable memory */
 	bool has_hypervisor; /* has hypervisor registers */
+	unsigned int num_sid_entries;
+	const struct host1x_sid_entry *sid_table;
 };
 
 struct host1x {
diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c
index ce320534cbed..5d61088db2bb 100644
--- a/drivers/gpu/host1x/hw/cdma_hw.c
+++ b/drivers/gpu/host1x/hw/cdma_hw.c
@@ -39,8 +39,6 @@ static void push_buffer_init(struct push_buffer *pb)
 static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr,
 				u32 syncpt_incrs, u32 syncval, u32 nr_slots)
 {
-	struct host1x *host1x = cdma_to_host1x(cdma);
-	struct push_buffer *pb = &cdma->push_buffer;
 	unsigned int i;
 
 	for (i = 0; i < syncpt_incrs; i++)
@@ -48,18 +46,6 @@ static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr,
 
 	/* after CPU incr, ensure shadow is up to date */
 	host1x_syncpt_load(cdma->timeout.syncpt);
-
-	/* NOP all the PB slots */
-	while (nr_slots--) {
-		u32 *p = (u32 *)(pb->mapped + getptr);
-		*(p++) = HOST1X_OPCODE_NOP;
-		*(p++) = HOST1X_OPCODE_NOP;
-		dev_dbg(host1x->dev, "%s: NOP at %pad+%#x\n", __func__,
-			&pb->dma, getptr);
-		getptr = (getptr + 8) & (pb->size - 1);
-	}
-
-	wmb();
 }
 
 /*
@@ -68,20 +54,31 @@ static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr,
 static void cdma_start(struct host1x_cdma *cdma)
 {
 	struct host1x_channel *ch = cdma_to_channel(cdma);
+	u64 start, end;
 
 	if (cdma->running)
 		return;
 
 	cdma->last_pos = cdma->push_buffer.pos;
+	start = cdma->push_buffer.dma;
+	end = cdma->push_buffer.size + 4;
 
 	host1x_ch_writel(ch, HOST1X_CHANNEL_DMACTRL_DMASTOP,
 			 HOST1X_CHANNEL_DMACTRL);
 
 	/* set base, put and end pointer */
-	host1x_ch_writel(ch, cdma->push_buffer.dma, HOST1X_CHANNEL_DMASTART);
+	host1x_ch_writel(ch, lower_32_bits(start), HOST1X_CHANNEL_DMASTART);
+#if HOST1X_HW >= 6
+	host1x_ch_writel(ch, upper_32_bits(start), HOST1X_CHANNEL_DMASTART_HI);
+#endif
 	host1x_ch_writel(ch, cdma->push_buffer.pos, HOST1X_CHANNEL_DMAPUT);
-	host1x_ch_writel(ch, cdma->push_buffer.dma + cdma->push_buffer.size + 4,
-			 HOST1X_CHANNEL_DMAEND);
+#if HOST1X_HW >= 6
+	host1x_ch_writel(ch, 0, HOST1X_CHANNEL_DMAPUT_HI);
+#endif
+	host1x_ch_writel(ch, lower_32_bits(end), HOST1X_CHANNEL_DMAEND);
+#if HOST1X_HW >= 6
+	host1x_ch_writel(ch, upper_32_bits(end), HOST1X_CHANNEL_DMAEND_HI);
+#endif
 
 	/* reset GET */
 	host1x_ch_writel(ch, HOST1X_CHANNEL_DMACTRL_DMASTOP |
@@ -104,6 +101,7 @@ static void cdma_timeout_restart(struct host1x_cdma *cdma, u32 getptr)
 {
 	struct host1x *host1x = cdma_to_host1x(cdma);
 	struct host1x_channel *ch = cdma_to_channel(cdma);
+	u64 start, end;
 
 	if (cdma->running)
 		return;
@@ -113,10 +111,18 @@ static void cdma_timeout_restart(struct host1x_cdma *cdma, u32 getptr)
 	host1x_ch_writel(ch, HOST1X_CHANNEL_DMACTRL_DMASTOP,
 			 HOST1X_CHANNEL_DMACTRL);
 
+	start = cdma->push_buffer.dma;
+	end = cdma->push_buffer.size + 4;
+
 	/* set base, end pointer (all of memory) */
-	host1x_ch_writel(ch, cdma->push_buffer.dma, HOST1X_CHANNEL_DMASTART);
-	host1x_ch_writel(ch, cdma->push_buffer.dma + cdma->push_buffer.size,
-			 HOST1X_CHANNEL_DMAEND);
+	host1x_ch_writel(ch, lower_32_bits(start), HOST1X_CHANNEL_DMASTART);
+#if HOST1X_HW >= 6
+	host1x_ch_writel(ch, upper_32_bits(start), HOST1X_CHANNEL_DMASTART_HI);
+#endif
+	host1x_ch_writel(ch, lower_32_bits(end), HOST1X_CHANNEL_DMAEND);
+#if HOST1X_HW >= 6
+	host1x_ch_writel(ch, upper_32_bits(end), HOST1X_CHANNEL_DMAEND_HI);
+#endif
 
 	/* set GET, by loading the value in PUT (then reset GET) */
 	host1x_ch_writel(ch, getptr, HOST1X_CHANNEL_DMAPUT);
diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c
index 95ea81172a83..27101c04a827 100644
--- a/drivers/gpu/host1x/hw/channel_hw.c
+++ b/drivers/gpu/host1x/hw/channel_hw.c
@@ -17,6 +17,7 @@
  */
 
 #include <linux/host1x.h>
+#include <linux/iommu.h>
 #include <linux/slab.h>
 
 #include <trace/events/host1x.h>
@@ -60,15 +61,37 @@ static void trace_write_gather(struct host1x_cdma *cdma, struct host1x_bo *bo,
 static void submit_gathers(struct host1x_job *job)
 {
 	struct host1x_cdma *cdma = &job->channel->cdma;
+#if HOST1X_HW < 6
+	struct device *dev = job->channel->dev;
+#endif
 	unsigned int i;
 
 	for (i = 0; i < job->num_gathers; i++) {
 		struct host1x_job_gather *g = &job->gathers[i];
-		u32 op1 = host1x_opcode_gather(g->words);
-		u32 op2 = g->base + g->offset;
+		dma_addr_t addr = g->base + g->offset;
+		u32 op2, op3;
+
+		op2 = lower_32_bits(addr);
+		op3 = upper_32_bits(addr);
+
+		trace_write_gather(cdma, g->bo, g->offset, g->words);
+
+		if (op3 != 0) {
+#if HOST1X_HW >= 6
+			u32 op1 = host1x_opcode_gather_wide(g->words);
+			u32 op4 = HOST1X_OPCODE_NOP;
+
+			host1x_cdma_push_wide(cdma, op1, op2, op3, op4);
+#else
+			dev_err(dev, "invalid gather for push buffer %pad\n",
+				&addr);
+			continue;
+#endif
+		} else {
+			u32 op1 = host1x_opcode_gather(g->words);
 
-		trace_write_gather(cdma, g->bo, g->offset, op1 & 0xffff);
-		host1x_cdma_push(cdma, op1, op2);
+			host1x_cdma_push(cdma, op1, op2);
+		}
 	}
 }
 
@@ -89,6 +112,16 @@ static inline void synchronize_syncpt_base(struct host1x_job *job)
 			 HOST1X_UCLASS_LOAD_SYNCPT_BASE_VALUE_F(value));
 }
 
+static void host1x_channel_set_streamid(struct host1x_channel *channel)
+{
+#if HOST1X_HW >= 6
+	struct iommu_fwspec *spec = dev_iommu_fwspec_get(channel->dev->parent);
+	u32 sid = spec ? spec->ids[0] & 0xffff : 0x7f;
+
+	host1x_ch_writel(channel, sid, HOST1X_CHANNEL_SMMU_STREAMID);
+#endif
+}
+
 static int channel_submit(struct host1x_job *job)
 {
 	struct host1x_channel *ch = job->channel;
@@ -120,6 +153,8 @@ static int channel_submit(struct host1x_job *job)
 		goto error;
 	}
 
+	host1x_channel_set_streamid(ch);
+
 	/* begin a CDMA submit */
 	err = host1x_cdma_begin(&ch->cdma, job);
 	if (err) {
diff --git a/drivers/gpu/host1x/hw/host1x06_hardware.h b/drivers/gpu/host1x/hw/host1x06_hardware.h
index 3039c92ea605..dd37b10c8d04 100644
--- a/drivers/gpu/host1x/hw/host1x06_hardware.h
+++ b/drivers/gpu/host1x/hw/host1x06_hardware.h
@@ -22,6 +22,7 @@
 #include <linux/types.h>
 #include <linux/bitops.h>
 
+#include "hw_host1x06_channel.h"
 #include "hw_host1x06_uclass.h"
 #include "hw_host1x06_vm.h"
 #include "hw_host1x06_hypervisor.h"
@@ -137,6 +138,11 @@ static inline u32 host1x_opcode_gather_incr(unsigned offset, unsigned count)
 	return (6 << 28) | (offset << 16) | BIT(15) | BIT(14) | count;
 }
 
+static inline u32 host1x_opcode_gather_wide(unsigned count)
+{
+	return (12 << 28) | count;
+}
+
 #define HOST1X_OPCODE_NOP host1x_opcode_nonincr(0, 0)
 
 #endif
diff --git a/drivers/gpu/host1x/hw/host1x07_hardware.h b/drivers/gpu/host1x/hw/host1x07_hardware.h
index 1353e7ab71dd..9f6da4ee5443 100644
--- a/drivers/gpu/host1x/hw/host1x07_hardware.h
+++ b/drivers/gpu/host1x/hw/host1x07_hardware.h
@@ -22,6 +22,7 @@
 #include <linux/types.h>
 #include <linux/bitops.h>
 
+#include "hw_host1x07_channel.h"
 #include "hw_host1x07_uclass.h"
 #include "hw_host1x07_vm.h"
 #include "hw_host1x07_hypervisor.h"
@@ -137,6 +138,11 @@ static inline u32 host1x_opcode_gather_incr(unsigned offset, unsigned count)
 	return (6 << 28) | (offset << 16) | BIT(15) | BIT(14) | count;
 }
 
+static inline u32 host1x_opcode_gather_wide(unsigned count)
+{
+	return (12 << 28) | count;
+}
+
 #define HOST1X_OPCODE_NOP host1x_opcode_nonincr(0, 0)
 
 #endif
diff --git a/drivers/gpu/host1x/hw/hw_host1x06_channel.h b/drivers/gpu/host1x/hw/hw_host1x06_channel.h
new file mode 100644
index 000000000000..18ae1c57bbea
--- /dev/null
+++ b/drivers/gpu/host1x/hw/hw_host1x06_channel.h
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2019 NVIDIA Corporation.
+ */
+
+#ifndef HOST1X_HW_HOST1X06_CHANNEL_H
+#define HOST1X_HW_HOST1X06_CHANNEL_H
+
+#define HOST1X_CHANNEL_SMMU_STREAMID 0x084
+
+#endif
diff --git a/drivers/gpu/host1x/hw/hw_host1x07_channel.h b/drivers/gpu/host1x/hw/hw_host1x07_channel.h
new file mode 100644
index 000000000000..96fa72bbd7ab
--- /dev/null
+++ b/drivers/gpu/host1x/hw/hw_host1x07_channel.h
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2019 NVIDIA Corporation.
+ */
+
+#ifndef HOST1X_HW_HOST1X07_CHANNEL_H
+#define HOST1X_HW_HOST1X07_CHANNEL_H
+
+#define HOST1X_CHANNEL_SMMU_STREAMID 0x084
+
+#endif
diff --git a/drivers/gpu/ipu-v3/ipu-pre.c b/drivers/gpu/ipu-v3/ipu-pre.c
index 4a28f3fbb0a2..6cacfd61d984 100644
--- a/drivers/gpu/ipu-v3/ipu-pre.c
+++ b/drivers/gpu/ipu-v3/ipu-pre.c
@@ -265,6 +265,12 @@ void ipu_pre_update(struct ipu_pre *pre, unsigned int bufaddr)
 	writel(IPU_PRE_CTRL_SDW_UPDATE, pre->regs + IPU_PRE_CTRL_SET);
 }
 
+bool ipu_pre_update_pending(struct ipu_pre *pre)
+{
+	return !!(readl_relaxed(pre->regs + IPU_PRE_CTRL) &
+		  IPU_PRE_CTRL_SDW_UPDATE);
+}
+
 u32 ipu_pre_get_baddr(struct ipu_pre *pre)
 {
 	return (u32)pre->buffer_paddr;
diff --git a/drivers/gpu/ipu-v3/ipu-prg.c b/drivers/gpu/ipu-v3/ipu-prg.c
index 38a3a9764e49..94b76badf677 100644
--- a/drivers/gpu/ipu-v3/ipu-prg.c
+++ b/drivers/gpu/ipu-v3/ipu-prg.c
@@ -347,6 +347,22 @@ int ipu_prg_channel_configure(struct ipuv3_channel *ipu_chan,
 }
 EXPORT_SYMBOL_GPL(ipu_prg_channel_configure);
 
+bool ipu_prg_channel_configure_pending(struct ipuv3_channel *ipu_chan)
+{
+	int prg_chan = ipu_prg_ipu_to_prg_chan(ipu_chan->num);
+	struct ipu_prg *prg = ipu_chan->ipu->prg_priv;
+	struct ipu_prg_channel *chan;
+
+	if (prg_chan < 0)
+		return false;
+
+	chan = &prg->chan[prg_chan];
+	WARN_ON(!chan->enabled);
+
+	return ipu_pre_update_pending(prg->pres[chan->used_pre]);
+}
+EXPORT_SYMBOL_GPL(ipu_prg_channel_configure_pending);
+
 static int ipu_prg_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
diff --git a/drivers/gpu/ipu-v3/ipu-prv.h b/drivers/gpu/ipu-v3/ipu-prv.h
index d6beee99b6b8..38622e835e95 100644
--- a/drivers/gpu/ipu-v3/ipu-prv.h
+++ b/drivers/gpu/ipu-v3/ipu-prv.h
@@ -272,6 +272,7 @@ void ipu_pre_configure(struct ipu_pre *pre, unsigned int width,
 		       unsigned int height, unsigned int stride, u32 format,
 		       uint64_t modifier, unsigned int bufaddr);
 void ipu_pre_update(struct ipu_pre *pre, unsigned int bufaddr);
+bool ipu_pre_update_pending(struct ipu_pre *pre);
 
 struct ipu_prg *ipu_prg_lookup_by_phandle(struct device *dev, const char *name,
 					  int ipu_id);
diff --git a/drivers/phy/allwinner/Kconfig b/drivers/phy/allwinner/Kconfig
index cdc1e745ba47..fb1204bcc454 100644
--- a/drivers/phy/allwinner/Kconfig
+++ b/drivers/phy/allwinner/Kconfig
@@ -17,6 +17,18 @@ config PHY_SUN4I_USB
 	  This driver controls the entire USB PHY block, both the USB OTG
 	  parts, as well as the 2 regular USB 2 host PHYs.
 
+config PHY_SUN6I_MIPI_DPHY
+	tristate "Allwinner A31 MIPI D-PHY Support"
+	depends on ARCH_SUNXI && HAS_IOMEM && OF
+	depends on RESET_CONTROLLER
+	select GENERIC_PHY
+	select GENERIC_PHY_MIPI_DPHY
+	select REGMAP_MMIO
+	help
+	  Choose this option if you have an Allwinner SoC with
+	  MIPI-DSI support. If M is selected, the module will be
+	  called sun6i_mipi_dphy.
+
 config PHY_SUN9I_USB
 	tristate "Allwinner sun9i SoC USB PHY driver"
 	depends on ARCH_SUNXI && HAS_IOMEM && OF
diff --git a/drivers/phy/allwinner/Makefile b/drivers/phy/allwinner/Makefile
index 8605529c01a1..7d0053efbfaa 100644
--- a/drivers/phy/allwinner/Makefile
+++ b/drivers/phy/allwinner/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_PHY_SUN4I_USB)		+= phy-sun4i-usb.o
+obj-$(CONFIG_PHY_SUN6I_MIPI_DPHY)	+= phy-sun6i-mipi-dphy.o
 obj-$(CONFIG_PHY_SUN9I_USB)		+= phy-sun9i-usb.o
diff --git a/drivers/gpu/drm/sun4i/sun6i_mipi_dphy.c b/drivers/phy/allwinner/phy-sun6i-mipi-dphy.c
index e4d19431fa0e..79c8af5c7c1d 100644
--- a/drivers/gpu/drm/sun4i/sun6i_mipi_dphy.c
+++ b/drivers/phy/allwinner/phy-sun6i-mipi-dphy.c
@@ -8,11 +8,14 @@
 
 #include <linux/bitops.h>
 #include <linux/clk.h>
+#include <linux/module.h>
 #include <linux/of_address.h>
+#include <linux/platform_device.h>
 #include <linux/regmap.h>
 #include <linux/reset.h>
 
-#include "sun6i_mipi_dsi.h"
+#include <linux/phy/phy.h>
+#include <linux/phy/phy-mipi-dphy.h>
 
 #define SUN6I_DPHY_GCTL_REG		0x00
 #define SUN6I_DPHY_GCTL_LANE_NUM(n)		((((n) - 1) & 3) << 4)
@@ -81,12 +84,46 @@
 
 #define SUN6I_DPHY_DBG5_REG		0xf4
 
-int sun6i_dphy_init(struct sun6i_dphy *dphy, unsigned int lanes)
+struct sun6i_dphy {
+	struct clk				*bus_clk;
+	struct clk				*mod_clk;
+	struct regmap				*regs;
+	struct reset_control			*reset;
+
+	struct phy				*phy;
+	struct phy_configure_opts_mipi_dphy	config;
+};
+
+static int sun6i_dphy_init(struct phy *phy)
 {
+	struct sun6i_dphy *dphy = phy_get_drvdata(phy);
+
 	reset_control_deassert(dphy->reset);
 	clk_prepare_enable(dphy->mod_clk);
 	clk_set_rate_exclusive(dphy->mod_clk, 150000000);
 
+	return 0;
+}
+
+static int sun6i_dphy_configure(struct phy *phy, union phy_configure_opts *opts)
+{
+	struct sun6i_dphy *dphy = phy_get_drvdata(phy);
+	int ret;
+
+	ret = phy_mipi_dphy_config_validate(&opts->mipi_dphy);
+	if (ret)
+		return ret;
+
+	memcpy(&dphy->config, opts, sizeof(dphy->config));
+
+	return 0;
+}
+
+static int sun6i_dphy_power_on(struct phy *phy)
+{
+	struct sun6i_dphy *dphy = phy_get_drvdata(phy);
+	u8 lanes_mask = GENMASK(dphy->config.lanes - 1, 0);
+
 	regmap_write(dphy->regs, SUN6I_DPHY_TX_CTL_REG,
 		     SUN6I_DPHY_TX_CTL_HS_TX_CLK_CONT);
 
@@ -111,16 +148,9 @@ int sun6i_dphy_init(struct sun6i_dphy *dphy, unsigned int lanes)
 		     SUN6I_DPHY_TX_TIME4_HS_TX_ANA1(3));
 
 	regmap_write(dphy->regs, SUN6I_DPHY_GCTL_REG,
-		     SUN6I_DPHY_GCTL_LANE_NUM(lanes) |
+		     SUN6I_DPHY_GCTL_LANE_NUM(dphy->config.lanes) |
 		     SUN6I_DPHY_GCTL_EN);
 
-	return 0;
-}
-
-int sun6i_dphy_power_on(struct sun6i_dphy *dphy, unsigned int lanes)
-{
-	u8 lanes_mask = GENMASK(lanes - 1, 0);
-
 	regmap_write(dphy->regs, SUN6I_DPHY_ANA0_REG,
 		     SUN6I_DPHY_ANA0_REG_PWS |
 		     SUN6I_DPHY_ANA0_REG_DMPC |
@@ -181,16 +211,20 @@ int sun6i_dphy_power_on(struct sun6i_dphy *dphy, unsigned int lanes)
 	return 0;
 }
 
-int sun6i_dphy_power_off(struct sun6i_dphy *dphy)
+static int sun6i_dphy_power_off(struct phy *phy)
 {
+	struct sun6i_dphy *dphy = phy_get_drvdata(phy);
+
 	regmap_update_bits(dphy->regs, SUN6I_DPHY_ANA1_REG,
 			   SUN6I_DPHY_ANA1_REG_VTTMODE, 0);
 
 	return 0;
 }
 
-int sun6i_dphy_exit(struct sun6i_dphy *dphy)
+static int sun6i_dphy_exit(struct phy *phy)
 {
+	struct sun6i_dphy *dphy = phy_get_drvdata(phy);
+
 	clk_rate_exclusive_put(dphy->mod_clk);
 	clk_disable_unprepare(dphy->mod_clk);
 	reset_control_assert(dphy->reset);
@@ -198,6 +232,15 @@ int sun6i_dphy_exit(struct sun6i_dphy *dphy)
 	return 0;
 }
 
+
+static struct phy_ops sun6i_dphy_ops = {
+	.configure	= sun6i_dphy_configure,
+	.power_on	= sun6i_dphy_power_on,
+	.power_off	= sun6i_dphy_power_off,
+	.init		= sun6i_dphy_init,
+	.exit		= sun6i_dphy_exit,
+};
+
 static struct regmap_config sun6i_dphy_regmap_config = {
 	.reg_bits	= 32,
 	.val_bits	= 32,
@@ -206,87 +249,70 @@ static struct regmap_config sun6i_dphy_regmap_config = {
 	.name		= "mipi-dphy",
 };
 
-static const struct of_device_id sun6i_dphy_of_table[] = {
-	{ .compatible = "allwinner,sun6i-a31-mipi-dphy" },
-	{ }
-};
-
-int sun6i_dphy_probe(struct sun6i_dsi *dsi, struct device_node *node)
+static int sun6i_dphy_probe(struct platform_device *pdev)
 {
+	struct phy_provider *phy_provider;
 	struct sun6i_dphy *dphy;
-	struct resource res;
+	struct resource *res;
 	void __iomem *regs;
-	int ret;
-
-	if (!of_match_node(sun6i_dphy_of_table, node)) {
-		dev_err(dsi->dev, "Incompatible D-PHY\n");
-		return -EINVAL;
-	}
 
-	dphy = devm_kzalloc(dsi->dev, sizeof(*dphy), GFP_KERNEL);
+	dphy = devm_kzalloc(&pdev->dev, sizeof(*dphy), GFP_KERNEL);
 	if (!dphy)
 		return -ENOMEM;
 
-	ret = of_address_to_resource(node, 0, &res);
-	if (ret) {
-		dev_err(dsi->dev, "phy: Couldn't get our resources\n");
-		return ret;
-	}
-
-	regs = devm_ioremap_resource(dsi->dev, &res);
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	regs = devm_ioremap_resource(&pdev->dev, res);
 	if (IS_ERR(regs)) {
-		dev_err(dsi->dev, "Couldn't map the DPHY encoder registers\n");
+		dev_err(&pdev->dev, "Couldn't map the DPHY encoder registers\n");
 		return PTR_ERR(regs);
 	}
 
-	dphy->regs = devm_regmap_init_mmio(dsi->dev, regs,
-					   &sun6i_dphy_regmap_config);
+	dphy->regs = devm_regmap_init_mmio_clk(&pdev->dev, "bus",
+					       regs, &sun6i_dphy_regmap_config);
 	if (IS_ERR(dphy->regs)) {
-		dev_err(dsi->dev, "Couldn't create the DPHY encoder regmap\n");
+		dev_err(&pdev->dev, "Couldn't create the DPHY encoder regmap\n");
 		return PTR_ERR(dphy->regs);
 	}
 
-	dphy->reset = of_reset_control_get_shared(node, NULL);
+	dphy->reset = devm_reset_control_get_shared(&pdev->dev, NULL);
 	if (IS_ERR(dphy->reset)) {
-		dev_err(dsi->dev, "Couldn't get our reset line\n");
+		dev_err(&pdev->dev, "Couldn't get our reset line\n");
 		return PTR_ERR(dphy->reset);
 	}
 
-	dphy->bus_clk = of_clk_get_by_name(node, "bus");
-	if (IS_ERR(dphy->bus_clk)) {
-		dev_err(dsi->dev, "Couldn't get the DPHY bus clock\n");
-		ret = PTR_ERR(dphy->bus_clk);
-		goto err_free_reset;
-	}
-	regmap_mmio_attach_clk(dphy->regs, dphy->bus_clk);
-
-	dphy->mod_clk = of_clk_get_by_name(node, "mod");
+	dphy->mod_clk = devm_clk_get(&pdev->dev, "mod");
 	if (IS_ERR(dphy->mod_clk)) {
-		dev_err(dsi->dev, "Couldn't get the DPHY mod clock\n");
-		ret = PTR_ERR(dphy->mod_clk);
-		goto err_free_bus;
+		dev_err(&pdev->dev, "Couldn't get the DPHY mod clock\n");
+		return PTR_ERR(dphy->mod_clk);
 	}
 
-	dsi->dphy = dphy;
+	dphy->phy = devm_phy_create(&pdev->dev, NULL, &sun6i_dphy_ops);
+	if (IS_ERR(dphy->phy)) {
+		dev_err(&pdev->dev, "failed to create PHY\n");
+		return PTR_ERR(dphy->phy);
+	}
 
-	return 0;
+	phy_set_drvdata(dphy->phy, dphy);
+	phy_provider = devm_of_phy_provider_register(&pdev->dev, of_phy_simple_xlate);
 
-err_free_bus:
-	regmap_mmio_detach_clk(dphy->regs);
-	clk_put(dphy->bus_clk);
-err_free_reset:
-	reset_control_put(dphy->reset);
-	return ret;
+	return PTR_ERR_OR_ZERO(phy_provider);
 }
 
-int sun6i_dphy_remove(struct sun6i_dsi *dsi)
-{
-	struct sun6i_dphy *dphy = dsi->dphy;
-
-	regmap_mmio_detach_clk(dphy->regs);
-	clk_put(dphy->mod_clk);
-	clk_put(dphy->bus_clk);
-	reset_control_put(dphy->reset);
+static const struct of_device_id sun6i_dphy_of_table[] = {
+	{ .compatible = "allwinner,sun6i-a31-mipi-dphy" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, sun6i_dphy_of_table);
+
+static struct platform_driver sun6i_dphy_platform_driver = {
+	.probe		= sun6i_dphy_probe,
+	.driver		= {
+		.name		= "sun6i-mipi-dphy",
+		.of_match_table	= sun6i_dphy_of_table,
+	},
+};
+module_platform_driver(sun6i_dphy_platform_driver);
 
-	return 0;
-}
+MODULE_AUTHOR("Maxime Ripard <maxime.ripard@bootlin>");
+MODULE_DESCRIPTION("Allwinner A31 MIPI D-PHY Driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/staging/vboxvideo/TODO b/drivers/staging/vboxvideo/TODO
index 2e0f99c3f10c..7f97c47a4042 100644
--- a/drivers/staging/vboxvideo/TODO
+++ b/drivers/staging/vboxvideo/TODO
@@ -1,5 +1,8 @@
 TODO:
 -Get a full review from the drm-maintainers on dri-devel done on this driver
+-Drop all the logic around initial_mode_queried, the master_set and
+ master_drop callbacks and everything related to this. kms clients can handle
+ hotplugs.
 -Extend this TODO with the results of that review
 
 Please send any patches to Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
diff --git a/drivers/staging/vboxvideo/vbox_drv.c b/drivers/staging/vboxvideo/vbox_drv.c
index cc6532d8c2fa..e7755a179850 100644
--- a/drivers/staging/vboxvideo/vbox_drv.c
+++ b/drivers/staging/vboxvideo/vbox_drv.c
@@ -7,11 +7,15 @@
  *          Michael Thayer <michael.thayer@oracle.com,
  *          Hans de Goede <hdegoede@redhat.com>
  */
-#include <linux/module.h>
 #include <linux/console.h>
+#include <linux/module.h>
+#include <linux/pci.h>
 #include <linux/vt_kern.h>
 
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_file.h>
+#include <drm/drm_ioctl.h>
 
 #include "vbox_drv.h"
 
@@ -221,9 +225,7 @@ static void vbox_master_drop(struct drm_device *dev, struct drm_file *file_priv)
 
 static struct drm_driver driver = {
 	.driver_features =
-	    DRIVER_MODESET | DRIVER_GEM | DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED |
-	    DRIVER_PRIME | DRIVER_ATOMIC,
-	.dev_priv_size = 0,
+	    DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME | DRIVER_ATOMIC,
 
 	.lastclose = drm_fb_helper_lastclose,
 	.master_set = vbox_master_set,
diff --git a/drivers/staging/vboxvideo/vbox_fb.c b/drivers/staging/vboxvideo/vbox_fb.c
index 6b7aa23dfc0a..83a04afd1766 100644
--- a/drivers/staging/vboxvideo/vbox_fb.c
+++ b/drivers/staging/vboxvideo/vbox_fb.c
@@ -6,20 +6,22 @@
  * Authors: Dave Airlie <airlied@redhat.com>
  *          Michael Thayer <michael.thayer@oracle.com,
  */
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/errno.h>
-#include <linux/string.h>
-#include <linux/mm.h>
-#include <linux/tty.h>
-#include <linux/sysrq.h>
 #include <linux/delay.h>
+#include <linux/errno.h>
 #include <linux/fb.h>
 #include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/string.h>
+#include <linux/sysrq.h>
+#include <linux/tty.h>
 
 #include <drm/drm_crtc.h>
-#include <drm/drm_fb_helper.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_fb_helper.h>
+#include <drm/drm_fourcc.h>
 
 #include "vbox_drv.h"
 #include "vboxvideo.h"
@@ -95,11 +97,6 @@ int vboxfb_create(struct drm_fb_helper *helper,
 
 	strcpy(info->fix.id, "vboxdrmfb");
 
-	/*
-	 * The last flag forces a mode set on VT switches even if the kernel
-	 * does not think it is needed.
-	 */
-	info->flags = FBINFO_DEFAULT | FBINFO_MISC_ALWAYS_SETPAR;
 	info->fbops = &vboxfb_ops;
 
 	/*
diff --git a/drivers/staging/vboxvideo/vbox_irq.c b/drivers/staging/vboxvideo/vbox_irq.c
index f3d9895c79d8..195484713365 100644
--- a/drivers/staging/vboxvideo/vbox_irq.c
+++ b/drivers/staging/vboxvideo/vbox_irq.c
@@ -9,7 +9,9 @@
  *          Hans de Goede <hdegoede@redhat.com>
  */
 
-#include <drm/drm_crtc_helper.h>
+#include <linux/pci.h>
+#include <drm/drm_irq.h>
+#include <drm/drm_probe_helper.h>
 
 #include "vbox_drv.h"
 #include "vboxvideo.h"
diff --git a/drivers/staging/vboxvideo/vbox_mode.c b/drivers/staging/vboxvideo/vbox_mode.c
index c43bec4628ae..213551394495 100644
--- a/drivers/staging/vboxvideo/vbox_mode.c
+++ b/drivers/staging/vboxvideo/vbox_mode.c
@@ -10,14 +10,17 @@
  *          Hans de Goede <hdegoede@redhat.com>
  */
 #include <linux/export.h>
+
 #include <drm/drm_atomic.h>
-#include <drm/drm_crtc_helper.h>
-#include <drm/drm_plane_helper.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_fourcc.h>
+#include <drm/drm_plane_helper.h>
+#include <drm/drm_probe_helper.h>
+#include <drm/drm_vblank.h>
 
+#include "hgsmi_channels.h"
 #include "vbox_drv.h"
 #include "vboxvideo.h"
-#include "hgsmi_channels.h"
 
 /*
  * Set a graphics mode.  Poke any required values into registers, do an HGSMI
diff --git a/include/drm/bridge/dw_hdmi.h b/include/drm/bridge/dw_hdmi.h
index 9c56412bb2cf..66e70770cce5 100644
--- a/include/drm/bridge/dw_hdmi.h
+++ b/include/drm/bridge/dw_hdmi.h
@@ -10,9 +10,11 @@
 #ifndef __DW_HDMI__
 #define __DW_HDMI__
 
-#include <drm/drmP.h>
-
+struct drm_connector;
+struct drm_display_mode;
+struct drm_encoder;
 struct dw_hdmi;
+struct platform_device;
 
 /**
  * DOC: Supported input formats and encodings
@@ -157,6 +159,7 @@ void dw_hdmi_setup_rx_sense(struct dw_hdmi *hdmi, bool hpd, bool rx_sense);
 void dw_hdmi_set_sample_rate(struct dw_hdmi *hdmi, unsigned int rate);
 void dw_hdmi_audio_enable(struct dw_hdmi *hdmi);
 void dw_hdmi_audio_disable(struct dw_hdmi *hdmi);
+void dw_hdmi_set_high_tmds_clock_ratio(struct dw_hdmi *hdmi);
 
 /* PHY configuration */
 void dw_hdmi_phy_i2c_set_addr(struct dw_hdmi *hdmi, u8 address);
diff --git a/include/drm/bridge/dw_mipi_dsi.h b/include/drm/bridge/dw_mipi_dsi.h
index 48a671e782ca..7d3dd69a5caa 100644
--- a/include/drm/bridge/dw_mipi_dsi.h
+++ b/include/drm/bridge/dw_mipi_dsi.h
@@ -14,7 +14,8 @@ struct dw_mipi_dsi;
 
 struct dw_mipi_dsi_phy_ops {
 	int (*init)(void *priv_data);
-	int (*get_lane_mbps)(void *priv_data, struct drm_display_mode *mode,
+	int (*get_lane_mbps)(void *priv_data,
+			     const struct drm_display_mode *mode,
 			     unsigned long mode_flags, u32 lanes, u32 format,
 			     unsigned int *lane_mbps);
 };
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index a3184416ddc5..94aae87b1138 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -93,25 +93,11 @@ struct dma_buf_attachment;
 struct pci_dev;
 struct pci_controller;
 
-#define DRM_IF_VERSION(maj, min) (maj << 16 | min)
-
-#define DRM_SWITCH_POWER_ON 0
-#define DRM_SWITCH_POWER_OFF 1
-#define DRM_SWITCH_POWER_CHANGING 2
-#define DRM_SWITCH_POWER_DYNAMIC_OFF 3
-
-/* returns true if currently okay to sleep */
-static inline bool drm_can_sleep(void)
-{
-	if (in_atomic() || in_dbg_master() || irqs_disabled())
-		return false;
-	return true;
-}
-
-#if defined(CONFIG_DRM_DEBUG_SELFTEST_MODULE)
-#define EXPORT_SYMBOL_FOR_TESTS_ONLY(x) EXPORT_SYMBOL(x)
-#else
-#define EXPORT_SYMBOL_FOR_TESTS_ONLY(x)
-#endif
+/*
+ * NOTE: drmP.h is obsolete - do NOT add anything to this file
+ *
+ * Do not include drmP.h in new files.
+ * Work is ongoing to remove drmP.h includes from existing files
+ */
 
 #endif
diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
index f9b35834c45d..824a5ed4e216 100644
--- a/include/drm/drm_atomic.h
+++ b/include/drm/drm_atomic.h
@@ -139,9 +139,9 @@ struct drm_crtc_commit {
 	/**
 	 * @abort_completion:
 	 *
-	 * A flag that's set after drm_atomic_helper_setup_commit takes a second
-	 * reference for the completion of $drm_crtc_state.event. It's used by
-	 * the free code to remove the second reference if commit fails.
+	 * A flag that's set after drm_atomic_helper_setup_commit() takes a
+	 * second reference for the completion of $drm_crtc_state.event. It's
+	 * used by the free code to remove the second reference if commit fails.
 	 */
 	bool abort_completion;
 };
@@ -192,7 +192,7 @@ struct drm_private_state;
  * private objects. The structure itself is used as a vtable to identify the
  * associated private object type. Each private object type that needs to be
  * added to the atomic states is expected to have an implementation of these
- * hooks and pass a pointer to it's drm_private_state_funcs struct to
+ * hooks and pass a pointer to its drm_private_state_funcs struct to
  * drm_atomic_get_private_obj_state().
  */
 struct drm_private_state_funcs {
@@ -228,9 +228,31 @@ struct drm_private_state_funcs {
  * Currently only tracks the state update functions and the opaque driver
  * private state itself, but in the future might also track which
  * &drm_modeset_lock is required to duplicate and update this object's state.
+ *
+ * All private objects must be initialized before the DRM device they are
+ * attached to is registered to the DRM subsystem (call to drm_dev_register())
+ * and should stay around until this DRM device is unregistered (call to
+ * drm_dev_unregister()). In other words, private objects lifetime is tied
+ * to the DRM device lifetime. This implies that:
+ *
+ * 1/ all calls to drm_atomic_private_obj_init() must be done before calling
+ *    drm_dev_register()
+ * 2/ all calls to drm_atomic_private_obj_fini() must be done after calling
+ *    drm_dev_unregister()
  */
 struct drm_private_obj {
 	/**
+	 * @head: List entry used to attach a private object to a &drm_device
+	 * (queued to &drm_mode_config.privobj_list).
+	 */
+	struct list_head head;
+
+	/**
+	 * @lock: Modeset lock to protect the state object.
+	 */
+	struct drm_modeset_lock lock;
+
+	/**
 	 * @state: Current atomic state for this driver private object.
 	 */
 	struct drm_private_state *state;
@@ -245,6 +267,18 @@ struct drm_private_obj {
 };
 
 /**
+ * drm_for_each_privobj() - private object iterator
+ *
+ * @privobj: pointer to the current private object. Updated after each
+ *	     iteration
+ * @dev: the DRM device we want get private objects from
+ *
+ * Allows one to iterate over all private objects attached to @dev
+ */
+#define drm_for_each_privobj(privobj, dev) \
+	list_for_each_entry(privobj, &(dev)->mode_config.privobj_list, head)
+
+/**
  * struct drm_private_state - base struct for driver private object state
  * @state: backpointer to global drm_atomic_state
  *
@@ -295,6 +329,15 @@ struct drm_atomic_state {
 	bool allow_modeset : 1;
 	bool legacy_cursor_update : 1;
 	bool async_update : 1;
+	/**
+	 * @duplicated:
+	 *
+	 * Indicates whether or not this atomic state was duplicated using
+	 * drm_atomic_helper_duplicate_state(). Drivers and atomic helpers
+	 * should use this to fixup normal  inconsistencies in duplicated
+	 * states.
+	 */
+	bool duplicated : 1;
 	struct __drm_planes_state *planes;
 	struct __drm_crtcs_state *crtcs;
 	int num_connector;
@@ -400,7 +443,8 @@ struct drm_connector_state * __must_check
 drm_atomic_get_connector_state(struct drm_atomic_state *state,
 			       struct drm_connector *connector);
 
-void drm_atomic_private_obj_init(struct drm_private_obj *obj,
+void drm_atomic_private_obj_init(struct drm_device *dev,
+				 struct drm_private_obj *obj,
 				 struct drm_private_state *state,
 				 const struct drm_private_state_funcs *funcs);
 void drm_atomic_private_obj_fini(struct drm_private_obj *obj);
diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
index bd850747ce54..9da8c93f7976 100644
--- a/include/drm/drm_bridge.h
+++ b/include/drm/drm_bridge.h
@@ -196,8 +196,8 @@ struct drm_bridge_funcs {
 	 * the DRM framework will have to be extended with DRM bridge states.
 	 */
 	void (*mode_set)(struct drm_bridge *bridge,
-			 struct drm_display_mode *mode,
-			 struct drm_display_mode *adjusted_mode);
+			 const struct drm_display_mode *mode,
+			 const struct drm_display_mode *adjusted_mode);
 	/**
 	 * @pre_enable:
 	 *
@@ -310,8 +310,8 @@ enum drm_mode_status drm_bridge_mode_valid(struct drm_bridge *bridge,
 void drm_bridge_disable(struct drm_bridge *bridge);
 void drm_bridge_post_disable(struct drm_bridge *bridge);
 void drm_bridge_mode_set(struct drm_bridge *bridge,
-			 struct drm_display_mode *mode,
-			 struct drm_display_mode *adjusted_mode);
+			 const struct drm_display_mode *mode,
+			 const struct drm_display_mode *adjusted_mode);
 void drm_bridge_pre_enable(struct drm_bridge *bridge);
 void drm_bridge_enable(struct drm_bridge *bridge);
 
diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h
index bfe1639df02d..97fc498dc767 100644
--- a/include/drm/drm_cache.h
+++ b/include/drm/drm_cache.h
@@ -47,6 +47,24 @@ static inline bool drm_arch_can_wc_memory(void)
 	return false;
 #elif defined(CONFIG_MIPS) && defined(CONFIG_CPU_LOONGSON3)
 	return false;
+#elif defined(CONFIG_ARM) || defined(CONFIG_ARM64)
+	/*
+	 * The DRM driver stack is designed to work with cache coherent devices
+	 * only, but permits an optimization to be enabled in some cases, where
+	 * for some buffers, both the CPU and the GPU use uncached mappings,
+	 * removing the need for DMA snooping and allocation in the CPU caches.
+	 *
+	 * The use of uncached GPU mappings relies on the correct implementation
+	 * of the PCIe NoSnoop TLP attribute by the platform, otherwise the GPU
+	 * will use cached mappings nonetheless. On x86 platforms, this does not
+	 * seem to matter, as uncached CPU mappings will snoop the caches in any
+	 * case. However, on ARM and arm64, enabling this optimization on a
+	 * platform where NoSnoop is ignored results in loss of coherency, which
+	 * breaks correct operation of the device. Since we have no way of
+	 * detecting whether NoSnoop works or not, just disable this
+	 * optimization entirely for ARM and arm64.
+	 */
+	return false;
 #else
 	return true;
 #endif
diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
index 971bb7853776..8b552b1a6ce9 100644
--- a/include/drm/drm_client.h
+++ b/include/drm/drm_client.h
@@ -26,7 +26,7 @@ struct drm_client_funcs {
 	 * @unregister:
 	 *
 	 * Called when &drm_device is unregistered. The client should respond by
-	 * releasing it's resources using drm_client_release().
+	 * releasing its resources using drm_client_release().
 	 *
 	 * This callback is optional.
 	 */
diff --git a/include/drm/drm_color_mgmt.h b/include/drm/drm_color_mgmt.h
index 90ef9996d9a4..d1c662d92ab7 100644
--- a/include/drm/drm_color_mgmt.h
+++ b/include/drm/drm_color_mgmt.h
@@ -69,4 +69,32 @@ int drm_plane_create_color_properties(struct drm_plane *plane,
 				      u32 supported_ranges,
 				      enum drm_color_encoding default_encoding,
 				      enum drm_color_range default_range);
+
+/**
+ * enum drm_color_lut_tests - hw-specific LUT tests to perform
+ *
+ * The drm_color_lut_check() function takes a bitmask of the values here to
+ * determine which tests to apply to a userspace-provided LUT.
+ */
+enum drm_color_lut_tests {
+	/**
+	 * @DRM_COLOR_LUT_EQUAL_CHANNELS:
+	 *
+	 * Checks whether the entries of a LUT all have equal values for the
+	 * red, green, and blue channels.  Intended for hardware that only
+	 * accepts a single value per LUT entry and assumes that value applies
+	 * to all three color components.
+	 */
+	DRM_COLOR_LUT_EQUAL_CHANNELS = BIT(0),
+
+	/**
+	 * @DRM_COLOR_LUT_NON_DECREASING:
+	 *
+	 * Checks whether the entries of a LUT are always flat or increasing
+	 * (never decreasing).
+	 */
+	DRM_COLOR_LUT_NON_DECREASING = BIT(1),
+};
+
+int drm_color_lut_check(const struct drm_property_blob *lut, u32 tests);
 #endif
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 9be2181b3ed7..8fe22abb1e10 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -366,6 +366,12 @@ struct drm_display_info {
 	bool has_hdmi_infoframe;
 
 	/**
+	 * @rgb_quant_range_selectable: Does the sink support selecting
+	 * the RGB quantization range?
+	 */
+	bool rgb_quant_range_selectable;
+
+	/**
 	 * @edid_hdmi_dc_modes: Mask of supported hdmi deep color modes. Even
 	 * more stuff redundant with @bus_formats.
 	 */
@@ -394,7 +400,7 @@ int drm_display_info_set_bus_formats(struct drm_display_info *info,
 /**
  * struct drm_tv_connector_state - TV connector related states
  * @subconnector: selected subconnector
- * @margins: margins
+ * @margins: margins (all margins are expressed in pixels)
  * @margins.left: left margin
  * @margins.right: right margin
  * @margins.top: top margin
@@ -906,7 +912,7 @@ struct drm_connector {
 	/**
 	 * @ycbcr_420_allowed : This bool indicates if this connector is
 	 * capable of handling YCBCR 420 output. While parsing the EDID
-	 * blocks, its very helpful to know, if the source is capable of
+	 * blocks it's very helpful to know if the source is capable of
 	 * handling YCBCR 420 outputs.
 	 */
 	bool ycbcr_420_allowed;
@@ -1249,9 +1255,11 @@ const char *drm_get_tv_select_name(int val);
 const char *drm_get_content_protection_name(int val);
 
 int drm_mode_create_dvi_i_properties(struct drm_device *dev);
+int drm_mode_create_tv_margin_properties(struct drm_device *dev);
 int drm_mode_create_tv_properties(struct drm_device *dev,
 				  unsigned int num_modes,
 				  const char * const modes[]);
+void drm_connector_attach_tv_margin_properties(struct drm_connector *conn);
 int drm_mode_create_scaling_mode_property(struct drm_device *dev);
 int drm_connector_attach_content_type_property(struct drm_connector *dev);
 int drm_connector_attach_scaling_mode_property(struct drm_connector *connector,
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
index 39c3900aab3c..85abd3fe9e83 100644
--- a/include/drm/drm_crtc.h
+++ b/include/drm/drm_crtc.h
@@ -1149,9 +1149,6 @@ static inline uint32_t drm_crtc_mask(const struct drm_crtc *crtc)
 	return 1 << drm_crtc_index(crtc);
 }
 
-int drm_crtc_force_disable(struct drm_crtc *crtc);
-int drm_crtc_force_disable_all(struct drm_device *dev);
-
 int drm_mode_set_config_internal(struct drm_mode_set *set);
 struct drm_crtc *drm_crtc_from_index(struct drm_device *dev, int idx);
 
diff --git a/include/drm/drm_crtc_helper.h b/include/drm/drm_crtc_helper.h
index d65f034843ce..a6d520d5b6ca 100644
--- a/include/drm/drm_crtc_helper.h
+++ b/include/drm/drm_crtc_helper.h
@@ -56,21 +56,6 @@ bool drm_helper_encoder_in_use(struct drm_encoder *encoder);
 int drm_helper_connector_dpms(struct drm_connector *connector, int mode);
 
 void drm_helper_resume_force_mode(struct drm_device *dev);
-
-/* drm_probe_helper.c */
-int drm_helper_probe_single_connector_modes(struct drm_connector
-					    *connector, uint32_t maxX,
-					    uint32_t maxY);
-int drm_helper_probe_detect(struct drm_connector *connector,
-			    struct drm_modeset_acquire_ctx *ctx,
-			    bool force);
-void drm_kms_helper_poll_init(struct drm_device *dev);
-void drm_kms_helper_poll_fini(struct drm_device *dev);
-bool drm_helper_hpd_irq_event(struct drm_device *dev);
-void drm_kms_helper_hotplug_event(struct drm_device *dev);
-
-void drm_kms_helper_poll_disable(struct drm_device *dev);
-void drm_kms_helper_poll_enable(struct drm_device *dev);
-bool drm_kms_helper_is_poll_worker(void);
+int drm_helper_force_disable_all(struct drm_device *dev);
 
 #endif
diff --git a/include/drm/drm_damage_helper.h b/include/drm/drm_damage_helper.h
index 4487660b26b8..40c34a5bf149 100644
--- a/include/drm/drm_damage_helper.h
+++ b/include/drm/drm_damage_helper.h
@@ -78,6 +78,9 @@ drm_atomic_helper_damage_iter_init(struct drm_atomic_helper_damage_iter *iter,
 bool
 drm_atomic_helper_damage_iter_next(struct drm_atomic_helper_damage_iter *iter,
 				   struct drm_rect *rect);
+bool drm_atomic_helper_damage_merged(const struct drm_plane_state *old_state,
+				     struct drm_plane_state *state,
+				     struct drm_rect *rect);
 
 /**
  * drm_helper_get_plane_damage_clips - Returns damage clips in &drm_rect.
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index 42411b3ea0c8..d5e092dccf3e 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -24,25 +24,79 @@ struct inode;
 struct pci_dev;
 struct pci_controller;
 
+
 /**
- * DRM device structure. This structure represent a complete card that
+ * enum drm_switch_power - power state of drm device
+ */
+
+enum switch_power_state {
+	/** @DRM_SWITCH_POWER_ON: Power state is ON */
+	DRM_SWITCH_POWER_ON = 0,
+
+	/** @DRM_SWITCH_POWER_OFF: Power state is OFF */
+	DRM_SWITCH_POWER_OFF = 1,
+
+	/** @DRM_SWITCH_POWER_CHANGING: Power state is changing */
+	DRM_SWITCH_POWER_CHANGING = 2,
+
+	/** @DRM_SWITCH_POWER_DYNAMIC_OFF: Suspended */
+	DRM_SWITCH_POWER_DYNAMIC_OFF = 3,
+};
+
+/**
+ * struct drm_device - DRM device structure
+ *
+ * This structure represent a complete card that
  * may contain multiple heads.
  */
 struct drm_device {
-	struct list_head legacy_dev_list;/**< list of devices per driver for stealth attach cleanup */
-	int if_version;			/**< Highest interface version set */
-
-	/** \name Lifetime Management */
-	/*@{ */
-	struct kref ref;		/**< Object ref-count */
-	struct device *dev;		/**< Device structure of bus-device */
-	struct drm_driver *driver;	/**< DRM driver managing the device */
-	void *dev_private;		/**< DRM driver private data */
-	struct drm_minor *primary;		/**< Primary node */
-	struct drm_minor *render;		/**< Render node */
+	/**
+	 * @legacy_dev_list:
+	 *
+	 * List of devices per driver for stealth attach cleanup
+	 */
+	struct list_head legacy_dev_list;
+
+	/** @if_version: Highest interface version set */
+	int if_version;
+
+	/** @ref: Object ref-count */
+	struct kref ref;
+
+	/** @dev: Device structure of bus-device */
+	struct device *dev;
+
+	/** @driver: DRM driver managing the device */
+	struct drm_driver *driver;
+
+	/**
+	 * @dev_private:
+	 *
+	 * DRM driver private data. Instead of using this pointer it is
+	 * recommended that drivers use drm_dev_init() and embed struct
+	 * &drm_device in their larger per-device structure.
+	 */
+	void *dev_private;
+
+	/** @primary: Primary node */
+	struct drm_minor *primary;
+
+	/** @render: Render node */
+	struct drm_minor *render;
+
+	/**
+	 * @registered:
+	 *
+	 * Internally used by drm_dev_register() and drm_connector_register().
+	 */
 	bool registered;
 
-	/* currently active master for this device. Protected by master_mutex */
+	/**
+	 * @master:
+	 *
+	 * Currently active master for this device.
+	 * Protected by &master_mutex
+	 */
 	struct drm_master *master;
 
 	/**
@@ -63,76 +117,65 @@ struct drm_device {
 	 */
 	bool unplugged;
 
-	struct inode *anon_inode;		/**< inode for private address-space */
-	char *unique;				/**< unique name of the device */
-	/*@} */
+	/** @anon_inode: inode for private address-space */
+	struct inode *anon_inode;
+
+	/** @unique: Unique name of the device */
+	char *unique;
 
-	/** \name Locks */
-	/*@{ */
-	struct mutex struct_mutex;	/**< For others */
-	struct mutex master_mutex;      /**< For drm_minor::master and drm_file::is_master */
-	/*@} */
+	/**
+	 * @struct_mutex:
+	 *
+	 * Lock for others (not &drm_minor.master and &drm_file.is_master)
+	 */
+	struct mutex struct_mutex;
 
-	/** \name Usage Counters */
-	/*@{ */
-	int open_count;			/**< Outstanding files open, protected by drm_global_mutex. */
-	spinlock_t buf_lock;		/**< For drm_device::buf_use and a few other things. */
-	int buf_use;			/**< Buffers in use -- cannot alloc */
-	atomic_t buf_alloc;		/**< Buffer allocation in progress */
-	/*@} */
+	/**
+	 * @master_mutex:
+	 *
+	 * Lock for &drm_minor.master and &drm_file.is_master
+	 */
+	struct mutex master_mutex;
+
+	/**
+	 * @open_count:
+	 *
+	 * Usage counter for outstanding files open,
+	 * protected by drm_global_mutex
+	 */
+	int open_count;
 
+	/** @filelist_mutex: Protects @filelist. */
 	struct mutex filelist_mutex;
+	/**
+	 * @filelist:
+	 *
+	 * List of userspace clients, linked through &drm_file.lhead.
+	 */
 	struct list_head filelist;
 
 	/**
 	 * @filelist_internal:
 	 *
-	 * List of open DRM files for in-kernel clients. Protected by @filelist_mutex.
+	 * List of open DRM files for in-kernel clients.
+	 * Protected by &filelist_mutex.
 	 */
 	struct list_head filelist_internal;
 
 	/**
 	 * @clientlist_mutex:
 	 *
-	 * Protects @clientlist access.
+	 * Protects &clientlist access.
 	 */
 	struct mutex clientlist_mutex;
 
 	/**
 	 * @clientlist:
 	 *
-	 * List of in-kernel clients. Protected by @clientlist_mutex.
+	 * List of in-kernel clients. Protected by &clientlist_mutex.
 	 */
 	struct list_head clientlist;
 
-	/** \name Memory management */
-	/*@{ */
-	struct list_head maplist;	/**< Linked list of regions */
-	struct drm_open_hash map_hash;	/**< User token hash table for maps */
-
-	/** \name Context handle management */
-	/*@{ */
-	struct list_head ctxlist;	/**< Linked list of context handles */
-	struct mutex ctxlist_mutex;	/**< For ctxlist */
-
-	struct idr ctx_idr;
-
-	struct list_head vmalist;	/**< List of vmas (for debugging) */
-
-	/*@} */
-
-	/** \name DMA support */
-	/*@{ */
-	struct drm_device_dma *dma;		/**< Optional pointer for DMA support */
-	/*@} */
-
-	/** \name Context support */
-	/*@{ */
-
-	__volatile__ long context_flag;	/**< Context swapping flag */
-	int last_context;		/**< Last current context */
-	/*@} */
-
 	/**
 	 * @irq_enabled:
 	 *
@@ -141,6 +184,10 @@ struct drm_device {
 	 * to true manually.
 	 */
 	bool irq_enabled;
+
+	/**
+	 * @irq: Used by the drm_irq_install() and drm_irq_unistall() helpers.
+	 */
 	int irq;
 
 	/**
@@ -168,7 +215,16 @@ struct drm_device {
 	 */
 	struct drm_vblank_crtc *vblank;
 
-	spinlock_t vblank_time_lock;    /**< Protects vblank count and time updates during vblank enable/disable */
+	/**
+	 * @vblank_time_lock:
+	 *
+	 *  Protects vblank count and time updates during vblank enable/disable
+	 */
+	spinlock_t vblank_time_lock;
+	/**
+	 * @vbl_lock: Top-level vblank references lock, wraps the low-level
+	 * @vblank_time_lock.
+	 */
 	spinlock_t vbl_lock;
 
 	/**
@@ -184,45 +240,61 @@ struct drm_device {
 	 * races and imprecision over longer time periods, hence exposing a
 	 * hardware vblank counter is always recommended.
 	 *
-	 * If non-zeor, &drm_crtc_funcs.get_vblank_counter must be set.
+	 * This is the statically configured device wide maximum. The driver
+	 * can instead choose to use a runtime configurable per-crtc value
+	 * &drm_vblank_crtc.max_vblank_count, in which case @max_vblank_count
+	 * must be left at zero. See drm_crtc_set_max_vblank_count() on how
+	 * to use the per-crtc value.
+	 *
+	 * If non-zero, &drm_crtc_funcs.get_vblank_counter must be set.
 	 */
-	u32 max_vblank_count;           /**< size of vblank counter register */
+	u32 max_vblank_count;
+
+	/** @vblank_event_list: List of vblank events */
+	struct list_head vblank_event_list;
 
 	/**
-	 * List of events
+	 * @event_lock:
+	 *
+	 * Protects @vblank_event_list and event delivery in
+	 * general. See drm_send_event() and drm_send_event_locked().
 	 */
-	struct list_head vblank_event_list;
 	spinlock_t event_lock;
 
-	/*@} */
+	/** @agp: AGP data */
+	struct drm_agp_head *agp;
 
-	struct drm_agp_head *agp;	/**< AGP data */
+	/** @pdev: PCI device structure */
+	struct pci_dev *pdev;
 
-	struct pci_dev *pdev;		/**< PCI device structure */
 #ifdef __alpha__
+	/** @hose: PCI hose, only used on ALPHA platforms. */
 	struct pci_controller *hose;
 #endif
+	/** @num_crtcs: Number of CRTCs on this device */
+	unsigned int num_crtcs;
 
-	struct drm_sg_mem *sg;	/**< Scatter gather memory */
-	unsigned int num_crtcs;                  /**< Number of CRTCs on this device */
+	/** @mode_config: Current mode config */
+	struct drm_mode_config mode_config;
 
-	struct {
-		int context;
-		struct drm_hw_lock *lock;
-	} sigdata;
-
-	struct drm_local_map *agp_buffer_map;
-	unsigned int agp_buffer_token;
-
-	struct drm_mode_config mode_config;	/**< Current mode config */
-
-	/** \name GEM information */
-	/*@{ */
+	/** @object_name_lock: GEM information */
 	struct mutex object_name_lock;
+
+	/** @object_name_idr: GEM information */
 	struct idr object_name_idr;
+
+	/** @vma_offset_manager: GEM information */
 	struct drm_vma_offset_manager *vma_offset_manager;
-	/*@} */
-	int switch_power_state;
+
+	/**
+	 * @switch_power_state:
+	 *
+	 * Power state of the client.
+	 * Used by drivers supporting the switcheroo driver.
+	 * The state is maintained in the
+	 * &vga_switcheroo_client_ops.set_gpu_state callback
+	 */
+	enum switch_power_state switch_power_state;
 
 	/**
 	 * @fb_helper:
@@ -231,6 +303,56 @@ struct drm_device {
 	 * Set by drm_fb_helper_init() and cleared by drm_fb_helper_fini().
 	 */
 	struct drm_fb_helper *fb_helper;
+
+	/* Everything below here is for legacy driver, never use! */
+	/* private: */
+
+	/* Context handle management - linked list of context handles */
+	struct list_head ctxlist;
+
+	/* Context handle management - mutex for &ctxlist */
+	struct mutex ctxlist_mutex;
+
+	/* Context handle management */
+	struct idr ctx_idr;
+
+	/* Memory management - linked list of regions */
+	struct list_head maplist;
+
+	/* Memory management - user token hash table for maps */
+	struct drm_open_hash map_hash;
+
+	/* Context handle management - list of vmas (for debugging) */
+	struct list_head vmalist;
+
+	/* Optional pointer for DMA support */
+	struct drm_device_dma *dma;
+
+	/* Context swapping flag */
+	__volatile__ long context_flag;
+
+	/* Last current context */
+	int last_context;
+
+	/* Lock for &buf_use and a few other things. */
+	spinlock_t buf_lock;
+
+	/* Usage counter for buffers in use -- cannot alloc */
+	int buf_use;
+
+	/* Buffer allocation in progress */
+	atomic_t buf_alloc;
+
+	struct {
+		int context;
+		struct drm_hw_lock *lock;
+	} sigdata;
+
+	struct drm_local_map *agp_buffer_map;
+	unsigned int agp_buffer_token;
+
+	/* Scatter gather memory */
+	struct drm_sg_mem *sg;
 };
 
 #endif
diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index 2d4fc2d33810..97ce790a5b5a 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -314,6 +314,10 @@
 # define DP_PSR_SETUP_TIME_SHIFT            1
 # define DP_PSR2_SU_Y_COORDINATE_REQUIRED   (1 << 4)  /* eDP 1.4a */
 # define DP_PSR2_SU_GRANULARITY_REQUIRED    (1 << 5)  /* eDP 1.4b */
+
+#define DP_PSR2_SU_X_GRANULARITY	    0x072 /* eDP 1.4b */
+#define DP_PSR2_SU_Y_GRANULARITY	    0x074 /* eDP 1.4b */
+
 /*
  * 0x80-0x8f describe downstream port capabilities, but there are two layouts
  * based on whether DP_DETAILED_CAP_INFO_AVAILABLE was set.  If it was not,
@@ -556,6 +560,8 @@
 # define DP_TEST_LINK_EDID_READ		    (1 << 2)
 # define DP_TEST_LINK_PHY_TEST_PATTERN	    (1 << 3) /* DPCD >= 1.1 */
 # define DP_TEST_LINK_FAUX_PATTERN	    (1 << 4) /* DPCD >= 1.2 */
+# define DP_TEST_LINK_AUDIO_PATTERN         (1 << 5) /* DPCD >= 1.2 */
+# define DP_TEST_LINK_AUDIO_DISABLED_VIDEO  (1 << 6) /* DPCD >= 1.2 */
 
 #define DP_TEST_LINK_RATE		    0x219
 # define DP_LINK_RATE_162		    (0x6)
@@ -604,6 +610,7 @@
 # define DP_COLOR_FORMAT_RGB                (0 << 1)
 # define DP_COLOR_FORMAT_YCbCr422           (1 << 1)
 # define DP_COLOR_FORMAT_YCbCr444           (2 << 1)
+# define DP_TEST_DYNAMIC_RANGE_VESA         (0 << 3)
 # define DP_TEST_DYNAMIC_RANGE_CEA          (1 << 3)
 # define DP_TEST_YCBCR_COEFFICIENTS         (1 << 4)
 # define DP_YCBCR_COEFFICIENTS_ITU601       (0 << 4)
@@ -653,6 +660,16 @@
 
 #define DP_TEST_SINK			    0x270
 # define DP_TEST_SINK_START		    (1 << 0)
+#define DP_TEST_AUDIO_MODE		    0x271
+#define DP_TEST_AUDIO_PATTERN_TYPE	    0x272
+#define DP_TEST_AUDIO_PERIOD_CH1	    0x273
+#define DP_TEST_AUDIO_PERIOD_CH2	    0x274
+#define DP_TEST_AUDIO_PERIOD_CH3	    0x275
+#define DP_TEST_AUDIO_PERIOD_CH4	    0x276
+#define DP_TEST_AUDIO_PERIOD_CH5	    0x277
+#define DP_TEST_AUDIO_PERIOD_CH6	    0x278
+#define DP_TEST_AUDIO_PERIOD_CH7	    0x279
+#define DP_TEST_AUDIO_PERIOD_CH8	    0x27A
 
 #define DP_FEC_STATUS			    0x280    /* 1.4 */
 # define DP_FEC_DECODE_EN_DETECTED	    (1 << 0)
@@ -972,6 +989,7 @@
 #define DP_PEER_DEVICE_DP_LEGACY_CONV	0x4
 
 /* DP 1.2 MST sideband request names DP 1.2a Table 2-80 */
+#define DP_GET_MSG_TRANSACTION_VERSION	0x00 /* DP 1.3 */
 #define DP_LINK_ADDRESS			0x01
 #define DP_CONNECTION_STATUS_NOTIFY	0x02
 #define DP_ENUM_PATH_RESOURCES		0x10
@@ -988,6 +1006,10 @@
 #define DP_SINK_EVENT_NOTIFY		0x30
 #define DP_QUERY_STREAM_ENC_STATUS	0x38
 
+/* DP 1.2 MST sideband reply types */
+#define DP_SIDEBAND_REPLY_ACK		0x00
+#define DP_SIDEBAND_REPLY_NAK		0x01
+
 /* DP 1.2 MST sideband nak reasons - table 2.84 */
 #define DP_NAK_WRITE_FAILURE		0x01
 #define DP_NAK_INVALID_READ		0x02
@@ -1043,11 +1065,18 @@ int drm_dp_bw_code_to_link_rate(u8 link_bw);
 #define DP_SDP_VSC_EXT_CEA		0x21 /* DP 1.4 */
 /* 0x80+ CEA-861 infoframe types */
 
+/**
+ * struct dp_sdp_header - DP secondary data packet header
+ * @HB0: Secondary Data Packet ID
+ * @HB1: Secondary Data Packet Type
+ * @HB2: Secondary Data Packet Specific header, Byte 0
+ * @HB3: Secondary Data packet Specific header, Byte 1
+ */
 struct dp_sdp_header {
-	u8 HB0; /* Secondary Data Packet ID */
-	u8 HB1; /* Secondary Data Packet Type */
-	u8 HB2; /* Secondary Data Packet Specific header, Byte 0 */
-	u8 HB3; /* Secondary Data packet Specific header, Byte 1 */
+	u8 HB0;
+	u8 HB1;
+	u8 HB2;
+	u8 HB3;
 } __packed;
 
 #define EDP_SDP_HEADER_REVISION_MASK		0x1F
diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h
index 727af08e5ea6..8c97a5f92c47 100644
--- a/include/drm/drm_dp_mst_helper.h
+++ b/include/drm/drm_dp_mst_helper.h
@@ -44,7 +44,6 @@ struct drm_dp_vcpi {
 
 /**
  * struct drm_dp_mst_port - MST port
- * @kref: reference count for this port.
  * @port_num: port number
  * @input: if this port is an input port.
  * @mcs: message capability status - DP 1.2 spec.
@@ -67,7 +66,18 @@ struct drm_dp_vcpi {
  * in the MST topology.
  */
 struct drm_dp_mst_port {
-	struct kref kref;
+	/**
+	 * @topology_kref: refcount for this port's lifetime in the topology,
+	 * only the DP MST helpers should need to touch this
+	 */
+	struct kref topology_kref;
+
+	/**
+	 * @malloc_kref: refcount for the memory allocation containing this
+	 * structure. See drm_dp_mst_get_port_malloc() and
+	 * drm_dp_mst_put_port_malloc().
+	 */
+	struct kref malloc_kref;
 
 	u8 port_num;
 	bool input;
@@ -102,7 +112,6 @@ struct drm_dp_mst_port {
 
 /**
  * struct drm_dp_mst_branch - MST branch device.
- * @kref: reference count for this port.
  * @rad: Relative Address to talk to this branch device.
  * @lct: Link count total to talk to this branch device.
  * @num_ports: number of ports on the branch.
@@ -121,7 +130,19 @@ struct drm_dp_mst_port {
  * to downstream port of parent branches.
  */
 struct drm_dp_mst_branch {
-	struct kref kref;
+	/**
+	 * @topology_kref: refcount for this branch device's lifetime in the
+	 * topology, only the DP MST helpers should need to touch this
+	 */
+	struct kref topology_kref;
+
+	/**
+	 * @malloc_kref: refcount for the memory allocation containing this
+	 * structure. See drm_dp_mst_get_mstb_malloc() and
+	 * drm_dp_mst_put_mstb_malloc().
+	 */
+	struct kref malloc_kref;
+
 	u8 rad[8];
 	u8 lct;
 	int num_ports;
@@ -387,8 +408,6 @@ struct drm_dp_mst_topology_cbs {
 	void (*register_connector)(struct drm_connector *connector);
 	void (*destroy_connector)(struct drm_dp_mst_topology_mgr *mgr,
 				  struct drm_connector *connector);
-	void (*hotplug)(struct drm_dp_mst_topology_mgr *mgr);
-
 };
 
 #define DP_MAX_PAYLOAD (sizeof(unsigned long) * 8)
@@ -406,9 +425,15 @@ struct drm_dp_payload {
 
 #define to_dp_mst_topology_state(x) container_of(x, struct drm_dp_mst_topology_state, base)
 
+struct drm_dp_vcpi_allocation {
+	struct drm_dp_mst_port *port;
+	int vcpi;
+	struct list_head next;
+};
+
 struct drm_dp_mst_topology_state {
 	struct drm_private_state base;
-	int avail_slots;
+	struct list_head vcpis;
 	struct drm_dp_mst_topology_mgr *mgr;
 };
 
@@ -620,13 +645,115 @@ int __must_check
 drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr);
 struct drm_dp_mst_topology_state *drm_atomic_get_mst_topology_state(struct drm_atomic_state *state,
 								    struct drm_dp_mst_topology_mgr *mgr);
-int drm_dp_atomic_find_vcpi_slots(struct drm_atomic_state *state,
-				  struct drm_dp_mst_topology_mgr *mgr,
-				  struct drm_dp_mst_port *port, int pbn);
-int drm_dp_atomic_release_vcpi_slots(struct drm_atomic_state *state,
-				     struct drm_dp_mst_topology_mgr *mgr,
-				     int slots);
+int __must_check
+drm_dp_atomic_find_vcpi_slots(struct drm_atomic_state *state,
+			      struct drm_dp_mst_topology_mgr *mgr,
+			      struct drm_dp_mst_port *port, int pbn);
+int __must_check
+drm_dp_atomic_release_vcpi_slots(struct drm_atomic_state *state,
+				 struct drm_dp_mst_topology_mgr *mgr,
+				 struct drm_dp_mst_port *port);
 int drm_dp_send_power_updown_phy(struct drm_dp_mst_topology_mgr *mgr,
 				 struct drm_dp_mst_port *port, bool power_up);
+int __must_check drm_dp_mst_atomic_check(struct drm_atomic_state *state);
+
+void drm_dp_mst_get_port_malloc(struct drm_dp_mst_port *port);
+void drm_dp_mst_put_port_malloc(struct drm_dp_mst_port *port);
+
+extern const struct drm_private_state_funcs drm_dp_mst_topology_state_funcs;
+
+/**
+ * __drm_dp_mst_state_iter_get - private atomic state iterator function for
+ * macro-internal use
+ * @state: &struct drm_atomic_state pointer
+ * @mgr: pointer to the &struct drm_dp_mst_topology_mgr iteration cursor
+ * @old_state: optional pointer to the old &struct drm_dp_mst_topology_state
+ * iteration cursor
+ * @new_state: optional pointer to the new &struct drm_dp_mst_topology_state
+ * iteration cursor
+ * @i: int iteration cursor, for macro-internal use
+ *
+ * Used by for_each_oldnew_mst_mgr_in_state(),
+ * for_each_old_mst_mgr_in_state(), and for_each_new_mst_mgr_in_state(). Don't
+ * call this directly.
+ *
+ * Returns:
+ * True if the current &struct drm_private_obj is a &struct
+ * drm_dp_mst_topology_mgr, false otherwise.
+ */
+static inline bool
+__drm_dp_mst_state_iter_get(struct drm_atomic_state *state,
+			    struct drm_dp_mst_topology_mgr **mgr,
+			    struct drm_dp_mst_topology_state **old_state,
+			    struct drm_dp_mst_topology_state **new_state,
+			    int i)
+{
+	struct __drm_private_objs_state *objs_state = &state->private_objs[i];
+
+	if (objs_state->ptr->funcs != &drm_dp_mst_topology_state_funcs)
+		return false;
+
+	*mgr = to_dp_mst_topology_mgr(objs_state->ptr);
+	if (old_state)
+		*old_state = to_dp_mst_topology_state(objs_state->old_state);
+	if (new_state)
+		*new_state = to_dp_mst_topology_state(objs_state->new_state);
+
+	return true;
+}
+
+/**
+ * for_each_oldnew_mst_mgr_in_state - iterate over all DP MST topology
+ * managers in an atomic update
+ * @__state: &struct drm_atomic_state pointer
+ * @mgr: &struct drm_dp_mst_topology_mgr iteration cursor
+ * @old_state: &struct drm_dp_mst_topology_state iteration cursor for the old
+ * state
+ * @new_state: &struct drm_dp_mst_topology_state iteration cursor for the new
+ * state
+ * @__i: int iteration cursor, for macro-internal use
+ *
+ * This iterates over all DRM DP MST topology managers in an atomic update,
+ * tracking both old and new state. This is useful in places where the state
+ * delta needs to be considered, for example in atomic check functions.
+ */
+#define for_each_oldnew_mst_mgr_in_state(__state, mgr, old_state, new_state, __i) \
+	for ((__i) = 0; (__i) < (__state)->num_private_objs; (__i)++) \
+		for_each_if(__drm_dp_mst_state_iter_get((__state), &(mgr), &(old_state), &(new_state), (__i)))
+
+/**
+ * for_each_old_mst_mgr_in_state - iterate over all DP MST topology managers
+ * in an atomic update
+ * @__state: &struct drm_atomic_state pointer
+ * @mgr: &struct drm_dp_mst_topology_mgr iteration cursor
+ * @old_state: &struct drm_dp_mst_topology_state iteration cursor for the old
+ * state
+ * @__i: int iteration cursor, for macro-internal use
+ *
+ * This iterates over all DRM DP MST topology managers in an atomic update,
+ * tracking only the old state. This is useful in disable functions, where we
+ * need the old state the hardware is still in.
+ */
+#define for_each_old_mst_mgr_in_state(__state, mgr, old_state, __i) \
+	for ((__i) = 0; (__i) < (__state)->num_private_objs; (__i)++) \
+		for_each_if(__drm_dp_mst_state_iter_get((__state), &(mgr), &(old_state), NULL, (__i)))
+
+/**
+ * for_each_new_mst_mgr_in_state - iterate over all DP MST topology managers
+ * in an atomic update
+ * @__state: &struct drm_atomic_state pointer
+ * @mgr: &struct drm_dp_mst_topology_mgr iteration cursor
+ * @new_state: &struct drm_dp_mst_topology_state iteration cursor for the new
+ * state
+ * @__i: int iteration cursor, for macro-internal use
+ *
+ * This iterates over all DRM DP MST topology managers in an atomic update,
+ * tracking only the new state. This is useful in enable functions, where we
+ * need the new state the hardware should be in when the atomic commit
+ * operation has completed.
+ */
+#define for_each_new_mst_mgr_in_state(__state, mgr, new_state, __i) \
+	for ((__i) = 0; (__i) < (__state)->num_private_objs; (__i)++) \
+		for_each_if(__drm_dp_mst_state_iter_get((__state), &(mgr), NULL, &(new_state), (__i)))
 
 #endif
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 35af23f5fa0d..570f9d03b2eb 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -41,21 +41,113 @@ struct drm_display_mode;
 struct drm_mode_create_dumb;
 struct drm_printer;
 
-/* driver capabilities and requirements mask */
-#define DRIVER_USE_AGP			0x1
-#define DRIVER_LEGACY			0x2
-#define DRIVER_PCI_DMA			0x8
-#define DRIVER_SG			0x10
-#define DRIVER_HAVE_DMA			0x20
-#define DRIVER_HAVE_IRQ			0x40
-#define DRIVER_IRQ_SHARED		0x80
-#define DRIVER_GEM			0x1000
-#define DRIVER_MODESET			0x2000
-#define DRIVER_PRIME			0x4000
-#define DRIVER_RENDER			0x8000
-#define DRIVER_ATOMIC			0x10000
-#define DRIVER_KMS_LEGACY_CONTEXT	0x20000
-#define DRIVER_SYNCOBJ                  0x40000
+/**
+ * enum drm_driver_feature - feature flags
+ *
+ * See &drm_driver.driver_features, drm_device.driver_features and
+ * drm_core_check_feature().
+ */
+enum drm_driver_feature {
+	/**
+	 * @DRIVER_GEM:
+	 *
+	 * Driver use the GEM memory manager. This should be set for all modern
+	 * drivers.
+	 */
+	DRIVER_GEM			= BIT(0),
+	/**
+	 * @DRIVER_MODESET:
+	 *
+	 * Driver supports mode setting interfaces (KMS).
+	 */
+	DRIVER_MODESET			= BIT(1),
+	/**
+	 * @DRIVER_PRIME:
+	 *
+	 * Driver implements DRM PRIME buffer sharing.
+	 */
+	DRIVER_PRIME			= BIT(2),
+	/**
+	 * @DRIVER_RENDER:
+	 *
+	 * Driver supports dedicated render nodes. See also the :ref:`section on
+	 * render nodes <drm_render_node>` for details.
+	 */
+	DRIVER_RENDER			= BIT(3),
+	/**
+	 * @DRIVER_ATOMIC:
+	 *
+	 * Driver supports the full atomic modesetting userspace API. Drivers
+	 * which only use atomic internally, but do not the support the full
+	 * userspace API (e.g. not all properties converted to atomic, or
+	 * multi-plane updates are not guaranteed to be tear-free) should not
+	 * set this flag.
+	 */
+	DRIVER_ATOMIC			= BIT(4),
+	/**
+	 * @DRIVER_SYNCOBJ:
+	 *
+	 * Driver supports &drm_syncobj for explicit synchronization of command
+	 * submission.
+	 */
+	DRIVER_SYNCOBJ                  = BIT(5),
+
+	/* IMPORTANT: Below are all the legacy flags, add new ones above. */
+
+	/**
+	 * @DRIVER_USE_AGP:
+	 *
+	 * Set up DRM AGP support, see drm_agp_init(), the DRM core will manage
+	 * AGP resources. New drivers don't need this.
+	 */
+	DRIVER_USE_AGP			= BIT(25),
+	/**
+	 * @DRIVER_LEGACY:
+	 *
+	 * Denote a legacy driver using shadow attach. Do not use.
+	 */
+	DRIVER_LEGACY			= BIT(26),
+	/**
+	 * @DRIVER_PCI_DMA:
+	 *
+	 * Driver is capable of PCI DMA, mapping of PCI DMA buffers to userspace
+	 * will be enabled. Only for legacy drivers. Do not use.
+	 */
+	DRIVER_PCI_DMA			= BIT(27),
+	/**
+	 * @DRIVER_SG:
+	 *
+	 * Driver can perform scatter/gather DMA, allocation and mapping of
+	 * scatter/gather buffers will be enabled. Only for legacy drivers. Do
+	 * not use.
+	 */
+	DRIVER_SG			= BIT(28),
+
+	/**
+	 * @DRIVER_HAVE_DMA:
+	 *
+	 * Driver supports DMA, the userspace DMA API will be supported. Only
+	 * for legacy drivers. Do not use.
+	 */
+	DRIVER_HAVE_DMA			= BIT(29),
+	/**
+	 * @DRIVER_HAVE_IRQ:
+	 *
+	 * Legacy irq support. Only for legacy drivers. Do not use.
+	 *
+	 * New drivers can either use the drm_irq_install() and
+	 * drm_irq_uninstall() helper functions, or roll their own irq support
+	 * code by calling request_irq() directly.
+	 */
+	DRIVER_HAVE_IRQ			= BIT(30),
+	/**
+	 * @DRIVER_KMS_LEGACY_CONTEXT:
+	 *
+	 * Used only by nouveau for backwards compatibility with existing
+	 * userspace.  Do not use.
+	 */
+	DRIVER_KMS_LEGACY_CONTEXT	= BIT(31),
+};
 
 /**
  * struct drm_driver - DRM driver structure
@@ -579,7 +671,12 @@ struct drm_driver {
 	/** @date: driver date */
 	char *date;
 
-	/** @driver_features: driver features */
+	/**
+	 * @driver_features:
+	 * Driver features, see &enum drm_driver_feature. Drivers can disable
+	 * some features on a per-instance basis using
+	 * &drm_device.driver_features.
+	 */
 	u32 driver_features;
 
 	/**
@@ -643,6 +740,10 @@ void drm_dev_unplug(struct drm_device *dev);
  * Unplugging itself is singalled through drm_dev_unplug(). If a device is
  * unplugged, these two functions guarantee that any store before calling
  * drm_dev_unplug() is visible to callers of this function after it completes
+ *
+ * WARNING: This function fundamentally races against drm_dev_unplug(). It is
+ * recommended that drivers instead use the underlying drm_dev_enter() and
+ * drm_dev_exit() function pairs.
  */
 static inline bool drm_dev_is_unplugged(struct drm_device *dev)
 {
@@ -662,11 +763,11 @@ static inline bool drm_dev_is_unplugged(struct drm_device *dev)
  * @feature: feature flag
  *
  * This checks @dev for driver features, see &drm_driver.driver_features,
- * &drm_device.driver_features, and the various DRIVER_\* flags.
+ * &drm_device.driver_features, and the various &enum drm_driver_feature flags.
  *
  * Returns true if the @feature is supported, false otherwise.
  */
-static inline bool drm_core_check_feature(struct drm_device *dev, u32 feature)
+static inline bool drm_core_check_feature(const struct drm_device *dev, u32 feature)
 {
 	return dev->driver->driver_features & dev->driver_features & feature;
 }
diff --git a/include/drm/drm_dsc.h b/include/drm/drm_dsc.h
index d03f1b83421a..9c26f083c70f 100644
--- a/include/drm/drm_dsc.h
+++ b/include/drm/drm_dsc.h
@@ -44,111 +44,231 @@
 #define DSC_1_2_MAX_LINEBUF_DEPTH_VAL		0
 #define DSC_1_1_MAX_LINEBUF_DEPTH_BITS		13
 
-/* Configuration for a single Rate Control model range */
+/**
+ * struct drm_dsc_rc_range_parameters - DSC Rate Control range parameters
+ *
+ * This defines different rate control parameters used by the DSC engine
+ * to compress the frame.
+ */
 struct drm_dsc_rc_range_parameters {
-	/* Min Quantization Parameters allowed for this range */
+	/**
+	 * @range_min_qp: Min Quantization Parameters allowed for this range
+	 */
 	u8 range_min_qp;
-	/* Max Quantization Parameters allowed for this range */
+	/**
+	 * @range_max_qp: Max Quantization Parameters allowed for this range
+	 */
 	u8 range_max_qp;
-	/* Bits/group offset to apply to target for this group */
+	/**
+	 * @range_bpg_offset:
+	 * Bits/group offset to apply to target for this group
+	 */
 	u8 range_bpg_offset;
 };
 
+/**
+ * struct drm_dsc_config - Parameters required to configure DSC
+ *
+ * Driver populates this structure with all the parameters required
+ * to configure the display stream compression on the source.
+ */
 struct drm_dsc_config {
-	/* Bits / component for previous reconstructed line buffer */
+	/**
+	 * @line_buf_depth:
+	 * Bits per component for previous reconstructed line buffer
+	 */
 	u8 line_buf_depth;
-	/* Bits per component to code (must be 8, 10, or 12) */
+	/**
+	 * @bits_per_component: Bits per component to code (8/10/12)
+	 */
 	u8 bits_per_component;
-	/*
-	 * Flag indicating to do RGB - YCoCg conversion
-	 * and back (should be 1 for RGB input)
+	/**
+	 * @convert_rgb:
+	 * Flag to indicate if RGB - YCoCg conversion is needed
+	 * True if RGB input, False if YCoCg input
 	 */
 	bool convert_rgb;
+	/**
+	 * @slice_count: Number fo slices per line used by the DSC encoder
+	 */
 	u8 slice_count;
-	/* Slice Width */
+	/**
+	 *  @slice_width: Width of each slice in pixels
+	 */
 	u16 slice_width;
-	/* Slice Height */
+	/**
+	 * @slice_height: Slice height in pixels
+	 */
 	u16 slice_height;
-	/*
-	 * 4:2:2 enable mode (from PPS, 4:2:2 conversion happens
-	 * outside of DSC encode/decode algorithm)
+	/**
+	 * @enable422: True for 4_2_2 sampling, false for 4_4_4 sampling
 	 */
 	bool enable422;
-	/* Picture Width */
+	/**
+	 * @pic_width: Width of the input display frame in pixels
+	 */
 	u16 pic_width;
-	/* Picture Height */
+	/**
+	 * @pic_height: Vertical height of the input display frame
+	 */
 	u16 pic_height;
-	/* Offset to bits/group used by RC to determine QP adjustment */
+	/**
+	 * @rc_tgt_offset_high:
+	 * Offset to bits/group used by RC to determine QP adjustment
+	 */
 	u8 rc_tgt_offset_high;
-	/* Offset to bits/group used by RC to determine QP adjustment */
+	/**
+	 * @rc_tgt_offset_low:
+	 * Offset to bits/group used by RC to determine QP adjustment
+	 */
 	u8 rc_tgt_offset_low;
-	/* Bits/pixel target << 4 (ie., 4 fractional bits) */
+	/**
+	 * @bits_per_pixel:
+	 * Target bits per pixel with 4 fractional bits, bits_per_pixel << 4
+	 */
 	u16 bits_per_pixel;
-	/*
-	 * Factor to determine if an edge is present based
-	 * on the bits produced
+	/**
+	 * @rc_edge_factor:
+	 * Factor to determine if an edge is present based on the bits produced
 	 */
 	u8 rc_edge_factor;
-	/* Slow down incrementing once the range reaches this value */
+	/**
+	 * @rc_quant_incr_limit1:
+	 * Slow down incrementing once the range reaches this value
+	 */
 	u8 rc_quant_incr_limit1;
-	/* Slow down incrementing once the range reaches this value */
+	/**
+	 * @rc_quant_incr_limit0:
+	 * Slow down incrementing once the range reaches this value
+	 */
 	u8 rc_quant_incr_limit0;
-	/* Number of pixels to delay the initial transmission */
+	/**
+	 * @initial_xmit_delay:
+	 * Number of pixels to delay the initial transmission
+	 */
 	u16 initial_xmit_delay;
-	/* Number of pixels to delay the VLD on the decoder,not including SSM */
+	/**
+	 * @initial_dec_delay:
+	 * Initial decoder delay, number of pixel times that the decoder
+	 * accumulates data in its rate buffer before starting to decode
+	 * and output pixels.
+	 */
 	u16  initial_dec_delay;
-	/* Block prediction enable */
+	/**
+	 * @block_pred_enable:
+	 * True if block prediction is used to code any groups within the
+	 * picture. False if BP not used
+	 */
 	bool block_pred_enable;
-	/* Bits/group offset to use for first line of the slice */
+	/**
+	 * @first_line_bpg_offset:
+	 * Number of additional bits allocated for each group on the first
+	 * line of slice.
+	 */
 	u8 first_line_bpg_offset;
-	/* Value to use for RC model offset at slice start */
+	/**
+	 * @initial_offset: Value to use for RC model offset at slice start
+	 */
 	u16 initial_offset;
-	/* Thresholds defining each of the buffer ranges */
+	/**
+	 * @rc_buf_thresh: Thresholds defining each of the buffer ranges
+	 */
 	u16 rc_buf_thresh[DSC_NUM_BUF_RANGES - 1];
-	/* Parameters for each of the RC ranges */
+	/**
+	 * @rc_range_params:
+	 * Parameters for each of the RC ranges defined in
+	 * &struct drm_dsc_rc_range_parameters
+	 */
 	struct drm_dsc_rc_range_parameters rc_range_params[DSC_NUM_BUF_RANGES];
-	/* Total size of RC model */
+	/**
+	 * @rc_model_size: Total size of RC model
+	 */
 	u16 rc_model_size;
-	/* Minimum QP where flatness information is sent */
+	/**
+	 * @flatness_min_qp: Minimum QP where flatness information is sent
+	 */
 	u8 flatness_min_qp;
-	/* Maximum QP where flatness information is sent */
+	/**
+	 * @flatness_max_qp: Maximum QP where flatness information is sent
+	 */
 	u8 flatness_max_qp;
-	/* Initial value for scale factor */
+	/**
+	 * @initial_scale_value: Initial value for the scale factor
+	 */
 	u8 initial_scale_value;
-	/* Decrement scale factor every scale_decrement_interval groups */
+	/**
+	 * @scale_decrement_interval:
+	 * Specifies number of group times between decrementing the scale factor
+	 * at beginning of a slice.
+	 */
 	u16 scale_decrement_interval;
-	/* Increment scale factor every scale_increment_interval groups */
+	/**
+	 * @scale_increment_interval:
+	 * Number of group times between incrementing the scale factor value
+	 * used at the beginning of a slice.
+	 */
 	u16 scale_increment_interval;
-	/* Non-first line BPG offset to use */
+	/**
+	 * @nfl_bpg_offset: Non first line BPG offset to be used
+	 */
 	u16 nfl_bpg_offset;
-	/* BPG offset used to enforce slice bit */
+	/**
+	 * @slice_bpg_offset: BPG offset used to enforce slice bit
+	 */
 	u16 slice_bpg_offset;
-	/* Final RC linear transformation offset value */
+	/**
+	 * @final_offset: Final RC linear transformation offset value
+	 */
 	u16 final_offset;
-	/* Enable on-off VBR (ie., disable stuffing bits) */
+	/**
+	 * @vbr_enable: True if VBR mode is enabled, false if disabled
+	 */
 	bool vbr_enable;
-	/* Mux word size (in bits) for SSM mode */
+	/**
+	 * @mux_word_size: Mux word size (in bits) for SSM mode
+	 */
 	u8 mux_word_size;
-	/*
-	 * The (max) size in bytes of the "chunks" that are
-	 * used in slice multiplexing
+	/**
+	 * @slice_chunk_size:
+	 * The (max) size in bytes of the "chunks" that are used in slice
+	 * multiplexing.
 	 */
 	u16 slice_chunk_size;
-	/* Rate Control buffer siz in bits */
+	/**
+	 * @rc_bits: Rate control buffer size in bits
+	 */
 	u16 rc_bits;
-	/* DSC Minor Version */
+	/**
+	 * @dsc_version_minor: DSC minor version
+	 */
 	u8 dsc_version_minor;
-	/* DSC Major version */
+	/**
+	 * @dsc_version_major: DSC major version
+	 */
 	u8 dsc_version_major;
-	/* Native 4:2:2 support */
+	/**
+	 * @native_422: True if Native 4:2:2 supported, else false
+	 */
 	bool native_422;
-	/* Native 4:2:0 support */
+	/**
+	 * @native_420: True if Native 4:2:0 supported else false.
+	 */
 	bool native_420;
-	/* Additional bits/grp for seconnd line of slice for native 4:2:0 */
+	/**
+	 * @second_line_bpg_offset:
+	 * Additional bits/grp for seconnd line of slice for native 4:2:0
+	 */
 	u8 second_line_bpg_offset;
-	/* Num of bits deallocated for each grp that is not in second line of slice */
+	/**
+	 * @nsl_bpg_offset:
+	 * Num of bits deallocated for each grp that is not in second line of
+	 * slice
+	 */
 	u16 nsl_bpg_offset;
-	/* Offset adj fr second line in Native 4:2:0 mode */
+	/**
+	 * @second_line_offset_adj:
+	 * Offset adjustment for second line in Native 4:2:0 mode
+	 */
 	u16 second_line_offset_adj;
 };
 
@@ -468,10 +588,13 @@ struct drm_dsc_picture_parameter_set {
  * This structure represents the DSC PPS infoframe required to send the Picture
  * Parameter Set metadata required before enabling VESA Display Stream
  * Compression. This is based on the DP Secondary Data Packet structure and
- * comprises of SDP Header as defined in drm_dp_helper.h and PPS payload.
+ * comprises of SDP Header as defined &struct struct dp_sdp_header in drm_dp_helper.h
+ * and PPS payload defined in &struct drm_dsc_picture_parameter_set.
  *
- * @pps_header: Header for PPS as per DP SDP header format
+ * @pps_header: Header for PPS as per DP SDP header format of type
+ *              &struct dp_sdp_header
  * @pps_payload: PPS payload fields as per DSC specification Table 4-1
+ *               as represented in &struct drm_dsc_picture_parameter_set
  */
 struct drm_dsc_pps_infoframe {
 	struct dp_sdp_header pps_header;
diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index e3c404833115..8dc1a081fb36 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -352,18 +352,17 @@ drm_load_edid_firmware(struct drm_connector *connector)
 
 int
 drm_hdmi_avi_infoframe_from_display_mode(struct hdmi_avi_infoframe *frame,
-					 const struct drm_display_mode *mode,
-					 bool is_hdmi2_sink);
+					 struct drm_connector *connector,
+					 const struct drm_display_mode *mode);
 int
 drm_hdmi_vendor_infoframe_from_display_mode(struct hdmi_vendor_infoframe *frame,
 					    struct drm_connector *connector,
 					    const struct drm_display_mode *mode);
 void
 drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame,
+				   struct drm_connector *connector,
 				   const struct drm_display_mode *mode,
-				   enum hdmi_quantization_range rgb_quant_range,
-				   bool rgb_quant_range_selectable,
-				   bool is_hdmi2_sink);
+				   enum hdmi_quantization_range rgb_quant_range);
 
 /**
  * drm_eld_mnl - Get ELD monitor name length in bytes.
@@ -471,7 +470,6 @@ u8 drm_match_cea_mode(const struct drm_display_mode *to_match);
 enum hdmi_picture_aspect drm_get_cea_aspect_ratio(const u8 video_code);
 bool drm_detect_hdmi_monitor(struct edid *edid);
 bool drm_detect_monitor_audio(struct edid *edid);
-bool drm_rgb_quant_range_selectable(struct edid *edid);
 enum hdmi_quantization_range
 drm_default_rgb_quant_range(const struct drm_display_mode *mode);
 int drm_add_modes_noedid(struct drm_connector *connector,
diff --git a/include/drm/drm_encoder_slave.h b/include/drm/drm_encoder_slave.h
index 1107b4b1c599..a09864f6d684 100644
--- a/include/drm/drm_encoder_slave.h
+++ b/include/drm/drm_encoder_slave.h
@@ -27,7 +27,6 @@
 #ifndef __DRM_ENCODER_SLAVE_H__
 #define __DRM_ENCODER_SLAVE_H__
 
-#include <drm/drmP.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_encoder.h>
 
diff --git a/include/drm/drm_fb_cma_helper.h b/include/drm/drm_fb_cma_helper.h
index 8dbbe1eece1b..4becb09975a4 100644
--- a/include/drm/drm_fb_cma_helper.h
+++ b/include/drm/drm_fb_cma_helper.h
@@ -2,31 +2,9 @@
 #ifndef __DRM_FB_CMA_HELPER_H__
 #define __DRM_FB_CMA_HELPER_H__
 
-struct drm_fbdev_cma;
-struct drm_gem_cma_object;
-
-struct drm_fb_helper_surface_size;
-struct drm_framebuffer_funcs;
-struct drm_fb_helper_funcs;
 struct drm_framebuffer;
-struct drm_fb_helper;
-struct drm_device;
-struct drm_file;
-struct drm_mode_fb_cmd2;
-struct drm_plane;
 struct drm_plane_state;
 
-int drm_fb_cma_fbdev_init(struct drm_device *dev, unsigned int preferred_bpp,
-			  unsigned int max_conn_count);
-void drm_fb_cma_fbdev_fini(struct drm_device *dev);
-
-struct drm_fbdev_cma *drm_fbdev_cma_init(struct drm_device *dev,
-	unsigned int preferred_bpp, unsigned int max_conn_count);
-void drm_fbdev_cma_fini(struct drm_fbdev_cma *fbdev_cma);
-
-void drm_fbdev_cma_restore_mode(struct drm_fbdev_cma *fbdev_cma);
-void drm_fbdev_cma_hotplug_event(struct drm_fbdev_cma *fbdev_cma);
-
 struct drm_gem_cma_object *drm_fb_cma_get_gem_obj(struct drm_framebuffer *fb,
 	unsigned int plane);
 
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index 84ac79219e4c..6710b612e2f6 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -32,6 +32,7 @@
 
 #include <linux/types.h>
 #include <linux/completion.h>
+#include <linux/idr.h>
 
 #include <uapi/drm/drm.h>
 
diff --git a/include/drm/drm_fourcc.h b/include/drm/drm_fourcc.h
index bcb389f04618..b3d9d88ab290 100644
--- a/include/drm/drm_fourcc.h
+++ b/include/drm/drm_fourcc.h
@@ -143,6 +143,123 @@ struct drm_format_name_buf {
 	char str[32];
 };
 
+/**
+ * drm_format_info_is_yuv_packed - check that the format info matches a YUV
+ * format with data laid in a single plane
+ * @info: format info
+ *
+ * Returns:
+ * A boolean indicating whether the format info matches a packed YUV format.
+ */
+static inline bool
+drm_format_info_is_yuv_packed(const struct drm_format_info *info)
+{
+	return info->is_yuv && info->num_planes == 1;
+}
+
+/**
+ * drm_format_info_is_yuv_semiplanar - check that the format info matches a YUV
+ * format with data laid in two planes (luminance and chrominance)
+ * @info: format info
+ *
+ * Returns:
+ * A boolean indicating whether the format info matches a semiplanar YUV format.
+ */
+static inline bool
+drm_format_info_is_yuv_semiplanar(const struct drm_format_info *info)
+{
+	return info->is_yuv && info->num_planes == 2;
+}
+
+/**
+ * drm_format_info_is_yuv_planar - check that the format info matches a YUV
+ * format with data laid in three planes (one for each YUV component)
+ * @info: format info
+ *
+ * Returns:
+ * A boolean indicating whether the format info matches a planar YUV format.
+ */
+static inline bool
+drm_format_info_is_yuv_planar(const struct drm_format_info *info)
+{
+	return info->is_yuv && info->num_planes == 3;
+}
+
+/**
+ * drm_format_info_is_yuv_sampling_410 - check that the format info matches a
+ * YUV format with 4:1:0 sub-sampling
+ * @info: format info
+ *
+ * Returns:
+ * A boolean indicating whether the format info matches a YUV format with 4:1:0
+ * sub-sampling.
+ */
+static inline bool
+drm_format_info_is_yuv_sampling_410(const struct drm_format_info *info)
+{
+	return info->is_yuv && info->hsub == 4 && info->vsub == 4;
+}
+
+/**
+ * drm_format_info_is_yuv_sampling_411 - check that the format info matches a
+ * YUV format with 4:1:1 sub-sampling
+ * @info: format info
+ *
+ * Returns:
+ * A boolean indicating whether the format info matches a YUV format with 4:1:1
+ * sub-sampling.
+ */
+static inline bool
+drm_format_info_is_yuv_sampling_411(const struct drm_format_info *info)
+{
+	return info->is_yuv && info->hsub == 4 && info->vsub == 1;
+}
+
+/**
+ * drm_format_info_is_yuv_sampling_420 - check that the format info matches a
+ * YUV format with 4:2:0 sub-sampling
+ * @info: format info
+ *
+ * Returns:
+ * A boolean indicating whether the format info matches a YUV format with 4:2:0
+ * sub-sampling.
+ */
+static inline bool
+drm_format_info_is_yuv_sampling_420(const struct drm_format_info *info)
+{
+	return info->is_yuv && info->hsub == 2 && info->vsub == 2;
+}
+
+/**
+ * drm_format_info_is_yuv_sampling_422 - check that the format info matches a
+ * YUV format with 4:2:2 sub-sampling
+ * @info: format info
+ *
+ * Returns:
+ * A boolean indicating whether the format info matches a YUV format with 4:2:2
+ * sub-sampling.
+ */
+static inline bool
+drm_format_info_is_yuv_sampling_422(const struct drm_format_info *info)
+{
+	return info->is_yuv && info->hsub == 2 && info->vsub == 1;
+}
+
+/**
+ * drm_format_info_is_yuv_sampling_444 - check that the format info matches a
+ * YUV format with 4:4:4 sub-sampling
+ * @info: format info
+ *
+ * Returns:
+ * A boolean indicating whether the format info matches a YUV format with 4:4:4
+ * sub-sampling.
+ */
+static inline bool
+drm_format_info_is_yuv_sampling_444(const struct drm_format_info *info)
+{
+	return info->is_yuv && info->hsub == 1 && info->vsub == 1;
+}
+
 const struct drm_format_info *__drm_format_info(u32 format);
 const struct drm_format_info *drm_format_info(u32 format);
 const struct drm_format_info *
diff --git a/include/drm/drm_framebuffer.h b/include/drm/drm_framebuffer.h
index c94acedfb08e..f0b34c977ec5 100644
--- a/include/drm/drm_framebuffer.h
+++ b/include/drm/drm_framebuffer.h
@@ -23,13 +23,17 @@
 #ifndef __DRM_FRAMEBUFFER_H__
 #define __DRM_FRAMEBUFFER_H__
 
-#include <linux/list.h>
 #include <linux/ctype.h>
+#include <linux/list.h>
+#include <linux/sched.h>
+
 #include <drm/drm_mode_object.h>
 
-struct drm_framebuffer;
-struct drm_file;
+struct drm_clip_rect;
 struct drm_device;
+struct drm_file;
+struct drm_framebuffer;
+struct drm_gem_object;
 
 /**
  * struct drm_framebuffer_funcs - framebuffer hooks
diff --git a/include/drm/drm_gem_cma_helper.h b/include/drm/drm_gem_cma_helper.h
index 07c504940ba1..947ac95eb24a 100644
--- a/include/drm/drm_gem_cma_helper.h
+++ b/include/drm/drm_gem_cma_helper.h
@@ -2,9 +2,12 @@
 #ifndef __DRM_GEM_CMA_HELPER_H__
 #define __DRM_GEM_CMA_HELPER_H__
 
-#include <drm/drmP.h>
+#include <drm/drm_file.h>
+#include <drm/drm_ioctl.h>
 #include <drm/drm_gem.h>
 
+struct drm_mode_create_dumb;
+
 /**
  * struct drm_gem_cma_object - GEM object backed by CMA memory allocations
  * @base: base GEM object
diff --git a/include/drm/drm_gem_framebuffer_helper.h b/include/drm/drm_gem_framebuffer_helper.h
index a38de7eb55b4..7f307e834eef 100644
--- a/include/drm/drm_gem_framebuffer_helper.h
+++ b/include/drm/drm_gem_framebuffer_helper.h
@@ -25,6 +25,9 @@ drm_gem_fb_create_with_funcs(struct drm_device *dev, struct drm_file *file,
 struct drm_framebuffer *
 drm_gem_fb_create(struct drm_device *dev, struct drm_file *file,
 		  const struct drm_mode_fb_cmd2 *mode_cmd);
+struct drm_framebuffer *
+drm_gem_fb_create_with_dirty(struct drm_device *dev, struct drm_file *file,
+			     const struct drm_mode_fb_cmd2 *mode_cmd);
 
 int drm_gem_fb_prepare_fb(struct drm_plane *plane,
 			  struct drm_plane_state *state);
diff --git a/include/drm/drm_hdcp.h b/include/drm/drm_hdcp.h
index c21682f76cd3..7260b31af276 100644
--- a/include/drm/drm_hdcp.h
+++ b/include/drm/drm_hdcp.h
@@ -9,6 +9,8 @@
 #ifndef _DRM_HDCP_H_INCLUDED_
 #define _DRM_HDCP_H_INCLUDED_
 
+#include <linux/types.h>
+
 /* Period of hdcp checks (to ensure we're still authenticated) */
 #define DRM_HDCP_CHECK_PERIOD_MS		(128 * 16)
 
diff --git a/include/drm/drm_legacy.h b/include/drm/drm_legacy.h
index 8fad66f88e4f..3e99ab69c122 100644
--- a/include/drm/drm_legacy.h
+++ b/include/drm/drm_legacy.h
@@ -2,6 +2,9 @@
 #define __DRM_DRM_LEGACY_H__
 
 #include <drm/drm_auth.h>
+#include <drm/drm_hashtab.h>
+
+struct drm_device;
 
 /*
  * Legacy driver interfaces for the Direct Rendering Manager
@@ -156,6 +159,7 @@ struct drm_map_list {
 int drm_legacy_addmap(struct drm_device *d, resource_size_t offset,
 		      unsigned int size, enum drm_map_type type,
 		      enum drm_map_flags flags, struct drm_local_map **map_p);
+struct drm_local_map *drm_legacy_findmap(struct drm_device *dev, unsigned int token);
 void drm_legacy_rmmap(struct drm_device *d, struct drm_local_map *map);
 int drm_legacy_rmmap_locked(struct drm_device *d, struct drm_local_map *map);
 void drm_legacy_master_rmmaps(struct drm_device *dev,
@@ -194,14 +198,4 @@ void drm_legacy_ioremap(struct drm_local_map *map, struct drm_device *dev);
 void drm_legacy_ioremap_wc(struct drm_local_map *map, struct drm_device *dev);
 void drm_legacy_ioremapfree(struct drm_local_map *map, struct drm_device *dev);
 
-static inline struct drm_local_map *drm_legacy_findmap(struct drm_device *dev,
-						       unsigned int token)
-{
-	struct drm_map_list *_entry;
-	list_for_each_entry(_entry, &dev->maplist, head)
-	    if (_entry->user_token == token)
-		return _entry->map;
-	return NULL;
-}
-
 #endif /* __DRM_DRM_LEGACY_H__ */
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index 572274ccbec7..7f60e8eb269a 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -361,7 +361,7 @@ struct drm_mode_config {
 	 *
 	 * This is the big scary modeset BKL which protects everything that
 	 * isn't protect otherwise. Scope is unclear and fuzzy, try to remove
-	 * anything from under it's protection and move it into more well-scoped
+	 * anything from under its protection and move it into more well-scoped
 	 * locks.
 	 *
 	 * The one important thing this protects is the use of @acquire_ctx.
@@ -391,18 +391,18 @@ struct drm_mode_config {
 	/**
 	 * @idr_mutex:
 	 *
-	 * Mutex for KMS ID allocation and management. Protects both @crtc_idr
+	 * Mutex for KMS ID allocation and management. Protects both @object_idr
 	 * and @tile_idr.
 	 */
 	struct mutex idr_mutex;
 
 	/**
-	 * @crtc_idr:
+	 * @object_idr:
 	 *
 	 * Main KMS ID tracking object. Use this idr for all IDs, fb, crtc,
 	 * connector, modes - just makes life easier to have only one.
 	 */
-	struct idr crtc_idr;
+	struct idr object_idr;
 
 	/**
 	 * @tile_idr:
@@ -512,6 +512,15 @@ struct drm_mode_config {
 	 */
 	struct list_head property_list;
 
+	/**
+	 * @privobj_list:
+	 *
+	 * List of private objects linked with &drm_private_obj.head. This is
+	 * invariant over the lifetime of a device and hence doesn't need any
+	 * locks.
+	 */
+	struct list_head privobj_list;
+
 	int min_width, min_height;
 	int max_width, max_height;
 	const struct drm_mode_config_funcs *funcs;
@@ -688,22 +697,22 @@ struct drm_mode_config {
 	struct drm_property *tv_mode_property;
 	/**
 	 * @tv_left_margin_property: Optional TV property to set the left
-	 * margin.
+	 * margin (expressed in pixels).
 	 */
 	struct drm_property *tv_left_margin_property;
 	/**
 	 * @tv_right_margin_property: Optional TV property to set the right
-	 * margin.
+	 * margin (expressed in pixels).
 	 */
 	struct drm_property *tv_right_margin_property;
 	/**
 	 * @tv_top_margin_property: Optional TV property to set the right
-	 * margin.
+	 * margin (expressed in pixels).
 	 */
 	struct drm_property *tv_top_margin_property;
 	/**
 	 * @tv_bottom_margin_property: Optional TV property to set the right
-	 * margin.
+	 * margin (expressed in pixels).
 	 */
 	struct drm_property *tv_bottom_margin_property;
 	/**
diff --git a/include/drm/drm_modes.h b/include/drm/drm_modes.h
index baded6514456..be4fed97e727 100644
--- a/include/drm/drm_modes.h
+++ b/include/drm/drm_modes.h
@@ -136,8 +136,7 @@ enum drm_mode_status {
 	.hdisplay = (hd), .hsync_start = (hss), .hsync_end = (hse), \
 	.htotal = (ht), .hskew = (hsk), .vdisplay = (vd), \
 	.vsync_start = (vss), .vsync_end = (vse), .vtotal = (vt), \
-	.vscan = (vs), .flags = (f), \
-	.base.type = DRM_MODE_OBJECT_MODE
+	.vscan = (vs), .flags = (f)
 
 #define CRTC_INTERLACE_HALVE_V	(1 << 0) /* halve V values for interlacing */
 #define CRTC_STEREO_DOUBLE	(1 << 1) /* adjust timings for stereo modes */
@@ -214,20 +213,6 @@ struct drm_display_mode {
 	struct list_head head;
 
 	/**
-	 * @base:
-	 *
-	 * A display mode is a normal modeset object, possibly including public
-	 * userspace id.
-	 *
-	 * FIXME:
-	 *
-	 * This can probably be removed since the entire concept of userspace
-	 * managing modes explicitly has never landed in upstream kernel mode
-	 * setting support.
-	 */
-	struct drm_mode_object base;
-
-	/**
 	 * @name:
 	 *
 	 * Human-readable name of the mode, filled out with drm_mode_set_name().
@@ -429,14 +414,14 @@ struct drm_display_mode {
 /**
  * DRM_MODE_FMT - printf string for &struct drm_display_mode
  */
-#define DRM_MODE_FMT    "%d:\"%s\" %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x"
+#define DRM_MODE_FMT    "\"%s\": %d %d %d %d %d %d %d %d %d %d 0x%x 0x%x"
 
 /**
  * DRM_MODE_ARG - printf arguments for &struct drm_display_mode
  * @m: display mode
  */
 #define DRM_MODE_ARG(m) \
-	(m)->base.id, (m)->name, (m)->vrefresh, (m)->clock, \
+	(m)->name, (m)->vrefresh, (m)->clock, \
 	(m)->hdisplay, (m)->hsync_start, (m)->hsync_end, (m)->htotal, \
 	(m)->vdisplay, (m)->vsync_start, (m)->vsync_end, (m)->vtotal, \
 	(m)->type, (m)->flags
diff --git a/include/drm/drm_modeset_helper.h b/include/drm/drm_modeset_helper.h
index efa337f03129..995fd981cab0 100644
--- a/include/drm/drm_modeset_helper.h
+++ b/include/drm/drm_modeset_helper.h
@@ -23,7 +23,11 @@
 #ifndef __DRM_KMS_HELPER_H__
 #define __DRM_KMS_HELPER_H__
 
-#include <drm/drmP.h>
+struct drm_crtc;
+struct drm_crtc_funcs;
+struct drm_device;
+struct drm_framebuffer;
+struct drm_mode_fb_cmd2;
 
 void drm_helper_move_panel_connectors_to_head(struct drm_device *);
 
diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
index 61142aa0ab23..cfb7be40bed7 100644
--- a/include/drm/drm_modeset_helper_vtables.h
+++ b/include/drm/drm_modeset_helper_vtables.h
@@ -1013,7 +1013,7 @@ struct drm_plane_helper_funcs {
 	 * @prepare_fb:
 	 *
 	 * This hook is to prepare a framebuffer for scanout by e.g. pinning
-	 * it's backing storage or relocating it into a contiguous block of
+	 * its backing storage or relocating it into a contiguous block of
 	 * VRAM. Other possible preparatory work includes flushing caches.
 	 *
 	 * This function must not block for outstanding rendering, since it is
diff --git a/include/drm/drm_modeset_lock.h b/include/drm/drm_modeset_lock.h
index a308f2d6496f..7b8841065b11 100644
--- a/include/drm/drm_modeset_lock.h
+++ b/include/drm/drm_modeset_lock.h
@@ -68,7 +68,7 @@ struct drm_modeset_acquire_ctx {
 /**
  * struct drm_modeset_lock - used for locking modeset resources.
  * @mutex: resource locking
- * @head: used to hold it's place on &drm_atomi_state.locked list when
+ * @head: used to hold its place on &drm_atomi_state.locked list when
  *    part of an atomic update
  *
  * Used for locking CRTCs and other modeset resources.
diff --git a/include/drm/drm_probe_helper.h b/include/drm/drm_probe_helper.h
new file mode 100644
index 000000000000..8d3ed2834d34
--- /dev/null
+++ b/include/drm/drm_probe_helper.h
@@ -0,0 +1,27 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+
+#ifndef __DRM_PROBE_HELPER_H__
+#define __DRM_PROBE_HELPER_H__
+
+#include <linux/types.h>
+
+struct drm_connector;
+struct drm_device;
+struct drm_modeset_acquire_ctx;
+
+int drm_helper_probe_single_connector_modes(struct drm_connector
+					    *connector, uint32_t maxX,
+					    uint32_t maxY);
+int drm_helper_probe_detect(struct drm_connector *connector,
+			    struct drm_modeset_acquire_ctx *ctx,
+			    bool force);
+void drm_kms_helper_poll_init(struct drm_device *dev);
+void drm_kms_helper_poll_fini(struct drm_device *dev);
+bool drm_helper_hpd_irq_event(struct drm_device *dev);
+void drm_kms_helper_hotplug_event(struct drm_device *dev);
+
+void drm_kms_helper_poll_disable(struct drm_device *dev);
+void drm_kms_helper_poll_enable(struct drm_device *dev);
+bool drm_kms_helper_is_poll_worker(void);
+
+#endif
diff --git a/include/drm/drm_rect.h b/include/drm/drm_rect.h
index 6c54544a4be7..6195820aa5c5 100644
--- a/include/drm/drm_rect.h
+++ b/include/drm/drm_rect.h
@@ -182,12 +182,6 @@ int drm_rect_calc_hscale(const struct drm_rect *src,
 int drm_rect_calc_vscale(const struct drm_rect *src,
 			 const struct drm_rect *dst,
 			 int min_vscale, int max_vscale);
-int drm_rect_calc_hscale_relaxed(struct drm_rect *src,
-				 struct drm_rect *dst,
-				 int min_hscale, int max_hscale);
-int drm_rect_calc_vscale_relaxed(struct drm_rect *src,
-				 struct drm_rect *dst,
-				 int min_vscale, int max_vscale);
 void drm_rect_debug_print(const char *prefix,
 			  const struct drm_rect *r, bool fixed_point);
 void drm_rect_rotate(struct drm_rect *r,
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index b1fe921f8e8f..0311c9fdbd2f 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -26,9 +26,9 @@
 #ifndef __DRM_SYNCOBJ_H__
 #define __DRM_SYNCOBJ_H__
 
-#include "linux/dma-fence.h"
+#include <linux/dma-fence.h>
 
-struct drm_syncobj_cb;
+struct drm_file;
 
 /**
  * struct drm_syncobj - sync object.
@@ -62,25 +62,6 @@ struct drm_syncobj {
 	struct file *file;
 };
 
-typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
-				   struct drm_syncobj_cb *cb);
-
-/**
- * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
- * @node: used by drm_syncob_add_callback to append this struct to
- *	  &drm_syncobj.cb_list
- * @func: drm_syncobj_func_t to call
- *
- * This struct will be initialized by drm_syncobj_add_callback, additional
- * data can be passed along by embedding drm_syncobj_cb in another struct.
- * The callback will get called the next time drm_syncobj_replace_fence is
- * called.
- */
-struct drm_syncobj_cb {
-	struct list_head node;
-	drm_syncobj_func_t func;
-};
-
 void drm_syncobj_free(struct kref *kref);
 
 /**
diff --git a/include/drm/drm_util.h b/include/drm/drm_util.h
index 88abdca89baa..07b8e9f04599 100644
--- a/include/drm/drm_util.h
+++ b/include/drm/drm_util.h
@@ -26,7 +26,58 @@
 #ifndef _DRM_UTIL_H_
 #define _DRM_UTIL_H_
 
-/* helper for handling conditionals in various for_each macros */
+/**
+ * DOC: drm utils
+ *
+ * Macros and inline functions that does not naturally belong in other places
+ */
+
+#include <linux/interrupt.h>
+#include <linux/kgdb.h>
+#include <linux/preempt.h>
+#include <linux/smp.h>
+
+/*
+ * Use EXPORT_SYMBOL_FOR_TESTS_ONLY() for functions that shall
+ * only be visible for drmselftests.
+ */
+#if defined(CONFIG_DRM_DEBUG_SELFTEST_MODULE)
+#define EXPORT_SYMBOL_FOR_TESTS_ONLY(x) EXPORT_SYMBOL(x)
+#else
+#define EXPORT_SYMBOL_FOR_TESTS_ONLY(x)
+#endif
+
+/**
+ * for_each_if - helper for handling conditionals in various for_each macros
+ * @condition: The condition to check
+ *
+ * Typical use::
+ *
+ *	#define for_each_foo_bar(x, y) \'
+ *		list_for_each_entry(x, y->list, head) \'
+ *			for_each_if(x->something == SOMETHING)
+ *
+ * The for_each_if() macro makes the use of for_each_foo_bar() less error
+ * prone.
+ */
 #define for_each_if(condition) if (!(condition)) {} else
 
+/**
+ * drm_can_sleep - returns true if currently okay to sleep
+ *
+ * This function shall not be used in new code.
+ * The check for running in atomic context may not work - see linux/preempt.h.
+ *
+ * FIXME: All users of drm_can_sleep should be removed (see todo.rst)
+ *
+ * Returns:
+ * False if kgdb is active, we are in atomic context or irqs are disabled.
+ */
+static inline bool drm_can_sleep(void)
+{
+	if (in_atomic() || in_dbg_master() || irqs_disabled())
+		return false;
+	return true;
+}
+
 #endif
diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
index 6ad9630d4f48..e528bb2f659d 100644
--- a/include/drm/drm_vblank.h
+++ b/include/drm/drm_vblank.h
@@ -129,6 +129,26 @@ struct drm_vblank_crtc {
 	 */
 	u32 last;
 	/**
+	 * @max_vblank_count:
+	 *
+	 * Maximum value of the vblank registers for this crtc. This value +1
+	 * will result in a wrap-around of the vblank register. It is used
+	 * by the vblank core to handle wrap-arounds.
+	 *
+	 * If set to zero the vblank core will try to guess the elapsed vblanks
+	 * between times when the vblank interrupt is disabled through
+	 * high-precision timestamps. That approach is suffering from small
+	 * races and imprecision over longer time periods, hence exposing a
+	 * hardware vblank counter is always recommended.
+	 *
+	 * This is the runtime configurable per-crtc maximum set through
+	 * drm_crtc_set_max_vblank_count(). If this is used the driver
+	 * must leave the device wide &drm_device.max_vblank_count at zero.
+	 *
+	 * If non-zero, &drm_crtc_funcs.get_vblank_counter must be set.
+	 */
+	u32 max_vblank_count;
+	/**
 	 * @inmodeset: Tracks whether the vblank is disabled due to a modeset.
 	 * For legacy driver bit 2 additionally tracks whether an additional
 	 * temporary vblank reference has been acquired to paper over the
@@ -206,4 +226,6 @@ bool drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev,
 void drm_calc_timestamping_constants(struct drm_crtc *crtc,
 				     const struct drm_display_mode *mode);
 wait_queue_head_t *drm_crtc_vblank_waitqueue(struct drm_crtc *crtc);
+void drm_crtc_set_max_vblank_count(struct drm_crtc *crtc,
+				   u32 max_vblank_count);
 #endif
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 47e19796c450..0daca4d8dad9 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -138,10 +138,6 @@ struct drm_sched_fence {
 	struct dma_fence		finished;
 
         /**
-         * @cb: the callback for the parent fence below.
-         */
-	struct dma_fence_cb		cb;
-        /**
          * @parent: the fence returned by &drm_sched_backend_ops.run_job
          * when scheduling the job on hardware. We signal the
          * &drm_sched_fence.finished fence once parent is signalled.
@@ -181,6 +177,7 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
  *         be scheduled further.
  * @s_priority: the priority of the job.
  * @entity: the entity to which this job belongs.
+ * @cb: the callback for the parent fence in s_fence.
  *
  * A job is created by the driver using drm_sched_job_init(), and
  * should call drm_sched_entity_push_job() once it wants the scheduler
@@ -197,6 +194,7 @@ struct drm_sched_job {
 	atomic_t			karma;
 	enum drm_sched_priority		s_priority;
 	struct drm_sched_entity  *entity;
+	struct dma_fence_cb		cb;
 };
 
 static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
@@ -298,9 +296,10 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		       void *owner);
 void drm_sched_job_cleanup(struct drm_sched_job *job);
 void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
-void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched,
-			    struct drm_sched_job *job);
-void drm_sched_job_recovery(struct drm_gpu_scheduler *sched);
+void drm_sched_stop(struct drm_gpu_scheduler *sched);
+void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery);
+void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched);
+void drm_sched_increase_karma(struct drm_sched_job *bad);
 bool drm_sched_dependency_optimized(struct dma_fence* fence,
 				    struct drm_sched_entity *entity);
 void drm_sched_fault(struct drm_gpu_scheduler *sched);
diff --git a/include/drm/i915_pciids.h b/include/drm/i915_pciids.h
index 192667144693..d2fad7b0fcf6 100644
--- a/include/drm/i915_pciids.h
+++ b/include/drm/i915_pciids.h
@@ -394,6 +394,9 @@
 	INTEL_VGA_DEVICE(0x3E9A, info)  /* SRV GT2 */
 
 /* CFL H */
+#define INTEL_CFL_H_GT1_IDS(info) \
+	INTEL_VGA_DEVICE(0x3E9C, info)
+
 #define INTEL_CFL_H_GT2_IDS(info) \
 	INTEL_VGA_DEVICE(0x3E9B, info), /* Halo GT2 */ \
 	INTEL_VGA_DEVICE(0x3E94, info)  /* Halo GT2 */
@@ -426,6 +429,7 @@
 #define INTEL_CFL_IDS(info)	   \
 	INTEL_CFL_S_GT1_IDS(info), \
 	INTEL_CFL_S_GT2_IDS(info), \
+	INTEL_CFL_H_GT1_IDS(info), \
 	INTEL_CFL_H_GT2_IDS(info), \
 	INTEL_CFL_U_GT2_IDS(info), \
 	INTEL_CFL_U_GT3_IDS(info), \
@@ -457,9 +461,13 @@
 	INTEL_VGA_DEVICE(0x8A51, info), \
 	INTEL_VGA_DEVICE(0x8A5C, info), \
 	INTEL_VGA_DEVICE(0x8A5D, info), \
+	INTEL_VGA_DEVICE(0x8A59, info),	\
+	INTEL_VGA_DEVICE(0x8A58, info),	\
 	INTEL_VGA_DEVICE(0x8A52, info), \
 	INTEL_VGA_DEVICE(0x8A5A, info), \
 	INTEL_VGA_DEVICE(0x8A5B, info), \
+	INTEL_VGA_DEVICE(0x8A57, info), \
+	INTEL_VGA_DEVICE(0x8A56, info), \
 	INTEL_VGA_DEVICE(0x8A71, info), \
 	INTEL_VGA_DEVICE(0x8A70, info)
 
diff --git a/include/drm/intel-gtt.h b/include/drm/intel-gtt.h
index 2324c84a25c0..71d81923e6b0 100644
--- a/include/drm/intel-gtt.h
+++ b/include/drm/intel-gtt.h
@@ -4,6 +4,9 @@
 #ifndef _DRM_INTEL_GTT_H
 #define	_DRM_INTEL_GTT_H
 
+#include <linux/agp_backend.h>
+#include <linux/kernel.h>
+
 void intel_gtt_get(u64 *gtt_total,
 		   phys_addr_t *mappable_base,
 		   resource_size_t *mappable_end);
diff --git a/include/drm/tinydrm/mipi-dbi.h b/include/drm/tinydrm/mipi-dbi.h
index b8ba58861986..f4ec2834bc22 100644
--- a/include/drm/tinydrm/mipi-dbi.h
+++ b/include/drm/tinydrm/mipi-dbi.h
@@ -14,6 +14,7 @@
 
 #include <drm/tinydrm/tinydrm.h>
 
+struct drm_rect;
 struct spi_device;
 struct gpio_desc;
 struct regulator;
@@ -67,6 +68,8 @@ int mipi_dbi_init(struct device *dev, struct mipi_dbi *mipi,
 		  const struct drm_simple_display_pipe_funcs *pipe_funcs,
 		  struct drm_driver *driver,
 		  const struct drm_display_mode *mode, unsigned int rotation);
+void mipi_dbi_pipe_update(struct drm_simple_display_pipe *pipe,
+			  struct drm_plane_state *old_state);
 void mipi_dbi_enable_flush(struct mipi_dbi *mipi,
 			   struct drm_crtc_state *crtc_state,
 			   struct drm_plane_state *plan_state);
@@ -80,7 +83,7 @@ u32 mipi_dbi_spi_cmd_max_speed(struct spi_device *spi, size_t len);
 int mipi_dbi_command_read(struct mipi_dbi *mipi, u8 cmd, u8 *val);
 int mipi_dbi_command_buf(struct mipi_dbi *mipi, u8 cmd, u8 *data, size_t len);
 int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer *fb,
-		      struct drm_clip_rect *clip, bool swap);
+		      struct drm_rect *clip, bool swap);
 /**
  * mipi_dbi_command - MIPI DCS command with optional parameter(s)
  * @mipi: MIPI structure
diff --git a/include/drm/tinydrm/tinydrm-helpers.h b/include/drm/tinydrm/tinydrm-helpers.h
index 5b96f0b12c8c..f0d598789e4d 100644
--- a/include/drm/tinydrm/tinydrm-helpers.h
+++ b/include/drm/tinydrm/tinydrm-helpers.h
@@ -11,8 +11,8 @@
 #define __LINUX_TINYDRM_HELPERS_H
 
 struct backlight_device;
-struct tinydrm_device;
-struct drm_clip_rect;
+struct drm_framebuffer;
+struct drm_rect;
 struct spi_transfer;
 struct spi_message;
 struct spi_device;
@@ -33,23 +33,15 @@ static inline bool tinydrm_machine_little_endian(void)
 #endif
 }
 
-bool tinydrm_merge_clips(struct drm_clip_rect *dst,
-			 struct drm_clip_rect *src, unsigned int num_clips,
-			 unsigned int flags, u32 max_width, u32 max_height);
-int tinydrm_fb_dirty(struct drm_framebuffer *fb,
-		     struct drm_file *file_priv,
-		     unsigned int flags, unsigned int color,
-		     struct drm_clip_rect *clips,
-		     unsigned int num_clips);
 void tinydrm_memcpy(void *dst, void *vaddr, struct drm_framebuffer *fb,
-		    struct drm_clip_rect *clip);
+		    struct drm_rect *clip);
 void tinydrm_swab16(u16 *dst, void *vaddr, struct drm_framebuffer *fb,
-		    struct drm_clip_rect *clip);
+		    struct drm_rect *clip);
 void tinydrm_xrgb8888_to_rgb565(u16 *dst, void *vaddr,
 				struct drm_framebuffer *fb,
-				struct drm_clip_rect *clip, bool swap);
+				struct drm_rect *clip, bool swap);
 void tinydrm_xrgb8888_to_gray8(u8 *dst, void *vaddr, struct drm_framebuffer *fb,
-			       struct drm_clip_rect *clip);
+			       struct drm_rect *clip);
 
 size_t tinydrm_spi_max_transfer_size(struct spi_device *spi, size_t max_len);
 bool tinydrm_spi_bpw_supported(struct spi_device *spi, u8 bpw);
diff --git a/include/drm/tinydrm/tinydrm.h b/include/drm/tinydrm/tinydrm.h
index 448aa5ea4722..5621688edcc0 100644
--- a/include/drm/tinydrm/tinydrm.h
+++ b/include/drm/tinydrm/tinydrm.h
@@ -10,14 +10,9 @@
 #ifndef __LINUX_TINYDRM_H
 #define __LINUX_TINYDRM_H
 
-#include <linux/mutex.h>
 #include <drm/drm_simple_kms_helper.h>
 
-struct drm_clip_rect;
 struct drm_driver;
-struct drm_file;
-struct drm_framebuffer;
-struct drm_framebuffer_funcs;
 
 /**
  * struct tinydrm_device - tinydrm device
@@ -32,24 +27,6 @@ struct tinydrm_device {
 	 * @pipe: Display pipe structure
 	 */
 	struct drm_simple_display_pipe pipe;
-
-	/**
-	 * @dirty_lock: Serializes framebuffer flushing
-	 */
-	struct mutex dirty_lock;
-
-	/**
-	 * @fb_funcs: Framebuffer functions used when creating framebuffers
-	 */
-	const struct drm_framebuffer_funcs *fb_funcs;
-
-	/**
-	 * @fb_dirty: Framebuffer dirty callback
-	 */
-	int (*fb_dirty)(struct drm_framebuffer *framebuffer,
-			struct drm_file *file_priv, unsigned flags,
-			unsigned color, struct drm_clip_rect *clips,
-			unsigned num_clips);
 };
 
 static inline struct tinydrm_device *
@@ -82,13 +59,10 @@ pipe_to_tinydrm(struct drm_simple_display_pipe *pipe)
 	.clock = 1 /* pass validation */
 
 int devm_tinydrm_init(struct device *parent, struct tinydrm_device *tdev,
-		      const struct drm_framebuffer_funcs *fb_funcs,
 		      struct drm_driver *driver);
 int devm_tinydrm_register(struct tinydrm_device *tdev);
 void tinydrm_shutdown(struct tinydrm_device *tdev);
 
-void tinydrm_display_pipe_update(struct drm_simple_display_pipe *pipe,
-				 struct drm_plane_state *old_state);
 int
 tinydrm_display_pipe_init(struct tinydrm_device *tdev,
 			  const struct drm_simple_display_pipe_funcs *funcs,
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 3fc4854dce49..49d9cdfc58f2 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -296,23 +296,6 @@ static inline void ttm_bo_get(struct ttm_buffer_object *bo)
 }
 
 /**
- * ttm_bo_reference - reference a struct ttm_buffer_object
- *
- * @bo: The buffer object.
- *
- * Returns a refcounted pointer to a buffer object.
- *
- * This function is deprecated. Use @ttm_bo_get instead.
- */
-
-static inline struct ttm_buffer_object *
-ttm_bo_reference(struct ttm_buffer_object *bo)
-{
-	ttm_bo_get(bo);
-	return bo;
-}
-
-/**
  * ttm_bo_get_unless_zero - reference a struct ttm_buffer_object unless
  * its refcount has already reached zero.
  * @bo: The buffer object.
@@ -387,17 +370,6 @@ int ttm_bo_validate(struct ttm_buffer_object *bo,
 void ttm_bo_put(struct ttm_buffer_object *bo);
 
 /**
- * ttm_bo_unref
- *
- * @bo: The buffer object.
- *
- * Unreference and clear a pointer to a buffer object.
- *
- * This function is deprecated. Use @ttm_bo_put instead.
- */
-void ttm_bo_unref(struct ttm_buffer_object **bo);
-
-/**
  * ttm_bo_add_to_lru
  *
  * @bo: The buffer object.
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 1021106438b2..cbf3180cb612 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -381,6 +381,15 @@ struct ttm_bo_driver {
 	 */
 	int (*access_memory)(struct ttm_buffer_object *bo, unsigned long offset,
 			     void *buf, int len, int write);
+
+	/**
+	 * struct ttm_bo_driver member del_from_lru_notify
+	 *
+	 * @bo: the buffer object deleted from lru
+	 *
+	 * notify driver that a BO was deleted from LRU.
+	 */
+	void (*del_from_lru_notify)(struct ttm_buffer_object *bo);
 };
 
 /**
@@ -867,7 +876,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
  *
  * @bo: A pointer to a struct ttm_buffer_object.
  *
- * Pipelined gutting a BO of it's backing store.
+ * Pipelined gutting a BO of its backing store.
  */
 int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo);
 
diff --git a/include/linux/dma-fence-array.h b/include/linux/dma-fence-array.h
index bc8940ca280d..c0ff417b4770 100644
--- a/include/linux/dma-fence-array.h
+++ b/include/linux/dma-fence-array.h
@@ -40,6 +40,7 @@ struct dma_fence_array_cb {
  * @num_fences: number of fences in the array
  * @num_pending: fences in the array still pending
  * @fences: array of the fences
+ * @work: internal irq_work function
  */
 struct dma_fence_array {
 	struct dma_fence base;
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 999e4b104410..6b788467b2e3 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -77,7 +77,7 @@ struct dma_fence {
 	struct list_head cb_list;
 	spinlock_t *lock;
 	u64 context;
-	unsigned seqno;
+	u64 seqno;
 	unsigned long flags;
 	ktime_t timestamp;
 	int error;
@@ -244,7 +244,7 @@ struct dma_fence_ops {
 };
 
 void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
-		    spinlock_t *lock, u64 context, unsigned seqno);
+		    spinlock_t *lock, u64 context, u64 seqno);
 
 void dma_fence_release(struct kref *kref);
 void dma_fence_free(struct dma_fence *fence);
@@ -414,9 +414,17 @@ dma_fence_is_signaled(struct dma_fence *fence)
  * Returns true if f1 is chronologically later than f2. Both fences must be
  * from the same context, since a seqno is not common across contexts.
  */
-static inline bool __dma_fence_is_later(u32 f1, u32 f2)
+static inline bool __dma_fence_is_later(u64 f1, u64 f2)
 {
-	return (int)(f1 - f2) > 0;
+	/* This is for backward compatibility with drivers which can only handle
+	 * 32bit sequence numbers. Use a 64bit compare when any of the higher
+	 * bits are none zero, otherwise use a 32bit compare with wrap around
+	 * handling.
+	 */
+	if (upper_32_bits(f1) || upper_32_bits(f2))
+		return f1 > f2;
+
+	return (int)(lower_32_bits(f1) - lower_32_bits(f2)) > 0;
 }
 
 /**
@@ -548,21 +556,21 @@ u64 dma_fence_context_alloc(unsigned num);
 	do {								\
 		struct dma_fence *__ff = (f);				\
 		if (IS_ENABLED(CONFIG_DMA_FENCE_TRACE))			\
-			pr_info("f %llu#%u: " fmt,			\
+			pr_info("f %llu#%llu: " fmt,			\
 				__ff->context, __ff->seqno, ##args);	\
 	} while (0)
 
 #define DMA_FENCE_WARN(f, fmt, args...) \
 	do {								\
 		struct dma_fence *__ff = (f);				\
-		pr_warn("f %llu#%u: " fmt, __ff->context, __ff->seqno,	\
+		pr_warn("f %llu#%llu: " fmt, __ff->context, __ff->seqno,\
 			 ##args);					\
 	} while (0)
 
 #define DMA_FENCE_ERR(f, fmt, args...) \
 	do {								\
 		struct dma_fence *__ff = (f);				\
-		pr_err("f %llu#%u: " fmt, __ff->context, __ff->seqno,	\
+		pr_err("f %llu#%llu: " fmt, __ff->context, __ff->seqno,	\
 			##args);					\
 	} while (0)
 
diff --git a/include/linux/hdmi.h b/include/linux/hdmi.h
index d2bacf502429..927ad6451105 100644
--- a/include/linux/hdmi.h
+++ b/include/linux/hdmi.h
@@ -27,6 +27,21 @@
 #include <linux/types.h>
 #include <linux/device.h>
 
+enum hdmi_packet_type {
+	HDMI_PACKET_TYPE_NULL = 0x00,
+	HDMI_PACKET_TYPE_AUDIO_CLOCK_REGEN = 0x01,
+	HDMI_PACKET_TYPE_AUDIO_SAMPLE = 0x02,
+	HDMI_PACKET_TYPE_GENERAL_CONTROL = 0x03,
+	HDMI_PACKET_TYPE_ACP = 0x04,
+	HDMI_PACKET_TYPE_ISRC1 = 0x05,
+	HDMI_PACKET_TYPE_ISRC2 = 0x06,
+	HDMI_PACKET_TYPE_ONE_BIT_AUDIO_SAMPLE = 0x07,
+	HDMI_PACKET_TYPE_DST_AUDIO = 0x08,
+	HDMI_PACKET_TYPE_HBR_AUDIO_STREAM = 0x09,
+	HDMI_PACKET_TYPE_GAMUT_METADATA = 0x0a,
+	/* + enum hdmi_infoframe_type */
+};
+
 enum hdmi_infoframe_type {
 	HDMI_INFOFRAME_TYPE_VENDOR = 0x81,
 	HDMI_INFOFRAME_TYPE_AVI = 0x82,
diff --git a/include/linux/mfd/intel_soc_pmic.h b/include/linux/mfd/intel_soc_pmic.h
index ed1dfba5e5f9..bfecd6bd4990 100644
--- a/include/linux/mfd/intel_soc_pmic.h
+++ b/include/linux/mfd/intel_soc_pmic.h
@@ -26,4 +26,7 @@ struct intel_soc_pmic {
 	struct device *dev;
 };
 
+int intel_soc_pmic_exec_mipi_pmic_seq_element(u16 i2c_address, u32 reg_address,
+					      u32 value, u32 mask);
+
 #endif	/* __INTEL_SOC_PMIC_H__ */
diff --git a/include/trace/events/host1x.h b/include/trace/events/host1x.h
index a37ef73092e5..3d340b6f1ea3 100644
--- a/include/trace/events/host1x.h
+++ b/include/trace/events/host1x.h
@@ -80,6 +80,32 @@ TRACE_EVENT(host1x_cdma_push,
 		__entry->name, __entry->op1, __entry->op2)
 );
 
+TRACE_EVENT(host1x_cdma_push_wide,
+	TP_PROTO(const char *name, u32 op1, u32 op2, u32 op3, u32 op4),
+
+	TP_ARGS(name, op1, op2, op3, op4),
+
+	TP_STRUCT__entry(
+		__field(const char *, name)
+		__field(u32, op1)
+		__field(u32, op2)
+		__field(u32, op3)
+		__field(u32, op4)
+	),
+
+	TP_fast_assign(
+		__entry->name = name;
+		__entry->op1 = op1;
+		__entry->op2 = op2;
+		__entry->op3 = op3;
+		__entry->op4 = op4;
+	),
+
+	TP_printk("name=%s, op1=%08x, op2=%08x, op3=%08x op4=%08x",
+		__entry->name, __entry->op1, __entry->op2, __entry->op3,
+		__entry->op4)
+);
+
 TRACE_EVENT(host1x_cdma_push_gather,
 	TP_PROTO(const char *name, struct host1x_bo *bo,
 			u32 words, u32 offset, void *cmdbuf),
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index be84e43c1e19..4a53f6cfa034 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -272,13 +272,14 @@ union drm_amdgpu_vm {
 
 /* sched ioctl */
 #define AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE	1
+#define AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE	2
 
 struct drm_amdgpu_sched_in {
 	/* AMDGPU_SCHED_OP_* */
 	__u32	op;
 	__u32	fd;
 	__s32	priority;
-	__u32	flags;
+	__u32   ctx_id;
 };
 
 union drm_amdgpu_sched {
@@ -523,6 +524,7 @@ struct drm_amdgpu_gem_va {
 #define AMDGPU_CHUNK_ID_SYNCOBJ_IN      0x04
 #define AMDGPU_CHUNK_ID_SYNCOBJ_OUT     0x05
 #define AMDGPU_CHUNK_ID_BO_HANDLES      0x06
+#define AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES	0x07
 
 struct drm_amdgpu_cs_chunk {
 	__u32		chunk_id;
@@ -565,6 +567,11 @@ union drm_amdgpu_cs {
  * caches (L2/vL1/sL1/I$). */
 #define AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE (1 << 3)
 
+/* Set GDS_COMPUTE_MAX_WAVE_ID = DEFAULT before PACKET3_INDIRECT_BUFFER.
+ * This will reset wave ID counters for the IB.
+ */
+#define AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID (1 << 4)
+
 struct drm_amdgpu_cs_chunk_ib {
 	__u32 _pad;
 	/** AMDGPU_IB_FLAG_* */
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index 0b44260a5ee9..bab20298f422 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -196,6 +196,27 @@ extern "C" {
 #define DRM_FORMAT_NV42		fourcc_code('N', 'V', '4', '2') /* non-subsampled Cb:Cr plane */
 
 /*
+ * 2 plane YCbCr MSB aligned
+ * index 0 = Y plane, [15:0] Y:x [10:6] little endian
+ * index 1 = Cr:Cb plane, [31:0] Cr:x:Cb:x [10:6:10:6] little endian
+ */
+#define DRM_FORMAT_P010		fourcc_code('P', '0', '1', '0') /* 2x2 subsampled Cr:Cb plane 10 bits per channel */
+
+/*
+ * 2 plane YCbCr MSB aligned
+ * index 0 = Y plane, [15:0] Y:x [12:4] little endian
+ * index 1 = Cr:Cb plane, [31:0] Cr:x:Cb:x [12:4:12:4] little endian
+ */
+#define DRM_FORMAT_P012		fourcc_code('P', '0', '1', '2') /* 2x2 subsampled Cr:Cb plane 12 bits per channel */
+
+/*
+ * 2 plane YCbCr MSB aligned
+ * index 0 = Y plane, [15:0] Y little endian
+ * index 1 = Cr:Cb plane, [31:0] Cr:Cb [16:16] little endian
+ */
+#define DRM_FORMAT_P016		fourcc_code('P', '0', '1', '6') /* 2x2 subsampled Cr:Cb plane 16 bits per channel */
+
+/*
  * 3 plane YCbCr
  * index 0: Y plane, [7:0] Y
  * index 1: Cb plane, [7:0] Cb
@@ -238,6 +259,8 @@ extern "C" {
 #define DRM_FORMAT_MOD_VENDOR_VIVANTE 0x06
 #define DRM_FORMAT_MOD_VENDOR_BROADCOM 0x07
 #define DRM_FORMAT_MOD_VENDOR_ARM     0x08
+#define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09
+
 /* add more to the end as needed */
 
 #define DRM_FORMAT_RESERVED	      ((1ULL << 56) - 1)
@@ -572,6 +595,9 @@ extern "C" {
  * AFBC has several features which may be supported and/or used, which are
  * represented using bits in the modifier. Not all combinations are valid,
  * and different devices or use-cases may support different combinations.
+ *
+ * Further information on the use of AFBC modifiers can be found in
+ * Documentation/gpu/afbc.rst
  */
 #define DRM_FORMAT_MOD_ARM_AFBC(__afbc_mode)	fourcc_mod_code(ARM, __afbc_mode)
 
@@ -581,10 +607,18 @@ extern "C" {
  * Indicates the superblock size(s) used for the AFBC buffer. The buffer
  * size (in pixels) must be aligned to a multiple of the superblock size.
  * Four lowest significant bits(LSBs) are reserved for block size.
+ *
+ * Where one superblock size is specified, it applies to all planes of the
+ * buffer (e.g. 16x16, 32x8). When multiple superblock sizes are specified,
+ * the first applies to the Luma plane and the second applies to the Chroma
+ * plane(s). e.g. (32x8_64x4 means 32x8 Luma, with 64x4 Chroma).
+ * Multiple superblock sizes are only valid for multi-plane YCbCr formats.
  */
 #define AFBC_FORMAT_MOD_BLOCK_SIZE_MASK      0xf
 #define AFBC_FORMAT_MOD_BLOCK_SIZE_16x16     (1ULL)
 #define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8      (2ULL)
+#define AFBC_FORMAT_MOD_BLOCK_SIZE_64x4      (3ULL)
+#define AFBC_FORMAT_MOD_BLOCK_SIZE_32x8_64x4 (4ULL)
 
 /*
  * AFBC lossless colorspace transform
@@ -644,6 +678,35 @@ extern "C" {
  */
 #define AFBC_FORMAT_MOD_SC      (1ULL <<  9)
 
+/*
+ * AFBC double-buffer
+ *
+ * Indicates that the buffer is allocated in a layout safe for front-buffer
+ * rendering.
+ */
+#define AFBC_FORMAT_MOD_DB      (1ULL << 10)
+
+/*
+ * AFBC buffer content hints
+ *
+ * Indicates that the buffer includes per-superblock content hints.
+ */
+#define AFBC_FORMAT_MOD_BCH     (1ULL << 11)
+
+/*
+ * Allwinner tiled modifier
+ *
+ * This tiling mode is implemented by the VPU found on all Allwinner platforms,
+ * codenamed sunxi. It is associated with a YUV format that uses either 2 or 3
+ * planes.
+ *
+ * With this tiling, the luminance samples are disposed in tiles representing
+ * 32x32 pixels and the chrominance samples in tiles representing 32x64 pixels.
+ * The pixel order in each tile is linear and the tiles are disposed linearly,
+ * both in row-major order.
+ */
+#define DRM_FORMAT_MOD_ALLWINNER_TILED fourcc_mod_code(ALLWINNER, 1)
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 298b2e197744..397810fa2d33 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1486,9 +1486,73 @@ struct drm_i915_gem_context_param {
 #define   I915_CONTEXT_MAX_USER_PRIORITY	1023 /* inclusive */
 #define   I915_CONTEXT_DEFAULT_PRIORITY		0
 #define   I915_CONTEXT_MIN_USER_PRIORITY	-1023 /* inclusive */
+	/*
+	 * When using the following param, value should be a pointer to
+	 * drm_i915_gem_context_param_sseu.
+	 */
+#define I915_CONTEXT_PARAM_SSEU		0x7
 	__u64 value;
 };
 
+/**
+ * Context SSEU programming
+ *
+ * It may be necessary for either functional or performance reason to configure
+ * a context to run with a reduced number of SSEU (where SSEU stands for Slice/
+ * Sub-slice/EU).
+ *
+ * This is done by configuring SSEU configuration using the below
+ * @struct drm_i915_gem_context_param_sseu for every supported engine which
+ * userspace intends to use.
+ *
+ * Not all GPUs or engines support this functionality in which case an error
+ * code -ENODEV will be returned.
+ *
+ * Also, flexibility of possible SSEU configuration permutations varies between
+ * GPU generations and software imposed limitations. Requesting such a
+ * combination will return an error code of -EINVAL.
+ *
+ * NOTE: When perf/OA is active the context's SSEU configuration is ignored in
+ * favour of a single global setting.
+ */
+struct drm_i915_gem_context_param_sseu {
+	/*
+	 * Engine class & instance to be configured or queried.
+	 */
+	__u16 engine_class;
+	__u16 engine_instance;
+
+	/*
+	 * Unused for now. Must be cleared to zero.
+	 */
+	__u32 flags;
+
+	/*
+	 * Mask of slices to enable for the context. Valid values are a subset
+	 * of the bitmask value returned for I915_PARAM_SLICE_MASK.
+	 */
+	__u64 slice_mask;
+
+	/*
+	 * Mask of subslices to enable for the context. Valid values are a
+	 * subset of the bitmask value return by I915_PARAM_SUBSLICE_MASK.
+	 */
+	__u64 subslice_mask;
+
+	/*
+	 * Minimum/Maximum number of EUs to enable per subslice for the
+	 * context. min_eus_per_subslice must be inferior or equal to
+	 * max_eus_per_subslice.
+	 */
+	__u16 min_eus_per_subslice;
+	__u16 max_eus_per_subslice;
+
+	/*
+	 * Unused for now. Must be cleared to zero.
+	 */
+	__u32 rsvd;
+};
+
 enum drm_i915_oa_format {
 	I915_OA_FORMAT_A13 = 1,	    /* HSW only */
 	I915_OA_FORMAT_A29,	    /* HSW only */
diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h
index 259588a4b61b..9459a6e3bc1f 100644
--- a/include/uapi/drm/nouveau_drm.h
+++ b/include/uapi/drm/nouveau_drm.h
@@ -133,12 +133,63 @@ struct drm_nouveau_gem_cpu_fini {
 #define DRM_NOUVEAU_NOTIFIEROBJ_ALLOC  0x05 /* deprecated */
 #define DRM_NOUVEAU_GPUOBJ_FREE        0x06 /* deprecated */
 #define DRM_NOUVEAU_NVIF               0x07
+#define DRM_NOUVEAU_SVM_INIT           0x08
+#define DRM_NOUVEAU_SVM_BIND           0x09
 #define DRM_NOUVEAU_GEM_NEW            0x40
 #define DRM_NOUVEAU_GEM_PUSHBUF        0x41
 #define DRM_NOUVEAU_GEM_CPU_PREP       0x42
 #define DRM_NOUVEAU_GEM_CPU_FINI       0x43
 #define DRM_NOUVEAU_GEM_INFO           0x44
 
+struct drm_nouveau_svm_init {
+	__u64 unmanaged_addr;
+	__u64 unmanaged_size;
+};
+
+struct drm_nouveau_svm_bind {
+	__u64 header;
+	__u64 va_start;
+	__u64 va_end;
+	__u64 npages;
+	__u64 stride;
+	__u64 result;
+	__u64 reserved0;
+	__u64 reserved1;
+};
+
+#define NOUVEAU_SVM_BIND_COMMAND_SHIFT          0
+#define NOUVEAU_SVM_BIND_COMMAND_BITS           8
+#define NOUVEAU_SVM_BIND_COMMAND_MASK           ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_PRIORITY_SHIFT         8
+#define NOUVEAU_SVM_BIND_PRIORITY_BITS          8
+#define NOUVEAU_SVM_BIND_PRIORITY_MASK          ((1 << 8) - 1)
+#define NOUVEAU_SVM_BIND_TARGET_SHIFT           16
+#define NOUVEAU_SVM_BIND_TARGET_BITS            32
+#define NOUVEAU_SVM_BIND_TARGET_MASK            0xffffffff
+
+/*
+ * Below is use to validate ioctl argument, userspace can also use it to make
+ * sure that no bit are set beyond known fields for a given kernel version.
+ */
+#define NOUVEAU_SVM_BIND_VALID_BITS     48
+#define NOUVEAU_SVM_BIND_VALID_MASK     ((1ULL << NOUVEAU_SVM_BIND_VALID_BITS) - 1)
+
+
+/*
+ * NOUVEAU_BIND_COMMAND__MIGRATE: synchronous migrate to target memory.
+ * result: number of page successfuly migrate to the target memory.
+ */
+#define NOUVEAU_SVM_BIND_COMMAND__MIGRATE               0
+
+/*
+ * NOUVEAU_SVM_BIND_HEADER_TARGET__GPU_VRAM: target the GPU VRAM memory.
+ */
+#define NOUVEAU_SVM_BIND_TARGET__GPU_VRAM               (1UL << 31)
+
+
+#define DRM_IOCTL_NOUVEAU_SVM_INIT           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_INIT, struct drm_nouveau_svm_init)
+#define DRM_IOCTL_NOUVEAU_SVM_BIND           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_SVM_BIND, struct drm_nouveau_svm_bind)
+
 #define DRM_IOCTL_NOUVEAU_GEM_NEW            DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_NEW, struct drm_nouveau_gem_new)
 #define DRM_IOCTL_NOUVEAU_GEM_PUSHBUF        DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_PUSHBUF, struct drm_nouveau_gem_pushbuf)
 #define DRM_IOCTL_NOUVEAU_GEM_CPU_PREP       DRM_IOW (DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_CPU_PREP, struct drm_nouveau_gem_cpu_prep)
diff --git a/include/uapi/drm/v3d_drm.h b/include/uapi/drm/v3d_drm.h
index 35c7d813c66e..ea70669d2138 100644
--- a/include/uapi/drm/v3d_drm.h
+++ b/include/uapi/drm/v3d_drm.h
@@ -52,6 +52,14 @@ extern "C" {
  *
  * This asks the kernel to have the GPU execute an optional binner
  * command list, and a render command list.
+ *
+ * The L1T, slice, L2C, L2T, and GCA caches will be flushed before
+ * each CL executes.  The VCD cache should be flushed (if necessary)
+ * by the submitted CLs.  The TLB writes are guaranteed to have been
+ * flushed by the time the render done IRQ happens, which is the
+ * trigger for out_sync.  Any dirtying of cachelines by the job (only
+ * possible using TMU writes) must be flushed by the caller using the
+ * CL's cache flush commands.
  */
 struct drm_v3d_submit_cl {
 	/* Pointer to the binner command list.
diff --git a/include/video/imx-ipu-v3.h b/include/video/imx-ipu-v3.h
index e582e8e7527a..b80b85f0d9d8 100644
--- a/include/video/imx-ipu-v3.h
+++ b/include/video/imx-ipu-v3.h
@@ -348,6 +348,7 @@ int ipu_prg_channel_configure(struct ipuv3_channel *ipu_chan,
 			      unsigned int axi_id,  unsigned int width,
 			      unsigned int height, unsigned int stride,
 			      u32 format, uint64_t modifier, unsigned long *eba);
+bool ipu_prg_channel_configure_pending(struct ipuv3_channel *ipu_chan);
 
 /*
  * IPU CMOS Sensor Interface (csi) functions
author	Linus Torvalds <torvalds@linux-foundation.org>	2019-03-08 08:23:15 -0800
committer	Linus Torvalds <torvalds@linux-foundation.org>	2019-03-08 08:23:15 -0800
commit	851ca779d110f694b5d078bc4af06d3ad37169e8 (patch)
tree	3d03de09e44ef02a6f73924f32fa21646347e64e
parent	b5dd0c658c31b469ccff1b637e5124851e7a4a1c (diff)
parent	4b057e73f28f1df13b77b77a52094238ffdf8abd (diff)
download	linux-851ca779d110f694b5d078bc4af06d3ad37169e8.tar.gz