From patchwork Tue Jul 21 12:59:27 2020
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: John David Anglin <dave.anglin@bell.net>
X-Patchwork-Id: 11675655
Return-Path: <SRS0=DV98=BA=vger.kernel.org=linux-parisc-owner@kernel.org>
Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org
 [172.30.200.123])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F24FB618
	for <patchwork-linux-parisc@patchwork.kernel.org>;
 Tue, 21 Jul 2020 12:59:30 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id E5B7920792
	for <patchwork-linux-parisc@patchwork.kernel.org>;
 Tue, 21 Jul 2020 12:59:30 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1727955AbgGUM7a (ORCPT
        <rfc822;patchwork-linux-parisc@patchwork.kernel.org>);
        Tue, 21 Jul 2020 08:59:30 -0400
Received: from belmont79srvr.owm.bell.net ([184.150.200.79]:39911 "EHLO
        mtlfep01.bell.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org
        with ESMTP id S1726769AbgGUM73 (ORCPT
        <rfc822;linux-parisc@vger.kernel.org>);
        Tue, 21 Jul 2020 08:59:29 -0400
Received: from bell.net mtlfep01 184.150.200.30 by mtlfep01.bell.net
          with ESMTP
          id <20200721125927.UMFH5779.mtlfep01.bell.net@mtlspm02.bell.net>
          for <linux-parisc@vger.kernel.org>;
          Tue, 21 Jul 2020 08:59:27 -0400
Received: from [192.168.2.49] (really [70.53.53.104]) by mtlspm02.bell.net
          with ESMTP
          id <20200721125927.VTMT16482.mtlspm02.bell.net@[192.168.2.49]>;
          Tue, 21 Jul 2020 08:59:27 -0400
To: linux-parisc <linux-parisc@vger.kernel.org>
Cc: Helge Deller <deller@gmx.de>,
        James Bottomley <James.Bottomley@HansenPartnership.com>
From: John David Anglin <dave.anglin@bell.net>
Subject: [PATCH v2] parisc: Various spin lock optimizations
Message-ID: <bf8a45cb-e32d-23c3-1da3-97f4c83a4b32@bell.net>
Date: Tue, 21 Jul 2020 08:59:27 -0400
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101
 Thunderbird/68.10.0
MIME-Version: 1.0
Content-Language: en-US
X-CM-Analysis: v=2.3 cv=E9SzWpVl c=1 sm=1 tr=0 a=htCe9XT+XAlGhzqgweArVg==:117
 a=htCe9XT+XAlGhzqgweArVg==:17 a=IkcTkHD0fZMA:10 a=_RQrkK6FrEwA:10
 a=FBHGMhGWAAAA:8 a=P8qibhrTnXfHFPWUauQA:9 a=QEXdDO2ut3YA:10
 a=9gvnlMMaQFpL9xblJ6ne:22
X-CM-Envelope: 
 MS4wfFj2iso1UsBCJoDvpIQB0a3MEgplJk7UffYNIdBUWbLTCsvNgUde4JykBIKqdHDMo5AmXGWxvCax2obn2Qy/Fp2LY2FkX/V3RqOdSjLaex8UxV47NB+m
 e7q19JmCwpQ2miuel62gXCi8DxhJxO5GnVyKTkUSI4Fgoq+AgVzYFhCyg9zB3xYSu3aMRpYF25GSFQ==
Sender: linux-parisc-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-parisc.vger.kernel.org>
X-Mailing-List: linux-parisc@vger.kernel.org

V2 adjusts the barrier used in arch_spin_unlock().

While investigating the stall problem, I looked closely at our spin lock implementation and found
a number of minor issues.

Regarding arch_spin_is_locked(), I wasn't convinced that the barrier was correct, so I switched the
code to use READ_ONCE.

Regarding arch_spin_lock(), cpu_relax() slightly pessimizes the loop code generated by gcc.  The pointer
"a" is volatile, so we can just use continue.

Regarding arch_spin_lock_flags(), I went back to the old code which just toggles interrupts on and off
in the wait loop.  It's rather dangerous to allow the routine to set all the PSW flag bits and wierd
things happen if local_save_flags() is moved.

Regarding arch_spin_unlock(), we can use an ordered store to release the lock and eliminate the ldcw
barrier.

Finally regarding arch_spin_trylock(), I just shortened the C code.

Signed-off-by: Dave Anglin <dave.anglin@bell.net>

diff --git a/arch/parisc/include/asm/spinlock.h b/arch/parisc/include/asm/spinlock.h
index 70fecb8dc4e2..ad40b3a8e8e9 100644
--- a/arch/parisc/include/asm/spinlock.h
+++ b/arch/parisc/include/asm/spinlock.h
@@ -10,8 +10,7 @@
 static inline int arch_spin_is_locked(arch_spinlock_t *x)
 {
 	volatile unsigned int *a = __ldcw_align(x);
-	smp_mb();
-	return *a == 0;
+	return READ_ONCE(*a) == 0;
 }

 static inline void arch_spin_lock(arch_spinlock_t *x)
@@ -21,22 +20,21 @@ static inline void arch_spin_lock(arch_spinlock_t *x)
 	a = __ldcw_align(x);
 	while (__ldcw(a) == 0)
 		while (*a == 0)
-			cpu_relax();
+			continue;
 }

 static inline void arch_spin_lock_flags(arch_spinlock_t *x,
-					 unsigned long flags)
+					  unsigned long flags)
 {
 	volatile unsigned int *a;
-	unsigned long flags_dis;

 	a = __ldcw_align(x);
 	while (__ldcw(a) == 0) {
-		local_save_flags(flags_dis);
-		local_irq_restore(flags);
 		while (*a == 0)
-			cpu_relax();
-		local_irq_restore(flags_dis);
+			if (flags & PSW_SM_I) {
+				local_irq_enable();
+				local_irq_disable();
+			}
 	}
 }
 #define arch_spin_lock_flags arch_spin_lock_flags
@@ -46,23 +44,15 @@ static inline void arch_spin_unlock(arch_spinlock_t *x)
 	volatile unsigned int *a;

 	a = __ldcw_align(x);
-#ifdef CONFIG_SMP
-	(void) __ldcw(a);
-#else
-	mb();
-#endif
-	*a = 1;
+	asm volatile("stw,ma %0,0(%1)" : : "r"(1), "r"(a) : "memory");
 }

 static inline int arch_spin_trylock(arch_spinlock_t *x)
 {
 	volatile unsigned int *a;
-	int ret;

 	a = __ldcw_align(x);
-        ret = __ldcw(a) != 0;
-
-	return ret;
+	return __ldcw(a) != 0;
 }

 /*
