[PATCH 00/21] FIFO: Support multiple readers

Ken Brown kbrown@cornell.edu
Tue May 19 13:37:17 GMT 2020


On 5/19/2020 8:51 AM, Ken Brown via Cygwin-patches wrote:
> On 5/19/2020 2:15 AM, Takashi Yano via Cygwin-patches wrote:
>> On Tue, 19 May 2020 10:26:09 +0900
>> Takashi Yano via Cygwin-patches <cygwin-patches@cygwin.com> wrote:
>>> Hi Ken,
>>>
>>> On Mon, 18 May 2020 13:42:19 -0400
>>> Ken Brown via Cygwin-patches <cygwin-patches@cygwin.com> wrote:
>>>> Hi Takashi,
>>>>
>>>> On 5/18/2020 12:03 PM, Ken Brown via Cygwin-patches wrote:
>>>>> On 5/18/2020 1:36 AM, Takashi Yano via Cygwin-patches wrote:
>>>>>> On Mon, 18 May 2020 14:25:19 +0900
>>>>>> Takashi Yano via Cygwin-patches <cygwin-patches@cygwin.com> wrote:
>>>>>>> However, mc hangs by several operations.
>>>>>>>
>>>>>>> To reproduce this:
>>>>>>> 1. Start mc with 'env SHELL=tcsh mc -a'
>>>>>>
>>>>>> I mean 'env SHELL=/bin/tcsh mc -a'
>>>>>>
>>>>>>> 2. Select a file using up/down cursor keys.
>>>>>>> 3. Press F3 (View) key.
>>>>>
>>>>> Thanks for the report.  I can reproduce the problem and will look into it.
>>>>
>>>> I'm not convinced that this is a FIFO bug.  I tried two things.
>>>>
>>>> 1. I attached gdb to mc while it was hanging and got the following backtrace
>>>> (abbreviated):
>>>>
>>>> #1  0x00007ff901638037 in WaitForMultipleObjectsEx ()
>>>> #2  0x00007ff901637f1e in WaitForMultipleObjects ()
>>>> #3  0x0000000180048df5 in cygwait () at ...winsup/cygwin/cygwait.cc:75
>>>> #4  0x000000018019b1c0 in wait4 () at ...winsup/cygwin/wait.cc:80
>>>> #5  0x000000018019afea in waitpid () at ...winsup/cygwin/wait.cc:28
>>>> #6  0x000000018017d2d8 in pclose () at ...winsup/cygwin/syscalls.cc:4627
>>>> #7  0x000000018015943b in _sigfe () at sigfe.s:35
>>>> #8  0x000000010040d002 in get_popen_information () at filemanager/ext.c:561
>>>> [...]
>>>>
>>>> So pclose is blocking after calling waitpid.  As far as I can tell from looking
>>>> at backtraces of all threads, there are no FIFOs open.
>>>>
>>>> 2. I ran mc under strace (after exporting SHELL=/bin/tcsh), and I didn't see
>>>> anything suspicious involving FIFOs.  But I saw many EBADF errors from fstat 
>>>> and
>>>> close that don't appear to be related to FIFOs.
>>>>
>>>> So my best guess at this point is that the FIFO changes just exposed some
>>>> unrelated bug(s).
>>>>
>>>> Prior to the FIFO changes, mc would get an error when it tried to open 
>>>> tcsh_fifo
>>>> the second time, and it would then set
>>>>
>>>>     mc_global.tty.use_subshell = FALSE;
>>>>
>>>> see the mc source file subshell/common.c:1087.
>>>
>>> I looked into this problem and found pclose() stucks if FIFO
>>> is opened.
>>>
>>> Attached is a simple test case. It works under cygwin 3.1.4,
>>> but stucks at pclose() under cygwin git head.
>>>
>>> Isn't this a FIFO related issue?
>>
>> In the test case, fhandler_fifo::close() called from /bin/true
>> seems to get into infinite loop at:
>>
>> do
>> ...
>> while (inc_nreaders () > 0 && !found);
> 
> Thank you!  I see the problem.  After the popen call, the original 
> fhandler_fifo's fifo_reader_thread was no longer running, so it was unable to 
> take ownership.
> 
> It might take a little while for me to figure out how to fix this.

Actually, I think it's easy.  Please try the two attached patches.  The second 
one is the crucial one for the mc problem, but the first is something I noticed 
while working on this.

I just did a quick and dirty patch for testing purposes.  I still have to do a 
lot of cleanup and make sure the fix doesn't break something else.

Ken
-------------- next part --------------
>From bdc0bd172b09d386a2f15c41e7a764f48748b90c Mon Sep 17 00:00:00 2001
From: Ken Brown <kbrown@cornell.edu>
Date: Tue, 19 May 2020 09:03:59 -0400
Subject: [PATCH 1/2] Cygwin: FIFO: add missing reader_opening_unlock

---
 winsup/cygwin/fhandler_fifo.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index e8a05dfbf..0dc356dfe 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -1047,6 +1047,7 @@ err_close_reader:
   saved_errno = get_errno ();
   close ();
   set_errno (saved_errno);
+  reader_opening_unlock ();
   return 0;
 err_close_cancel_evt:
   NtClose (cancel_evt);
-- 
2.21.0

-------------- next part --------------
>From ce60aa56a573e712ce5d212ca6aa0ddd9781be29 Mon Sep 17 00:00:00 2001
From: Ken Brown <kbrown@cornell.edu>
Date: Tue, 19 May 2020 09:28:30 -0400
Subject: [PATCH 2/2] Cygwin: FIFO: don't take ownership after exec

Proof of concept only.  Needs cleanup and further testing.

There's no need to transfer ownership after exec.  This will happen
automatically, if necessary, when the parent closes.  And canceling
the parent's fifo_reader_thread can cause problems, as reported here
for example:

https://cygwin.com/pipermail/cygwin-patches/2020q2/010235.html
---
 winsup/cygwin/fhandler_fifo.cc | 35 +++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/winsup/cygwin/fhandler_fifo.cc b/winsup/cygwin/fhandler_fifo.cc
index 0dc356dfe..79f83177f 100644
--- a/winsup/cygwin/fhandler_fifo.cc
+++ b/winsup/cygwin/fhandler_fifo.cc
@@ -1727,23 +1727,23 @@ fhandler_fifo::fixup_after_exec ()
       fc_handler = NULL;
       nhandlers = shandlers = 0;
 
-      /* Cancel parent's reader thread */
-      if (cancel_evt)
-	SetEvent (cancel_evt);
-      if (thr_sync_evt)
-	WaitForSingleObject (thr_sync_evt, INFINITE);
-
-      /* Take ownership if parent is owner. */
-      fifo_reader_id_t parent_fr = me;
-      me.winpid = GetCurrentProcessId ();
-      owner_lock ();
-      if (get_owner () == parent_fr)
-	{
-	  set_owner (me);
-	  if (update_my_handlers (true) < 0)
-	    api_fatal ("Can't update my handlers, %E");
-	}
-      owner_unlock ();
+      // /* Cancel parent's reader thread */
+      // if (cancel_evt)
+      // 	SetEvent (cancel_evt);
+      // if (thr_sync_evt)
+      // 	WaitForSingleObject (thr_sync_evt, INFINITE);
+
+      // /* Take ownership if parent is owner. */
+      // fifo_reader_id_t parent_fr = me;
+      // me.winpid = GetCurrentProcessId ();
+      // owner_lock ();
+      // if (get_owner () == parent_fr)
+      // 	{
+      // 	  set_owner (me);
+      // 	  if (update_my_handlers (true) < 0)
+      // 	    api_fatal ("Can't update my handlers, %E");
+      // 	}
+      // owner_unlock ();
       /* Close inherited cancel_evt and thr_sync_evt. */
       if (cancel_evt)
 	NtClose (cancel_evt);
@@ -1755,6 +1755,7 @@ fhandler_fifo::fixup_after_exec ()
 	api_fatal ("Can't create reader thread sync event during exec, %E");
       /* At this moment we're a new reader.  The count will be
 	 decremented when the parent closes. */
+      me.winpid = GetCurrentProcessId ();
       inc_nreaders ();
       new cygthread (fifo_reader_thread, this, "fifo_reader", thr_sync_evt);
     }
-- 
2.21.0



More information about the Cygwin-patches mailing list