RESOLVED FIXED 88414
[NRWT] XvfbDriver should choose the next free display
https://bugs.webkit.org/show_bug.cgi?id=88414
Summary [NRWT] XvfbDriver should choose the next free display
Kristóf Kosztyó
Reported 2012-06-06 07:08:18 PDT
Now the XfvbDriver chose the next display id with these simple way: running_displays = len(Executive().running_pids(x_filter)) display_id = self._worker_number * 2 + running_displays But if we ran some parallel nrwt's they could choose an already existing display which can cause problems.
Attachments
WIP patch (1.13 KB, patch)
2012-06-06 07:25 PDT, Kristóf Kosztyó
no flags
WIP patch (4.14 KB, patch)
2012-06-21 06:37 PDT, Kristóf Kosztyó
no flags
proposed fix (4.99 KB, patch)
2012-06-22 08:00 PDT, Kristóf Kosztyó
no flags
draft patch (2.64 KB, patch)
2012-09-14 07:29 PDT, Kristóf Kosztyó
no flags
proposed fix (9.40 KB, patch)
2012-09-21 02:29 PDT, Kristóf Kosztyó
no flags
Kristóf Kosztyó
Comment 1 2012-06-06 07:25:24 PDT
Created attachment 146020 [details] WIP patch We've discussed with Ossy that maybe we could copy the method the way xvfb-run scripts choose the next unreserved display. It works locally but I want to do more tests.
Csaba Osztrogonác
Comment 2 2012-06-07 02:45:02 PDT
Comment on attachment 146020 [details] WIP patch View in context: https://bugs.webkit.org/attachment.cgi?id=146020&action=review > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:62 > - display_id = self._worker_number * 2 + running_displays > + display_id = next_free_id() > + print 'start new xvfb with %d id' % display_id > if pixel_tests: > display_id += 1 But what if next_free_id() + 1 isn't free?
Csaba Osztrogonác
Comment 3 2012-06-07 02:49:19 PDT
Comment on attachment 146020 [details] WIP patch View in context: https://bugs.webkit.org/attachment.cgi?id=146020&action=review > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:54 > + def next_free_id(): > + for i in range(99): > + if not os.path.exists('/tmp/.X%d-lock' % i): > + return i Hmm ... I think checking /tmp/.X%d-lock isn't enough ... On the machine where run 3 buildslave, I have only these lock files: .X0-lock, .X90-lock . But there are three xvfb-run.XXXXXX directories with Xauthority files contain the following numbers: 84, 39, 90. (These numbers were passed to xvfb-run when we started the bots.) And there are two sockets in /tmp/.X11-unix dir: X0 and X90.
Kristóf Kosztyó
Comment 4 2012-06-19 00:41:08 PDT
Kristóf Kosztyó
Comment 5 2012-06-19 00:42:37 PDT
(In reply to comment #4) > Committed r120688: <http://trac.webkit.org/changeset/120688> Temporarily we disable the xvfb driver on qt
Csaba Osztrogonác
Comment 6 2012-06-21 04:07:21 PDT
Is there any progression?
Kristóf Kosztyó
Comment 7 2012-06-21 06:32:34 PDT
(In reply to comment #6) > Is there any progression? Yes there is. Now I check the first free xvfb id with the check of the already existing .X#-lock files. After when I found the first free one I lock this number with the webkitpy's FileLock. At locally it works fine also when I run more than one NRWT in parallel.
Kristóf Kosztyó
Comment 8 2012-06-21 06:37:37 PDT
Created attachment 148786 [details] WIP patch Work in progress patch with the FileLock. I would like to do some tests with it, but if everything go well I will upload the final patch.
Peter Gal
Comment 9 2012-06-22 02:25:42 PDT
View in context: https://bugs.webkit.org/attachment.cgi?id=148786&action=review > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:71 > + environment['DUMPRENDERTREE_TEMP'] = str(self._driver_tempdir) I did a little digging: Where will be this _driver_tempdir be set to something? In the parent class it'll be set in the _start method, but we override it here. Or am I missing something here?
Kristóf Kosztyó
Comment 10 2012-06-22 04:27:55 PDT
(In reply to comment #9) > View in context: https://bugs.webkit.org/attachment.cgi?id=148786&action=review > > > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:71 > > + environment['DUMPRENDERTREE_TEMP'] = str(self._driver_tempdir) > > I did a little digging: Where will be this _driver_tempdir be set to something? In the parent class it'll be set in the _start method, but we override it here. Or am I missing something here? You're right I missed out that.
Kristóf Kosztyó
Comment 11 2012-06-22 08:00:34 PDT
Created attachment 149029 [details] proposed fix
Dirk Pranke
Comment 12 2012-06-22 14:53:53 PDT
Comment on attachment 149029 [details] proposed fix View in context: https://bugs.webkit.org/attachment.cgi?id=149029&action=review the patch basically looks fine with a couple nits ... I'll leave cq? in case you want to change things. > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:53 > + _guard_lock_file = self._port._filesystem.join('/tmp', 'WebKitXvfb.lock.%i' % i) nit: there's no reason to use an underscore in front of the name. I'd probably just combine lines 53 and 54 ... > Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver.py:58 > + self.display_id = next_free_id() should this be self._display_id to indicate that it's a private variable?
Kristóf Kosztyó
Comment 13 2012-06-26 00:58:20 PDT
Thank you. I commited this in r121236: <http://trac.webkit.org/changeset/121236> if everything go well on the bots I will close the bug.
Kristóf Kosztyó
Comment 14 2012-06-26 01:59:06 PDT
(In reply to comment #13) > Thank you. > I commited this in r121236: <http://trac.webkit.org/changeset/121236> if everything go well on the bots I will close the bug. Rollouted in r121239 because it produced two type of error: * start the xvfb with the same display * or everything went fine until it starts getting warnings from the file lock and the tests are timed out or crashed
Csaba Osztrogonác
Comment 15 2012-07-12 03:29:50 PDT
Comment on attachment 149029 [details] proposed fix It is obsolete patch.
Csaba Osztrogonác
Comment 16 2012-08-23 03:23:15 PDT
After Balázs's patch - https://trac.webkit.org/changeset/124581 - NRWT doesn't run two DRT for non-pixel and pixel tests, so XVFBDriver's display handling should be refactored with this fix too.
Csaba Osztrogonác
Comment 17 2012-08-23 07:36:17 PDT
(In reply to comment #14) > (In reply to comment #13) > > Thank you. > > I commited this in r121236: <http://trac.webkit.org/changeset/121236> if everything go well on the bots I will close the bug. > > Rollouted in r121239 because it produced two type of error: > * start the xvfb with the same display > * or everything went fine until it starts getting warnings from the file lock and the tests are timed out or crashed I think I got the root of the problem. I saw the following failing webkitpy test on the Qt buildbot: Traceback (most recent call last): File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/common/system/filesystem_unittest.py", line 220, in test_read_and_write_file text_contents = fs.read_text_file(binary_path) File "/ramdisk/qt-linux-release/build/Tools/Scripts/webkitpy/common/system/filesystem.py", line 213, in read_text_file with codecs.open(path, 'r', 'utf8') as f: File "/usr/lib/python2.6/codecs.py", line 881, in open file = __builtin__.open(filename, mode, buffering) IOError: [Errno 2] No such file or directory: '/tmp/tree_unittest_9TdIXB' [283/1600] webkitpy.common.system.file_lock_integrationtest.FileLockTest.test_lock_lifecycle passed ------------------------------------- It seems something is wrong with the FileLock class and it caused that NRWT started two driver on same display. Could you check it, please?
Zan Dobersek
Comment 18 2012-08-29 00:59:23 PDT
Instead of checking for the X locks in /tmp directory, how about listing all the current Xorg and Xvfb processes, parsing their commands to get the displays they are using and then using the first free display for the new Xvfb instance? Perhaps -nolock could be passed to new Xvfb processes as well to avoid Xvfb-spawned lock files, but I think using a FileLock to denote that Xvfb (spawned by NRWT) is using a specific display is a good idea.
Kristóf Kosztyó
Comment 19 2012-09-14 07:29:24 PDT
Created attachment 164148 [details] draft patch Now I check the next free display by the running processes. It seems more stable than the xvfb's lock files based approach. If this approach is good I will rewrite some unittest to fit this change.
Kristóf Kosztyó
Comment 20 2012-09-21 02:29:11 PDT
Created attachment 165091 [details] proposed fix It is process based choose of the next free display as the draft patch was but it contains unit tests :)
Kristóf Kosztyó
Comment 21 2012-09-28 06:57:54 PDT
Raphael Kubo da Costa (:rakuco)
Comment 22 2012-10-08 01:56:32 PDT
Do you guys have any idea why this keeps failing every once in a while? See, for example, <http://build.webkit.org/builders/EFL%20Linux%2064-bit%20Debug/builds/6866/steps/webkitpy-test/logs/stdio>: [1580/1585] webkitpy.layout_tests.port.xvfbdriver_unittest.XvfbDriverTest.test_next_free_display failed: Traceback (most recent call last): File "/home/buildslave-1/webkit-buildslave/efl-linux-64-debug/build/Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver_unittest.py", line 86, in test_next_free_display self.assertEqual(driver._next_free_display(), 2) AssertionError: 3 != 2
Zan Dobersek
Comment 23 2012-10-08 02:04:22 PDT
(In reply to comment #22) > Do you guys have any idea why this keeps failing every once in a while? > > See, for example, <http://build.webkit.org/builders/EFL%20Linux%2064-bit%20Debug/builds/6866/steps/webkitpy-test/logs/stdio>: > > [1580/1585] webkitpy.layout_tests.port.xvfbdriver_unittest.XvfbDriverTest.test_next_free_display failed: > Traceback (most recent call last): > File "/home/buildslave-1/webkit-buildslave/efl-linux-64-debug/build/Tools/Scripts/webkitpy/layout_tests/port/xvfbdriver_unittest.py", line 86, in test_next_free_display > self.assertEqual(driver._next_free_display(), 2) > AssertionError: 3 != 2 Set the bug #98346 to be blocking this one. As for the failure, it's strange. I'm pretty sure that if somehow there are any Xvfb processes left running after the NRWT is done, it still shouldn't affect this unit test as the mock executive is used. As much as I tried, though, I couldn't reproduce even one such failure locally.
Zan Dobersek
Comment 24 2012-10-09 12:39:05 PDT
Reopening so the work can continue.
Csaba Osztrogonác
Comment 25 2012-11-22 03:38:46 PST
As far as I know, xvfbdriver works fine, only the unit test is flakey because of stucked xvfbs. Am I right?
Zan Dobersek
Comment 26 2012-11-22 05:40:59 PST
(In reply to comment #25) > As far as I know, xvfbdriver works fine, only the unit test is flakey > because of stucked xvfbs. Am I right? As far as I can tell, that's the case. I think the main problem with the flaky unit tests is that the running Xvfb instances somehow affect them even though they shouldn't.
Zan Dobersek
Comment 27 2012-11-30 09:31:53 PST
Yup, I can reproduce this problem by running NRWT and test-webkitpy in parallel. I'll try to come up with a patch, but probably in another bug.
Zan Dobersek
Comment 28 2013-01-14 23:14:57 PST
(In reply to comment #27) > Yup, I can reproduce this problem by running NRWT and test-webkitpy in parallel. I'll try to come up with a patch, but probably in another bug. Handled in bug #103806, landed in r136314. http://trac.webkit.org/changeset/136314 Closing the bug.
Note You need to log in before you can comment on or make changes to this bug.