Skip to content

JIT tracing - thundering herd causes lockup after opcache reset #11609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
maaarghk opened this issue Jul 7, 2023 · 12 comments
Closed

JIT tracing - thundering herd causes lockup after opcache reset #11609

maaarghk opened this issue Jul 7, 2023 · 12 comments

Comments

@maaarghk
Copy link
Contributor

maaarghk commented Jul 7, 2023

Description

we've spotted some bad behaviour around the locking the shared opcache memory on an 8.2 site (php-fpm) and unfortunately we can't reproduce it locally, only on production, so it's going to be difficult to get much more info than this - we've just left opcache disabled for now to avoid downtime in prod. i realise that there's already been a big flamewar chat about whether or not the jit code is actually going to see any fixes and maybe we should just disable it but i don't see any reason not to report a known issue in the mean time (may be just a PR needs reversed out).

after calling opcache_reset after a deploy, we occasionally end up with a situation where all php-fpm worker processes end up in a loop of locking and unlocking the SHM. the strace output looks like this:

20:53:12 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.008294>
20:53:12 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.000064>
20:53:12 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.681358>
20:53:13 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.000054>
20:53:13 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.050460>
20:53:13 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.000421>
20:53:13 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.012670>
20:53:13 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.000067>
20:53:13 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.037775>
20:53:13 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.000056>
20:53:13 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.004498>
20:53:13 fcntl(3</tmp/.ZendSem.Le8Hvi (deleted)>, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 <0.000085>

after this occurred I asked our hosting company to SIGABRT all the php-fpm worker processes, to get coredumps. these aren't debug builds but we do have symbols and i've been able to draw some conclusions.

99% of the processes have a backtrace like this:

(gdb) bt
#0  0x00007f0f7a90a974 in fcntl64 () from libc.so.6
#1  0x00007f0f79451129 in zend_shared_alloc_lock () from opcache.so
#2  0x00007f0f79458a98 in zend_jit_stop_persistent_op_array () from opcache.so
#3  0x00007f0f794df510 in zend_jit_trace_hot_root.part () from opcache.so
#4  0x000055fb59200647 in ?? () < note - seems to be inside jit memory map
#5  0x00007f0f6f117b30 in ?? ()
#6  0x000055fb4d269898 in ?? () < note - seems to be inside opcache memory map

with one like this:

#0  0x00007f0f7a90f510 in ?? () from libc.so.6
#1  0x00007f0f7a90a944 in fcntl64 () from libc.so.6
#2  0x00007f0f7945119e in zend_shared_alloc_unlock () from opcache.so
#3  0x00007f0f794df5f0 in zend_jit_trace_hot_root.part () from opcache.so
#4  0x000055fb59200647 in ?? ()
#5  0x0000000000000001 in ?? ()
#6  0x000055fb53bed058 in ?? ()

I've taken a look in the core code and this does seem slightly wild:

within zend_jit_trace_hot_root:

	if (ZEND_JIT_TRACE_NUM >= JIT_G(max_root_traces)) {
		stop = ZEND_JIT_TRACE_STOP_TOO_MANY_TRACES;
		zend_jit_stop_counter_handlers();
		goto abort;
	}

zend_jit_stop_counter_handlers is a for loop in a for loop:

	for (uint32_t i = 0; i < ZCSG(hash).max_num_entries; i++) {
		zend_accel_hash_entry *cache_entry;
		for (cache_entry = ZCSG(hash).hash_table[i]; cache_entry; cache_entry = cache_entry->next) {
			zend_persistent_script *script;
			if (cache_entry->indirect) continue;
			script = (zend_persistent_script *)cache_entry->data;
			zend_jit_stop_persistent_script(script);
		}
	}

i'm certainly no expert (i.e. i've got no idea whatsoever) regarding the contents of ZCSG(hash) but it appears that if a loop is being done over max_num_entries AND cache_entry->next is being used to traverse a list, then there's some kind of logarithmic complexity thing going on here where a cache_entry could be visited multiple times if it is pointed to by next of some previous cache_entry?

anyway, the main issue: within this loop, zend_jit_stop_persistent_script is called -

static void zend_jit_stop_persistent_script(zend_persistent_script *script)
{
	zend_class_entry *ce;
	zend_op_array *op_array;

	zend_jit_stop_persistent_op_array(&script->script.main_op_array);

	ZEND_HASH_FOREACH_PTR(&script->script.function_table, op_array) {
		zend_jit_stop_persistent_op_array(op_array);
	} ZEND_HASH_FOREACH_END();

	ZEND_HASH_FOREACH_PTR(&script->script.class_table, ce) {
		ZEND_HASH_FOREACH_PTR(&ce->function_table, op_array) {
			if (op_array->type == ZEND_USER_FUNCTION) {
				zend_jit_stop_persistent_op_array(op_array);
			}
		} ZEND_HASH_FOREACH_END();
	} ZEND_HASH_FOREACH_END();
}

so here we have the main op array being stopped, followed by the function table array being stopped, followed by the class methods table being stopped(?). zend_jit_stop_persistent_op_array more or less wraps zend_jit_stop_hot_trace_counters, which locks the SHM, updates the function pointer back from jit to vm, and then unlocks the SHM.

on our box with 100 or so worker processes each having entered an up-to-quadruple-nested foreach loop, with the absolute innermost loop of each calling unlock / lock, potentially over a file which has many methods and functions, the end result is absolute chaos. the loop tends quickly towards taking minutes to complete as the number of setlkws queued up reaches thousands. in the short strace snippet posted above you can see an 0.6s wait for example. imagine that, per function?

notably, we seemed to trigger this whenever we flushed opcache. so possibly we're seeing some kind of thundering herd thing here. i see the ZEND_JIT_TRACE_NUM >= JIT_G(max_root_traces) condition being checked; maybe there is a situation where several threads at the same time add a script to the jit table and then they all try to reverse out their own work as they fill it up? I am really not familiar with jit even as a concept so I'm just guessing based on the fact we were at max worker processes and under load whenever we saw this issue.

this nested foreach thing was introduced here: #9343 and an alternative solution was proposed here: #9964

@dstogov - the first was accepted but the second not, with the second having the comment that it might cause "too many locks"? maybe worth reconsidering both of the above together with this issue report to find some better solution, assuming that improvements and bugfixes are currently permitted on the jit code.

ps: I don't believe this is a duplicate of any other issue that I can find; unlike in many other issues we do not have segfaults, just persistent lock waits.

PHP Version

8.2.5 - 7 / amd64 / linux / fpm sapi

@dstogov
Copy link
Member

dstogov commented Jul 7, 2023

@maaarghk thanks for the report.

This should be already partially fixed by 5b033b0

@iluuu1994, your fix may be significantly improved.

zend_jit_stop_persistent_script() need to be called only once (by one process). It should set some flag in SHM. The other processes should check this flag before and and after the lock to prevent redundant work.

@dstogov
Copy link
Member

dstogov commented Jul 7, 2023

@maaarghk disabling opcache is too conservative solution, I recommend you to disable JIT opcache.jit=0.

@maaarghk
Copy link
Contributor Author

maaarghk commented Jul 7, 2023

@dstogov Ah neat, I didn't notice it. Is the appropriate thing to add a bool to zend_accel_shared_globals?

@dstogov
Copy link
Member

dstogov commented Jul 7, 2023

I think - yes, but let @iluuu1994 fix this. (I'm on vacation for next two weeks)

@iluuu1994
Copy link
Member

@dstogov I'll have a look. Enjoy your vacation!

iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Jul 7, 2023
Use atomic to prevent clearing resetting the handlers from multiple processes or
threads.

Fixes phpGH-11609
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Jul 7, 2023
Use atomic to prevent resetting handlers from multiple processes or threads.

Fixes phpGH-11609
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Jul 7, 2023
Use atomic to prevent resetting handlers from multiple processes or threads.

Fixes phpGH-11609
@blackwolf12333
Copy link

@maaarghk I just want to confirm if we are seeing a similar issue in production, do you also have php-fpm log that it's starting children that exit almost immediately seemingly in a loop?

Otherwise I'll have to make a separate bug report for this, but it started happening since we switched over to php 8.2.

@maaarghk
Copy link
Contributor Author

believe it or not I done the above debugging on a server where the hosting co doesn't give me an easy way to access basic things like the fpm error log, they had to generate the core dumps and strace for me, so I can't say conclusively. But it doesn't sound like it. The child processes here are not broken as such, just locked up doing this memory dance. Afaict they did not die afterwards, just went on to process the request normally. I also don't think they used enough memory to be killed by OOM. But - there was a supervisor process which would detect FPM not responding to requests and try to restart it - so maybe that's causing your logs to look like that? Maybe post an excerpt.

@blackwolf12333
Copy link

Hmm that doesn't really sound like the same issue then. I'll create a separate issue.

@iluuu1994
Copy link
Member

@maaarghk Any luck checking if the newest version of PHP already mitigates this to some degree?

@maaarghk
Copy link
Contributor Author

Hey @iluuu1994 - because I don't have root on the web server, I rely on a support team to intervene if I upgrade to the latest version and find the issue still causes downtime. Combined with the fact I only see this issue consistently on a production site, I'm not really in a position to experiment with it when I know that having JIT turned off keeps the site stable. Sorry I can't be of more use.

@maaarghk
Copy link
Contributor Author

@iluuu1994 We decided to just YOLO it and re-enable JIT on production after upgrading to 8.2.8, and it does appear to have mitigated the issue. I can't see an unusual number of failed requests since yesterday. It's hard to say conclusively since it was exacerbated by traffic (we were only really in a place bad enough to notice if every single fpm worker got caught in this loop whilst at max_children) but unless other reports come in from busier sites, it seems like your improvement can be simplified targetted to 8.3 safely enough. I'll follow up in a week or so.

@iluuu1994
Copy link
Member

@maaarghk Great, thank you!

iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Jul 27, 2023
Avoid stopping counters repeatedly from different threads/processes.

Fixes phpGH-11609
jorgsowa pushed a commit to jorgsowa/php-src that referenced this issue Aug 16, 2023
Avoid stopping counters repeatedly from different threads/processes.

Fixes phpGH-11609
Closes phpGH-11806
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants