-
Notifications
You must be signed in to change notification settings - Fork 350
[ php-wasm ] Add intl
dynamic extension to @php-wasm/web
#2591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
ce5ac2f
to
9915b90
Compare
Little summary :
![]()
I finally found out the resulting
- var specialHTMLTargets = [0, document, window];
+ var specialHTMLTargets = [0, typeof document != 'undefined' ? document : 0, typeof window != 'undefined' ? window : 0];
/** @suppress {duplicate } */
var findEventTarget = (target) => {
target = maybeCStringToJsString(target);
- var domElement = specialHTMLTargets[target] || document.querySelector(target);
+ var domElement = specialHTMLTargets[target] || (typeof document != 'undefined' ? document.querySelector(target) : null);
return domElement;
}; And running
|
I had in mind to also try to add |
Good find! It should be fine as long as we're not breaking loading that script on a regular web page (not in a worker). And if we are breaking it, that may still be fine, but let's acknowledge that and discuss any consequences. |
Let's just use a specific E2E testing setup. I've tried that in the past in this repo and |
c511369
to
52fe96a
Compare
f244496
to
f54d3c5
Compare
the
I implemented a I also managed to run the tests in JSPI and Asyncify separately by using |
I ended up only removing the
I tried to add the |
@adamziel That's it I think! The first full dynamic extension with its associated tests in Node, Web and Playground. I will clean up the old artifacts from static Intl and Playground CLI in the next pull request, to keep this one clean. Should I leave |
Currently, only |
Thank you @mho22! |
I had to upgrade |
🎉 |
I guess this is because of |
What does it add to the binary? Are those additions relevant? It worked without them earlier on and it was smaller despite shipping an additional php extension. Can we post process the wasm binary and remove the extra stuff? Or get emscripten to not include it in the first place? |
I'm currently listing the different steps and possibilities. I already tried packages/php-wasm/web/public/php/jspi/8_3_25/php_8_3.wasm | Bin 24607385 -> 22006934 bytes
packages/php-wasm/web/public/php/jspi/php_8_3.js | Bin 577829 -> 161805 bytes I'm still investigating. |
Ok. I found something interesting. First of all, why does
Based on this, I found two ways to decrease the wasm file size while adding the
But to be honest, these two options were not satisfying enough. I decided to make some kind of sizes benchmark of the wasm binary and its composition : with and without I hope this will be readable. [ n.b. GD needs LIBZIP, CURL needs OPENSSL and LIBZIP and OPENSSL needs MBSTRING and MBREGEX to build individually ]
Now using
Now a comparison between the increase based on the individual static extension :
Now the overall additionnal Mb from all the extensions enabled equals Why is So, I don't know if this is quite possible right now but I would like to suggest to transform Summary : Current PHP.Wasm web without That sounds promising right ? |
It does sound promising, thank you for this great research!
The way I understand it, is we only need the system libraries that are already shipped before this PR. Whatever makes the additional FILEINFO-related 8MB or MAIN_MODULE-related 5Mb can probably be slashed. It sounds like these are additional libraries loaded just in case some dynamic library tries to load them later on – but we know there is no dynamic library that will need that later on because we're just splitting php.wasm into php.wasm + intl.so. The current system libraries are enough today so they should also be enough for the dynamic library Can we inspect the built wasm file for the functions it ships, diff that with the wasm file we have today, and just blanket remove all the additional functions? |
# Remove fileinfo if needed
RUN if [ "$WITH_FILEINFO" = "yes" ]; \
then \
echo -n ' --enable-fileinfo' >> /root/.php-configure-flags; \
+ rm /root/php-src/ext/fileinfo/data_file.c; \
+ echo -e 'const unsigned char php_magic_database[0] = {};' > /root/php-src/ext/fileinfo/data_file.c; \
else \
# light bundle should compile without fileinfo and libmagic
echo -n ' --disable-fileinfo' >> /root/.php-configure-flags; \
fi; But, as you may understand the issue, even if
GLOBAL TO ALL PHP VERSIONS : "_zend_string_init_interned", \n\
"___cxa_pure_virtual", \n\
"_executor_globals", \n\
"_std_object_handlers", \n\
"_zend_empty_string", \n\
"_timezone", \n\
"_tzname", \n\
"_zend_ce_aggregate", \n\
"__ZTVN10__cxxabiv120__si_class_type_infoE", \n\
"__ZTVN10__cxxabiv117__class_type_infoE", \n\
"__ZTVN10__cxxabiv121__vmi_class_type_infoE", \n\
"_zend_ce_exception", \n\
"_OnUpdateStringUnempty", \n\
"_OnUpdateLong", \n\
"_OnUpdateBool", \n\
"_zend_ini_boolean_displayer_cb", \n\
"_zend_ce_countable", \n\
"_zend_ce_iterator", \n\
"__ZNSt12length_errorD1Ev", \n\
"__ZTISt12length_error", \n\
"__ZTVSt12length_error", \n\
"__ZNSt20bad_array_new_lengthD1Ev", \n\
"__ZTISt20bad_array_new_length", \n\
"_zval_add_ref", \n\
"_free", \n\
"_object_init_ex", \n\
"__emalloc", \n\
"_object_properties_init", \n\
"_strlen", \n\
"_strstr", \n\
"_strchr", \n\
"__ZNSt3__211__call_onceERVmPvPFvS2_E", \n\
"__ZNSt3__25mutex4lockEv", \n\
"__ZNSt3__25mutex6unlockEv", \n\
"__ZNSt3__218condition_variable10notify_allEv", \n\
"_strcmp", \n\
"_strcpy", \n\
"_strncmp", \n\
"_strrchr", \n\
"_strncpy", \n\
"_getenv", \n\
"_setlocale", \n\
"_stat", \n\
"_open", \n\
"_mmap", \n\
"_close", \n\
"_memcmp", \n\
"_realloc", \n\
"_convert_to_double", \n\
"__safe_emalloc", \n\
"__efree", \n\
"_convert_to_long", \n\
"_zend_known_strings", \n\
"_zend_empty_array", \n\
"_compiler_globals", \n\
"_zend_add_attribute", \n\
"_zend_register_ini_entries", \n\
"_zend_register_ini_entries_ex", \n\
"_zend_declare_class_constant_long", \n\
"_zend_declare_class_constant_null", \n\
"_zend_declare_class_constant_string", \n\
"_zend_register_long_constant", \n\
"_zend_declare_class_constant_double", \n\
"_zend_register_internal_class_with_flags", \n\
"_zend_declare_typed_class_constant", \n\
"_zend_register_string_constant", \n\
"_zend_register_internal_class_ex", \n\
"_zend_unregister_ini_entries_ex", \n\
"_zend_throw_exception_ex", \n\
"_zend_strpprintf", \n\
"_zend_object_std_dtor", \n\
"_zend_object_std_init", \n\
"_zend_declare_class_constant_ex", \n\
"_zend_parse_method_parameters", \n\
"_zend_throw_error", \n\
"_zend_sort", \n\
"_zend_hash_sort_ex", \n\
"__zend_new_array_0", \n\
"_zend_hash_next_index_insert", \n\
"_zend_hash_update", \n\
"_zend_hash_index_update", \n\
"_zend_error", \n\
"_zend_parse_parameters", \n\
"_zend_replace_error_handling", \n\
"_zend_spprintf", \n\
"_zend_throw_exception", \n\
"_zend_restore_error_handling", \n\
"_zend_strtod", \n\
"_zend_wrong_parameters_none_error", \n\
"_zend_try_assign_typed_ref_long", \n\
"_zend_fcall_info_init", \n\
"_zend_array_destroy", \n\
"_zend_argument_value_error", \n\
"_zend_hash_str_find", \n\
"_zend_objects_clone_members", \n\
"_zend_call_function", \n\
"_zend_try_assign_typed_ref_str", \n\
"___zend_malloc", \n\
"_zend_alter_ini_entry", \n\
"_zend_argument_type_error", \n\
"_zend_hash_destroy", \n\
"_zend_memnstr_ex", \n\
"_zend_str_tolower", \n\
"_zend_create_internal_iterator_zval", \n\
"_zend_class_implements", \n\
"_zend_iterator_init", \n\
"_zend_update_property", \n\
"_zend_declare_typed_property", \n\
"_zend_wrong_parameters_count_error", \n\
"_zend_wrong_parameter_error", \n\
"_zend_parse_arg_str_slow", \n\
"_zend_parse_arg_long_slow", \n\
"_zend_parse_arg_str_or_long_slow", \n\
"_zend_release_fcall_info_cache", \n\
"_zend_try_assign_typed_ref_arr", \n\
"___zend_realloc", \n\
"_zend_objects_store_del", \n\
"_zend_get_gc_buffer_create", \n\
"_zend_get_gc_buffer_grow", \n\
"_zend_std_get_properties", \n\
"_zend_hash_index_find", \n\
"__zend_hash_init", \n\
"_zend_hash_str_update", \n\
"_zend_call_known_function", \n\
"_zend_std_compare_objects", \n\
"_zend_try_assign_typed_ref_bool", \n\
"_zend_hash_copy", \n\
"__zend_new_array", \n\
"_zend_argument_error", \n\
"_zend_argument_count_error", \n\
"_zend_call_method", \n\ ONLY PHP8.4 : "_zend_register_internal_class_with_flags", \n\
"_zend_declare_typed_class_constant", \n\ ABOVE PHP8.0 : "_zend_unregister_ini_entries_ex", \n\
"_zend_register_ini_entries_ex", \n\ ABOVE PHP7.4 : "_zend_add_attribute", \n\
"_zend_argument_count_error", \n\
"_zend_argument_error", \n\
"_zend_argument_type_error", \n\
"_zend_argument_value_error", \n\
"_zend_call_known_function", \n\
"_zend_create_internal_iterator_zval", \n\
"_zend_get_gc_buffer_create", \n\
"_zend_get_gc_buffer_grow", \n\
"_zend_parse_arg_str_or_long_slow", \n\
"_zend_wrong_parameter_error", \n\ ABOVE PHP7.3 : "_zend_declare_typed_property", \n\
"_zend_release_fcall_info_cache", \n\
"_zend_try_assign_typed_ref_arr", \n\
"_zend_try_assign_typed_ref_bool", \n\
"_zend_try_assign_typed_ref_long", \n\
"_zend_try_assign_typed_ref_str", \n\ ABOVE PHP7.2 : "__zend_new_array", \n\
"__zend_new_array_0", \n\
"_object_init_ex", \n\
"_zend_empty_array", \n\
"_zend_hash_index_update", \n\
"_zend_hash_next_index_insert", \n\
"_zend_hash_str_update", \n\
"_zend_hash_update", \n\
"_zend_std_compare_objects", \n\
"_zend_string_init_interned", \n\
"_zend_wrong_parameters_none_error", \n\
"_zval_ptr_dtor", \n\ Setting To be sure we have the correct amount of exported functions for I would like to suggest doing this optimization in a next pull request. Since this will probably need a certain amount of time to implement all those necessary exported functions for each PHP version. As explained above, removing So @adamziel, in summary, what do you think of these steps before being able to merge this pull request ? :
|
But why does it work today with the smaller |
Aha, I see the issue – we need to expose any function that the built |
I spent some time digging into why the 1.When we build php.wasm without 2.The 3.Through investigation, it appears that the linker and Emscripten optimizations remove most of the unused data. Specifically : Non-
Therefore, most of the 16MB constant ends up fully included in the wasm file, increasing its size by 8MB. This explains why the I am stuck here on a situation where it would be super useful to have that shrinking happen without having the possibility to use it. I tried a lot of things and by modifying the
This is mostly right! But some missing functions make the test crash too. I tested that with the PHP8.3 build and almost every function I listed above are present except these ones :
Everytime I ran the Intl tests, it crashed because of one of the above missing function, I had to compile again after adding the missing function and so on until the test passed. So in theory you're correct and the files can be analyzed to list the needed FYI, there are 769 imports needed from the
I am not sure to understand what you're meaning with "only as an assertion". Did you mean, at the end of the |
I tried multiple options :
I came to the frustrating conclusion that it we had two last options :
I hope you'll have better ideas and options than mine... |
Thank you @mho22! Let me think about that for a moment and follow up here |
Could wasm-split help us here? It seems to enable doing a static build and splitting it into the main module and the dynamic library after the fact. We don't need to support arbitrary future PHP extensions but only the set of extensions we're already building so perhaps we could use it to eliminate dead code first and split afterwards. Alternatively, perhaps there's a way we can do a full static build, list all the symbols left inside by the linker, and then do the MAIN_MODULE build and post-process it to remove the symbols that weren't there in the static build? |
The built dynamic library should have some imports and exports listed at the top (viewable via wasm2wat or wasm-objdump). Couldn't all the required imports be extracted from there? It would be weird if the dynamic library needed a PHP function and yet it didn't list that as an import. The list of required core PHP exports must be somewhere in the built artifact, either in the binary wasm file or the js file, because it, ultimately, tries to call those functions by name. |
Basically we want the same linking/dead code elimination outcome as when only the main module it built. It seems like EMscripten won't give us that by default so we need to help it with other tools and, potentially, custom static analysis and rewriting of the generated build. If the linker can prune a large part of the |
I am thoroughly studying your suggestions. I am learning a lot of things related to emscripten and webassembly. I haven't found a solution yet. I created an issue on emscripten and will go back to Xdebug waiting for an answer. |
Motivation for the change, related issues
This is a pull request to dynamically load Intl in PHP.wasm Web.
Related issues and pull requests
Issues
Pull requests
intl
dynamic extension to @php-wasm/node ASYNCIFY #2501 #2557intl
dynamic extension to @php-wasm/node JSPI #2501intl
extension #2187Implementation details
MAIN_MODULE
in node and webworker
to the [web
] environmentignore-lib-imports
Vite pluginwasm-feature-detect
to simulate JSPI mode enabled based on Cypress ENVTesting Instructions (or ideally a Blueprint)
CI
🧪 test-e2e-php-wasm-web-jspi
🧪 test-e2e-php-wasm-web-asyncify
Next steps