-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: using delegate with transformer | AttributeError: 'NoneType' object has no attribute 'c_void_p' #758
Comments
Anyone please? |
Hi. I am not able to access the link to your model file. A suggestion though is to see if the model loads another way such as: |
sorry file was deleted by mistake, I'll upload again but in the meantime this one https://drive.google.com/file/d/13dJeJgs2l562YnOWph4lku650rzymUnA/view?usp=drive_link is the same the only difference is it has dynamic inputs which will throw a warning but the crash is the same above (so the crash is not because of the dynamic inputs, I have tried it with fixed inputs and get the same crash). I have not been able to obtain anything newer than 2.14 for tflite_runtime module using pip, I'll check by using their repo or by compiling it on my own (they might not have 2.15 in arch64). I'll also check using execute network and report back. |
Hi @tracyn-arm I've tried with Tensorflow 2.15 and with the new (last week) version of ARMNN. |
Hi @federicoparra, ExecuteNetwork is available to run in the following prebuilt binaries for the 24.02 release: https://github.com/ARM-software/armnn/releases/tag/v24.02. You can download and unzip the binary that is required for your specific architecture and run the model with ExecuteNetwork (see comment from @tracyn-arm above for command) . I will pick this up during the week to investigate further the above error. If possible could you open access to the google drive link provided as I am getting permission denied error. Regards, Cathal. |
Hi again, Unfortunately, we have a restriction on not being able to download privately supplied models to us. Would the model you have supplied be available publically somewhere which would allow us to download and use it? Regards, Cathal. |
I compiled ARMNN myself using the scripts provided - is there a script to compile execute network? My understanding is that your releases for LINUX do not support the Delegate (just the parser), and I want to use the Delegate in Linux (my compiled version works well with most TFlite networks, just not with the one I'm talking about in this bug repport). About the link, I apoligize once more, I realize it wasn't shared publically, here's the link it works now: https://drive.google.com/file/d/13dJeJgs2l562YnOWph4lku650rzymUnA/view?usp=sharing |
The network is publically available here https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b but it's a pytorch network. |
All release prebuilt binaries should have ExecuteNetwork included: for example, linux-aarch64 here: https://github.com/ARM-software/armnn/releases/download/v24.02/ArmNN-linux-aarch64.tar.gz To manually build ArmNN, you can use the build tool (documentation here). Please use main as there were some recent changes to the build tool. I have just realized that the build-armnn.sh script does not include ExecuteNetwork in the output tar file. I am going to create a patch to fix this. In the meantime you should be able get ExecuteNetwork in the output Regards, Cathal. |
Hello, I have created a patch that is currently in review to build ExecuteNetwork using the build-tool. Please feel free to use. 11282: Enable build of execute network in build tool. | https://review.mlplatform.org/c/ml/armnn/+/11282 |
* Help with issue #758 Signed-off-by: Cathal Corbett <[email protected]> Change-Id: Ic9f4ff54e1e5a26b16c3d869815d09036ce5806c
Hi! I was finally able to run the model with ExecuteNetwork, this is the output: $ LD_LIBRARY_PATH=. ./ExecuteNetwork -c CpuAcc GpuAcc -m ../models/stablelm16fixed.tflite -T delegate > output_file.txt |
changing the order of the backends (putting gpu first): $ LD_LIBRARY_PATH=. ./ExecuteNetwork -c GpuAcc CpuAcc -m ../models/stablelm16fixed.tflite -T delegate |
@catcor01 @tracyn-arm Could it be simply that ARMNN delegate doesn't support constant inputs tensors? |
For GpuAcc: It looks constant is not supported (in ArmNN or the delegate) due to datatype but INT32 is one of the supported datatypes. I can see there are some INT64 operators in your model. I am wondering if any of your constant/inputs are INT64. I can see INT64 input is not supported here in ExecuteNetwork: https://github.com/ARM-software/armnn/blob/branches/armnn_24_02/tests/ExecuteNetwork/TfliteExecutor.cpp#L180. I am going to follow up with the team and check if it is on our radar to add INT64 in ExecuteNetwork. Although the above statement seems to be conflicting because I would expect the same failure to happen in the first test but in that test the model seem to get parsed fine through the delegate. Running on only 1 backend might isolate the problem more. It may be that fallback from one backend to another is somehow allowing operators to get supported in the first case and not in the second. |
I went back and made sure that the model did not include any int64 input or operator; indeed there were, and I went ahead and modified it to make sure it exclusively uses int32, never int64. This is the modified model: https://drive.google.com/file/d/1lLtmUjoTvvllNw6fR5Af0pct-agCJ3s5/view?usp=sharing unfortunately this changed nothing: Info: ArmNN v33.1.0 So it is still saying CONSTANT not supported for INT32. The same is true when using just one delegate (gpu or cpu) and the same is true when using ExecuteNetwork: ExecuteNetwork -c GpuAcc -m ../wallE/stablelm16fixed.tflite -T delegate Error: An error occurred when preparing the network workloads: StridedSliceQueueDescriptor: Tensor type is not supported. |
I also don't undertand all these warnings: isn't ARMNN supposed to be compatible with ALL ops, falling back to standard tensorflow lite implementation (i.e., CPU) when an op is not optimized ? Please, do test the model and help me |
As requested on #762 by @Colm-in-Arm here is the license statement for my TFLITE conversion of StableLm-2: I want to assert here that the converted tflite model which is causing the bug described in this bug report and that can be downloaded from the following link https://drive.google.com/file/d/1lLtmUjoTvvllNw6fR5Af0pct-agCJ3s5/view?usp=sharing, which is based on https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b, was shared here under license CC BY-NC-SA 4.0 I hope this allows you to experiment with the model and hopefully find out why it doesn't load up using the ARMNN delegate (but does when using the interpreter without the delegate). |
Hello Federico, I tried this model - it's big! Very big! I started with the changes I'd made for #762 , https://review.mlplatform.org/c/ml/armnn/+/11379 For CpuRef one additional change was required to get the model to execute:
I'll have to see if this an appropriate change. We could instead fully implement the Signed64 Cast operation. As the model file is so large I couldn't easily get it onto an Android device but I did try it on an aarch64 Ubuntu device. It failed in a StridedSlice layer. I'll investigate that next. Colm. |
Amazing @Colm-in-Arm ! keep us posted if you get it to work past the StridedSlice layer error - I'll be using it in Ubuntu so I'm happy that's how you are testing it! |
Hello Federico, It only required one additional change to get past the StridedSlice error. However, as model loading continued it started consuming massive amounts of memory exceeding the 16Gb available on the device I was using resulting in a SIGKILL. This happened with both CpuAcc and CpuRef. I don't have access to a device with more memory. Do you? The patch required to get it this is attached. |
Hey! here you have the 8bit version https://drive.google.com/file/d/1uuuLCO_cD9cd2B06BnfAo0eq5KcJ0q_a/view?usp=sharing I don't know why the 16 bit version takes up so much ram (in my own attempt just right now loading it in Google collab it took 13gb). The 8bit version takes very little RAM. As you posted on the other bug report, I'm assuming the 8bit versions of tflite models get accelerated by ARMNN correct? I wonder if the models run faster, or slower, compared to their 32bit counterparts? Thank you! |
Update @Colm-in-Arm : the 8-bit version I just shared with you above, with the changes in your patch, does load ! I will be testing it in the next few days to see if inference work as expected but at least the model does load :) |
Running it on Orange Pi 5B
Python 3.11
tflite_runtime 2.14.0
Installation of ARMNN etc works fine with other models (example: runs fine with mirNET)
Link to tflite model that causes the error:
https://drive.google.com/file/d/1uX7sZn2idpQOqHwcLFhEvCmHhTJJlfPZ/view?usp=drive_link (3gb download)
code snippet:
import tflite_runtime.interpreter as tflite
armnn_delegate = tf.lite.experimental.load_delegate( library="/home/federico/Documents/code/ARM/aarch64_build/delegate/libarmnnDelegate.so",
options={"backends": "CpuAcc,GpuAcc,CpuRef", "logging-severity":"trace"})
try:
model = tf.lite.Interpreter(model_path="../models/stablelm16.tflite", experimental_delegates=[armnn_delegate])
except Exception as e:
print(f"An error occurred: {e}")
Report:
Info: ArmNN v33.1.0
arm_release_ver: g13p0-01eac0, rk_so_ver: 3
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '7'.
Info: Initialization time: 17.17 ms.
INFO: TfLiteArmnnDelegate: Requested unknown backend CpuAcc
INFO: TfLiteArmnnDelegate: Added backend GpuAcc
INFO: TfLiteArmnnDelegate: Requested unknown backend CpuRef
INFO: TfLiteArmnnDelegate: Created TfLite ArmNN delegate.
WARNING: ADD: not supported by armnn: in validate_arguments_with_arithmetic_rules src/gpu/cl/kernels/ClElementwiseKernel.cpp:160: ITensor data type S64 not supported by this kernel
WARNING: MINIMUM: not supported by armnn: in validate_arguments_with_arithmetic_rules src/gpu/cl/kernels/ClElementwiseKernel.cpp:160: ITensor data type S64 not supported by this kernel
WARNING: MAXIMUM: not supported by armnn: in validate_arguments_with_arithmetic_rules src/gpu/cl/kernels/ClElementwiseKernel.cpp:160: ITensor data type S64 not supported by this kernel
WARNING: BROADCAST_TO: not supported by armnn
WARNING: GATHER: not supported by armnn: in validate_arguments src/core/CL/kernels/CLGatherKernel.cpp:58: ITensor data type S64 not supported by this kernel
WARNING: GATHER: not supported by armnn: in validate_arguments src/core/CL/kernels/CLGatherKernel.cpp:58: ITensor data type S64 not supported by this kernel
WARNING: GATHER: not supported by armnn: in validate_arguments src/core/CL/kernels/CLGatherKernel.cpp:58: ITensor data type S64 not supported by this kernel
Info: ArmnnSubgraph creation
WARNING: CONSTANT: not supported by armnn: Unsupported DataType
An error occurred:
Exception ignored in: <function Delegate.del at 0xffff2c7dc360>
Traceback (most recent call last):
File "/home/federico/miniconda3/envs/mlc-chat-venv/lib/python3.11/site-packages/tensorflow/lite/python/interpreter.py", line 110, in del
AttributeError: 'NoneType' object has no attribute 'c_void_p'
Info: Shutdown time: 2.70 ms.
The text was updated successfully, but these errors were encountered: