- 
                Notifications
    
You must be signed in to change notification settings  - Fork 75
 
ROCm PyTorch unit tests status
Legend::: N: Unittest group name, T: Total tests, F: Failed, E: Errors, S: Skipped (ROCm only), SG: Skipped on GPUs, EF: Expected Failures, P: Passed, PR: Pass rate [P*100/(T-EF-SG)], CM: Comments/Modifications
N			T	F	E	S	SG	EF	P	PR	CM
test_autograd		849	0	0	9	?	0	840	99%
test_c10d										Not enabled yet (ready for testing?)   
test_cpp_extensions									Not enabled yet
test_cuda		2036	0	0	554	?	0	1482	73%
test_dataloader		44	0	0	2	?	0	42	95%
test_distributed									Not enabled yet
test_distributions	176     0       0       36      ?	0       140     80%
test_indexing		46	0	0	0	?	0	46	100%
test_jit		1198	0	0	50	?	3	1145	96%
test_legacy_nn		416	0	0	13	?	0	403	97%
test_multiprocessing	        							Not enabled yet
test_nccl			        						Not enabled yet
test_nn			1221	0	0	216	?	2	1003	82%
test_optim		34	0	0	2	?	0	32	94%
test_sparse		594	0	0	193	?	0	401	68%
test_torch		384	0	0	37	?	0	347	90%
test_utils										Not enabled yet
TOTAL			6998				?	5	5881	84%
- test_autograd
 
Skip due to seg fault:
test_pin_memory at aten/src/ATen/RegisterCUDA.cpp:30 (JMD: works for me)
test_set_requires_grad_only_for_floats_cuda
Skip due to undefined symbol hiprngMakeMTGP32Constants:
test_rnn_backward_to_input_but_not_parameters_cuda
test_requires_grad_factory (failed in CI)
Skip due to 'Memory access fault' (Failed in CI):
test_inputbuffer_add_multigpu
test_type_conversions
test_unused_output_gpu
- test_dataloader
 
Skip due to hang:
test_manager_unclean_exit (due to leaked semaphores (?)) (JMD: according to comments, seems to be python 2.7 issue)
- test_jit
 
Skip due to "RuntimeError: cannot compile a CUDA fusion group, CUDA is not enabled":
test_cpp
test_exp
test_fusion_distribute
test_lstm_fusion_concat
test_lstm_fusion_cuda
test_relu
test_tensor_number_math_cuda
test_comparison_ge_le
test_comparison_gt_lt
test_concat_fusion
test_ge_cuda
test_traced_module
JMD: will require us to enable CUDAFusionFunction which explicitly seems to call nvcc
- test_optim
 
Skip due to hang:
test_adamax (JMD: works for me but fails on CI)
test_rprop - hangs in a thrust kernel
- test_torch
 
Skip due to memory access page fault:
test_topk_noncontiguous_gpu (null pointer being passed to bitonic sort bitonicSortKVInPlace , it seems) JMD: fixed through gather changes, in branch
Skip due to seg fault:
test_half_tensor_cuda (due to build/aten/src/ATen/CUDAHalfType.cpp:2263)
test_print (due to build/aten/src/ATen/CUDAHalfType.cpp:151 fill)
Skip due to AssertionError:
test_norm_cuda (failing with "dim reduction failed for 0-norm")
Skip due to hang:
test_empty_full
Skip due to cublas runtime error:
test_blas_alpha_beta_empty
test_blas_empty
Skip due to RuntimeError:
test_pairwise_distance_empty (failing with "RuntimeError: cuda runtime error (1011) : hipErrorInvalidValue")
test_tensor_factories_empty (failing with "RuntimeError: cuda runtime error (1011) : hipErrorInvalidValue")
test_tensor_shape_empty (failing with "RuntimeError: cuda runtime error (1011) : hipErrorInvalidValue")
- test_cuda
 
Skip due to assertion error:
8 test_\*Tensor_nonzero (Thrust issue; gives correct result for <=960 threads)  
16 test_\*Tensor_prod\*dim + 16 test_\*Tensor_sum\*dim + 4 test_\*Tensor_norm_3\*dim (issue with kernelReduceContigDim and kernelReduceNoncontigDim_shared)  
40 test_\*Tensor_sort\* + 24 test_\*Tensor_topk\* (Memory access fault due to bitonicSortKVInPlace (alternately fails with assertion error when not access faulting))  
2 test_DoubleTensor_mean\*dim (Native elementwise_kernel with div_constant_impl<double>)  
8 test_\*Tensor_mvlgamma\* (Native elementwise_kernel with div_add_impl<>)  
12 test_\*Tensor_renorm\* (THCTensor_kernel_renorm ?)  
Skip due to runtime error:
test_fft_ifft_rfft_irfft (due to undefined symbol: hipfftCreate)  
test_from_sequence + test_randperm_cuda (due to undefined symbol: \_ZN12_GLOBAL__N_112__float2halfEf)  
test_DoubleTensor_inverse + test_FloatTensor_inverse + test_btrifact + test_btrisolve + (due to forced(?) rocblas internal error)  
test_events + test_caching_pinned_memory + test_record_stream (due to 'NoneType' object has no attribute 'cudaEventCreateWithFlags')  
test_streams (due to 'NoneType' object has no attribute 'cudaStreamQuery')  
test_nvtx (due to "undefined symbol: nvtxMarkA")  
test_bincount_cuda (due to hipErrorInvalidValue)  
test_trtrs + test_symeig + test_pinverse + test_matrix_rank + 2 test_gesv\* + test_det_logdet_slogdet + 12 test_(Float|Double)Tensor_svd\* + 8 test_(Float|Double)Tensor_qr\* + 2 test_(Float|Double)Tensor_geqrf + 2 test_(Float|Double)Tensor_eig_with_eigvec (due to no MAGMA library detected)  
12 test_HalfTensor_<addbmm* | addmm* | addr* | baddbmm*> cublas Runtime error in THCBlas.cu
Skip due to hang:
2 test_FloatTensor_mean\*dim (Native elementwise_kernel with div_constant_impl<float>)  
4 test_\*Tensor_add + 4 test_\*Tensor_add_ + 4 test_\*Tensor_sub + 4 test_\*Tensor_sub_ (Native elementwise_kernel with add_kernel_impl; float, double, int and long tensor tests pass for these)  
10 test_\*Tensor_div\* (Native_elementwise_kernel with div_constant_impl; double, int and long tensor tests pass for these)  
8 test_\*Tensor_mul\* (Native elementwise_kernel with mul_kernel_impl; float, double, int and long tensor tests pass for these)  
8 test_\*Tensor_put_ + test_broadcast (TensorPutOp bug)  
8 test_\*Tensor_take + 3 test_advancedindex\* + test_index + test_multinomial (TensorTakeOp bug)  
Skip due to undefined symbol (float2half and half2float):
3 test_HalfTensor_addbmm*
3 test_HalfTensor_addmm*
6 test_HalfTensor_addmv*
3 test_HalfTensor_addr
3 test_HalfTensor_baddbmm
4 test_HalfTesnor_cum<prod|sum>
3 test_HalfTensor_dist*
10 test_HalfTensor_pow*
1 test_halfTensor_max
1 test_halftensor_min
4 test_HalfTensor_norm*
6 test_HalfTensor_renorm*
1 test_tiny_half_norm_