Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clinical data bin counts service not returning correct result on (some) mskcc data #10960

Open
alisman opened this issue Aug 27, 2024 · 0 comments
Assignees
Labels

Comments

@alisman
Copy link
Contributor

alisman commented Aug 27, 2024

if you run the following query against the new clickhouse mskcc database, you'll see it returns 0 when it should return results based on unfiltered chart.

SELECT COUNT(*)
        FROM sample_to_gene_panel_derived
         WHERE alteration_type = 'COPY_NUMBER_ALTERATION' AND gene_panel_id = 'WES'
            AND
            
        
               
        sample_unique_id IN (
         
                SELECT sample_unique_id
                FROM sample_derived
                WHERE cancer_study_identifier IN
                (
                    'mskimpact'
                )
            
            
            
                
            
  
            
            
            
            
                INTERSECT
                    
        SELECT sample_unique_id
        FROM clinical_data_derived
        WHERE attribute_name = 'MUTATION_COUNT' AND
        type='sample'
         AND ((
              match(attribute_value, '^[-+]?[0-9]*[.,]?[0-9]+$')
                
                   
                    
                            AND abs(
                                minus(
                                    
        multiIf(
            (startsWith(attribute_value, '<=') OR startsWith(attribute_value, '>=')),
            cast(substr(attribute_value, 3) as float),
            startsWith(attribute_value, '<'),
            cast(substr(attribute_value, 2) as float) - exp(-10),
            startsWith(attribute_value, '>'),
            cast(substr(attribute_value, 2) as float) + exp(-10),
            cast(attribute_value as float)
        )
    ,
                                    4
                                )
                            ) < exp(-11)
        
            ) OR (  match(attribute_value, '^[-+]?[0-9]*[.,]?[0-9]+$')
                
                   
                    
                            AND abs(
                                minus(
                                    
        multiIf(
            (startsWith(attribute_value, '<=') OR startsWith(attribute_value, '>=')),
            cast(substr(attribute_value, 3) as float),
            startsWith(attribute_value, '<'),
            cast(substr(attribute_value, 2) as float) - exp(-10),
            startsWith(attribute_value, '>'),
            cast(substr(attribute_value, 2) as float) + exp(-10),
            cast(attribute_value as float)
        )
    ,
                                    5
                                )
                            ) < exp(-11)
        
            ) OR (  match(attribute_value, '^[-+]?[0-9]*[.,]?[0-9]+$')
                
                   
                    
                            AND abs(
                                minus(
                                    
        multiIf(
            (startsWith(attribute_value, '<=') OR startsWith(attribute_value, '>=')),
            cast(substr(attribute_value, 3) as float),
            startsWith(attribute_value, '<'),
            cast(substr(attribute_value, 2) as float) - exp(-10),
            startsWith(attribute_value, '>'),
            cast(substr(attribute_value, 2) as float) + exp(-10),
            cast(attribute_value as float)
        )
    ,
                                    6
                                )
                            ) < exp(-11)
        
            ) OR (  match(attribute_value, '^[-+]?[0-9]*[.,]?[0-9]+$')
                
                   
                    
                            AND abs(
                                minus(
                                    
        multiIf(
            (startsWith(attribute_value, '<=') OR startsWith(attribute_value, '>=')),
            cast(substr(attribute_value, 3) as float),
            startsWith(attribute_value, '<'),
            cast(substr(attribute_value, 2) as float) - exp(-10),
            startsWith(attribute_value, '>'),
            cast(substr(attribute_value, 2) as float) + exp(-10),
            cast(attribute_value as float)
        )
    ,
                                    7
                                )
                            ) < exp(-11)
        
            ) OR (  match(attribute_value, '^[-+]?[0-9]*[.,]?[0-9]+$')
                
                   
                    
                            AND abs(
                                minus(
                                    
        multiIf(
            (startsWith(attribute_value, '<=') OR startsWith(attribute_value, '>=')),
            cast(substr(attribute_value, 3) as float),
            startsWith(attribute_value, '<'),
            cast(substr(attribute_value, 2) as float) - exp(-10),
            startsWith(attribute_value, '>'),
            cast(substr(attribute_value, 2) as float) + exp(-10),
            cast(attribute_value as float)
        )
    ,
                                    8
                                )
                            ) < exp(-11)
        
            ) OR (  match(attribute_value, '^[-+]?[0-9]*[.,]?[0-9]+$')
                
                   
                    
                            AND abs(
                                minus(
                                    
        multiIf(
            (startsWith(attribute_value, '<=') OR startsWith(attribute_value, '>=')),
            cast(substr(attribute_value, 3) as float),
            startsWith(attribute_value, '<'),
            cast(substr(attribute_value, 2) as float) - exp(-10),
            startsWith(attribute_value, '>'),
            cast(substr(attribute_value, 2) as float) + exp(-10),
            cast(attribute_value as float)
        )
    ,
                                    9
                                )
                            ) < exp(-11)
        
            ) OR (  match(attribute_value, '^[-+]?[0-9]*[.,]?[0-9]+$')
                
                   
                    
                            AND abs(
                                minus(
                                    
        multiIf(
            (startsWith(attribute_value, '<=') OR startsWith(attribute_value, '>=')),
            cast(substr(attribute_value, 3) as float),
            startsWith(attribute_value, '<'),
            cast(substr(attribute_value, 2) as float) - exp(-10),
            startsWith(attribute_value, '>'),
            cast(substr(attribute_value, 2) as float) + exp(-10),
            cast(attribute_value as float)
        )
    ,
                                    10
                                )
                            ) < exp(-11)
        ))
    
                
            
            
                INTERSECT
                    
        SELECT sample_unique_id
        FROM clinical_data_derived
        WHERE attribute_name = 'MUTATION_COUNT' AND
        type='sample'
         AND ((
              (
                    
        multiIf(
            
        attribute_value=''
        OR upperUTF8(attribute_value)='NA'
        OR upperUTF8(attribute_value)='NAN'
        OR upperUTF8(attribute_value)='N/A'
    ,
            'NA',
            
        upperUTF8(attribute_value)='TRUE'
    ,
            'True',
            
        upperUTF8(attribute_value)='FALSE'
    ,
            'False',
            attribute_value
        )
    
                ) = ''
        
            ) OR (  (
                    
        multiIf(
            
        attribute_value=''
        OR upperUTF8(attribute_value)='NA'
        OR upperUTF8(attribute_value)='NAN'
        OR upperUTF8(attribute_value)='N/A'
    ,
            'NA',
            
        upperUTF8(attribute_value)='TRUE'
    ,
            'True',
            
        upperUTF8(attribute_value)='FALSE'
    ,
            'False',
            attribute_value
        )
    
                ) = ''
        
            ) OR (  (
                    
        multiIf(
            
        attribute_value=''
        OR upperUTF8(attribute_value)='NA'
        OR upperUTF8(attribute_value)='NAN'
        OR upperUTF8(attribute_value)='N/A'
    ,
            'NA',
            
        upperUTF8(attribute_value)='TRUE'
    ,
            'True',
            
        upperUTF8(attribute_value)='FALSE'
    ,
            'False',
            attribute_value
        )
    
                ) = ''
        
            ) OR (  (
                    
        multiIf(
            
        attribute_value=''
        OR upperUTF8(attribute_value)='NA'
        OR upperUTF8(attribute_value)='NAN'
        OR upperUTF8(attribute_value)='N/A'
    ,
            'NA',
            
        upperUTF8(attribute_value)='TRUE'
    ,
            'True',
            
        upperUTF8(attribute_value)='FALSE'
    ,
            'False',
            attribute_value
        )
    
                ) = ''
        
            ) OR (  (
                    
        multiIf(
            
        attribute_value=''
        OR upperUTF8(attribute_value)='NA'
        OR upperUTF8(attribute_value)='NAN'
        OR upperUTF8(attribute_value)='N/A'
    ,
            'NA',
            
        upperUTF8(attribute_value)='TRUE'
    ,
            'True',
            
        upperUTF8(attribute_value)='FALSE'
    ,
            'False',
            attribute_value
        )
    
                ) = ''
        
            ) OR (  (
                    
        multiIf(
            
        attribute_value=''
        OR upperUTF8(attribute_value)='NA'
        OR upperUTF8(attribute_value)='NAN'
        OR upperUTF8(attribute_value)='N/A'
    ,
            'NA',
            
        upperUTF8(attribute_value)='TRUE'
    ,
            'True',
            
        upperUTF8(attribute_value)='FALSE'
    ,
            'False',
            attribute_value
        )
    
                ) = ''
        
            ) OR (  (
                    
        multiIf(
            
        attribute_value=''
        OR upperUTF8(attribute_value)='NA'
        OR upperUTF8(attribute_value)='NAN'
        OR upperUTF8(attribute_value)='N/A'
    ,
            'NA',
            
        upperUTF8(attribute_value)='TRUE'
    ,
            'True',
            
        upperUTF8(attribute_value)='FALSE'
    ,
            'False',
            attribute_value
        )
    
                ) = ''
        ))
    
            
        )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants