Skip to content

Add multi-table UniformSynthesizer #485

@amontanez24

Description

@amontanez24

Problem Description

As a user, I'd like to have a baseline synthesizer to compare different multi-table synthesizers to.

In the single table case we have the UniformSynthesizer. We also use this synthesizer as the fallback when other synthesizers timeout or fail. We need something like this in the multi-table case.

Our plan is to add a MultiTableUniformSynthesizer. All this synthesizer will do is use the single table UniformSynthesizer on each table. It does not have to handle connecting child to parent rows. We expect this synthesizer will have poor referential integrity for this reason.

Expected behavior

Add a new class

class MultiTableUniformSynthesizer(BaselineSynthesizer):
    def get_trained_synthesizer(data, metadata):
        """
        This function should train single table UniformSynthesizers on each table in the data.
        Args:
            data (dict): A dict mapping table name to table data.
            metadata (sdv.metadata.Metadata): The metadata
        
        Returns:
            A synthesizer object.
        """"
        pass

    def sample_from_synthesizer(synthesizer, scale):
        """
        Args:
            synthesizer (sdgym.synthesizers.BaselineSynthesizer): The trained synthesizer instance.
            scale (float): The scale of data to sample. Should default to 1.
        
        Returns:
            dict:  A dict mapping table name to the sampled data.
        """
        pass

Additional context

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions