The best parameters for PICARD in eval. #85

JiexingQi · 2022-04-11T14:44:30Z

Hi, @tscholak . I run eval in T5-3B + PICARD model. The performance improvement by using PICARD is not significant as you showed in the results of your paper. For example, when I fine-tune a T5-3B model, the improvement is 3 for EM and <4 for EX. However, I find the improvement in your paper is 4 for EM (71.5 --> 75.5 by using PICARD) and 4.9 for EX (74.4 --> 79.3 by using PICARD).

The eval.json of mine is :

{
    "run_name": "eval_0411_spider_1984",
    "model_name_or_path": "t5-3b",
    "dataset": "spider",
    "source_prefix": "",
    "schema_serialization_type": "custom",
    "schema_serialization_randomized": false,
    "schema_serialization_with_db_id": true,
    "schema_serialization_with_db_content": true,
    "normalize_query": true,
    "target_with_db_id": true,
    "output_dir": "./experiment/eval_0411_spider_1984/",
    "cache_dir": "./transformers_cache",
    "do_train": false,
    "do_eval": true,
    "fp16": false,
    "per_device_eval_batch_size": 2,
    "seed": 1,
    "report_to": ["wandb"],
    "predict_with_generate": true,
    "num_beams": 4,
    "num_beam_groups": 1,
    "diversity_penalty": 0.0,
    "max_val_samples": 1034,
    "use_picard": true,
    "launch_picard": true,
    "picard_mode": "parse_with_guards",
    "picard_schedule": "incremental",
    "picard_max_tokens_to_check": 2,
    "eval_accumulation_steps": 1,
    "metric_config": "both",
    "val_max_target_length": 512,
    "val_max_time": 1200
}

Could you please tell me the eval parameter for you to obtain the best improvement? Thank you.

The text was updated successfully, but these errors were encountered:

tscholak · 2022-04-11T16:00:24Z

Hi @JiexingQi. Your configuration looks fine. What is the performance of your T5-3b model without PICARD? What happens if you use tscholak/cxmefzzi? Can you reproduce what I had?

tscholak · 2022-04-11T17:21:11Z

There currently also is an issue with the PICARD parser that may cause timeouts and therefore empty results, see #80 and #82 (comment)

tscholak added the question Further information is requested label Apr 15, 2022

tscholak closed this as completed Jul 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The best parameters for PICARD in eval. #85

The best parameters for PICARD in eval. #85

JiexingQi commented Apr 11, 2022

tscholak commented Apr 11, 2022

tscholak commented Apr 11, 2022

The best parameters for PICARD in eval. #85

The best parameters for PICARD in eval. #85

Comments

JiexingQi commented Apr 11, 2022

tscholak commented Apr 11, 2022

tscholak commented Apr 11, 2022