You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working with the GRIT-20M dataset for the Alpha-CLIP project as described in the training README . However, I've encountered some discrepancies between the instructions and the dataset format I've obtained.
Dataset Format:
The data preparation script (sam_grit.py) is configured to use .tar files, as evidenced by the line: parser.add_argument('--tar-pth', type=str, default="GRIT-1m/00001.tar")
However, the dataset I've downloaded is in .parquet format (e.g., coyo_0_snappy.parquet, coyo_10_snappy.parquet, etc.).
Could you confirm if this .parquet format is correct for the latest version of the dataset?
Thank you for your time and assistance.
The text was updated successfully, but these errors were encountered:
you can follow the download script in KOSMOS-2 to download .tar file. If you download from hugging face, you need to adjust the script. (by the way, this script only use SAM to change box into masks, its easy to reimplement it in .parquet format)
Hello,
I'm working with the GRIT-20M dataset for the Alpha-CLIP project as described in the training README . However, I've encountered some discrepancies between the instructions and the dataset format I've obtained.
parser.add_argument('--tar-pth', type=str, default="GRIT-1m/00001.tar")
Thank you for your time and assistance.
The text was updated successfully, but these errors were encountered: