Skip to content

Add fp16, multi-GPU training script (toy dataset)#123

Merged
SaulLu merged 7 commits intomasterfrom
JC/joint_training_fp16
Jan 21, 2022
Merged

Add fp16, multi-GPU training script (toy dataset)#123
SaulLu merged 7 commits intomasterfrom
JC/joint_training_fp16

Conversation

@changjonathanc
Copy link
Collaborator

  • Changed an argument in load_dataset so it tries to read private datasets.

  • Added gradient_step logging, so we can see time v.s. gradient step on wandb

  • Adds 2 sub-experiments

      • fp16
      • fp16, 2 GPU
    • note: I only have 1 GPU, so I couldn't test this one, but I included a local_test.sh that can be used to test.
  • The toy dataset used is now privated. To run the experiments, you'd need to configure the huggingface access tokens with huggingface-cli login.

@changjonathanc changjonathanc requested a review from SaulLu January 14, 2022 14:14
logger = logging.getLogger(__name__)


load_dataset = functools.partial(load_dataset, use_auth_token=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something we don't necessarily need in our workflow I think as we work on the dataset already cloned locally. After that, I understand that you may need it for other tests on your side. ☺️

@SaulLu SaulLu merged commit 6552342 into master Jan 21, 2022
tianjianjiang added a commit to tianjianjiang/bigscience-metadata that referenced this pull request Jan 21, 2022
* master: (141 commits)
  build: bump nltk to 3.6.7 for security and performance (bigscience-workshop#130)
  build: bump nltk to 3.6.7 for security and performance (#5)
  Add fp16, multi-GPU training script (toy dataset) (bigscience-workshop#123)
  create dataset with html, timestamp, url, datasource, generation length and website description metadata and tittles, footers and headers from HTML (bigscience-workshop#119)
  remove `#SBATCH --gres=gpu:0 ` from `03_create_dataset.slurm` (bigscience-workshop#121)
  Add joint training slurm script (bigscience-workshop#111)
  Add features types for the metadata to extract and test multiprocessing (bigscience-workshop#118)
  feat: add a feature to choose where to extract metadata (bigscience-workshop#116)
  Use dateutil to parse date (bigscience-workshop#117)
  feat: change how the entity extraction process use ids (bigscience-workshop#115)
  add `path_or_url_flair_ner_model` in order to execute the entity extraction on a partition without internet (bigscience-workshop#106)
  delete old submodule
  delete ds_store
  style check
  style & quality
  imports
  handle IndexError for `wikipedia_desc_utils` (bigscience-workshop#102)
  handle the comment specific type not recognized by pyarrow (bigscience-workshop#83)
  quality check
  Change torch version + make it optional (bigscience-workshop#82)
  ...

# Conflicts:
#	bsmetadata/metadata_utils.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants