Skip to content

build: bump nltk to 3.6.7 for security and performance#130

Merged
tianjianjiang merged 11 commits intobigscience-workshop:masterfrom
tianjianjiang:dependabot/pip/nltk-3.6.6
Jan 21, 2022
Merged

build: bump nltk to 3.6.7 for security and performance#130
tianjianjiang merged 11 commits intobigscience-workshop:masterfrom
tianjianjiang:dependabot/pip/nltk-3.6.6

Conversation

@tianjianjiang
Copy link
Collaborator

@tianjianjiang tianjianjiang commented Jan 21, 2022

@shanyas10 and @SaulLu I'm going to merge it since it is based on dependabot. But if this makes website description preprocessing slower or even unusable, we can always revert it.

@tianjianjiang tianjianjiang changed the title build: bump nltk o 3.6.7 for security and performance build: bump nltk to 3.6.7 for security and performance Jan 21, 2022
@tianjianjiang tianjianjiang self-assigned this Jan 21, 2022
@tianjianjiang tianjianjiang added the bug Something isn't working label Jan 21, 2022
@dependabot dependabot bot deleted the dependabot/pip/nltk-3.6.6 branch January 21, 2022 15:38
@tianjianjiang tianjianjiang marked this pull request as ready for review January 21, 2022 15:39
@tianjianjiang tianjianjiang merged commit 9382b4f into bigscience-workshop:master Jan 21, 2022
tianjianjiang added a commit to tianjianjiang/bigscience-metadata that referenced this pull request Jan 21, 2022
* master: (141 commits)
  build: bump nltk to 3.6.7 for security and performance (bigscience-workshop#130)
  build: bump nltk to 3.6.7 for security and performance (#5)
  Add fp16, multi-GPU training script (toy dataset) (bigscience-workshop#123)
  create dataset with html, timestamp, url, datasource, generation length and website description metadata and tittles, footers and headers from HTML (bigscience-workshop#119)
  remove `#SBATCH --gres=gpu:0 ` from `03_create_dataset.slurm` (bigscience-workshop#121)
  Add joint training slurm script (bigscience-workshop#111)
  Add features types for the metadata to extract and test multiprocessing (bigscience-workshop#118)
  feat: add a feature to choose where to extract metadata (bigscience-workshop#116)
  Use dateutil to parse date (bigscience-workshop#117)
  feat: change how the entity extraction process use ids (bigscience-workshop#115)
  add `path_or_url_flair_ner_model` in order to execute the entity extraction on a partition without internet (bigscience-workshop#106)
  delete old submodule
  delete ds_store
  style check
  style & quality
  imports
  handle IndexError for `wikipedia_desc_utils` (bigscience-workshop#102)
  handle the comment specific type not recognized by pyarrow (bigscience-workshop#83)
  quality check
  Change torch version + make it optional (bigscience-workshop#82)
  ...

# Conflicts:
#	bsmetadata/metadata_utils.py
Copy link
Collaborator

@SaulLu SaulLu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try it 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

Status: Closed

Development

Successfully merging this pull request may close these issues.

2 participants