Skip to content

UBC Scientific Software Seminar: Practical Data Science

Notifications You must be signed in to change notification settings

ubcs3/2017-Fall

Repository files navigation

UBC Scientific Software Seminar

The UBC Scientific Software Seminar is inspired by Software Carpentry and its goal is to help students, graduates, fellows and faculty at UBC develop software skills for science.

Fall 2017: Practical Data Science

OUTLINE

  • What are the learning goals?
    • To practice data wrangling using pandas
    • To construct data visualizations using matplotlib and bokeh
    • To build models and make predictions using scikit-learn
    • To submit models and solutions to Kaggle competitions
    • To meet and collaborate with other students and faculty interested in scientific computing
  • What software tools are we going to use?
  • What scientific topics will we study?
    • Data wrangling
    • Data visualization
    • Machine learning
  • Where do we start? What are the prerequisites?
    • Calculus, linear algebra, probability and statistics
    • Basic Python (see UBCS3 Summer 2016)
  • Who is the target audience?
    • Everyone is invited!
    • If the outline above is at your level, perfect! Get ready to write a lot of code!
    • If the outline above seems too intimidating, come anyway! You'll learn things just by being exposed to new tools and ideas, and meeting new people!
    • If you have experience with all the topics outlined above, come anyway! You'll become more of an expert by participating as a helper/instructor!

SCHEDULE

Please join the mailing list to receive weekly updates about the seminar.

  • Week 1 - Friday September 29 - 1-2pm - DLAM Learning Lab - [Notes]
    • Introduction to Kaggle
      • Competitions, datasets, kernels, and community
    • Getting Started
      • Titanic: Machine Learning from Disaster
      • Make a submission using a Decision Tree Classifier
  • Week 2 - Friday October 6 - 1-2pm - DLAM Learning Lab - [Notes]
    • Feature engineering on Titanic dataset
      • Titles and decks
      • Filling missing data
    • Random forest classifiers
  • Week 3 - Friday October 13 - 1-2pm - DLAM Learning Lab - [Notes]
    • Tunig parameters for random forest classifier
    • Our best attempt
  • Week 4 - Friday October 20 - 1-2pm - DLAM Learning Lab - [Notes]
    • NYC Taxi Trip Duration
      • Outliers and clusters
  • Week 5 - Friday October 27 - 1-2pm - DLAM Learning Lab - [Notes]
    • NYC Taxi Trip Duration
      • Distance and spatial features for a random forest resgressor
  • Week 6 - Friday November 3 - 1-2pm - DLAM Learning Lab - [Notes]
    • NYC Taxi Trip Duration
      • A random forest resgressor for every route
  • Week 7 - Friday November 10 - No meeting
  • Week 8 - Friday November 17 - 1-2pm - DLAM Learning Lab - [Notes] (presented by @sempwn)
    • West Nile Virus Prediction
      • Data exploration
  • Week 9 - Friday November 24 - 1-2pm - DLAM Learning Lab - [Notes] (presented by @sempwn)
    • West Nile Virus Prediction
      • Models and predictions
  • Week 10 - Friday December 1 - 1-2pm - DLAM Learning Lab
    • Quora Question Pairs
      • Natural language processing with NLTK

About

UBC Scientific Software Seminar: Practical Data Science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •