Computational Modelling Group

Workshop  3rd May 2017 6 p.m.  Nuffield Theatre Room 1083 (6/1083)

Detect clickbait with machine learning

Oliver Laslett
University of Southampton

Web page
http://southampton-python.github.io/
Categories
Classification, Computational Social Science, Data Aggregation, Data Management, Data Science, Digital Humanities, Git, IPython/Jupyter Notebook, Linux, Mac OS X, Machine learning, NGCM, Pandas, Python, Scientific Computing, Social Networks, Support Vector Machine, Windows
Submitter
Thomas Kluyver

With this one weird trick you can build a text processing pipeline!

We've all fallen for clickbait articles online. They pollute our news feeds and make it harder to filter out valuable information. In this workshop we'll stream news articles in real-time and detect clickbait using simple machine learning techniques.

By the end of the workshop you'll have your very own python app for streaming real-time news and detecting click bait. In the workshop we'll cover:

  • Streaming data from a REST API
  • Preprocessing textual data
  • Training a simple machine learning classifier for clickbait
  • Putting everything together in a scikit-learn pipeline
  • Analysing our results (which news source is the most clickbaity?)

This session is an interactive workshop - please bring a laptop.