At a high level, this project’s goal was to automatically find important events (i.e, when a whale lunged to eat) in time-series data which encoded a whale’s activity. The manual approach was time intensive, but it had already been done a fair bit by researchers; thus, there was labeled data to learn from in a supervised setting. Through this project, we had to think about

  • the data (what was tracked and from where, how it was encoded, what it looked like, how researchers currently used it, what other data sources would later be available, etc.)
  • what approaches had been tried in the past
  • which metric(s) were best to track performance and compare models
  • which models would be appropriate (for a baseline, to take advantage of special properties of data, etc.)
  • which hyperparameters were most impactful to performance

If you want to learn more, you can look at any of the following (or reach out to me):

For an early write-up of this project, which shows a lot more of the details of how to get core parts of the machine learning project working, click here.

For the latest code and other files, you can also view the latest GitHub repo here.