|
Data Science Updates is the University of Wisconsin-Madison's resource for news, training, events, and professional opportunities in data science, brought to you by the Data Science Institute, powered by American Family Insurance, and the Data Science Hub.
May 1, 2024
|
|
|
|
Steve Wright Receives Hilldale Professorship
Computer Sciences Professor Steve Wright was recently honored with a Hilldale Professorship, along with four other UW–Madison faculty.
Hilldale Professorships are given to faculty members who excel in scholarly activity, have records of outstanding research or creative work, and show promise of continued productivity. Wright designs and analyses algorithms for continuous optimization and studies optimization and machine learning in many areas. Congratulations!
|
|
|
|
Complete the Open Source Survey by May 3
Have you taken the Open Source Survey? The Open Source Program Office (OSPO) is conducting this survey to understand how open source is utilized and regarded, and its impact on academics and research at UW–Madison. The survey should take no more than 10 minutes. Your responses will help improve understanding of the open-source landscape at UW-Madison and guide future endeavors in this field. Please complete the survey by Friday, May 3, 2024, and thank you for taking the time to participate. Questions? ospo@datascience.wisc.edu.
|
Data Science Hub Hiring a Data Science Facilitator
The Data Science Hub is on the lookout for an additional data science facilitator to join their team!
Role: Conduct training and foster community engagement in data science across the UW-Madison campus.
|
Carpentries Instructor Training
Want to improve your skill with coding or teaching coding? The best way to improve your programming skill is to teach it. The best way to improve your teaching skill is to practice! Join for Carpentries Instructor Training to work toward those goals.
The Carpentries is a global community centered around teaching foundational coding and data science skills to researchers worldwide. The University of Wisconsin–Madison has a long-standing relationship with the Carpentries, and we offer a handful of instructor certificates annually to those interested in joining the community. Carpentries Instructor Training is designed to prepare trainees to certify and participate as Carpentries instructors. Much of the curriculum focuses on educational principles that apply across a wide variety of contexts.
Carpentries Instructor Training has the following goals:
- Introduce you to evidence-based teaching practices
- Teach you how to create a positive environment for learners at your workshops
- Provide opportunities for you to practice and build your teaching skills
- Help you become integrated into the Carpentries community
- Prepare you to use these teaching skills in teaching Carpentries workshops
|
|
|
|
Intro to Deep Learning with Keras
May 29-31, 8:30 a.m. - 12:30 p.m.; Orchard View Room, Discovery Building
Description: The use of deep learning has seen a sharp increase of popularity and applicability over the last decade. While deep learning can be a useful tool for researchers from a wide range of domains, taking the first steps in the world of deep learning can be somewhat intimidating. This introduction aims to cover the basics of deep learning in a practical and hands-on manner, so that upon completion, you will be able to train your first neural network and understand what next steps to take to improve the model. We will focus on the Keras framework for this workshop — an excellent choice for those that don't want to write 300 lines of code to fit a single neural network model. The concepts will apply across all deep learning frameworks. Next year, we will teach a similar lesson using the PyTorch framework. Preview the lesson materials for more information on what topics will be covered.
Prerequisites: Learners are expected to know Python (how to create functions, for loops, conditional logic, and use the pandas library, etc.) and machine learning fundamentals (overfitting and underfitting, common evaluation metrics, common use-cases).
|
Geospatial Data Carpentry
June 3-6, 1:00 p.m. - 5:00 p.m.; Room 115, Ingraham Hall
Description: The goal of this workshop is to provide an introduction to core geospatial data concepts and dive into working with raster/vector data, including how to open, work with, and plot vector and raster-format spatial data in R. Additional topics include working with spatial metadata (extent and coordinate reference systems), reprojecting spatial data, and working with raster time series data. Preview the lesson materials (under schedule) for more information on what topics will be covered.
Prerequisites: This lesson assumes you have some knowledge of R. If you have not used R, or want a refresher, review the Introduction to R for Geospatial Data lesson. You can follow along with this prerequisite lesson via this recorded video series.
|
|
|
|
Have questions about anything data science-related? Come see the Data Science Hub facilitators at Coding Meetup on Tuesdays and Thursdays from 2:30-4:30 p.m. CT. To join Coding Meetup, join data-science-hubgroup.slack.com
|
|
|
|
Genomics Seminar Series: Decoding the Evolutionary Histories and Functions of Human Accelerated Regions
May 2, 1:30 p.m., Room 1111 Biotechnology Center Auditorium & Online, The Center for Genomic Science Innovation is hosting Dr. Katherine S. Pollard, Director of the Gladstone Institute of Data Science & Biotechnology and Professor at the University of California San Francisco, as part of their Genomics Seminar Series for "Decoding the Evolutionary Histories and Functions of Human Accelerated Regions." The abstract for the talk is below. Attend the event at the Biotechnology Center Auditorium or at this Zoom link.
Abstract: “Human accelerated regions (HARs) are sequences that have been highly conserved through millions of years of vertebrate evolution and then changed dramatically in the human genome since divergence from our common ancestor with chimpanzees. This evolutionary signature suggests that HARs play important roles and that their functions may have been lost or changed in our ancestors, making HARs exciting candidates for understanding the genetic basis for what makes us human. However, it has been challenging to determine what HARs do and why the evolutionary forces constraining HAR sequences in other species suddenly changed in our lineage. In this talk, I will describe updated methods for identifying accelerated regions in any lineage using large multiple sequence alignments and machine learning approaches that have shed light on the evolutionary histories of HARs. These modeling approaches are generating new hypotheses about the fastest evolving regions in the human genome, which we are testing using high-throughput genomic tools for functional characterization of non-coding sequences. This prediction-first strategy exemplifies my vision for a proactive, rather than reactive, role for data science in biomedical research."
|
El Zoominario: What can we learn from farm kids' immune system when studying house dust microbiome?
May 3, 3 p.m., Online, The Solís-Lemus is hosting Rene Welch from UW-Madison as part of their El Zoominario seminar series for "What can we learn from farm kids’ immune system when studying house dust microbiome?". Attend the talk with this Zoom link. After the talk, the recording will also be uploaded to YouTube.
This talk will feature some ML components including:
- Classical algorithms to label the microbiome sequences to their respective bacteria, but those are quite conservative at this moment, i.e., naïve Bayes or decision trees.
- Classifier fitting and a variable importance analysis of which bacteria are important when predicting farm status.
About Rene Welch
Rene Welch Schwartz is a computational biologist working at Dr. Irene Ong’s lab and supporting Carbone Cancer Center scientists at the University of Wisconsin Madison. Rene is from Mexico City where he earned his BSc in Applied Mathematics at ITAM and moved the Madison where he earned a PhD in Statistics at UW-Madison. Rene has worked with multiple omics data types studying disease, cancer, and its associations with the immune system.
|
HTC 24: Connect with the High Throughput Computing Community
July 8-12, University of Wisconsin-Madison, You are invited to the second annual Throughput Computing event (HTC 24) from July 8-12 to be held in beautiful Madison, Wisconsin. HTC 24 brings together researchers, campuses, science collaborations, facilitators, administrators, government representatives, and professionals interested in high throughput computing to:
- Engage with the throughput computing community, including the OSG Consortium, and the HTCondor, PATh, and Pelican teams and many others contributing to HTC
- Be inspired by presentations and conversations with community leaders and contributors sharing common interests
- Learn about HTC and new developments to advance your science, your collaboration, or your campus
Connect with CC* Campuses and OSG Staff
CC* campuses (current and potential) will have the opportunity to build connections and to advance their technical know-how at the dedicated CC* track held Wednesday, July 10th. These sessions will bring together campus staff, including staff involved directly with HTC technology, with the OSG Consortium staff. The goal is to engage with and to learn from each other to improve the experience of providing or utilizing capacity and to advance scientific research on your own campus and across the nation.
Speaking Opportunities
Lightning Showcases will be introduced from the community on Tuesday, July 9. Come and give a lightning talk about your project, tool, or activities around HTC. To keep the session relaxed and informal, there will be opportunities for signing up for a slot on the first day of the workshop. We also encourage you to consider a more formal talk. Technical presentations at HTC 24 are short, typically 20 minutes in length. Applying merely requires a brief abstract submission.
Registration
Registration Is Open! Visit the event site for registration information. Registration is required for attendees, even if you plan to attend remotely only. Registration for in-person attendance will cost $125 per day; there is no fee for registration for virtual attendance.
Questions and Resources
HTC 24 is sponsored by the OSG Consortium, the HTCondor team, and the UW-Madison Center for High Throughput Computing. For questions about attending, speaking, accommodations, and other concerns please contact the Partnership to Advance Throughput Computing htc@path-cc.io.
|
|
|
|
|
EVIL in Spring 2024
May 10, 10:00 a.m. - 11:00 a.m., The Ethics, Values, Information, and Law (EVIL) reading group pursues scholarship in the intersections of ethics, law, and data and information technologies. The EVIL Reading group meets every three weeks (roughly), Fridays, online, and is hosted in collaboration with the iSchool and ML+X. Learn more about the community and how to attend the meeting at the EVIL website.
This week's reading: TBD
|
How Can I Apply ML to My Data? Get Insights at ML+Coffee
May 15, 9:00 a.m. - 11:00 a.m., As part of the ML+X community's monthly coffee event, ML+Coffee, researchers and students with little or no background in machine learning (ML) are invited to join and ask how ML can be applied in their domain of work. ML+Coffee offers a casual and social atmosphere where ML practitioners can problem-solve with one another. Slides can be displayed on a large TV in the event room (Room 1145, Discovery Building). If interested, please contact the community's leadership team ( ml-community-leaders@g-groups.wisc.edu) with a short description of the problem and dataset. For additional context, check out some of the previous projects discussed at ML+Coffee.
|
|
|
|
PROFESSIONAL
|
Psych 750 Instructor
Apply by May 12 – The job posting and application for the Fall 2024 Instructor position for Psych 750 has just opened up! Psych 750: Programming for Human Behavioral Data Science is designed to provide students with knowledge and experience conducting large-scale behavioral data science projects, independently and in collaboration with others, using a variety of contemporary software tools and environments.
Brief Descriptions
- Psych 750 is a 3-credit course that meets on Thursdays from 9-11:30 am. It is a 150-minute lecture per week led by the lecturer, as well as one code-along discussion section per week that is run by a graduate teaching assistant.
After taking this course students will be able to:
- Learn fundamentals of Python and best practices for writing efficient and understandable code
- Learn how to code a variety of experimental paradigms and how to capture participants' responses
- Improve their debugging and problem-solving skills
- Learn basic data wrangling in R's tidyverse environment
- Learn how to scrape data and manage complex data structures
- Learn techniques for automating repetitive tasks
- Learn to be able to efficiently use and build on existing open-source APIs
- Have an opportunity to learn about topics specific to student interests
Course materials such as homework assignments, course website, GitHub repository, and Slack workspace have already been developed. These materials will be made available to the lecturer in case they would like to utilize them.
Requirements to Apply
- Master's degree in any field (PhD Student with ABD or Master’s Standing)
- Permission from your advisor/supervisor
|
STUDENT
|
Web Developer - Internet Scout Research Group
Apply by May 31 - The Internet Scout Research Group, based in the Computer Sciences Department, is a pioneering digital library research center in the U.S. and is seeking a Web Developer to develop web and digital repository software and services.
Requirements
- Experience with PHP / JavaScript, CSS, and hand-coded HTML
- Written communication skills
- 12-15 hours of weekly availability
- Student at UW-Madison graduating December 2025 or later
Position Summary
- Will use SQL, jQuery / Bootstrap, Selenium / Behave on Linux to develop digital repositories
- Hours are flexible. Must work 8 hours between 8:00 a.m. and 5:00 p.m. Monday and Friday. Remaining hours can be worked anytime
|
|
|
|
|
DATA VISUALIZATION OF THE WEEK
|
|
|
|
Nestor Maslej. IEEE Spectrum and Stanford HAI Report 2024. 15 Graphs that Explain the State of AI in 2024. 15 April 2024.
|
|
|
|
Data Science Updates is a collaborative effort of the Data Science Institute and Data Science Hub.
Use our submission form to send us your news, events, opportunities and data visualizations for future issues.
|
|
|
|
|