Data Science Updates is the University of Wisconsin-Madison's resource for news, training, events, and professional opportunities in data science, brought to you by the Data Science Institute, powered by American Family Insurance, and the Data Science Hub.
March 20, 2024
Grant Supports Open-Source Tool Creation for Crop Disease Forecasting
The UW–Madison Open Source Program Office (OSPO) and Damon Smith, Plant Pathology, were recently awarded a Wisconsin Idea Collaboration Grant from UW–Madison Extension. Their project focuses on the development of a crop disease forecasting tool that is openly accessible to farmers. Located at the Data Science Institute, OSPO collaborates with key stakeholders and community members to further the Wisconsin Idea by connecting open-source practitioners and adopters on campus and beyond.
ML for Wisconsin: Feature Your Data for the 2024 Machine Learning Marathon!
The ML+X community invites principal investigators and companies in Wisconsin to contribute their data and associated machine learning task (classification or prediction) to the 2024 Machine Learning Marathon (MLM24), to be featured as competitions on Kaggle. This 12-week summer event offers a platform for machine learning practitioners to collaborate, learn, and innovate on real-world datasets, with challenges for both beginners and advanced participants. Weekly meetings will facilitate knowledge exchange and discussion on ML tools and strategies. Check out the ML Marathon blog post for more information. Applications are due by April 29.
Introduction to Text Analysis Workshop
Online, April 15-17, 2024 8:30 am - 12:30 pm. Join this recently developed Carpentries workshop for a practical Introduction to Text Analysis, designed for those with Python experience (how to create functions, for loops, conditional logic, use the pandas library, etc.). The workshop covers Natural Language Processing (NLP) basics, API usage, data preparation, document/word embeddings, topic modeling, Word2Vec, Transformer models using Hugging Face, and ethical considerations. Students and researchers working in the digital humanities are especially encouraged to attend! Register for Intro to Text Analysis here, and complete the short pre-workshop survey at your earliest convenience.
Next Generation Data Analysis Workshops
March and April, The Bioinformatics Resource Core (BRC) at the UW Biotechnology Center ( UWBC) is offering heavily hands-on workshops on Next-Generation Sequencing (NGS) Data Analysis skills:
- Access and analyze data with bash command line
- SNP and RNA-Seq analysis examples with open-source software on a Linux Platform
These day-long workshops are in-person sessions and do have a fee associated with them. Read workshops descriptions, access registration links, and view the calendar at the Bioinformatics website.
Have questions about anything data science-related? Come see the Data Science Hub facilitators at Coding Meetup on Tuesdays and Thursdays from 2:30-4:30 p.m. CT. To join Coding Meetup, join
Registration Opens for the 2024 Midwest Machine Learning Symposium (MMLS'24)
May 20-21, University of Minnesota, We are pleased to announce that this year's Midwest Machine Learning Symposium (MMLS'24) will be held May 20-21 at Graduate Minneapolis on the campus of University of Minnesota! Now the registration is open, please see the MMLS'24 conference website for details. As all the previous versions of MMLS, there is no registration fee. You are encouraged to present your works as posters during the conference (in the registration, there is a place to submit title/abstract).
Please note that there is a limited budget for free student housing. The deadline for applying is April 15th. That is, priority for allocating these spaces will be given for those who register before the deadline.
Thank you and we are looking forward to seeing you in Minneapolis!
August 21-24, Seattle & Limited Online, Whether you’re just starting your data science journey, a skilled professional, or a data science leader, posit::conf(2024) has it all, with four talk tracks, community events, updates on product enhancements, all-day workshops, and keynotes from your favorite data scientists. Learn more and register by March 21 for posit::conf(2024).
EVIL in Spring 2024
March 29, 10:00 a.m. - 11:00 a.m., The Ethics, Values, Information, and Law (EVIL) reading group pursues scholarship in the intersections of ethics, law, and data and information technologies. The EVIL Reading group meets every three weeks (roughly), Fridays, online, and is hosted in collaboration with the iSchool and ML+X. Learn more about the community and how to attend the meeting at the EVIL website.
ComBEE Python and R Study Group
April 2, 1:00 p.m. - 2:00 p.m., Computational Biology, Ecology, & Evolution (ComBEE) is a group of researchers at UW-Madison interested in computational biology in ecology and evolution. Each workshop has a facilitator to guide discussion and will focus on the use of Python and R. The ComBEE study group meets every two weeks in 4503 Microbial Sciences. Learn more about the community and how to attend the meeting at the ComBEE website.
How Can I Apply ML to My Data? Get Insights at ML+Coffee
April 10, 9:00 a.m. - 11:00 a.m., As part of the ML+X community's monthly coffee event, ML+Coffee, we invite researchers and students with little or no background in machine learning (ML) to join us and ask how ML can be applied in their domain of work. ML+Coffee offers a casual and social atmosphere where ML practitioners can problem-solve with one another. You don't need to prepare anything formal to discuss your project. That said, a few slides might be helpful to go over the problem and dataset. We can display the slides on a large TV in the event room (room 1145, Discovery building). If interested, please contact the community's leadership team ( with a short description of the problem and dataset. For additional context, check out some of the previous projects discussed at ML+Coffee.
Psych 755 Instructor
Apply by March 8 – The job posting and application for the Summer 2024 Instructor position for Psych 755 has just opened up! Psych 755: Environments and tools for large-scale behavioral data science is designed to provide students with knowledge and experience conducting large-scale behavioral data science projects, independently and in collaboration with others, using a variety of contemporary software tools and environments.
Brief Descriptions
Psych 755 is a 3-credit course that meets on Tuesdays and Thursdays from 2-4:30 pm and is a required course for Data Science in Human Behavior Masters’ Students.
The course syllabus & assignments have been developed.
After taking this course students will be able to:
- Use online crowd-sourcing platforms for collecting behavioral data, such as Amazon Mechanical Turk, and understand issues of design, sampling, and interpretation associated with such platforms.
- Use integrated tools for conducting, documenting, and publishing complex behavioral data analyses, including JuPyTeR notebooks and R Markdown.
- Use the GitHub platform to conduct collaborative behavioral data science, including documentation, analysis development, versioning, forking and merging.
- Use the SQL database management system to manage large behavioral datasets
- Understand how to access and use high-throughput and high-performance infrastructure for computationally expensive jobs, and when use of these platforms is warranted.
Requirements to Apply
- Master's degree in any field (PhD Student with ABD or Master’s Standing)
- Permission from your advisor/supervisor
Open Source Program Office Seeks Communications Intern
Apply by March 24 – The Open Source Program Office in the Data Science Institute is hiring an undergraduate summer intern to assist with outreach and communications. If you have a passion for communicating science and technology, this could be the internship for you! While open-source experience is not required, a background in journalism, life sciences communication, technical communication, or education is preferred. View the position description and application information on the student jobs website.
OSG School 2024: Learn to harness large-scale computing for research
Apply by April 1 – CHTC is seeking applicants for the OSG School 2024, to be held August 5-9 at UW-Madison (please note this is an in-person event). CHTC will pay all basic travel, hotel, and food costs for applicants who are selected to attend.
Using lectures, demonstrations, hands-on exercises, roleplays, and personal consulting with OSG experts, the OSG School will teach you how to use high-throughput computing (HTC) effectively and get a research workload up and running. Past participants have come from physics, chemistry, life sciences, engineering, earth sciences, agricultural and animal sciences, economics, social sciences, medicine, and more.
Ideal candidates are:
- Researchers (especially graduate students and post-docs) in any research area for which large-scale computing is a key part of the research process;
- People (especially students and staff) who support researchers who are current or potential users of high-throughput computing;
- Instructors (at the post-secondary level) who teach future researchers and are ready to integrate high-throughput computing into their curriculum.
- Application Period (OPEN NOW): 3 March - 1 April 2024
- OSG School: 5-9 August 2024
David McCandless, Tony Camme, Nell Simon-Batsford, Sven Ehmann. (2024) Facebook’s “Supreme Court” Major Rulings to Date.
“In 2020, Meta (then Facebook) created a quasi-independent, 20 member oversight board with the power to override the company’s decisions around content-moderation.”
“Out of 78 cases, the Oversight Board:
- overturned 63 of Meta/Facebook’s moderation decisions (80%)
- upheld 13 cases (16%)
- delivered mixed verdicts in 2 cases (3%)”
Data Science Updates is a collaborative effort of the Data Science Institute and Data Science Hub.
Use our submission form to send us your news, events, opportunities and data visualizations for future issues.