|
Data Science Updates is the University of Wisconsin-Madison's resource for news, training, events, and professional opportunities in data science, brought to you by the Data Science Institute, powered by American Family Insurance, and the Data Science Hub.
February 5, 2025
|
|
|
|
Research Bazaar Registration Opens Feb. 12
Get ready for the 6th annual Data Science Research Bazaar, March 19-20 at the Discovery Building! Hosted by the Data Science Hub, this year’s event will explore the potential and limitations of AI and ML in research. Presentations will also highlight fundamental and applied data science across research fields and industries.
The Research Bazaar is an opportunity to connect with UW–Madison’s thriving data science community and learn from inspiring presentations. All are welcome! Registration opens February 12. Learn more at bazaar.datascience.wisc.edu.
|
|
|
|
RABBIT a Leap Forward for Campus Collaboration and Industry Engagement
The Office of Business Engagement (OBE) and Data Science Institute (DSI) have launched a new tool that helps university faculty and staff discover relevant research, shared interests and potential collaborators across campus.
|
|
|
|
RABBIT (Research and Business-Bridging Intelligence Tool) is an AI-powered faculty discovery tool that is available to all UW–Madison faculty and staff, and WARF employees, with a NetID. It was designed and developed by OBE and DSI to help industry engagement offices on campus connect faculty with applied research opportunities. RABBIT is also a powerful tool for any faculty or staff seeking researchers on campus with specific expertise.
|
|
|
|
Diakonikolas Receives NSF CAREER Award
Assistant Professor Jelena Diakonikolas, a Computer Sciences researcher in the field of large-scale optimization, recently received a National Science Foundation (NSF) CAREER Award for her proposal “Optimization and Learning with Changing Distributions.” The award comes with funding to support research and education initiatives through 2029.
Early inspiration for her proposal came several years ago, when Diakonikolas, an affiliate with the Data Science Institute, participated in a DSI symposium on climate-smart agriculture and forestry. Diakonikolas noticed a pattern in the obstacles encountered by experts in this area: theoretical machine learning results were not specific enough to address the dynamic, sometimes imperfect, real-world data sets used in applied research.
|
|
|
|
CDIS Leadership Transitions to Arpaci-Dusseau
Remzi Arpaci-Dusseau, the Grace Wahba and Vilas Distinguished Achievement Professor of Computer Sciences, assumed leadership of the School of Computer, Data & Information Sciences in January. Arpaci-Dusseau is an internationally renowned computer scientist, known for his research in storage and distributed systems. He is a long-standing leader within the university’s computing community and has been a key figure in CDIS for the past five years. In a recent video, he shares his goals and vision for the future of CDIS.
|
|
|
|
Tom Erickson, CDIS Founding Director, stepped down from this leadership role in late January. Erickson led CDIS since its inception in 2019. Under his guidance, CDIS achieved significant milestones, including the completion of a $267 million campaign to construct Morgridge Hall — the largest privately funded project in university history.
|
|
|
|
Exploring AI in Teaching: Critical AI Literacy
February 5, 12:00 p.m. - 1:00 p.m.; Zoom. Participants will engage with the ethical use of AI tools, potential biases in AI systems, implications for academic integrity, and strategies for integrating AI into writing assignments and activities in thoughtful, engaging ways. This session will provide a space for discussion and practical insights, empowering instructors to make informed decisions about incorporating AI while considering its challenges. To learn more and register, visit the Exploring AI in Teaching calendar listing.
|
|
|
|
Introduction to NVivo
February 5, 1:00 p.m. - 2:30 p.m.; 4218 Sewell Social Sciences. NVivo is a popular qualitative data analysis software that allows for organization, storage, coding, and analysis of any qualitative data including text, images, and video. This course will provide an introduction to the NVivo interface and teach you how to import data to NVivo, organize and coding data, perform analysis such as word counts and cross tabulation queries, and export query data. To learn more, visit the Introduction to NVivo calendar listing.
|
|
|
|
Functions and Iteration in R
February 6, 9:00 a.m. - 12:00 p.m.; 3218 Sewell Social Sciences. This workshop will teach you the basics of function writing to turn existing code procedures into functions, and return multiple and conditional values. We will then learn how to apply these functions we have written to series of data to perform tasks such as calculating the standard error of each column in a dataframe, run simulations, and write and read files. We will also discuss how to parallelize iterative processes on the SSCC Slurm cluster.
|
|
|
|
Introduction to MaxQDA
February 6, 1:00 p.m. - 2:30 p.m.; 4218 Sewell Social Sciences. MaxQDA is a popular qualitative data analysis software that allows for organization, storage, coding, and analysis of any qualitative data including text, images, and video. This course will provide an introduction to the MaxQDA interface and teach you how to import data, organize your project, code data, perform analysis such as word counts and cross tabulation queries, and exporting query data. To learn more, visit the Introduction to MaxQDA calendar listing.
|
|
|
|
Loops and Macros in Stata
February 7, 1:00 p.m. - 3:00 p.m.; 4218 Sewell Social Sciences. Have you ever had to do the same thing to ten different variables and wished you didn't have to write it out ten times? If so, this workshop is for you. If you haven't experienced that, you will, so we recommend this workshop for anyone who anticipates using Stata regularly.
|
|
|
|
R Programming: R Basics (repeat)
February 7, 10:00 a.m. - 12:30 p.m.; Zoom. This workshop is an exact repeat of the January 31 session. This workshop will cover the basics of R programming. By the end of this session, you will be able to create variables, use pre-defined functions, understand data types, and load and inspect a dataset using RStudio. We will work through setting up a project directory, cover key concepts and terminology, and load and inspect a dataset. This workshop is geared toward programming novices; no previous experience required. To learn more, visit the R Programming calendar listing.
|
|
|
|
Regression Diagnostics with R
February 11, 9:00 a.m. - 12:00 p.m.; 3218 Sewell Social Sciences. The usefulness and accuracy of regression models depend on whether several assumptions are satisfied, but many researchers do not check whether their model assumptions are met. In this workshop, we will learn the importance of satisfying each regression assumption, how to check for assumption violations with statistical and visual tests, and how to correct for any violations.
This workshop assumes you are familiar with the basics of fitting models as taught in Regression Review with R or any introductory statistics course. To learn more and register, visit the Regression Diagnostics with R calendar listing.
|
|
|
|
Design Principles for Data Visualization
February 11, 6:00 p.m. - 7:00 p.m.; Media Studio Room 2252B, College Library, Helen C. White Hall. Want to learn how to make well-designed data visualizations that stand out? This Love Data Week, join other UW-Madison data viz lovers for a workshop on Design Principles for Data Visualization hosted by DesignLab. This workshop will cover design strategies and ways to think outside the box when visualizing your data. Registration is suggested to help with planning, but not required to attend. To learn more, visit the Design Principles for Data Visualization calendar listing.
|
|
|
|
R Programming: Data Wrangling
February 14, 10:00 a.m. - 12:30 p.m.; Zoom. Data is rarely perfect out of the box. This workshop will cover how to manipulate datasets using an R package called dplyr. After this session, you will be able to select rows and columns, add new columns, remove missing data and create summary tables of your data. A basic working knowledge of R and RStudio (functions, operators, data types) would be helpful for you to get the most out of this session. To learn more and register, visit the R Programming: Data Wrangling calendar listing.
|
|
|
|
Dates and Times in Stata
February 14, 1:00 p.m. - 3:00 p.m.; 4218 Sewell Social Sciences. In this workshop, we'll discuss working with dates and times in Stata, including how Stata stores dates and times, converting dates and times into Stata's format, and using dates and times in Stata code. The materials for this course can be found in the SSCC's Knowledge Base.To learn more and register, visit the Dates and Times in Stata calendar listing.
|
|
|
|
Python Programming: Loops, Lists, and Functions
February 18, 10:00 a.m. - 12:00 p.m.; Zoom. This workshop will take a deeper dive into Python, covering essential topics such as automating tasks using loops, lists, and functions. Attendees must have a basic understanding of Python concepts (e.g., variables, data types) is helpful. To learn more and register, visit the Python Programming: Loops, Lists, and Functions calendar listing.
|
|
|
|
R Programming: Data Visualization
February 21, 10:00 a.m. - 12:00 p.m.; Zoom. If you're familiar with R, but want to do more with your plots than the base graphics package. This workshop will show you how to use the ggplot2 package in R. After this session, you will be able to create a variety of plot types, alter their aesthetics, and create custom themes. A working knowledge of R, RStudio, and dplyr would be helpful for you to get the most out of this session. To learn more and register, visit the R Programming: Data Visualization calendar listing.
|
|
|
|
Have questions about anything data science-related? Come see the Data Science Hub facilitators at Coding Meetup on Tuesdays and Thursdays from 2:30-4:30 p.m. CT. To join Coding Meetup, join data-science-hubgroup.slack.com
|
|
|
|
Neural Operators for Scientific Applications: Learning on Function Spaces
TODAY February 5, 12:30 p.m. - 1:30 p.m.; Discovery Building, Orchard room 3280. Join Jean Kossaifi, Senior Research Scientist at NVIDIA, to discuss how weather forecasting and aerodynamics require spatiotemporal processes and solutions to partial differential equations on continuous domains at multiple scales. However, traditional deep learning approaches only learn mapping between finite dimensional vector spaces. Neural operators address this limitation by generalizing deep learning to learn mappings between function spaces. Kossaifi will introduce the fundamentals of neural operators and demonstrate their application to concrete problems such as weather forecasting. For more information, view the full abstract from SILO's upcoming talks page.
|
|
|
|
Developing Responsible AI Monitoring Technologies for Chronic Care
February 6, 12:00 p.m. - 1:00 p.m.; 1240 Computer Sciences and Zoom. Data from everyday devices are increasingly being repurposed to monitor symptoms of heterogeneous chronic conditions: conditions where symptoms present diversely across individuals, and the devices used for symptom monitoring vary across a population. While these variations may not greatly affect personal tracking applications, they pose challenges towards use in clinical settings.
|
|
|
|
Toward Flexible and Effective Human-Robot Teaming
February 7, 12:00 p.m. - 1:00 p.m.; 1240 Computer Sciences and Zoom. Despite nearly seventy years of development, robots are not yet realizing their promise of handling the undesirable day-to-day tasks of skilled industrial workers. Recent studies indicate that today’s robots are still too inflexible and difficult to program, particularly for less structured and high-variability tasks.
Mike Hagenow, postdoctoral fellow at MIT, will present three recent approaches to human-robot teaming that aim to unlock new opportunities for robots. These approaches address key questions in human-robot teaming, such as how to optimize human input during teaming and how skilled workers can teach robots complex behaviors. For more information, visit the Toward Flexible and Effective Human-Robot Teaming calendar listing.
|
|
|
|
Identifying New Sources of Subseasonal Predictability
February 7, 12:00 p.m. - 1:00 p.m.; 811 Atmospheric, Oceanic and Space Sciences and Zoom. Subseasonal forecasts predict one- or two-week mean temperature patterns up to 6 weeks in advance. To make subseasonal forecasts, we often rely heavily on climate mechanisms, such as El Niño, that are themselves predictable and that are known to impact temperature. Looking at the historical side of these sources of predictability, all of them were first discovered and then later their impact on predictability was established. This leaves us with the question – have we missed a source of predictability?
|
Learning Through Comparison: Use Cases of Contrastive Learning
February 10, 10:30 a.m. - 11:30 a.m.; Discovery Building, Orchard room 3280 and Zoom.
Contrastive learning is more than just a tool for big tech—it's a practical approach that improves models across a variety of machine learning tasks, from structured data to vision and NLP. At its core, contrastive learning transforms how models learn by focusing on relationships rather than rigid class assignments. Instead of training models to classify data into fixed categories, contrastive learning encourages them to structure representations based on similarity and dissimilarity between pairs of observations. This relational approach has been key to advancements in feature learning, clustering, multimodal models, and out-of-distribution detection, helping models make better use of unlabeled data and generalize more effectively. This forum, lead by Prof. Yin Li and Chris Endemann, will explore how contrastive learning works, why it's useful, and how you can integrate it into your workflows. We'll cover key methods, real-world applications, and practical ways to get started. To join the discussion, please fill out the ML+X registration form—we’d love to hear your insights and questions!
|
|
|
|
Nexus Winter Challenge: Share ML/AI Resources and Win Prizes!
Nexus is the ML+X community’s website for crowdsourcing machine learning (ML) and AI resources —such as recorded talks, ML/AI libraries, genAI tools, model-use guides, datasets (e.g., for transfer learning), workshop materials, and more. Since launching in summer 2024, Nexus has quickly grown to feature 40 ML/AI resources in total — thanks, contributors! While this number is an impressive milestone, we'd love to see additional ML/AI practitioners contribute and fully leverage Nexus for their own needs/purposes. To incentivize additional contributions over winter, we are thrilled to offer a $50 Amazon gift card to whoever contributes the greatest number of resources by February 10th. This is only intended as an extra incentive, as the real prize is helping to build a stronger, more connected community of ML/AI practitioners on campus! Visit the How to Contribute page for detailed guidance on contributing. If you’re unsure about the scope (most submissions are accepted) or have other questions—please contact endemann@wisc.edu.
|
|
|
|
ML+Coffee: Connect, Share, and Explore Machine Learning
February 12, 9:00 a.m. - 11:00 a.m.; Rm. 1145, Discovery Building. ML+Coffee offers a supportive and casual environment to discuss ongoing machine learning (ML) projects and share knowledge & tools across campus. Whether you're looking for advice on applying ML/AI to your data, hoping to demo a favorite tool, or interested in discussing an ML/AI paper, ML+Coffee offers the perfect space. All experience levels and backgrounds are welcome. Caffeinated beverages provided ☕ to keep the ideas flowing, courtesy of our sponsors.
The first spring event will be held on Wednesday, February 12, from 9:00 a.m. to 11:00 a.m. in Room 1145, Discovery Building, and we’re seeking two volunteers to kick us off by discussing an ongoing project (seeking feedback), showcasing an ML/AI tool, or discussing a paper. No formal presentation is required—this event prioritizes open dialogue over formal presentations. Use the ML+Coffee discussion/demo form to sign up for February 12 or later dates (March 5, April 2, and May 7). No matter your background, you’ll find a welcoming space to connect, learn, and share insights with the ML+X community.
|
|
|
|
Property-Based Testing for the People
February 14, 12:00 p.m. - 1:00 p.m.; 1240 Computer Sciences and Zoom. Property-based testing (PBT) is a testing methodology that allows users to write executable specifications of programs and then test those specifications with automatically generated program inputs. PBT is well-documented as a power-tool for bug-finding, with success stories at companies like Amazon, Jane Street, DropBox, and Volvo, but it still has significant room to grow.
Harry Goldstein, Victor Basili Postdoctoral Fellow at the University of Maryland, combines techniques from programming languages, human-computer interaction, and software engineering to better understand the needs of real PBT users and increase the reach of this powerful testing tool. To learn more about Goldstein's work, visit the Property-Based Testing for the People calendar listing.
|
|
|
|
Distinguished Entrepreneurs Lunch sponsored by Neider & Boucher, S.C.
February 19, 12:15 p.m. - 1:00 p.m.; 4151 Grainger Hall. If you are interested in strategic sourcing and supply chain management with top tech firms, now is your chance to meet Matt Billings, Principal Category Manager of Microsoft Cloud & AI. Billings has led billion-dollar global teams, driven agile supply strategies, and enhanced semiconductor supply chain resilience. RSVP to the event via the Distinguished Entrepreneurs Lunch calendar listing.
|
|
|
|
MadData25 Hackathon
February 22-23, 9:00 a.m. - 9:00 a.m.; 1240 CS Building. MadData is dotData flagship event—a 24-hour data-focused hackathon happening February 22-23, 2025 that attracts top computer science and data science students from across campus. The MadData25 Competition will be open for 24 hours. There will be a kick-off event in the morning, lunch in the afternoon, and various companies and keynote speakers throughout the weekend. This event presents a unique opportunity for students to work in a team environment to strengthen their data science skills.
Even if you have little coding experience, students can still participate. Registration opens January 30th. To learn more and register, view the Hackathon flyer and visit the MadData webpage.
|
|
|
|
CHTC Researcher Forum
February 26, 2:30 p.m. - 4:30 p.m.; Discovery Building Orchard View Room. Join the Center for High Throughput Computing users and the UW-Madison research computing community at the inaugural CHTC Researcher Forum this February! At the forum, you will learn about recent changes at CHTC and what’s new and upcoming, connect with CHTC users and learn from each other about their use of CHTC at a mini poster session, engage with CHTC staff, and meet a computer server.
CHTC hopes that this is an event that empowers all CHTC users and community members to start and continue conversations with us as collaborative research partners. To learn more and register, visit the CHTC Researcher Forum 2025 webpage.
|
|
|
|
|
Opportunity Scholarships for posit::conf(2025)
Registration is now open for posit::conf(2025), happening September 16 - 18 in Atlanta, GA! Super Early Bird pricing is available for a limited time only. For the next month, get conference passes at the lowest possible price. At this time, we are offering in-person tickets only, virtual tickets will be available at a later date!
Cat Hicks, a leading psychologist specializing in software teams, is one of many keynotes speakers at posit::conf(2025). Hicks's work focuses on improving developer performance and well-being, helping teams become more effective problem solvers. Her insights will offer valuable perspectives for data scientists, as we often grapple with complex challenges and rely heavily on teamwork.
Posit is committed to ensuring our conference is accessible to all, regardless of economic means. Posit::conf will continue to include a hybrid format for those who cannot attend in person. Posit also offer sponsorships to 40 individuals worldwide who are members of a group underrepresented at posit::conf. The application process is officially open and will close on Tuesday, February 21. Learn more and apply on the posit webpage.
|
|
|
|
Undergraduate Research Assistant - Laryngeal Physiology Lab
Apply by February 14 - The Laryngeal Physiology Laboratory is a part of the Department of Surgery. We study various aspects of the vocal folds and how they relate to a healthy voice. Student researchers will work with other lab members on a variety of projects related to the physiology and biomechanics of vocal fold vibration, as well as acoustic analysis.
|
|
|
|
Space Science and Engineering Center Summer Internship Program
Apply by February 17 - The University of Wisconsin Space Science and Engineering Center (SSEC) is a research and development center with an international reputation for developing instrumentation for both terrestrial and space flight applications. The SSEC internship provides highly motivated students an opportunity to apply their abilities in support of cutting-edge atmospheric research.
From year to year projects may include a combination of designing, monitoring, and debugging data ingest archival systems; creating and maintaining data analysis tools; and processing and visualization of atmospheric data collected from ground, aircraft or satellite based instrumentation in collaboration with partners and funders. To learn more and apply, visit the SSEC Summer Internship Program job posting on the Student Jobs board.
|
|
|
|
Baseball Statistics Intern
Apply by February 23 - Dairyland Collegiate League (DCL) is a collegiate baseball league based in south-central Wisconsin. Internships at Dairyland Collegiate League offer a front row experience to the day to day grind of the sports industry. DCL is a small league and working here offers you the chance to experience a variety of roles and take on your own projects and bring your ideas to life.
The intern will be in charge of entering play-by-play through GameChanger during games, which is broadcasted to fans, transfering statistics from GameChanger to an online statistic platform, and staying up to date on player statistics and leaderboards. To learn more and apply, visit the Statistics Intern job posting on the Student Jobs board.
|
|
|
|
AI Institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) Educational Fellows Program
Apply by February 24 - The National Science Foundation-funded AI Institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) is now accepting applications for its 2025 Educational Fellows Program. Graduate students, postdoctoral fellows, and early-career educators and researchers from all domains and disciplines who are actively enrolled in or formally affiliated with a US-based institution are eligible to apply. Those interested in developing skills and/or positing research questions using AI-enabled cyberinfrastructure and knowledge systems to support areas such as digital agriculture, food access and security, and animal ecology are strongly encouraged to apply. Fellows will be positioned to apply their educational, community building, and workforce development expertise in a practical experience setting that broadens the impact of ICICLE's mission.
|
|
|
|
Data Scientist, Sage Bionetworks
Apply ASAP - Sage Bionetworks is looking for a Data Scientist to support and drive computational research to study cancer and other diseases using publicly available datasets. The Data Scientist will use their understanding of large scale data modalities and investigate computational approaches to integrate multiple modalities and sources of data to answer disease-relevant questions. Initial focus will include: 1) applying computational, statistical, and systems biology approaches to high dimensional data from human and model systems, and 2) applying methods to harmonize various data modalities from multiple sources.
Ideal candidates will have a computational biology background and have a desire to gain experience in open science, FAIR data, and data-driven research in cancer or other disease domains. To learn more and apply, visit the Data Scientist job posting.
|
|
|
|
Research Data Scientist (AI/ML)
Apply by February 9 - The Research Data Scientist (AI/ML) will join the Wisconsin Health Data Hub (WHDH) as part of the Tech Hub initiative. As a Data Scientist, the incumbent will use real world data including Electronic Health Record (EHR) data to develop and implement advanced computational algorithms and support conduct of groundbreaking data-driven research.
On a day-to-day basis, the incumbent could expect to engage in data collection and preprocessing, algorithm development and optimization, building predictive modeling and insights, and collaborate with key internal and external stakeholders regarding data science best practices and methodologies. To learn more and apply, visit the Research Data Scientist job posting.
|
|
|
|
Research Data Scientist - Ontology
Apply by February 9 - The Research Data Scientist (Ontology) will join the Wisconsin Health Data Hub (WHDH) as part of the Tech Hub initiative. On a day-to-day basis, the incumbent could expect to develop/implement and maintain biomedical ontologies to organize and manage data, collaborate with cross-functional teams to define and document data models, d esign and implement data taxonomy and classification systems, conduct ontology mapping and integration, and develop metadata dictionaries as necessary for research datasets. To learn more and apply, visit the Research Data Scientist - Ontology job posting.
|
|
|
|
Honest Broker
Apply by February 16 - The UW School of Medicine and Public Health (SMPH) is seeking a highly skilled and ethical Honest Broker to join their team. The Honest Broker will: advise researchers in the best use of various state-of-the-art cyber infrastructure (CI) systems, tools, and software; partner with facilitators and researchers to co-create and co-learn research activities and relevant advanced computing and data capabilities; coordinate projects with technical teams; contribute to strategy for engaging with faculty research groups and building facilitation capacities; and participate in cross-functional management teams and projects. To learn more and apply, visit the Honest Broker job posting.
|
Postdoctoral Position(s) Research Associate(s) in Dairy AI
Apply by February 28 - The Department of Animal and Dairy Sciences at the University of Wisconsin–Madison invites applications for one or two Postdoctoral Research Associate positions to lead advancements in artificial intelligence (AI) applications for sustainable and climate-smart dairy farm management.
This role will focus on applying Reinforcement Learning (RL) to optimize complex decision-making processes such as culling and replacement strategies by integrating economic, environmental, and animal welfare considerations and leveraging farm-specific data streams. The Research Associate will also leverage or extend modeling frameworks like the Ruminant Farm Systems (RuFaS) model to evaluate management scenarios, focusing on actionable outcomes such as improving diets, manure management, and cropping systems to enhance economic viability, environmental sustainability, and overall farm productivity. For more information and to apply, visit the Postdoctoral Research Associate job posting.
|
|
|
|
|
DATA VISUALIZATION OF THE WEEK
|
|
|
|
Stop emissions, stop warming: A climate reality check
In a hypothetical year zero where GHG emissions fall to zero, the amount of CO2 in the atmosphere is projected (by all models) to fall much more quickly (left graph) than the global temperature (right graph), which is predicted to essentially stay around the new equilibrium temperature present at year zero, taking 100 years to fall about 0.1 degree Celsius (though there is much wider variation from model to model).
The amount of CO2 in the atmosphere decreases quicker than global temperature because when emissions stop, atmospheric CO2 will begin to decline as it is absorbed by the ocean and land biosphere. This in turn reduces heating of the climate system. In much the same way that it takes a very long time for a hot tub filled with cold water to warm after you set the heater, the oceans will take a very long time to fully warm to reach equilibrium with the fixed atmospheric composition.
Reposted from the Data Science Community Newsletter, an Academic Data Science Alliance project.
|
|
|
|
Data Science Updates is a collaborative effort of the Data Science Institute and Data Science Hub.
Use our submission form to send us your news, events, opportunities and data visualizations for future issues.
|
|
|
|
|