Data Science Updates is the University of Wisconsin-Madison's resource for news, training, events, and professional opportunities in data science, brought to you by the Data Science Institute, powered by American Family Insurance, and the Data Science Hub.
May 14, 2025
|
|
|
|
Data science students leave UW–Madison with skills that can be applied in a wide range of fields including finance, insurance, humanities, scientific research, health care, sports, and more! Congratulations to the 448 students who received undergraduate degrees in data science, 253 students who earned data science certificates, and 60 students who received master’s degrees in data science this year. Meet some of the data science graduates in this feature from the Statistics Department.
|
|
|
|
The undergraduate-only Data Carpentry workshop on June 2–5, from 9:00 a.m. to 12:30 p.m. at the Discovery Building, teaches fundamental data skills needed to conduct research. This workshop uses ecological data, but is generally suited for undergraduate researchers, teaching them how to work with data from project organization in spreadsheets, data cleaning with OpenRefine, data analysis with SQL and R, and data visualization in R. This workshop is for beginners wanting to get started using these tools and requires no prerequisite knowledge. Register on the workshop website by May 28th.
|
|
|
|
Join the Geospatial Data Carpentry workshop on June 2–5, from 1:00 p.m. to 5:00 p.m. at the Discovery Building, to learn how to read, manipulate, and visualize geospatial data using R. This workshop teaches how to work with raster and vector data. Intro R skills are required, such as knowledge of working with objects and functions. The workshop webpage includes links to lessons that can help you learn the necessary R skills before the workshop. Register on the workshop website by May 28th.
|
|
|
|
The undergraduate-only Software Carpentry workshop on June 9–10, from 9 a.m. to 4:30 p.m. at the Discovery Building, helps researchers get their work done in less time and with less pain by teaching research computing skills. This hands-on workshop will cover introductory concepts and tools including program design, version control, data management, and task automation teaching tools such as the Unix shell, Git/GitHub, and Python. This workshop is for beginners wanting to get started using these tools and requires no prerequisite knowledge. Register on the workshop website by June 4th.
|
|
|
|
May 15 and June 5, 1:00 p.m. - 1:45 p.m.; Zoom. This workshop will teach you to use the SSCC's Linux servers, Linstat and Slurm, for your research. The agenda will be flexible, depending on participants' needs, interests, and backgrounds. Topics may include moving your data to the SSCC file system, modifying your code so it will run on the servers, logging into the servers, running jobs interactively, submitting jobs to the Slurm cluster, running programs like R, Stata, and Python, and running jobs that use GPUs.
|
|
|
|
May 19 - 22, 9:00 a.m. - 12:30 p.m.; Orchard View Room, Discovery Building. Join the Data Science Hub for an introductory workshop on the basics of deep learning. Upon completion, you'll be able to train your first neural network. We will focus on the Keras framework — an excellent choice for those who don't want to write 300 lines of code to fit a single neural network model. Participants should be comfortable coding in Python and understand machine learning fundamentals. Registration closes today, May 14th!
|
|
|
|
May 27, 10:00 a.m. - 3:00 p.m.; 4218 Sewell Social Sciences. This class covers the basics of Stata. This class (or comparable experience) is a prerequisite for the rest of SSCC's Stata training. It will also prepare you to excel in classes that use Stata, like Sociology 361 or Economics 410. We suggest that new graduate students consider taking this class before or during their first semester. Registration is required.
|
|
|
|
May 28-30, 10:00 a.m. - 3:00 p.m.; 4218 Sewell Social Sciences. In this class, you'll learn how to wrangle data using Stata. We'll cover key concepts and workflows of data science and the structure and logic of Stata. We'll emphasize real-world issues like handling missing data and checking for errors and best practices for research computing and reproducibility. Before taking this class, students should take SSCC's Introduction to Stata or have equivalent experience.
|
|
|
|
May 28, 10:00 a.m. - 11:00 a.m.; Zoom. When we think of LLMs, general-purpose chatbots like ChatGPT or code assistants like GitHub Copilot usually come to mind. As useful as ChatGPT and Copilot are, LLMs have much more to offer—if you know how to code. Join Posit to learn LLM application programming interfaces from zero, and build and deploy custom LLM-empowered data workflows and apps.
|
|
|
|
May 28, 1:00 p.m. - 4:00 p.m.; 3218 Sewell Social Sciences. NVivo is a popular qualitative data analysis software that allows for organization, storage, coding, and analysis of any qualitative data, including text, images, and video. This course will introduce the NVivo interface. You'll learn to import data to NVivo, organize and code data, perform analyses such as word counts and cross-tabulation queries, and export query data.
|
|
|
|
May 29, 1:00 p.m. - 4:00 p.m.; 3218 Sewell Social Sciences. MaxQDA is a popular qualitative data analysis software that allows for organization, storage, coding, and analysis of any qualitative data, including text, images, and video. This course will introduce the MaxQDA interface. You'll learn to import data, organize your project, code data, perform analyses such as word counts and cross-tabulation queries, and export query data.
|
|
|
|
June 2, 10:00 a.m. - 3:00 p.m.; 3218 Sewell Social Sciences. This class introduces the basics of the RStudio interface and the R language, including creating and running scripts, saving your work, using functions, and installing packages. There will be opportunities to apply what we learn during class time. Registration is required.
|
|
|
|
June 3-6, 10:00 a.m. - 3:00 p.m.; 3218 Sewell Social Sciences. "Data wrangling" is the process of preparing data for analysis. This is a hands-on class with time devoted to practicing essential data wrangling skills. This course will first cover the tools for working with different data types. Then we will apply all of these in the context of datasets to create, transform, and clean variables. Registration is required.
|
|
|
|
June 3, 1:00 p.m. - 3:00 p.m.; 4218 Sewell Social Sciences. Have you ever had to do the same thing to ten different variables and wished you didn't have to write it out ten times? If so, this workshop is for you. If you haven't, you will, so we recommend this workshop for anyone who anticipates using Stata regularly. Registration is required.
|
|
|
|
June 4, 10:00 a.m. - 12:00 p.m.; Zoom. Harness AI to transform continuous improvement. This Skills Accelerator explores AI-powered insights, automation, and decision-making to enhance efficiency. Learn practical strategies to integrate AI into CI projects and drive real impact in your organization. Registration is required.
|
|
|
|
June 9, 10:00 a.m. - 3:00 p.m.; 4218 Sewell Social Sciences. In this workshop, you'll learn about Python's fundamental concepts and structures and Pandas DataFrames. This workshop is primarily intended to be taken in conjunction with the Data Wrangling in Python workshop. Registration is required.
|
|
|
|
June 10-12, 10:00 a.m. - 3:00 p.m.; 4218 Sewell Social Sciences. This hands-on course teaches wrangling skills, mostly using the data wrangling tools of the Pandas package in Python. Pandas is a collection of functions/methods for working with data comparable to R's tidyverse. This course will cover importing and cleaning data, creating and transforming variables, merging data, and basic data visualization. Almost all students should take Introduction to Python for Data Analysis before this course.
|
|
|
|
June 17, 10:00 a.m. - 12:00 p.m.; 3218 Sewell Social Sciences. This workshop will teach you how to create and modify data visualizations using ggplot2, a popular plotting package in R. Emphasis is placed on using plots to understand distributions of different numbers and types of variables. Registration is required.
|
|
|
|
June 17, 1:00 p.m. - 3:00 p.m.; 4218 Sewell Social Sciences. In this workshop, we'll discuss working with dates and times in Stata, including how Stata stores dates and times, converting dates and times into Stata's format, and using dates and times in Stata code. Registration is required.
|
|
|
|
Have questions about anything data science-related? Come see the Data Science Hub facilitators at Coding Meetup on Tuesdays and Thursdays from 2:30-4:30 p.m. CT. To join Coding Meetup, join data-science-hubgroup.slack.com.
|
|
|
|
May 16-18; Van Vleck Hall. The 2025 AWM Research Symposium will showcase research from women in the mathematical sciences across the career spectrum in academia, government, and industry. Learn from plenary lecturers from Sandbox AQ, Cornell University, Occidental College, and Smith College. Join the AWM for panel discussions, workshops, poster sessions, and networking opportunities.
|
|
|
|
May 16; 8:30 a.m. - 6:30 p.m.; 750 S Halsted St, Chicago, IL. This symposium unites experts and practitioners for a dynamic program of keynotes, talks, fireside chats, and panels. We will explore cutting-edge AI applications, responsible strategies, and workforce implications that ensure AI drives business success and societal benefits. There will be s essions on applied AI, interpretability, energy infrastructure, and responsible innovation.
|
|
|
|
May 16, 2:00 p.m. - 3:15 p.m.; Orchard View Room, Discovery Building. Join the Data Science Institute and the Department of Mathematics for a special seminar with Dr. Talitha Washington, professor of mathematics at Howard University. Dr. Washington will share insights on advancing AI technology and policy responsible for ensuring these technologies promote innovation, accountability, and broad societal benefit. Coffee and cookies will be served.
|
|
|
|
May 21, 11:00 a.m. - 12:00 p.m.; 175 Science Hall. Dr. Jinmeng Rao, AI researcher at Google DeepMind, will discuss three main challenges in trajectory privacy protection and Spatial AI methods: the trade-off between privacy and utility, data sparsity and imbalance issues, and endogenous privacy risks.
|
|
|
|
May 26, 1:00 p.m. - 2:30 p.m.; Zoom. This talk provides an insight into the general functionality of QDA software and will support researchers in confidently searching for a software tool that suits their research project and analysis workflow. We will discuss the promises of QDA software, cautions, and general strategies for selecting a tool. Additionally, we will introduce NVivo, MaxQDA, Atlas.ti, and Dedoose to compare and contrast their features and functionality.
|
|
|
|
May 27, 9:00 a.m. - 10:30 a.m.; DeLuca Forum, Discovery Building. Join the new UW RSE community at their first town hall meeting. The UW RSE community shares ideas, methods, and best practices to improve how we do and share research. The UW RSE community is open to anyone at UW–Madison engaged or interested in research software engineering. Coffee will be provided. RSVP by May 19th to help us best prepare for the event, but walk-ins are welcome.
|
|
|
|
Register by May 31- Join the National Center for Quantitative Biology of Complex Systems for the annual North American Mass Spectrometry Summer School from July 21-24 at the Discovery Building. Students will experience an engaging program covering fundamentals of mass spectrometry and the latest in its application to the analysis of plants (NSF) and animals (NIH). There will be lectures and workshops for scientific and professional development.
|
|
|
|
June 2-6, 2025; Fluno Center at the University of Wisconsin-Madison & Zoom. HTC25 brings together researchers, campuses, scientific collaborations, facilitators, administrators and professionals interested in high-throughput computing to engage with the throughput computing community ( OSG Consortium, Center for High Throughput Computing, HTCondor staff, PATh and Pelican teams). Register by May 22 to learn about HTC and new developments to advance your science and collaboration.
|
|
|
|
June 11, 18, 25, 12:00 p.m. - 1:00 p.m.; Zoom. Join UW–Madison's Center for Teaching, Learning & Mentoring for a light lift, high-impact discussion series exploring how AI is reshaping teaching and learning. We’ll discuss AI’s role in teaching and assessment, equity and academic integrity, and disciplinary perspectives. Each session will center on a short journal article or podcast episode, offering space to connect, reflect, and imagine what's next. Open to anyone who teaches or supports instruction.
|
|
|
|
June 20 - July 25, 1:00 p.m. - 4:00 p.m. The Sky's the Limit STEM Camp, hosted by the Nelson Institute Center for Climatic Research, broadens science opportunities for autistic youth in grades 5-12 with a medical diagnosis, self-diagnosis, or suspected diagnosis. The camp provides nature-based and interactive learning opportunities to build interest and appreciation for STEM (science, technology, engineering, and mathematics). Attendees, accompanied by their caregivers, will participate in science experiments and outdoor activities. Registration is limited to 20 participants.
|
|
|
|
June 23-24; University of Chicago, Logan Center for the Arts. The Midwest ML Symposium convenes regional machine learning researchers to stimulate discussions and debates, foster cross-institutional collaboration, and showcase ML researchers' collective talent at all career stages. There is an exciting lineup of plenary speakers and invited speakers, spanning various directions such as GenAI, trustworthy ML, ML for science/robotics, and deep learning theory. Graduate and undergraduate students are encouraged to submit a poster by May 15th to share their latest research!
|
|
|
|
- AI Engineer, MLOps Specialist, Vice Chancellor for Research and Graduate Education/Data Science Institute
- Data Science Program Manager, UW–Madison School of Medicine and Public Health, Informatics and Information Technology
- Data Scientist, Vice Chancellor for Research and Graduate Education/Data Science Institute
- Digital Pathology & AI Software Developer, UW–Madison School of Medicine and Public Health, Anatomic Pathology
- Research Data Scientist, UW–Madison School of Medicine and Public Health, Informatics and Information Technology
- Scientist I, UW–Madison School of Medicine and Public Health, Medical Physics, Gen
|
|
|
|
|
DATA VISUALIZATION OF THE WEEK
|
|
|
|
Below is an illustration of how federally-funded university and industrial research and development precede the emergence of large IT industries by decades. Seven out of eight industries originated in university labs.
|
|
|
|
Data Science Updates is a collaborative effort of the Data Science Institute and Data Science Hub. This newsletter was originally created by the Data Science Hub and published as Hub Updates.
Use our submission form to send us your news, events, opportunities and data visualizations for future issues.
|
|
|
|
|