Data Science Updates is the University of Wisconsin-Madison's resource for news, training, events, and professional opportunities in data science, brought to you by the Data Science Institute, powered by American Family Insurance, and the Data Science Hub.
April 3, 2024
|
|
|
|
2024 Research Bazaar Recordings Available
Data in Action was the theme of the 5th annual Research Bazaar, held February 7 and 8, 2024 at the Discovery Building. Video recordings of the introduction, lightning talks, career panel, and closing panel are now available on YouTube. If you missed these presentations, or if you want to re-watch something that intrigued or inspired you, check out the recordings!
|
|
|
|
UW–Madison Grad Students Selected for Data Science Leadership Summit
Graduate students Olivia Krebs and Yuchen Zeng will represent UW-Madison at the Michigan Institute for Data Science (MIDAS) Future Leaders Summit, April 8-10 at the University of Michigan. This event provides outstanding graduate students, postdocs, and early-career faculty from around the US with an opportunity to engage in research discussions with peers and research leaders, and receive career mentoring, as they grow to become leaders in data science and Artificial Intelligence (AI) research. Yuchen is a Research Assistant working with Kangwook Lee in the College of Engineering. Olivia is a graduate intern working with Pallavi Tiwari in the School of Medicine and Public Health. Congratulations, Yuchen and Olivia!
|
|
ML for Wisconsin: Feature Your Data for the 2024 Machine Learning Marathon!
The ML+X community invites principal investigators and companies in Wisconsin to contribute their data and associated machine learning task (classification or prediction) to the 2024 Machine Learning Marathon (MLM24), to be featured as competitions on Kaggle. This 12-week summer event offers a platform for machine learning practitioners to collaborate, learn, and innovate on real-world datasets, with challenges for both beginners and advanced participants. Weekly meetings will facilitate knowledge exchange and discussion on ML tools and strategies. Check out the ML Marathon blog post for more information. Applications are due by April 29.
|
|
|
|
In the News
Kangwook Lee, Assistant Professor in the Department of Electrical and Computer Engineering, was recently honored with the NSF Career Award. With his collaborators and students, Lee will develop a unified theory and new algorithms with provable guarantees for learning with frozen pretrained models, also known as foundation models.
|
|
|
|
Introduction to Text Analysis Workshop
Online, April 15-17, 2024 8:30 am - 12:30 pm, Join this recently developed Carpentries workshop for a practical Introduction to Text Analysis, designed for those with Python experience (how to create functions, for loops, conditional logic, use the pandas library, etc.). The workshop covers Natural Language Processing (NLP) basics, API usage, data preparation, document/word embeddings, topic modeling, Word2Vec, Transformer models using Hugging Face, and ethical considerations. Students and researchers working in the digital humanities are especially encouraged to attend! Register for Intro to Text Analysis here, and complete the short pre-workshop survey at your earliest convenience.
|
Next Generation Data Analysis Workshops
March and April, The Bioinformatics Resource Core (BRC) at the UW Biotechnology Center ( UWBC) is offering heavily hands-on workshops on Next-Generation Sequencing (NGS) Data Analysis skills:
- Access and analyze data with bash command line
- SNP and RNA-Seq analysis examples with open-source software on a Linux Platform
These day-long workshops are in-person sessions and do have a fee associated with them. Read workshops descriptions, access registration links, and view the calendar at the Bioinformatics website.
|
|
|
|
Have questions about anything data science-related? Come see the Data Science Hub facilitators at Coding Meetup on Tuesdays and Thursdays from 2:30-4:30 p.m. CT. To join Coding Meetup, join data-science-hubgroup.slack.com.
|
|
|
|
ML4MI Seminar: Segmentation of Kidney Structures
April 10, 10:00 a.m., The Machine Learning for Medical Imaging (ML4MI) is hosting a virtual seminar event, Segmentation of Kidney Structures, with Dr. Pinkai Sarder, Associate Professor of AI in Quantitative Health and Associate Director for Imaging at the Intelligent Critical Care Center, Department of Medicine, University of Florida. The abstract for the talk is below. Attend the event at this Zoom link.
Abstract: This talk will introduce the emerging field of digital and computational pathology, utilizing examples from studies focused on kidney microanatomy. We will delve into an overview of our research and that of others in the literature, specifically concerning the segmentation and feature extraction of kidney microanatomy from histology. Furthermore, we’ll explore the impact of our work in areas such as diabetic nephropathy classification, chronic kidney disease trajectory prediction, and its relevance to the NIH Kidney Precision Medicine Project (KPMP) consortium. Additionally, we will highlight our ongoing efforts within the Human Biomolecular Atlas Project (HuBMAP) consortium. Our focus here is directed towards detecting and segmenting multiple cell types and states exclusively from brightfield histology images. We will demonstrate a cloud-based, end-to-end system that operates through the UF supercomputing center. This system is designed to conduct various computational tasks related to renal pathology, starting with the analysis of brightfield histology images and extending to the integration of histology with spatial omics data. Lastly, we’ll conclude by discussing new opportunities and potential directions for collective contributions in the field of computational pathology.
|
All Are Invited to the 2024 Bioethics Symposium
April 11, 1:00 - 5:00 p.m., Health Sciences Learning Center 1325, This year’s theme is “AI, Ethics, and Health Care.” Artificial intelligence has taken the world by storm — including the world of health care. Many have touted AI’s promise for improving diagnosis, clinical judgment, and patient care. At the same time, the growing presence of AI in health care raises formidable ethical challenges. Algorithmic bias can exacerbate inequities. AI in the clinic can erode trust between patients and clinicians. And accountability can be undermined when opaque algorithms replace human deliberation. This year’s Bioethics Symposium will examine these and other AI-related concerns. Its goal is a broad roadmap for using health-centered AI in ways that are equitable, empathetic, and empowering.
|
Register for the NMBSI's AI Symposium
April 24, UW-Milwaukee and Virtually, The Northwestern Mutual Data Science Institute (NMDSI) will host its second annual AI symposium, “Bridging Innovation & Impact,” on Wednesday, April 24, at the University of Wisconsin-Milwaukee (UWM) and virtually.
This dynamic event will feature a range of topics centered on the ever-evolving landscape of AI. Attendees will hear from a variety of experts in the data science industry via a series of topical talks – including keynote addresses, fireside chats, panels, lightning talks, and other presentations.
Their two keynote speakers are Dr. Chris Wiggins, associate professor of applied mathematics at Columbia University and co-author of How Data Happened: A History from the Age of Reason to the Age of Algorithms; and Anne (Gregg) Skeet, senior director of leadership ethics at the Markkula Center for Applied Ethics.
View the full list of speakers, learn more about the symposium, and register via the Eventbrite page.
|
Registration Opens for the 2024 Midwest Machine Learning Symposium (MMLS'24)
May 20-21, University of Minnesota, This year's Midwest Machine Learning Symposium (MMLS'24) will be held May 20-21 at Graduate Minneapolis on the campus of University of Minnesota! Now the registration is open, please see the MMLS'24 conference website for details. As all the previous versions of MMLS, there is no registration fee. You are encouraged to present your works as posters during the conference (in the registration, there is a place to submit title/abstract).
Please note that there is a limited budget for free student housing. The deadline for applying is April 15. That is, priority for allocating these spaces will be given for those who register before the deadline.
|
posit::conf(2024)
August 21-24, Seattle & Limited Online, Whether you’re just starting your data science journey, a skilled professional, or a data science leader, posit::conf(2024) has it all, with four talk tracks, community events, updates on product enhancements, all-day workshops, and keynotes from your favorite data scientists. Learn more and register by May 28 for posit::conf(2024).
|
|
|
|
|
ComBEE Python and R Study Group
April 9, 12:00 p.m. - 1:00 p.m., Computational Biology, Ecology, & Evolution (ComBEE) is a group of researchers at UW-Madison interested in computational biology in ecology and evolution. Each workshop has a facilitator to guide discussion and will focus on the use of Python and R. This week will focus on R.
The ComBEE study group meets every two weeks in 4503 Microbial Sciences. Learn more about the community and how to attend the meeting at the ComBEE website.
|
How Can I Apply ML to My Data? Get Insights at ML+Coffee
April 10, 9:00 a.m. - 11:00 a.m., As part of the ML+X community's monthly coffee event, ML+Coffee, researchers and students with little or no background in machine learning (ML) are invited to join and ask how ML can be applied in their domain of work. ML+Coffee offers a casual and social atmosphere where ML practitioners can problem-solve with one another. Slides can be displayed on a large TV in the event room (room 1145, Discovery building). If interested, please contact the community's leadership team ( ml-community-leaders@g-groups.wisc.edu) with a short description of the problem and dataset. For additional context, check out some of the previous projects discussed at ML+Coffee.
|
|
|
|
EVIL in Spring 2024
April 19, 10:00 a.m. - 11:00 a.m., The Ethics, Values, Information, and Law (EVIL) reading group pursues scholarship in the intersections of ethics, law, and data and information technologies. The EVIL Reading group meets every three weeks (roughly), Fridays, online, and is hosted in collaboration with the iSchool and ML+X. Learn more about the community and how to attend the meeting at the EVIL website.
|
|
|
|
|
PROFESSIONAL
|
Psych 755 Instructor
Apply by April 14 – The job posting and application for the Summer 2024 Instructor position for Psych 755 has just opened up! Psych 755: Environments and tools for large-scale behavioral data science is designed to provide students with knowledge and experience conducting large-scale behavioral data science projects, independently and in collaboration with others, using a variety of contemporary software tools and environments.
Brief Descriptions
Psych 755 is a 3-credit course that meets on Tuesdays and Thursdays from 2-4:30 pm and is a required course for Data Science in Human Behavior Masters’ Students.
The course syllabus & assignments have been developed.
After taking this course students will be able to:
- Use online crowd-sourcing platforms for collecting behavioral data, such as Amazon Mechanical Turk, and understand issues of design, sampling, and interpretation associated with such platforms.
- Use integrated tools for conducting, documenting, and publishing complex behavioral data analyses, including JuPyTeR notebooks and R Markdown.
- Use the GitHub platform to conduct collaborative behavioral data science, including documentation, analysis development, versioning, forking and merging.
- Use the SQL database management system to manage large behavioral datasets
- Understand how to access and use high-throughput and high-performance infrastructure for computationally expensive jobs, and when use of these platforms is warranted.
Requirements to Apply
- Master's degree in any field (PhD Student with ABD or Master’s Standing)
- Permission from your advisor/supervisor
|
|
|
|
|
DATA VISUALIZATION OF THE WEEK
|
|
|
|
Anson Ho, Tamay Besiroglu, Ege Erdil, David Owen, Robi Rahman, Zifan Carl Guo, David Atkinson, Neil Thompson, and Jaime Sevilla. Algorithmic Process in Language Models.12 March 2024.
“Language models have come a long way since 2012, when recurrent networks struggled to form coherent sentences. Our new paper finds that the compute needed to achieve a set performance level has been halving every 5 to 14 months on average. (1/10)” “While algorithmic progress has been rapid, our Shapley value analysis suggests that 60-95% of the performance improvements stem from increased computing power and training data, while novel algorithms account for only 5-40% of the progress. (4/10)”
|
|
|
|
Data Science Updates is a collaborative effort of the Data Science Institute and Data Science Hub.
Use our submission form to send us your news, events, opportunities and data visualizations for future issues.
|
|
|
|
|