Texas A&M and the world: Exploring the networks and impact of Texas A&M’s researcH
Final Event
Tuesday, April 19th, 6:00pm-8:40pm, ILSB Auditorium, RM 1105 and online. Program and zoom details (TAMU NetID login required)
TECHNICAL ORIENTATION SESSION
March 1, 2022, 6-8pm Interdisciplinary Life Sciences Building Auditorium (ILSB 1105) & Online (Zoom info will be emailed to registrants)
Program (approximate times)
- 6:00pm Welcome and logistics, Nick Duffield, ECE/TAMIDS
- 6:10pm The competition challenge: networks and impact of Texas A&M’s research, Bruce Herbert, Library
- 6:20pm Bibliometrics & the competition datasets, Dong Joon Lee, Library
- 6:50pm Midpoint graphic prizes; what makes a good visualization? Darren Homrighausen, Statistics
- 7:20pm Introduction to Natural Language Processing, Jian Tao, VIZ/TAMIDS
- 7:50pm Review and discussion, Nick Duffield, ECE/TAMIDS
- 8:00pm Conclusion
Resources
- Experiences from 2019 Competition: LA Metro Bike Share, Josiah Coad, John Deere
- TAMIDS Data Science Primer, Jian Tao
- Introduction to Data Science
- Introduction to Graph Analytics with NetworkX
- Exploratory Data Analysis with pandas and matplotlib
- Introduction to Machine Learning with scikit-learn
- Introduction to Deep Learning with Keras
COMPETITION SETTING
As a Land Grant Institution, a major part of Texas A&M University’s mission is to conduct research that expands human knowledge and creates practical advances that serve the public good. Increasingly, we want our research to address societal grand challenges that have been identified as regional, national, and global priorities, such as energy, sustainability, climate change, human and animal health, poverty, and education. Describing Texas A&M’s research to the public and other stakeholders is a real challenge because of the size and diversity of Texas A&M’s research community.
Further challenging our efforts at communicating with the public is that much of Texas A&M’s research to address complex, emergent problems involves teams drawn from multiple disciplines of research. The National Science Foundation has labelled this type of research as convergent research, which entails research practices that integrate knowledge, methods, and expertise from different disciplines to form novel frameworks to catalyze scientific discovery and innovation.
Competition Challenge
The 2022 Data Science competition challenges participants to conduct a data-driven study that describes and visualizes Texas A&M research in ways that illuminate the patterns of collaboration across disciplines and can help Texas A&M to communicate about the significance and impact of our research. Participants should consider how they would describe Texas A&M’s research to university leaders, state representatives, or funding agencies, or the public, to answer:
- How have patterns of research involving multiple disciplines evolved at Texas A&M?
- What have been the successes for Texas A&M research in solving complex problems, and which new collaborations across disciplines could strengthen Texas A&M’s response to emerging societal challenges?
- Where has Texas A&M research been represented in public discourse that sets priorities for progress, and where can this representation be increased?
Potential Approaches
In the 2022 Data Science Competition, participants will combine bibliometric data concerning research publications from Texas A&M with data concerning wider publications, grants, policy, external partnerships and media to understand the nature and impact of Texas A&M’s research involving multiple disciplines at Texas A&M. Competitors will be provided access to multiple data sources, but may also identify and incorporate data from other sources into their solution.
Competitors are encouraged to apply existing and/or develop new metrics that capture research across disciplines represented in publications, based, for example, on organizational affiliation and/or keywords. Other potential metrics may express representation of research in public discourse, for example, in policy deliberations, news and social media. The collaboration graph representing joint publications is a rich resource on which to apply such metrics. Competitors may scope their study, for example, to focus on longitudinal aspects, i.e., metric development over time, and/or cross-sectional aspects, i.e., between different organizations within Texas A&M or in comparison with other universities. Diagnostics that reveal opportunities to enhance latent capacity for research across disciplines are welcome. This problem area presents a rich opportunity to apply multiple techniques from Data Science, including graph analysis, natural language processing, statistical analysis, and recommendation systems, and develop innovative ways to represent the results of their analysis through visualization.
COMPETITION ORGANIZATION & SCHEDULE
Registration is Closed; Deadline Was: Sunday February 27, 2022
- Registration is now closed. Competitors register using a Google Form (TAMU NetID login required) and acknowledge their understanding and intent to follow the rules of the competition. All registrations will be acknowledged. Registrants should list their team mates. Registrations can be updated to include further teammates. All team members must individually register by 11:59pm February 27, 2022; otherwise they will not be included in the competition.
- Teams, Divisions & Mentors: Students will work in teams of up to 5 members. The competition is split into two divisions: graduate and undergraduate. A team that contains at least one graduate student will be assigned to the graduate division. Teams have the option to have a faculty mentor to provide guidance. Competitors must obtain the agreement of the mentor to serve in this role before listing them in the registration. Competition judges and organizers cannot serve as team mentors.
- Eligibilty: The competition is open to graduate and undergraduate students from all majors at Texas A&M University, including Galveston and Qatar campuses. Competitors must be enrolled as a student at Texas A&M University during the Spring 2022 semester.
- Find Teammates through the Competition Slack Channel. Quick start guide: (1) Click on the. Slack Channel; (2) Click on “create an account” in top right corner; (3) Under “OR”, enter TAMU NetID and click “Continue”; (4) Click “Confirm Email” in confirmation email; (5) In the browser window that opens, enter your name and chosen password then click on “Create Account”; (6) Open workspace in browser or app. Problems? Email Carlie Payne to be invited to the channel.
- Competition Operation: Data release and competition submission will be conducted through the competition’s Canvas community, into which the organizers will enroll registrants and configure teams.
- Questions or Concerns? Email Carlie Payne.
March 1, 6-8pm: Technical Orientation Session (Hybrid); Data Release; Competition Opens
- Technical Orientation Session: will be held on March 1, 2022, 6-8pm, in person in the ILSB with remote participation via zoom. The session will present information on conference organization, the competition datasets, the competition context of bibliometrics analysis, and some orientation around approaches to statistical analysis, visualization, and data science project management .
- Data Release: datasets will be released through Canvas and registered teams may commence their analyses.
March and Early April: Office Hours (Online)
- Office Hours: Members of the organizing committee will be on hand to advise competitors on access and use of the competition data resources, and technical issues concerning analysis.
- Signup here (one signup is good for all team members) for a 30 minute Zoom meeting (tamu.edu authentication required) during these times:
- Mondays, 12-2pm: March 7, 21, 28; April 4.
- Thursdays, 3-5pm: March 3, 10, 24, 31.
March 22, 6-7pm: Midpoint Event (Online): Best Progress Graphic Prizes
- Best Progress Graphic Prize: Teams may optionally submit a one-page summary graphic of their initial work via Canvas by Saturday March 19, 2022, 11:59pm. The graphic must include the team name. The submissions will displayed and prizes of $250 will be awarded to each of the top three team entries at the online midpoint event on Tuesday March 22, 2022, 6:00pm-7:00pm. Zoom details to be announced.
April 5: Report Submission Deadline
- Report Submission & Format: teams will submit their report through Canvas as a PDF file, maximum 10 pages using 10 pt Arial font with 1 inch margins all round. Teams may submit supplementary materials such as (but not limited to): a code or data repository, a Jupyter notebook, a dashboard, or an app. A rubric will be supplied to registered competitors through Canvas. Reports must be submitted through Canvas by the deadline of April 5, 2022, at 11:59 pm.
April 12: Finalists Announced
- Finalist Selection and Preparation: After the close of the submission period, judges will review all entries and select participants to advance to the final round of the competition. Finalist teams will be notified through email by April 12, 2022. Finalist teams will will prepare a 10 minute presentation for in-person delivery of their findings and solutions at the finalist event.
April 19, 6-9pm: Final Event (Hybrid): Presentations and Prize Awards
- Event Format: Finalist teams will deliver to the judging panel at the final event on April 19, 2022, 6-8pm, held in-person in the ILSB with remote participation via zoom. Judges will review and select the winning teams based on their written report and presentation. The competition winners will be announced at the final event, along with special team prizes. Detailed program and zoom details to be announced.
Prizes
Graduate Division:
- First Placed Team: $1,500
- Second Paced Team: $1,000
- Third Placed Team: $500
Undergraduate Division:
- First Placed Team: $1,500
- Second Paced Team: $1,000
- Third Placed Team: $500
Special Team Prizes:
- Best presentation design: $500
- Best use of additional data: $500
- Best supplementary materials: $500
Organizing Team
- Nick Duffield (Electrical & Computer Engineering, TAMIDS)
- Bruce Herbert (Library)
- Darren Homrighausen (Statistics)
- Shuiwang Ji (Computer Science and Engineering)
- Dong Joon Lee (Library)
- David Lowe (Library)
- Carlie Payne (TAMIDS)
- Jennifer South (TAMIDS)
- Jian Tao (Visualization, TAMIDS)
Sponsors
TAMIDS gratefully acknowledges support for the 2022 Data Science Competition: Chevron, the Texas A&M Division of Research, The Texas A&M Libraries, and the Texas A&M Departments of Electrical and Computer Engineering, Statistics, and Visualization.