Due: Thursday, Jan. 11
Due: Thursday, Jan. 18
Interact with the python interpreter. In these 5 short exercises you only need to type directly into the interpreter. There should be no need to rely on a text editor or other editing programs. All of these are only a few lines each
Submit the output of your python intepreter session. Similar to what we have seen in class. All submissions are through the course canvas site.
Due: Thursday, Jan. 18
Create a group of up to four (4) people who will work as a team on the course project.
Submit to the Instructor (either paper or in email) the full name and UWNetID for each person on your team.
Due: Monday, Jan. 22
Due: Thursday, Jan. 25
Write 4 short python programs.
Submit a sample run of your code for each program. Similar to what we have seen in class. An edited transcript of your python interpreter session is fine. All submissions are through the course canvas site.
Due: Monday, Jan. 29
Write 2 short python programs.
Submit a sample run of your code for each program. Similar to what we have seen in class. An edited transcript of your python interpreter session is fine. All submissions are through the course canvas site.
Due: Monday, Jan. 29
Due: Thursday, Feb. 1
Setup HCDE Python User Module
Submit:
(1) The text of your command line session or python interpreter session showing that HCDE530Test01 and HCDE530Test02 are authorized.
(2) The query term that you used with Search.py and 4 example tweets that were returned from the search.
Note: All submissions are through the course canvas site.
Due: Thursday, Feb. 1
Write a proposal to collect, analyze and present some Twitter data. You can choose to collect data of your own, or propose an analysis and presentation of one of two datasets provided in class (Election 2016 or Oscar 2016). Your proposal should describe:
Proposals should be a maximum of 4 pages in 12pt Times Roman (or equivalent) font. You should clearly address each of the four main areas (i.e., Motivation, Data, Analysis, Presentation). The questions above are only to help you think about what belongs in each of the sections.
Submit your written proposal. Only 1 submission is needed from each group. Please make sure all team members names are on the submission. If more than one person from your group submits the proposal, then I will take the last one submitted prior to the deadline.
Note: All submissions are through the course canvas site.
Due: Monday, Feb. 5
Due: Monday, Feb. 5
Database Connection and Data Access
Submit:
(1) The text of your python interpreter session showing that you can import pymysql and sqlalchemy modules.
(2) A list of the top 5 hashtags found and the frequency count for each hashtag.
Note: All submissions are through the course canvas site.
Due: Monday, Feb. 12
Due: Thursday, Feb. 22
Tweet Text Analysis - Jaccard Distance
Using the samples from class and from the book, write a python script that calculates Jaccard distance for all of tweets for two different days. Your code should produce both the Jaccard distance for each pair of days, and the cardinality of the term intersection (i.e., the total number of terms that occur both days). When calculating the Jaccard distance and term intersection, your script should use at most the top 3000 terms. Further, your script should not consider any terms that occur only once in a given day.
Run your script for one of the weeks indicated for both course datasets, producing 7 pairwise comparisons each week. Use the output of your script to produce a chart showing the distance computed for each day. You should assume that the values for a given day represent the distance and term intersection relative to the prior day. That is, if you are calculating the distance and intersection for Sunday February 21, 2016, then you would need to have the terms for Saturday February 20, 2016.
Submit:
(1) The python script/code that you used to generate the Jaccard distance for each day.
(2) Two jpg, png, or pdf chart showing the total tweets for each day, the count of the term intersection and the jaccard distance for the weeks you choose in each dataset.
Note: All submissions are through the course canvas site.
Due: Thursday, Mar. 1
Tweet Network Analysis - 2-Mode Hashtag Network
Using the samples from class and from the book, write a python script that builds a 2-mode network linking users and hashtags. That is, the social network will have user nodes and hashtag nodes. Edges in the network will only link users to the hashtags that they use. Make sure your python script will clean the network of any singleton user-hashtag pairs. Your script should save/write a file that is either GraphML (for use with Gephi) or JSON (for use with D3).
Run your script for one of the following days in the Election 2016 dataset. Use the output of your script to create a visual presentation of the graph data. You may use either Gephi (as demonstrated in class) or D3 (based on the examples in the book).
Submit:
(1) The python script/code that you used to generate the 2-mode network.
(2) A jpg, png, or pdf graph of your data from Gephi or D3.
Note: All submissions are through the course canvas site.
In Class: Thursday, Mar. 8
Brief 10 minute presentation (+5 minutes of questions) on your project.
Submit:
The slides or visual aids for your presentation - before class.
Note: All submissions are through the course canvas site.
Due: Monday, Mar. 12
Written report of your project. Please submit one copy of your project report for the entire team. Please make sure all team members names are on the report.
Note: All submissions are through the course canvas site.