HCDE 530 Assignments

Quiz 1

Due: Thursday, Jan. 11

 

Assignment 1

Due: Thursday, Jan. 18

Interact with the python interpreter. In these 5 short exercises you only need to type directly into the interpreter. There should be no need to rely on a text editor or other editing programs. All of these are only a few lines each

  1. Make the python interpreter calculate and print 13! (13 factorial)
  2. Make the python interpreter output "Happy New Year!" using 3 different string variables.
  3. Define three procedures that each returns one string of "Happy", "New", and "Year!". In the python interpreter execute the three procedures and show what they output.
  4. Write a new procedure using the ones you created in the prior problem. Make your new procedure print "Happy New Year!"
  5. Write a procedure that takes two parameters and adds them together. The procedure should write output that looks like an addition statement. For example, if the procedure was given the values 3 and 4 the output should be something like: "3 + 4 = 7"

Submit the output of your python intepreter session. Similar to what we have seen in class. All submissions are through the course canvas site.

 

Course Project - Team Membership (Project Team)

Due: Thursday, Jan. 18

Create a group of up to four (4) people who will work as a team on the course project.

Submit to the Instructor (either paper or in email) the full name and UWNetID for each person on your team.

 

Quiz 2

Due: Monday, Jan. 22

 

Assignment 2

Due: Thursday, Jan. 25

Write 4 short python programs.

  1. Write a procedure that accepts one parameter (count) and generates a list of (count) random integers, between 0 and 1000, and puts those integers into a list, and returns the list.
  2. Write a procedure called "no_5xx" which searches a list (like the one from #1 above) and removes any integer value in the range 500 to 599, and returns the resulting list.
  3. Write a procedure that takes four parameters (lastname, firstname, score, grade) and returns a new dictionary item with those four items.
  4. Write a procedure called "update_lastname" that takes two parameters (a dictionary, like from #3 above) and a string value and updates the value for the "lastname" key in the dictionary.

Submit a sample run of your code for each program. Similar to what we have seen in class. An edited transcript of your python interpreter session is fine. All submissions are through the course canvas site.

 

Assignment 3

Due: Monday, Jan. 29

Write 2 short python programs.

  1. Write a short program that reads input lines from standard input and writes the line to a file, prepending a three digit line number and a colon. If the user enters two blank lines, stop writing lines, and close the file.
  2. Subclass the "Person" object to create a "Faculty" object type. Faculty should have a field called "rank" which can be one of "lecturer", "assistant", "associate" or "full" and a boolean value for whether the faculty member is tenured or not. Modify __repr__ to do the right thing and create set/get methods for Faculty specific fields.

Submit a sample run of your code for each program. Similar to what we have seen in class. An edited transcript of your python interpreter session is fine. All submissions are through the course canvas site.

 

Quiz 3

Due: Monday, Jan. 29

 

Assignment 4

Due: Thursday, Feb. 1

Setup HCDE Python User Module

  1. Download a copy of the hcde user module, unzip and copy the whole directory into your python code directory.
  2. Install the following modules, most can be installed using easy_install or pip:
    • requests
    • oauthlib
    • requests_oauthlib
    • networkx
    • nltk
  3. Run Login.py twice to authorize the twitter apps HCDE530Test01 and HCDE530Test02 with your own twitter user account. Login.py is found in the hcde/twitter directory of the hcde python user module. You can do this in the python interpreter or from the command line.
  4. After running the login, run Search.py with a query term of your choosing. Search.py is found in the hcde/twitter directory of the hcde python user module. You can do this in the python interpreter or from the command line.

Submit:
(1) The text of your command line session or python interpreter session showing that HCDE530Test01 and HCDE530Test02 are authorized.
(2) The query term that you used with Search.py and 4 example tweets that were returned from the search.
Note: All submissions are through the course canvas site.

 

Course Project Proposal (Project Team)

Due: Thursday, Feb. 1

Write a proposal to collect, analyze and present some Twitter data. You can choose to collect data of your own, or propose an analysis and presentation of one of two datasets provided in class (Election 2016 or Oscar 2016). Your proposal should describe:

  1. The Motivation - You should provide a motivation for the question you are trying to address. The motivation might be supported by answering questions like:
    • Why is this an interesting question?
    • Who would care about answering this question?
    • What prior work has been done that relates to this problem/question? You should identify (and fully cite) at least 6 pieces of prior research that are related to your problem/question - and explain how/why the work is related.
  2. The Data - You should be clear about what data you are collecting. If you are using one of the existing datasets, you should know what data is there and how you will access it from the existing database.
    • What are you collecting?
    • What entities/terms/features need to be extracted?
    • How will you store the data, do you need to keep in all or just some of the meta data?
  3. The Analysis - You should clearly describe how you will analyze the data.
    • How will this data be analyzed? Will you use descriptive statistics, correlation, regression?
    • How are you using the entities/terms/features?
  4. The Presentation
    • How will the results be presented?

Proposals should be a maximum of 4 pages in 12pt Times Roman (or equivalent) font. You should clearly address each of the four main areas (i.e., Motivation, Data, Analysis, Presentation). The questions above are only to help you think about what belongs in each of the sections.

Submit your written proposal. Only 1 submission is needed from each group. Please make sure all team members names are on the submission. If more than one person from your group submits the proposal, then I will take the last one submitted prior to the deadline.
Note: All submissions are through the course canvas site.

 

Quiz 4

Due: Monday, Feb. 5

 

Assignment 5

Due: Monday, Feb. 5

Database Connection and Data Access

  1. Install a local copy of MySQL database
  2. Install the following modules, most can be installed using easy_install or pip
    • pymysql
    • sqlalchemy
  3. Run find_hashtags.py found in the oscar_2016 directory (i.e., hcde/data/oscar_2016) picking one of these dates:
    • January 25 2016
    • February 10 2016
    • February 26 2016

Submit:
(1) The text of your python interpreter session showing that you can import pymysql and sqlalchemy modules.
(2) A list of the top 5 hashtags found and the frequency count for each hashtag.
Note: All submissions are through the course canvas site.

 

Quiz 5

Due: Monday, Feb. 12

 

Assignment 6

Due: Thursday, Feb. 22

Tweet Text Analysis - Jaccard Distance
Using the samples from class and from the book, write a python script that calculates Jaccard distance for all of tweets for two different days. Your code should produce both the Jaccard distance for each pair of days, and the cardinality of the term intersection (i.e., the total number of terms that occur both days). When calculating the Jaccard distance and term intersection, your script should use at most the top 3000 terms. Further, your script should not consider any terms that occur only once in a given day.

Run your script for one of the weeks indicated for both course datasets, producing 7 pairwise comparisons each week. Use the output of your script to produce a chart showing the distance computed for each day. You should assume that the values for a given day represent the distance and term intersection relative to the prior day. That is, if you are calculating the distance and intersection for Sunday February 21, 2016, then you would need to have the terms for Saturday February 20, 2016.

  1. Oscar 2016 dataset
    • January 31 through February 6, 2016
    • February 7 through February 13, 2016
    • February 21 through February 27, 2016
  2. Election 2016 dataset
    • February 3 through February 9, 2016 (note: a GOP debate occured February 6th)
    • March 3 through March 9, 2016 (note: a DNC debate occured on March 6th)
    • October 6 through October 12, 2016 (note: a presidential debate occured on October 9th)

Submit:
(1) The python script/code that you used to generate the Jaccard distance for each day.
(2) Two jpg, png, or pdf chart showing the total tweets for each day, the count of the term intersection and the jaccard distance for the weeks you choose in each dataset.
Note: All submissions are through the course canvas site.

 

Assignment 7

Due: Thursday, Mar. 1

Tweet Network Analysis - 2-Mode Hashtag Network
Using the samples from class and from the book, write a python script that builds a 2-mode network linking users and hashtags. That is, the social network will have user nodes and hashtag nodes. Edges in the network will only link users to the hashtags that they use. Make sure your python script will clean the network of any singleton user-hashtag pairs. Your script should save/write a file that is either GraphML (for use with Gephi) or JSON (for use with D3).

Run your script for one of the following days in the Election 2016 dataset. Use the output of your script to create a visual presentation of the graph data. You may use either Gephi (as demonstrated in class) or D3 (based on the examples in the book).

  1. Election 2016 dataset
    • February 6, 2016
    • March 6, 2016
    • October 9, 2016
    • November 7, 2016

Submit:
(1) The python script/code that you used to generate the 2-mode network.
(2) A jpg, png, or pdf graph of your data from Gephi or D3.
Note: All submissions are through the course canvas site.

 

Course Project Presentation (Project Team)

In Class: Thursday, Mar. 8

Brief 10 minute presentation (+5 minutes of questions) on your project.

Submit:
The slides or visual aids for your presentation - before class.
Note: All submissions are through the course canvas site.

 

Course Project Write-up (Project Team)

Due: Monday, Mar. 12

Written report of your project. Please submit one copy of your project report for the entire team. Please make sure all team members names are on the report.
Note: All submissions are through the course canvas site.

 

© 2018 David W. McDonald