[GISA] IRB application

When & Where ?

Inside The Application

– Protocol

  • Protocol Type:
    • Exempt: survey, interview which do not collect sensitive data (name, picture, identifiable information)
    • Expedited: when using more sensitive data
  • Principal investigator & lead unit
    • Advisor, faculty and his affiliation
  • Participant type
    • Total: general population in the US
    • Transnational: when open to all the people worldwide

– Questionnaire

  • Benefit #23301: not monetary, but more qualitative potential benefits for participants

– Attachment

  • survey, pamphlet, poster, interview guides, sample questions, and so on.

Other Helpful Resources

[GISA] IRB application

[2015 Spring, Complex System Seminar] Game theory

Definition of “Game Theory”

  • “… the study of mathematical models of conflict and cooperation between intelligent rational decison-makers.”([1])
  • originated as sub-fields of microeconomics and applied mathematics

Definition of “Game”

  • “In the language of game theory, a game refers to any social situation involving two or more individuals. The individuals involved in a game may be called the players.”([1])
  • Assumption on players
  • rational: A player is called as being rational, if he/she makes decisions consistently in pursuit of his own objectives(, which is maximization of his utility frequently).
  • intelligent: A player is called as being intelligent, if he/she knows everything that we know about the game and he can make any inferences about the situation that we can make.

Applications of Game Theory

  • Industrial organization (and their behaviors): analyzing cooperations(e.g. cartel) and competitions between firms
  • Auction theory: in terms of auctioneer and auction participants. e.g. Google auction, Yahoo auction, Soderby`s, ebay and so on.
  • Contract theory: Employer vs. Employee / Consumer vs. Producer
  • Evolutionary biology
  • Political science: international relationship, political parties
  • Public policy: Tragedy of commons, welfare policy design

List of Games

Why Do People Cooperate?

1. Kinship selection

  • When the sacrificing behavior of an agent can contribute to the spreading of its genes more than the cost for itself, it would choose to do. ([2], [3])

2. Indirect reciprocity

  • If each player decides whether to help someone or not based on the recipient’s image accumulated through previous altruistic behaviors, altruistic behavior becomes dominant. ([4])

3. Direct reciprocity

  • Repeated PD game
  • Tit-For-Tat: Select the previous strategy of your partner ([5])
  • win-stay, lose-shift: If your previous strategy was dominant toward the one of your partner, keep it. Otherwise, change it. ([6])

4. Costly signaling([7])

  • Group members have a personal characteristic, which we will call quality, that can either be high or low.
  • Each individual has occasion to enter into a profitable alliance (e.g. mating or political coalition) with any one of the other group members.

5. Altruistic punishment ([8])

  • If individuals can punish free riders in their group, although the punishment is costly and yields no material gain to the punisher, the cooperation flourishes.

6. Evolution of Social Network ([9])

– If cooperator pay the required cost, all his neighbors in a network would get benefit.
– In every turn, one randomly chosen player become dead.
– The tendency of new player for that position is decided depending on the sum of accumulated benefits of all neighbors.

7. Static Network ([10])

– If a social network is static, cooperative strategy becomes more stable.
– “We find that people cooperate at high stable levels, as long as the benefits created by cooperation are larger than the number of neighbors in the network.”


[1] Myerson, Roger B. Game theory. Harvard university press, 2013.
[2] http://en.wikipedia.org/wiki/Kin_selection
[3] Hamilton, William D. “The genetical evolution of social behaviour. II.” Journal of theoretical biology 7.1 (1964): 17-52.
[4] Nowak, Martin A., and Karl Sigmund. “Evolution of indirect reciprocity by image scoring.” Nature 393.6685 (1998): 573-577.
[5] Axelrod, Robert, and William D. Hamilton. “The evolution of cooperation.” Science 211.4489 (1981): 1390-1396.
[6] Nowak, Martin, and Karl Sigmund. “A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner’s Dilemma game.” Nature 364.6432 (1993): 56-58.
[7] Gintis, Herbert, Eric Alden Smith, and Samuel Bowles. “Costly signaling and cooperation.” Journal of theoretical biology 213.1 (2001): 103-119.
[8] Fehr, Ernst, and Simon Gächter. “Altruistic punishment in humans.” Nature 415.6868 (2002): 137-140.
[9] Ohtsuki, Hisashi, et al. “A simple rule for the evolution of cooperation on graphs and social networks.” Nature 441.7092 (2006): 502-505.
[10] Rand, David G., et al. “Static network structure can stabilize human cooperation.” Proceedings of the National Academy of Sciences 111.48 (2014): 17093-17098.

[2015 Spring, Complex System Seminar] Game theory

[Fashion] Related Articles

1. How to be a model agent by DAZED magazine

A founder of Premier fashion agency, Carole White, mentioned about running agency and lives of fashion models. In the article, following sentences look worthy as a reference.

“Now we’re in a social media era. The whole business has changed so much in the last five years. It’s changing how advertising is done; it’s changing how we evaluate how much a job is worth. Before it used to be how many posters and billboards are there, but that’s not the crucial element anymore. Followers have become a currency and agents around the world have been slow to click onto that.”

2. How Instagram is Changing Fashion Week by HYPEBEAST

[Fashion] Related Articles

[2015 Spring AI: Mining the Social Web] Working for Your Project!

  • Course material for March 3 and 5.


For the next 2 classes for your project

  1. Have a seat with your group members.
  2. Encourage your group members not to skip the classes.
  3. (5 to 10 min) Set a specific goal for each class: I will ask in the beginning of the classes.
    • e.g. “We will complete to construct a Pandas table with our dataset.”, “Our goal is to collect Yelp data during the last 3 months.”, “We’ll make a presentation file.”
  4. At the end of the classes(5 to 10 min), evaluate the achievement compared to the initial goals you set. Briefly explain what were challenges, and major achievements.

Expected elements in your upcoming project presentation

  • What is the dataset you would use? (not only website, but also specific attributes.)
  • What is the basic question you try to answer with this dataset? (Being specific is better.) Why is the question important?
  • Why is your dataset (or attributes of the dataset) good to answer your question?
  • What techniques (in machine learning) will you use to answer your question?
  • Why is the technique better compared to other possible candidate techniques?
  • Are there any similar attempts? If exist, what would be the advantage of yours competitively?
  • How will you present your result? (e.g. predicting something, clustering something)
[2015 Spring AI: Mining the Social Web] Working for Your Project!

[2015 Spring AI: Mining the Social Web] TF-IDF (2) & Sentiment Analysis


– Open the Data File, then make Words Corpus and Tweet Dictionary

In [2]:
# Mostly simliar to Example 4-9. Querying Google+ data with TF-IDF 
# in our textbook "Mining the Social Web" 4.4.2 Applying TF-IDF to Human Languages

data = "oscar_tweets.txt"
tweet_dictionary = {}
words_corpus = []
i = 0
for line in open(data):
    if len(line.strip().split())!=0:
        tweet_dictionary[i] = line.lower()
        i += 1
print tweet_dictionary[1]
print words_corpus[1]
rt @dory: when you're washing the dishes at 7:15 but you remember you gotta be at the oscars by 7:30 http://t.co/27faqodhpm

['rt', '@dory:', 'when', "you're", 'washing', 'the', 'dishes', 'at', '7:15', 'but', 'you', 'remember', 'you', 'gotta', 'be', 'at', 'the', 'oscars', 'by', '7:30', 'http://t.co/27faqodhpm']

– Set Your Query Terms and Scoring Each Document (Tweet)

In [3]:
# Set your query with tf-idf method
QUERY_TERMS = ['lego']

# TextCollection provides tf, idf, and tf_idf abstractions so
# that we don't have to maintain/compute them ourselves
import nltk
tc = nltk.TextCollection(words_corpus)

relevant_tweets = []

for idx in range(len(words_corpus)):
    score = 0
    for term in [t.lower() for t in QUERY_TERMS]:
        score += tc.tf_idf(term, words_corpus[idx])
    if score > 0:
        relevant_tweets.append({'score':score, 'tweet':tweet_dictionary[idx]})

– Sort by Score and Display Results

In [5]:
relevant_tweets = sorted(relevant_tweets, key=lambda p: p['score'], reverse=True)
for tweet in relevant_tweets[:5]:
    print tweet['tweet']
    print '\tScore: %s' % (tweet['score'],)
how the lego oscars were built http://t.co/glbdphfyn9

    Score: 0.867215250635

http://t.co/lghymlygns - is getting a lego oscar bet

    Score: 0.758813344306

see how the awesome lego oscars were made https://t.co/lheategesj

    Score: 0.674500750494

how the lego oscars were built - gif on imgur

    Score: 0.607050675445

rt @thingswork: this is how the lego oscars were built http://t.co/kzuabkuy1u

    Score: 0.551864250404

2. Sentiment Analysis

– Scoring Positivity (or Negativity) of Tweets

In [7]:
# source: http://textblob.readthedocs.org/en/dev/quickstart.html#sentiment-analysis
# The sentiment property returns a namedtuple of the form Sentiment(polarity, subjectivity). 
# The polarity score is a float within the range [-1.0, 1.0]. 
# The subjectivity is a float within the range [0.0, 1.0] 
# where 0.0 is very objective and 1.0 is very subjective.

from textblob import TextBlob

positive_tweets = []
for idx in range(len(words_corpus)):
    positivity = TextBlob(tweet_dictionary[idx]).sentiment.polarity
    subjectivity = TextBlob(tweet_dictionary[idx]).sentiment.subjectivity
    if positivity <= -0.9:
        positive_tweets.append({'positivity':positivity, 'tweet':tweet_dictionary[idx]})

positive_tweets = sorted(positive_tweets, key=lambda p: p['positivity'], reverse=True)
for tweet in positive_tweets[:5]:
    print tweet['tweet']
    print '\tScore: %s' % (tweet['positivity'],)
zendaya defends oscars dreadlocks after 'outrageously offensive' remark via @abc7ny http://t.co/jirc40gy8p

    Score: -1.0

@mrbradgoreski travolta was the worst dressed wax figure at the oscars.

    Score: -1.0

the amount of pics of scarlett johansson &amp; john travolta at the oscars people texted me is obscene. i hate u all! (and u know me so well.)

    Score: -1.0

rt @mygeektime: just getting over an awful stomach virus...

    Score: -1.0

behati's style at the oscars was the worst ive ever seen omg

    Score: -1.0

– Scoring Subjectivity (or Objectivity) of Tweets

In [8]:
subjective_tweets = []
for idx in range(len(words_corpus)):
    subjectivity = TextBlob(tweet_dictionary[idx]).sentiment.subjectivity
    if subjectivity >= 1:
        subjective_tweets.append({'subjectivity':subjectivity, 'tweet':tweet_dictionary[idx]})

subjective_tweets = sorted(subjective_tweets, key=lambda p: p['subjectivity'], reverse=True)
for tweet in subjective_tweets[:5]:
    print tweet['tweet']
    print '\tScore: %s' % (tweet['subjectivity'],)
rt @9gag: remember the greatest oscars ever? http://t.co/qw3xdbmne9

    Score: 1.0

rt @logotv: confirmed. @actuallynph's bulge at the #oscars was indeed padded. watch: http://t.co/a8iaxitxcu

    Score: 1.0

rt @girlposts: remember the greatest oscars ever? http://t.co/ij9fm4cdhm

    Score: 1.0

rt @ryanabe: another oscars, another sad leo.

    Score: 1.0

rt @9gag: remember the greatest oscars ever? http://t.co/qw3xdbmne9

    Score: 1.0

[2015 Spring AI: Mining the Social Web] TF-IDF (2) & Sentiment Analysis

[2015 Spring AI: Mining the Social Web] TF-IDF (1)

  • Class material for February 26, 2015

After TF-IDF scoring of 100 documents…

  1. All words in 100 docs are in the corpus.
  2. Separately, each word in each doc has its own TF-IDF score. That is, each doc is represented as the vector of TF-IDF scores of all words in the corpus.
  • e.g.) It was awesome! -> [0, .2345, 0, 0, …, 1.23, 3.4] (if the corpus is ordered as [“you”, “it”, “sucks” , “cold”, …, “was”, “awesome”])


What is this TF-IDF for?

We’ve learned much about TF-IDF method; how to calculate TF score and IDF score, how the conceptual assumption in this method (Bag of Words) and so on. Then, what is this for? How can we use this for what?

  1. Having a seat with your group members.
  2. Discuss how to use this score generally or for your project. (10 min)

Is TF-IDF better than just counting hits?

One of the easiest way to find relevant documents about a specific query is finding the documents which contain the query words many times. In what situation, does TF-IDF work better than this?

[2015 Spring AI: Mining the Social Web] TF-IDF (1)

How to be Effective in the Classroom: From Communicating Difficult Concepts to Storytelling

Part of AI classes for 2015 spring semester.

Time & Brief Explanation

  • Friday, February 20th, 10:00 am to 11:30 am
  • Professor Siegel, who gets many kudos from students for his engaging teaching methods, will lead this spring interactive workshop for AI’s. Come be inspired by his session on communication in the classroom and making a difference.
    (Professor Marty Siegel)

Contents Summary

  • Not to try covering the material, but to try “uncovering” the material

  • Playing the whole game

  • Math class should make students understand ongoing games on mathematics field.
  • Ex> Learning soccer:
  • playing from the very first day of learning
  • doesn’t exist “Soccer 101: Kick”, “Soccer 201: Pass” and so on.
  • Then, what’s the game of my field, or the field of the course?
  • Show students the big picture. Make them imagine it.
  • Communicating with students about these topics provides them A HA moments.

  • Storytelling

  • Telling as a story helps students to remember very well.
  • Personal story makes you more human, approachable, communicatable.

For I400 (the class for which currently I’m doing AI )

  • Possible questions for an applicatoin
  • Why are you taking this course? What made you to decide to take this?
  • Why is social media data important? What are the differences of social media data compared to previous other data sources? Why do you want to collect and use “social media data” rather than other sources?
  • What kinds of commands (or set of commands) do we need to do what you want? Can you explain why each part is necessary?
  • (I missed later part of the class (because of another group meeting)).
How to be Effective in the Classroom: From Communicating Difficult Concepts to Storytelling