[GISA] IRB application

When & Where ?

Inside The Application

– Protocol

  • Protocol Type:
    • Exempt: survey, interview which do not collect sensitive data (name, picture, identifiable information)
    • Expedited: when using more sensitive data
  • Principal investigator & lead unit
    • Advisor, faculty and his affiliation
  • Participant type
    • Total: general population in the US
    • Transnational: when open to all the people worldwide

– Questionnaire

  • Benefit #23301: not monetary, but more qualitative potential benefits for participants

– Attachment

  • survey, pamphlet, poster, interview guides, sample questions, and so on.

Other Helpful Resources

[GISA] IRB application

[2015 Spring, Complex System Seminar] Game theory

Definition of “Game Theory”

  • “… the study of mathematical models of conflict and cooperation between intelligent rational decison-makers.”([1])
  • originated as sub-fields of microeconomics and applied mathematics

Definition of “Game”

  • “In the language of game theory, a game refers to any social situation involving two or more individuals. The individuals involved in a game may be called the players.”([1])
  • Assumption on players
  • rational: A player is called as being rational, if he/she makes decisions consistently in pursuit of his own objectives(, which is maximization of his utility frequently).
  • intelligent: A player is called as being intelligent, if he/she knows everything that we know about the game and he can make any inferences about the situation that we can make.

Applications of Game Theory

  • Industrial organization (and their behaviors): analyzing cooperations(e.g. cartel) and competitions between firms
  • Auction theory: in terms of auctioneer and auction participants. e.g. Google auction, Yahoo auction, Soderby`s, ebay and so on.
  • Contract theory: Employer vs. Employee / Consumer vs. Producer
  • Evolutionary biology
  • Political science: international relationship, political parties
  • Public policy: Tragedy of commons, welfare policy design

List of Games

Why Do People Cooperate?

1. Kinship selection

  • When the sacrificing behavior of an agent can contribute to the spreading of its genes more than the cost for itself, it would choose to do. ([2], [3])

2. Indirect reciprocity

  • If each player decides whether to help someone or not based on the recipient’s image accumulated through previous altruistic behaviors, altruistic behavior becomes dominant. ([4])

3. Direct reciprocity

  • Repeated PD game
  • Tit-For-Tat: Select the previous strategy of your partner ([5])
  • win-stay, lose-shift: If your previous strategy was dominant toward the one of your partner, keep it. Otherwise, change it. ([6])

4. Costly signaling([7])

  • Group members have a personal characteristic, which we will call quality, that can either be high or low.
  • Each individual has occasion to enter into a profitable alliance (e.g. mating or political coalition) with any one of the other group members.

5. Altruistic punishment ([8])

  • If individuals can punish free riders in their group, although the punishment is costly and yields no material gain to the punisher, the cooperation flourishes.

6. Evolution of Social Network ([9])

– If cooperator pay the required cost, all his neighbors in a network would get benefit.
– In every turn, one randomly chosen player become dead.
– The tendency of new player for that position is decided depending on the sum of accumulated benefits of all neighbors.

7. Static Network ([10])

– If a social network is static, cooperative strategy becomes more stable.
– “We find that people cooperate at high stable levels, as long as the benefits created by cooperation are larger than the number of neighbors in the network.”


[1] Myerson, Roger B. Game theory. Harvard university press, 2013.
[2] http://en.wikipedia.org/wiki/Kin_selection
[3] Hamilton, William D. “The genetical evolution of social behaviour. II.” Journal of theoretical biology 7.1 (1964): 17-52.
[4] Nowak, Martin A., and Karl Sigmund. “Evolution of indirect reciprocity by image scoring.” Nature 393.6685 (1998): 573-577.
[5] Axelrod, Robert, and William D. Hamilton. “The evolution of cooperation.” Science 211.4489 (1981): 1390-1396.
[6] Nowak, Martin, and Karl Sigmund. “A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner’s Dilemma game.” Nature 364.6432 (1993): 56-58.
[7] Gintis, Herbert, Eric Alden Smith, and Samuel Bowles. “Costly signaling and cooperation.” Journal of theoretical biology 213.1 (2001): 103-119.
[8] Fehr, Ernst, and Simon Gächter. “Altruistic punishment in humans.” Nature 415.6868 (2002): 137-140.
[9] Ohtsuki, Hisashi, et al. “A simple rule for the evolution of cooperation on graphs and social networks.” Nature 441.7092 (2006): 502-505.
[10] Rand, David G., et al. “Static network structure can stabilize human cooperation.” Proceedings of the National Academy of Sciences 111.48 (2014): 17093-17098.

[2015 Spring, Complex System Seminar] Game theory

[Fashion] Related Articles

1. How to be a model agent by DAZED magazine

A founder of Premier fashion agency, Carole White, mentioned about running agency and lives of fashion models. In the article, following sentences look worthy as a reference.

“Now we’re in a social media era. The whole business has changed so much in the last five years. It’s changing how advertising is done; it’s changing how we evaluate how much a job is worth. Before it used to be how many posters and billboards are there, but that’s not the crucial element anymore. Followers have become a currency and agents around the world have been slow to click onto that.”

2. How Instagram is Changing Fashion Week by HYPEBEAST

[Fashion] Related Articles

[2015 Spring, Complex System Seminar] Go Viral!

Social Influence

1. Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market ([1])

  • Method
    • Authors created an artificial music market and recruited 14,341 participants (mostly teenagers) and provide them unknown musics from unknown bands. After listening the songs they chose, they are asked to rate the quality of the songs and to decide whether to download or not.
    • Participants are assigned into two groups randomly.
    • Independent: Only names of the bands and their songs are provided as an information.
    • Social Influence: Not only above information, but also download counts of each song by others are known. This social influence group is separated into 8 subgroups, in which each subgroup is evolved independently each other.
    • This experiment are operated two times, which are different in the visualizing way of download counts.
    • In experiment 1, the download counts of 48 songs are shown in 16*3 grid, in a random order.
    • In experiment 2, the counts are shown in one column in a descending order.
  • Result
    • Gini coefficient of Exp. 2 > Gini coefficient of Exp. 1 > Gini coefficient of Independent Group
    • Unpredictability of Exp. 2 > Unpredictability of Exp. 1 > Unpredictability of Independent Group (Unpredictability: avg. difference in market share of a song in 8 different worlds)
    • Market share in Exp 1. looks linearly correlated with market share in Independent group, while that in Exp 2. looks exponentially correlated with that in Independent group.
  • Personal opinion
    • If it traces the dynamic processes to become top-ranked for some most popular songs, it would be also interesting. (Finding phase transition moments and conditions.)

2. Complex Contagions and the Weakness of Long Ties ([2])

  • Related works
    • Two different meanings of tie strength according to Granovetter
    • relational: strong tie means close friend, family, while weak tie means an acqaintance.
    • structural: strong tie means having higher ability to facilitate diffusion, cohesion, and integration of its social network by linking others.
    • Granovetter’s insight is that a weak tie in relation can be a strong tie in structure by doing a job as shortcuts in small-world network.
    • Threshold model in contagion process (by Granovetter and Schelling)
    • Mechanisms of Complex Contagion
    1. Strategic complementarity: When the (social or economic) cost for adoptation decreases as the number of adopted people around increases.
    2. Credibility: Some innovations (or information) become reliable enough to adopt when my credible neighbors already adopted them.
    3. Legitimacy: The number of close friends who participated matters to recognize the event or social movement legitimate.
    4. Emotional Contagion

3. A 61-million-person experiment in social influence and political mobilization ([3])

  • Question: Can political behaviour spread through an online social network?
  • Effect of message to encourage voting

    • Dividing all Facebook users over 18 years in the US into 3 groups: social message group, informational message group, and control group
    • Social message group (N = 60,055,176) vs. Informational message group (N = 611,044): Different in whether show the profile pictures of 6 friends in a message.
    • Not only using self-reported voting (“I Voted” in the message), they also used the examination of public voting records.
  • Effect of strong ties

    • Validating that Facebook friends with more interactions are likely to be closer friends.
    • Then, based on this interaction counts, they compared the effect on voting behavior measured in 3 different ways, depending on the closeness.
    • Followings are their explanation.
    • “Figure 2 shows that the observed per-friend treatment effects increase as tie-strength increases. All of the observed treatment effects fall outside the null distribution for expressed vote (Fig. 2b), suggesting that they are significantly different from chance outcomes. For validated vote (Fig. 2c), the observed treatment effect is near zero for weak ties, but it spikes upwards and falls outside the null distribution for the top two deciles. This suggests that strong ties are important for the spread of real-world voting behaviour. Finally, the treatment effect for polling place search gradually increases (Fig. 2d), with several of the effects falling outside the 95% confidence interval of the null distribution.”
    • However, if you see the graph, the mean changes in probability to vote look same and only the variances become larger as the amount of interaction is bigger. (It might be because the number of close friends are much smaller than that of just frieds?)

4. Structural diversity in social contagion ([4])


  • Contagion of joining Facebook through friends’ invitation e-mails
  • Data: corpus of 54 million invitation e-mails
  • Question: “How does an individual’s probability of accepting an invitation depend on the structure of his or her contact neighborhood?”
  • Result: Acceptance probability is related to the number of connected components in the contact neighborhood (existing users who have an e-mail account of a user in common). That is, more components, higher probability to accept

[1] – M.J. Salganik et al., Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market, Science 311, 854 (2006).
[2] – D. Centola and M. Macy, Complex Contagions and the Weakness of Long Ties, AJS 113, 702 (2007).
[3] – R.M. Bond et al., A 61-million-person experiment in social influence and political mobilization, Nature 489, 295 (2012).
[4] – Ugander, Johan, et al. “Structural diversity in social contagion.” Proceedings of the National Academy of Sciences (2012): 201116502.

[2015 Spring, Complex System Seminar] Go Viral!

Useful Links about Large-Scale Network Analysis

MapReduce and Hadoop


Useful Links about Large-Scale Network Analysis

[2015 Spring AI: Mining the Social Web] Working for Your Project!

  • Course material for March 3 and 5.


For the next 2 classes for your project

  1. Have a seat with your group members.
  2. Encourage your group members not to skip the classes.
  3. (5 to 10 min) Set a specific goal for each class: I will ask in the beginning of the classes.
    • e.g. “We will complete to construct a Pandas table with our dataset.”, “Our goal is to collect Yelp data during the last 3 months.”, “We’ll make a presentation file.”
  4. At the end of the classes(5 to 10 min), evaluate the achievement compared to the initial goals you set. Briefly explain what were challenges, and major achievements.

Expected elements in your upcoming project presentation

  • What is the dataset you would use? (not only website, but also specific attributes.)
  • What is the basic question you try to answer with this dataset? (Being specific is better.) Why is the question important?
  • Why is your dataset (or attributes of the dataset) good to answer your question?
  • What techniques (in machine learning) will you use to answer your question?
  • Why is the technique better compared to other possible candidate techniques?
  • Are there any similar attempts? If exist, what would be the advantage of yours competitively?
  • How will you present your result? (e.g. predicting something, clustering something)
[2015 Spring AI: Mining the Social Web] Working for Your Project!

[2015 Spring AI: Mining the Social Web] TF-IDF (2) & Sentiment Analysis


– Open the Data File, then make Words Corpus and Tweet Dictionary

In [2]:
# Mostly simliar to Example 4-9. Querying Google+ data with TF-IDF 
# in our textbook "Mining the Social Web" 4.4.2 Applying TF-IDF to Human Languages

data = "oscar_tweets.txt"
tweet_dictionary = {}
words_corpus = []
i = 0
for line in open(data):
    if len(line.strip().split())!=0:
        tweet_dictionary[i] = line.lower()
        i += 1
print tweet_dictionary[1]
print words_corpus[1]
rt @dory: when you're washing the dishes at 7:15 but you remember you gotta be at the oscars by 7:30 http://t.co/27faqodhpm

['rt', '@dory:', 'when', "you're", 'washing', 'the', 'dishes', 'at', '7:15', 'but', 'you', 'remember', 'you', 'gotta', 'be', 'at', 'the', 'oscars', 'by', '7:30', 'http://t.co/27faqodhpm']

– Set Your Query Terms and Scoring Each Document (Tweet)

In [3]:
# Set your query with tf-idf method
QUERY_TERMS = ['lego']

# TextCollection provides tf, idf, and tf_idf abstractions so
# that we don't have to maintain/compute them ourselves
import nltk
tc = nltk.TextCollection(words_corpus)

relevant_tweets = []

for idx in range(len(words_corpus)):
    score = 0
    for term in [t.lower() for t in QUERY_TERMS]:
        score += tc.tf_idf(term, words_corpus[idx])
    if score > 0:
        relevant_tweets.append({'score':score, 'tweet':tweet_dictionary[idx]})

– Sort by Score and Display Results

In [5]:
relevant_tweets = sorted(relevant_tweets, key=lambda p: p['score'], reverse=True)
for tweet in relevant_tweets[:5]:
    print tweet['tweet']
    print '\tScore: %s' % (tweet['score'],)
how the lego oscars were built http://t.co/glbdphfyn9

    Score: 0.867215250635

http://t.co/lghymlygns - is getting a lego oscar bet

    Score: 0.758813344306

see how the awesome lego oscars were made https://t.co/lheategesj

    Score: 0.674500750494

how the lego oscars were built - gif on imgur

    Score: 0.607050675445

rt @thingswork: this is how the lego oscars were built http://t.co/kzuabkuy1u

    Score: 0.551864250404

2. Sentiment Analysis

– Scoring Positivity (or Negativity) of Tweets

In [7]:
# source: http://textblob.readthedocs.org/en/dev/quickstart.html#sentiment-analysis
# The sentiment property returns a namedtuple of the form Sentiment(polarity, subjectivity). 
# The polarity score is a float within the range [-1.0, 1.0]. 
# The subjectivity is a float within the range [0.0, 1.0] 
# where 0.0 is very objective and 1.0 is very subjective.

from textblob import TextBlob

positive_tweets = []
for idx in range(len(words_corpus)):
    positivity = TextBlob(tweet_dictionary[idx]).sentiment.polarity
    subjectivity = TextBlob(tweet_dictionary[idx]).sentiment.subjectivity
    if positivity <= -0.9:
        positive_tweets.append({'positivity':positivity, 'tweet':tweet_dictionary[idx]})

positive_tweets = sorted(positive_tweets, key=lambda p: p['positivity'], reverse=True)
for tweet in positive_tweets[:5]:
    print tweet['tweet']
    print '\tScore: %s' % (tweet['positivity'],)
zendaya defends oscars dreadlocks after 'outrageously offensive' remark via @abc7ny http://t.co/jirc40gy8p

    Score: -1.0

@mrbradgoreski travolta was the worst dressed wax figure at the oscars.

    Score: -1.0

the amount of pics of scarlett johansson &amp; john travolta at the oscars people texted me is obscene. i hate u all! (and u know me so well.)

    Score: -1.0

rt @mygeektime: just getting over an awful stomach virus...

    Score: -1.0

behati's style at the oscars was the worst ive ever seen omg

    Score: -1.0

– Scoring Subjectivity (or Objectivity) of Tweets

In [8]:
subjective_tweets = []
for idx in range(len(words_corpus)):
    subjectivity = TextBlob(tweet_dictionary[idx]).sentiment.subjectivity
    if subjectivity >= 1:
        subjective_tweets.append({'subjectivity':subjectivity, 'tweet':tweet_dictionary[idx]})

subjective_tweets = sorted(subjective_tweets, key=lambda p: p['subjectivity'], reverse=True)
for tweet in subjective_tweets[:5]:
    print tweet['tweet']
    print '\tScore: %s' % (tweet['subjectivity'],)
rt @9gag: remember the greatest oscars ever? http://t.co/qw3xdbmne9

    Score: 1.0

rt @logotv: confirmed. @actuallynph's bulge at the #oscars was indeed padded. watch: http://t.co/a8iaxitxcu

    Score: 1.0

rt @girlposts: remember the greatest oscars ever? http://t.co/ij9fm4cdhm

    Score: 1.0

rt @ryanabe: another oscars, another sad leo.

    Score: 1.0

rt @9gag: remember the greatest oscars ever? http://t.co/qw3xdbmne9

    Score: 1.0

[2015 Spring AI: Mining the Social Web] TF-IDF (2) & Sentiment Analysis