Skip to content

Commit

Permalink
remove extraneous images
Browse files Browse the repository at this point in the history
  • Loading branch information
arencambre committed May 27, 2019
1 parent 0b692d4 commit a63abb6
Show file tree
Hide file tree
Showing 110 changed files with 160 additions and 0 deletions.
Binary file removed Advocate_ED .png
Binary file not shown.
Binary file removed BenRussellNBC5 .png
Binary file not shown.
Binary file removed BradfordPearson .png
Binary file not shown.
Binary file removed CarrTamicbs11 .png
Binary file not shown.
Binary file removed Central_Track .png
Binary file not shown.
Binary file removed CoryNBC .png
Binary file not shown.
Binary file removed CourtneyNBC5 .png
Binary file not shown.
Binary file removed CultureMapDAL .png
Binary file not shown.
Binary file removed DMNOpinion .png
Binary file not shown.
Binary file removed DMagazine .png
Binary file not shown.
Binary file removed D_FrontRow .png
Binary file not shown.
Binary file removed Dallas_Observer .png
Binary file not shown.
Binary file removed DavidSchechter .png
Binary file not shown.
Binary file removed DeborahNBC5 .png
Binary file not shown.
Binary file removed JeffSmithi24 .png
Binary file not shown.
Binary file removed JennySivie .png
Binary file not shown.
Binary file removed JimSchutze .png
Binary file not shown.
Binary file removed JohnnyNBC6 .png
Binary file not shown.
Binary file removed JustinWWaldrop .png
Binary file not shown.
Binary file removed KRLDEmily .png
Binary file not shown.
Binary file removed KenKalthoffNBC5 .png
Binary file not shown.
Binary file removed Knightengale .png
Binary file not shown.
Binary file removed Lindenberger .png
Binary file not shown.
Binary file removed MonicaTVNews .png
Binary file not shown.
Binary file removed NBC5photog .png
Binary file not shown.
Binary file removed POLSDFW .png
Diff not rendered.
Binary file removed PappalardoJoe .png
Diff not rendered.
Binary file removed PhilipTKingston .png
Diff not rendered.
Binary file removed RayLeszcynski .png
Diff not rendered.
Binary file removed RobertWilonsky .png
Diff not rendered.
Binary file removed ScottNBC5 .png
Diff not rendered.
Binary file removed ToddNEWS .png
Diff not rendered.
Binary file removed ToddWFAA8 .png
Diff not rendered.
Binary file removed TristanHallman .png
Diff not rendered.
Binary file removed Wylie_H_Dallas .png
Diff not rendered.
Binary file removed ahuguelet .png
Diff not rendered.
Binary file removed aviselk .png
Diff not rendered.
Binary file removed brandonformby .png
Diff not rendered.
Binary file removed clairezcardona .png
Diff not rendered.
Binary file removed dallasweekly .png
Diff not rendered.
Binary file removed docs/Advocate_ED .png
Diff not rendered.
Binary file removed docs/BenRussellNBC5 .png
Diff not rendered.
Binary file removed docs/BradfordPearson .png
Diff not rendered.
Binary file removed docs/CarrTamicbs11 .png
Diff not rendered.
Binary file removed docs/Central_Track .png
Diff not rendered.
Binary file removed docs/CoryNBC .png
Diff not rendered.
Binary file removed docs/CourtneyNBC5 .png
Diff not rendered.
Binary file removed docs/CultureMapDAL .png
Diff not rendered.
Binary file removed docs/DMNOpinion .png
Diff not rendered.
Binary file removed docs/DMagazine .png
Diff not rendered.
Binary file removed docs/D_FrontRow .png
Diff not rendered.
Binary file removed docs/Dallas_Observer .png
Diff not rendered.
Binary file removed docs/DavidSchechter .png
Diff not rendered.
Binary file removed docs/DeborahNBC5 .png
Diff not rendered.
Binary file removed docs/JeffSmithi24 .png
Diff not rendered.
Binary file removed docs/JennySivie .png
Diff not rendered.
Binary file removed docs/JimSchutze .png
Diff not rendered.
Binary file removed docs/JohnnyNBC6 .png
Diff not rendered.
Binary file removed docs/JustinWWaldrop .png
Diff not rendered.
Binary file removed docs/KRLDEmily .png
Diff not rendered.
Binary file removed docs/KenKalthoffNBC5 .png
Diff not rendered.
Binary file removed docs/Knightengale .png
Diff not rendered.
Binary file removed docs/Lindenberger .png
Diff not rendered.
Binary file removed docs/MonicaTVNews .png
Diff not rendered.
Binary file removed docs/NBC5photog .png
Diff not rendered.
Binary file removed docs/POLSDFW .png
Diff not rendered.
Binary file removed docs/PappalardoJoe .png
Diff not rendered.
Binary file removed docs/PhilipTKingston .png
Diff not rendered.
Binary file removed docs/RayLeszcynski .png
Diff not rendered.
Binary file removed docs/RobertWilonsky .png
Diff not rendered.
Binary file removed docs/ScottNBC5 .png
Diff not rendered.
Binary file removed docs/ToddNEWS .png
Diff not rendered.
Binary file removed docs/ToddWFAA8 .png
Diff not rendered.
Binary file removed docs/TristanHallman .png
Diff not rendered.
Binary file removed docs/Wylie_H_Dallas .png
Diff not rendered.
Binary file removed docs/ahuguelet .png
Diff not rendered.
Binary file removed docs/aviselk .png
Diff not rendered.
Binary file removed docs/brandonformby .png
Diff not rendered.
Binary file removed docs/clairezcardona .png
Diff not rendered.
Binary file removed docs/dallasweekly .png
Diff not rendered.
Binary file removed docs/erickreindler .png
Diff not rendered.
Binary file removed docs/etjohnstone .png
Diff not rendered.
Binary file removed docs/jasonheid .png
Diff not rendered.
Binary file removed docs/jdmiles11 .png
Diff not rendered.
Binary file removed docs/jmchiquillo .png
Diff not rendered.
Binary file removed docs/johnmccaa .png
Diff not rendered.
Binary file removed docs/medenix .png
Diff not rendered.
Binary file removed docs/rlopezwfaa .png
Diff not rendered.
Binary file removed docs/shaunrabbfox4 .png
Diff not rendered.
Binary file removed docs/timmytyper .png
Diff not rendered.
Binary file removed docs/ttsiaperas .png
Diff not rendered.
Binary file removed docs/wfaalauren .png
Diff not rendered.
Binary file removed docs/wfaashelly .png
Diff not rendered.
Binary file removed docs/zaccrain .png
Diff not rendered.
Binary file removed erickreindler .png
Diff not rendered.
Binary file removed etjohnstone .png
Diff not rendered.
160 changes: 160 additions & 0 deletions get twitter data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
library(tidyverse)
library(tidytext)
library(rvest)
library(drlib)
library(rtweet)
library(httpuv)
# install.packages("devtools")
# library(devtools)
# install_github("dgrtwo/drlib")
library(drlib)
library(widyr)

source(tokens.R)

# list of prominent Dallas media members
# from https://twitter.com/advocamentum/lists/news-media/members
advocamentum_news_media <- lists_members(owner_user = "Advocamentum", slug = "news-media")

# add Wylie to advocamentum's list
advocamentum_news_media <- advocamentum_news_media %>%
add_row(name = "Wylie H. Dallas",
screen_name = "Wylie_H_Dallas") %>%
add_row(name = "Philip Kingston",
screen_name = "PhilipTKingston")

# Download ~3200 from each account
tweets <- map_df(advocamentum_news_media$screen_name,
get_timeline,
n = 3200)

wylie_tweets <- get_timeline("Wylie_H_Dallas", n = 3200)

save(tweets, file="tweets_20190507.Rda")
save(wylie_tweets, file="wylie_tweets_20190507.Rda")

kingston_tweets <- get_timeline("PhilipTKingston", n = 3200)
tweets <- tweets %>%
bind_rows(kingston_tweets)

load(file = "tweets.Rda")
load(file = "wylie_tweets.Rda")

# merge data frames together
tweets <- bind_rows(tweets, wylie_tweets)

# create plot showing time of day when everyone tweets

# first get the hour
tweets$hour <- as.numeric(strftime(tweets$created_at, format = "%H")) +
as.numeric(strftime(tweets$created_at, format = "%M")) / 60

tweets %>%
filter(screen_name %in% c("JimSchutze",
"Wylie_H_Dallas",
"RobertWilonsky")) %>%
ggplot(aes(x = hour)) +
geom_histogram(bins = 24 * 6) +
facet_grid(~screen_name)

# let's do the above on everyone individually
for (i in unique(tweets$screen_name)) {
tweets %>%
filter(screen_name == i |
screen_name == "Wylie_H_Dallas") %>%
ggplot(data = .[screen_name == i,], aes(x = hour)) +
geom_histogram(bins = 24 * 6) +
xlim(0, 24) +
xlab("hour of day") +
ylab("count of tweets") +
ggtitle(paste(advocamentum_news_media[advocamentum_news_media$screen_name == i,]$name, "'s tweet time histogram", sep = ""))
ggsave(paste(i, ".png"))
}

i = "PhilipTKingston"
tweets %>%
filter(screen_name == i) %>%
ggplot(aes(x = hour)) +
geom_histogram(bins = 24 * 6) +
xlim(0, 24) +
xlab("hour of day") +
ylab("count of tweets") +
ggtitle(paste(advocamentum_news_media[advocamentum_news_media$screen_name == i,]$name, "'s tweet time histogram", sep = ""))


tweets %>%
filter(screen_name == i |
screen_name == "Wylie_H_Dallas") %>%
ggplot(data = .[.$screen_name == i,], aes(x = hour)) +
geom_histogram(bins = 24 * 6) +
xlim(0, 24) +
xlab("hour of day") +
ylab("count of tweets") +
ggtitle(paste(advocamentum_news_media[advocamentum_news_media$screen_name == i,]$name, "'s tweet time histogram", sep = ""))
ggsave(paste(i, ".png"))

# do word analysis
reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"

tweet_words <- tweets %>%
filter(!is_retweet) %>%
arrange(created_at) %>%
distinct(text, .keep_all = TRUE) %>%
select(screen_name, status_id, text) %>%
mutate(text = str_replace_all(text, "https?://t.co/[A-Za-z\\d]+|&amp;", "")) %>%
unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(str_detect(word, "[a-z]")) %>%
filter(!word %in% stop_words$word)

tweet_words %>%
filter(!word %in% stop_words$word) %>%
count(word, sort = TRUE) %>%
head(16) %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n)) +
geom_col() +
coord_flip() +
labs(y = "# of uses among staff Twitter accounts")

word_counts <- tweet_words %>%
count(screen_name, word, sort = TRUE)

# Compute TF-IDF using "word" as term and "screen_name" as document
word_tf_idf <- word_counts %>%
bind_tf_idf(word, screen_name, n) %>%
arrange(desc(tf_idf))

similarity <- word_tf_idf %>%
pairwise_similarity(screen_name, word, tf_idf, upper = FALSE, sort = TRUE) %>%
filter(item1 == "Wylie_H_Dallas" |
item2 == "Wylie_H_Dallas")

# get technology data
tweets_datefiltered %>%
group_by(screen_name, source) %>%
summarize(count = n()) %>%
ungroup() %>%
pairwise_similarity(screen_name, source, count, upper = FALSE, sort = TRUE) %>%
filter(item1 == "Wylie_H_Dallas" |
item2 == "Wylie_H_Dallas")

# pairwise similarity by hour
hour_matches <- tweets_datefiltered %>%
mutate(hour_number = as.numeric(strftime(tweets_datefiltered$created_at, format = "%H"))) %>%
group_by(screen_name, hour_number) %>%
summarize(count = n()) %>%
ungroup() %>%
pairwise_similarity(screen_name, hour_number, count, upper = FALSE, sort = TRUE) %>%
filter(item1 == "Wylie_H_Dallas" |
item2 == "Wylie_H_Dallas")

# pairwise similarity by date
date_matches <- tweets_datefiltered %>%
mutate(date_string = as.Date(tweets_datefiltered$created_at)) %>%
group_by(screen_name, date_string) %>%
summarize(count = n()) %>%
ungroup() %>%
pairwise_similarity(screen_name, date_string, count, upper = FALSE, sort = TRUE) %>%
filter(item1 == "Wylie_H_Dallas" |
item2 == "Wylie_H_Dallas")

Diff not rendered.
Binary file removed jasonheid .png
Diff not rendered.
Binary file removed jdmiles11 .png
Diff not rendered.
Binary file removed jmchiquillo .png
Diff not rendered.
Binary file removed johnmccaa .png
Diff not rendered.
Binary file removed medenix .png
Diff not rendered.
Binary file removed rlopezwfaa .png
Diff not rendered.
Binary file removed shaunrabbfox4 .png
Diff not rendered.
Binary file removed timmytyper .png
Diff not rendered.
Binary file removed ttsiaperas .png
Diff not rendered.
Binary file removed wfaalauren .png
Diff not rendered.
Binary file removed wfaashelly .png
Diff not rendered.
Binary file removed zaccrain .png
Diff not rendered.

0 comments on commit a63abb6

Please sign in to comment.